Figure 2

Feature selection process conducted through LASSO regression and tenfold cross-validation. (a) The figure illustrates the relationship between the coefficients of clinical features and the lambda value in the plot. As the lambda increased, the coefficients for each feature converged towards zero, indicating the regularization effect inherent in the LASSO regression. This regularization process effectively helps in identifying and selecting relevant features while mitigating potential overfitting in the model; (b) The tenfold cross-validation curve for LASSO regression is depicted, offering valuable insights into model selection. On the plot, the left dotted vertical line corresponds to the number of features and the optimal log (lambda) value that yielded the smallest mean squared error (λ = 0.009451193). Moreover, utilizing the one standard error criteria of the optimal log (lambda), the right dotted vertical line represents the model with 19 variables, striking a harmonious balance between predictive accuracy and model simplicity (λ = 0.03476508). This thoughtful selection of variables ensures robust performance while avoiding unnecessary complexity in the predictive model. λ lambda.