Multi-feature fusion RFE random forest for schizophrenia classification and treatment response prediction

Wang, Chang; Zhang, Rui; Zhang, Jiyuan; Ren, Yaning; Pang, Ting; Chen, Xiangyu; Li, Xiao; Zhao, Zongya; Yang, Yongfeng; Ren, Wenjie; Yu, Yi

doi:10.1038/s41598-025-89359-5

Download PDF

Article
Open access
Published: 12 March 2025

Multi-feature fusion RFE random forest for schizophrenia classification and treatment response prediction

Chang Wang^1,2,3,4,5,
Rui Zhang^1,2,3,4,5,
Jiyuan Zhang^1,2,3,4,5,
Yaning Ren^1,2,3,4,5,
Ting Pang^1,2,3,4,5,
Xiangyu Chen^1,2,3,4,
Xiao Li^2,5,
Zongya Zhao^1,2,3,4,
Yongfeng Yang³,
Wenjie Ren^1,4 &
…
Yi Yu^1,2,3,4

Scientific Reports volume 15, Article number: 8594 (2025) Cite this article

1170 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

Schizophrenia(SZ) classification and treatment response prediction hold substantial clinical application value. However, only a limited number of researchers have exploited the multi-feature information derived from resting-state functional magnetic resonance imaging (rs-fMRI) to achieve short-term drug-treatment SZ classification and treatment response prediction. We developed a multi-feature fusion recursive feature elimination random forest (RFE-RF) approach for SZ classification and treatment response prediction. Initially, we computed multiple features, such as regional homogeneity, fractional amplitude of low-frequency fluctuations, and functional connectivity. Subsequently, the RFE-RF method was employed to conduct SZ classification. Moreover, we utilized the rate of score reduction (RR) of the Positive and Negative Symptom Scale (PANSS) to forecast the treatment response of individual patients. Finally, we identified the neuroimaging biomarkers for SZ classification and drug-treatment response prediction. This method achieved the classification results (accuracy = 91.7%, sensitivity = 90.9%, and specificity = 92.6%), and the abnormalities in the visual and default mode networks emerged as potential neuroimaging biomarkers for differentiating SZ from healthy controls (HC). Additionally, we predicted the drug-treatment response of SZ patients in terms of their total PANSS scores, as well as negative and positive symptom scores after eight weeks of treatment. Specifically, the abnormalities in the visual network, sensorimotor network, and right superior frontal gyrus are crucial biomarkers for the short-term drug-treatment response of negative symptoms in SZ patients. Meanwhile, the abnormalities in the visual and default mode networks serve as important biomarkers of the short-term drug-treatment response of positive symptoms. There findings offer novel insights into the neural mechanisms underlying SZ following eight weeks of short-term drug treatment. With further clinical validation in the future, this research may provide potential biomarkers and intervention targets for personalized treatment of SZ.

Prediction of antipsychotic drug efficacy for schizophrenia treatment based on neural features of the resting-state functional connectome

Article Open access 10 April 2025

Multimodal workflows optimally predict response to repetitive transcranial magnetic stimulation in patients with schizophrenia: a multisite machine learning analysis

Article Open access 25 April 2024

Mismatch negativity as EEG biomarker supporting CNS drug development: a transnosographic and translational study

Article Open access 29 April 2021

Introduction

SZ is a severe chronic mental disorder that occurs most occurs in adolescence or early adulthood¹, and clinical symptoms are categorized into three main groups, including positive symptoms, negative symptoms, and general psychopathological symptoms. Negative symptoms earlier than other symptoms are considered core symptoms in SZ, and are associated with poor functional outcomes^2,3. Therefore, it is crucial to effectively assess the effectiveness of early treatment of negative symptoms of SZ, to find individual prognostic biomarkers of SZ, and to guide the use of appropriate treatment in the clinic.

Numerous studies^4,5,6,7 have attested to the disparities in brain imaging metrics between patients with SZ and healthy HC. These differentiating metrics are pivotal for uncovering individualized prognostic indicators for SZ. A multitude of research approaches zero in on the functional activities of specific brain regions. For instance, techniques such as regional homogeneity (ReHo), amplitude of low-frequency fluctuations (ALFF), and fractional amplitude of low-frequency fluctuations (fALFF)^8,9,10 are dedicated to probing the functional activity patterns of each voxel or brain region. ReHo gauges the temporal congruence between a voxel and its neighboring voxels, thereby obliquely reflecting alterations in the synchronization of local neuronal activity. Multiple investigations^11,12 have revealed that the ReHo values in the prefrontal cortex are correlated with clinical symptoms and can be harnessed to anticipate treatment responses, laying a theoretical groundwork for the application of machine learning in this ___domain. Guo et al.¹³ further corroborated this by demonstrating that, through the use of the support vector regression (SVR) method, they could accurately predict treatment responses in SZ patients based on abnormal ReHo values detected in the sensorimotor network and the right putamen. ALFF directly mirrors the intensity of spontaneous blood oxygen level-dependent (BOLD) signal activity within a region, offering valuable insights into local spontaneous neuronal activity. fALFF further mitigates the impact of physiological noise. It has been observed^14,15 that regions like the postcentral gyrus and thalamus are linked to therapeutic efficacy in patients. Studies employing SVR analysis have indicated that a substantial decline in fALFF levels just one week after treatment in SZ patients is highly and positively associated with improvements in positive symptoms after eight weeks of treatment. Another analogous study discovered that fluctuations in fALFF within the right caudate nucleus region could serve as an effective biomarker for forecasting treatment responses in SZ patients^16,17. These studies enable the identification of region-specific changes within the brain, facilitating a thorough exploration of the functional characteristics of localized brain regions. However, they fall short when it comes to providing a profound comprehension of the intricate interactions occurring between different brain regions.

The other mainstream approach lies in integrating the functional activities between voxels or brain regions. Functional connectivity (FC)¹⁸ gauges the co-activity of various brain regions by computing the signal correlations among them, which in turn reveals whether there are any irregularities in the synchronization between these regions. Currently, a significant portion of research on predicting patients’ responses to drugs related to abnormal FC focuses primarily on the striatum and hippocampus^19,20. Similar investigations^21,22 have also uncovered that enhanced connectivity between the two hemispheres of the brain is associated with more favorable treatment responses. Specifically, abnormalities in inter-hemispheric connectivity, along with certain abnormal values, can be used to forecast how patients with SZ will respond to olanzapine. These research methodologies can offer valuable insights into the functional connectivity across different brain regions and contribute to a better understanding of the structure and function of the entire brain network. However, they cannot highlight individual brain regions throughout the brain in a large amount of redundant information.

The methods mentioned above rely solely on a single feature extracted from resting-state functional magnetic resonance imaging (rs-fMRI). SZ, however, is an extremely complex disorder, and a solitary feature is far from adequate to comprehensively elucidate its multifaceted characteristics. Consequently, there are distinct limitations when it comes to classification accuracy and the prediction of treatment responses. A wealth of research has corroborated that if a model can integrate crucial information from multiple features simultaneously, it will contribute to enhancing prediction accuracy²³. Nevertheless, feature fusion may usher in copious amounts of irrelevant information, which in turn escalates the complexity of the model. Hence, feature reduction techniques are essential for minimizing noise and errors, to optimize the model’s performance. Recursive Feature Elimination (RFE) is more result-driven. Once it takes in all the features, it gauges feature importance according to the training outcomes of the model. This enables it to efficiently select features that are highly relevant to the predicted target variable while weeding out those that have only a weak correlation with the target variable.

Current research has relatively neglected the use of multi-feature fusion, along with the integration of pre-treatment and post-treatment information, for treatment response prediction. In this study, we developed a multi-feature fusion recursive feature elimination random forest approach for SZ classification and treatment response prediction. Firstly, we computed three key features: regional homogeneity, fractional amplitude of low-frequency fluctuations, and functional connectivity. Subsequently, a two-sample t-test was implemented to filter out the interfering data elements. We then utilized the recursive feature elimination random forest (RFE-RF) method to conduct SZ classification. Moreover, the RR of the PANSS was adopted to forecast the treatment response of individual patients. Through this process, we were able to identify the neuroimaging biomarkers relevant to SZ classification and treatment response prediction.

The main contributions of this study can be summarized as follows:

(1)
A multi-feature fusion RFE-RF was proposed for schizophrenia classification and treatment response prediction.
(2)
The abnormalities in the visual and default mode networks are the potential neuroimaging biomarkers in distinguishing SZ from HC.
(3)
The abnormalities in the visual network, sensorimotor network, and right superior frontal gyrus are the important biomarkers of short-term treatment response of negative symptoms in SZ patients.
(4)
The abnormalities in the visual and default mode networks are important biomarkers of the short-term treatment response of positive symptoms.

Materials and methods

Figure 1 is the flow chart, and it includes data acquisition and preprocessing, feature extraction, statistical analysis, recursive feature elimination and random forest, and classification and prediction.

Participants

190 participants were included in this study. We collected 104 SZ patients from the Second Affiliated Hospital of Xinxiang Medical University in China and 86 HC from nearby communities. The inclusion criteria were set as follows: (1) 18–55 years-old, (2) Han nationality, (3) right-handedness, (4) normal IQ, (5) fulfillment of the clinical diagnostic criteria for schizophrenia in the DSM-IV, and (6) no history of psychiatric disorders in the HC and relatives in the third generation. Exclusion criteria were set as follows: (1) history of head trauma, (2) having neurological disorders, (3) alcohol or drug dependence, (4) suffering from unstable physical conditions, (5) breastfeeding or pregnancy status, and (6) contraindications to MRI. Symptoms and neurocognitive function of SZ were assessed by the PANSS and MCCB. The treatment response was evaluated by RR, and the RR can be calculated as follows

$$RR=\frac{{PANS{S_{Bassline}} - PANS{S_{8weeks}}}}{{PANS{S_{Bassline}}}}$$

(1)

Patients were treated with second-generation antipsychotics for eight weeks before MRI examination, and the selection of dosage and medication was performed by psychiatrists. This study was approved by the Ethics Committee of the Second Affiliated Hospital of Xinxiang Medical University and performed in accordance with the tenets of the Declaration of Helsinki, all patients and control subjects have provided written informed consent.

Data acquisition and preprocessing

We obtained the raw resting-state functional magnetic resonance imaging (rs-fMRI) data using a 3.0T MR scanner (Siemens, Verio) with an echo planar imaging sequence sensitive to blood oxygen level-dependent (BOLD) contrast. The imaging parameters were set as follows: repetition time of 2000 ms, echo time of 30 ms, flip angle of 90°, matrix size of 64 × 64, axial slice resolution of 3.4 × 3.4 mm², slice thickness of 4 mm, and a 0.6-mm gap between slices. The data collection lasted for 8 min, yielding 240 time points.

The rs-fMRI raw data were preprocessed via the Data Processing Assistant for Resting-State fMRI (DPABI). This preprocessing involved multiple steps, including time point removal, slice timing correction, head motion correction, normalization, spatial smoothing, drift de-linearization, covariate regression, and filtering. At the start of MRI data collection, the scanner requires time to stabilize the magnetic field gradient, while the subjects need to adapt to the scanning environment and reach a resting state. It is commonly acknowledged that discarding the first 10 time points can enhance data quality and stability. Hence, we removed these initial time points. Participants with head movements exceeding 3 mm or rotations greater than 3° during scanning were excluded from the study. To register different subjects to the same standardized space, we utilized the echo planar imaging (EPI) template. Regarding spatial smoothing, an appropriately sized smoothing kernel can eliminate noise and boost the signal-to-noise ratio. However, if the kernel is too large, it may cause signal loss, and if it’s too small, residual noise will remain. A reasonable guideline for choosing the smoothing kernel size is twice the full-width half-maximum (FWHM) of the voxel size²⁴. Given that the voxel size in this experiment was 3, we employed a Gaussian smoothing kernel with an FWHM of 6 mm for smoothing. Finally, we filtered the data of different subjects to mitigate both low-frequency drift and high-frequency noise.

Feature extraction

We divided the whole brain into 116 regions (90 cortical and subcortical non-cerebellar regions and 26 cerebellar regions) using the standardized Automatic Anatomical labeling (AAL) atlas. The fALFF, REHO, and FC were calculated, and Z-Score normalization was performed for these features.

fALFF has been demonstrated to stably correspond to local glucose metabolism in the brain, thus reflecting the brain’s energy expenditure²⁵. fALFF can be calculated as follows. The bold signals of the whole brain voxel were converted into a frequency-___domain power spectrum by the Fourier transform, and the square root of each frequency-___domain power spectrum was calculated to obtain the average square root of the low-frequency (0.01–0.08 Hz) frequency-___domain power spectrum. fALFF is the low-frequency ALFF value divided by the full-band (0.01–0.25 Hz) power spectrum.

$${\text{fALF}}{{\text{F}}_{\text{i}}}=\frac{{\sum\nolimits_{{\text{k}}} {\sqrt {{X_{\text{i}}}{{\left[ {\text{k}} \right]}^2}} } }}{{\frac{1}{N}\sum\nolimits_{{{\text{m}}=0}}^{{N - 1}} {\sqrt {{X_{\text{i}}}{{\left[ {\text{m}} \right]}^{\text{2}}}} } }}{\text{,k}} \in \left[ {0.01\frac{{2\pi }}{{{\text{f}}_{S}^{2}}}{\text{,}}0.08\frac{{2\pi }}{{{\text{f}}_{S}^{2}}}} \right]$$

(2)

ReHo represents the local consistency of the time series of neighboring voxels, which is gauged by Kendall’s Coefficient of Concordance (KCC). Abnormalities in ReHo mirror disruptions in the local coordinated neuronal activity. Indeed, numerous studies have already detected ReHo abnormalities in SZ^26,27. The KCC was calculated as follows.

$$W=\frac{{{{\sum {({R_i})} }^2} - n{{{\text{(}}\overline {R} )}^2}}}{{\frac{1}{{12}}{K^2}({n^3} - n)}}$$

(3)

Where n represents the number of time points, ${R_i}$ stands for means the total number of ranks of K voxel points at the i-th time point, $\overline {R}$ is the mean value of R, and K is the total number of specific voxels and surrounding voxels.

FC serves as an indicator to reveal whether there are any anomalies in the synergistic activities or correlations among different regions in SZ. It can be calculated as follows. The time series of all voxels in each brain region were averaged. Pearson correlation coefficients were adopted to construct the functional connectivity matrix for individual subjects. Pearson correlation (PC) can be calculated by the linear correlation between ROIs as follows.

$$W_{{ij}} = \frac{{\sum\limits_{1}^{n} {(x_{i} - \bar{x}_{i} )^{T} (x_{j} - \bar{x}_{j} )} }}{{\sqrt {\sum\limits_{{i = 1}}^{n} {(x_{i} - \bar{x}_{i} )^{T} (x_{i} - \bar{x}_{i} )} } \sqrt {\sum\limits_{{j = 1}}^{n} {(x_{j} - \bar{x}_{j} )^{T} (x_{j} - \bar{x}_{j} )} } }}$$

(4)

Where ${x_i}$and ${x_j}$are the bold signal of any two ROIs, ${\bar {x}_i}$ and ${\bar {x}_j}$ are the average of ${x_i}$ and ${x_j}$.

Statistical analysis

Demographic and clinical data were analyzed using SPSS (Statistical Product and Service Solutions) 26.0 software using chi-square tests and two-sample t-tests (where applicable). Gender, age, and years of education were used as covariates at baseline, and two-sample t-tests were used for between-group comparisons of SZ and HC. Using Gaussian Random Field (GRF) theory based (voxel significance: p < 0.001, clustering significance: p < 0.05) for multiple comparisons to obtain the differential brain regions.

Recursive feature elimination and Random Forest

Recursive Feature Elimination (RFE) constructs a model that assigns importance ratings to each feature within the dataset and then ranks the features according to these ratings. During the RFE process, the model is initially trained on the complete set of original features. Subsequently, the least important feature, as determined by the model, is removed. With the remaining features, a new model is then trained. In each iteration of this process, RFE eliminates the least significant feature based on the feature weights. This iterative removal continues until only one feature remains. Eventually, we rank the features in terms of accuracy and select the subset that yields the highest accuracy, thereby identifying the optimal number of features. The importance ranking is computed using the Python feature importance function. In this function, every time a node splits during model construction, the contribution of each feature to impurity or the Mean Squared Error (MSE) is calculated. These individual contributions are then aggregated to obtain the final relative importance score for all features, which serves as the feature weights. This iterative procedure effectively filters out the most influential features from a large number of candidates, mitigating the impacts of redundancy, noise, and high data dimensionality, and enhancing the model’s generalization performance, accuracy, and predictive capabilities.

Random Forest (RF)²⁸, an ensemble learning method, constructs and integrates multiple decision tree models. In the tree-building process, a bootstrap sampling strategy is adopted to randomly extract multiple sub-datasets from the original feature set. Each sub-dataset is then used to independently train a decision tree. This approach reduces the risk of over-fitting, thereby improving the model’s generalization ability. Additionally, it exhibits good robustness against noise. Random Forest offers high accuracy and strong robustness, conferring significant advantages in both classification and regression tasks. As a result, it has emerged as one of the most widely utilized machine learning algorithms in practical applications.

Classification and prediction

We used RFE-RF for SZ classification and treatment response prediction in this study. In SZ classification, we used the differential brain regions of SZ and HC as the original feature set, adopted a leave-one-out cross-validation strategy, and assessed the performance of the classifier models using accuracy, specificity, and sensitivity. These metrics can be calculated as follow.

$$ACC=\frac{{TP+TN}}{{TP+FP+TN+FN}}$$

(5)

$$SPE=\frac{{TN}}{{TN+FP}}$$

(6)

$$SEN=Recall=\frac{{TP}}{{TP+FN}}$$

(7)

Where TP is true positive, FN is false negative, TN is true negative, FP is and false positive. In this process, we identified the optimal number of features and reported the importance rankings of the top ten features for SZ classification.

In the treatment response prediction for SZ, we focused on abnormal changes in brain regions before and after patient treatment, using pre- and post-treatment information on abnormal brain regions. We employed the Recursive Feature Elimination with Random Forest (RFE-RF) method, combined with leave-one-out cross-validation, to comprehensively assess the performance of our model. The mean squared error (MSE) was selected as the key performance metric, while the RR of the PANSS total score, negative symptom score, and positive symptom score served as the prediction targets. During the feature selection process, we iteratively removed one feature at a time until only a single feature remained. This iterative approach enabled us to pinpoint the feature subset that corresponded to the lowest MSE value, thereby optimizing the model’s predictive ability. With the positive symptom score as one of the central prediction targets, we were able to determine the optimal number of features for the model. In this process, we identified the optimal number of features and again used the feature importance function to report the importance ranking of the top ten features for which the PANSS negative score RR and positive score RR were most relevant to predictive performance.

Result

Subject characteristics

This study consisted of 190 participants. Following strict quality control procedures and the application of exclusion criteria, 88 patients with SZ and 81 HC were ultimately incorporated into the research. The demographic and clinical profiles of the subjects are presented in Table 1. There was no significant difference in age and gender between SZ and HC, and there was a significant difference in years of education (p < 0.001). After eight weeks of treatment period, patients in SZ demonstrated notable improvements in the PANSS total score, negative symptom scores, and positive symptom scores when compared to their baseline scores. These improvements were all highly statistically significant, with P-values less than 0.001, as elaborated in Table 2.

Table 1 Demographic data and clinical characteristics of all subjects.

Full size table

Table 2 Changes in PANSS scores before and after 8 weeks of treatment.

Full size table

Multi-feature with statistical differences

For each participant, a multi-feature vector with a dimension of 6902 × 1 was constructed by concatenating several components. Specifically, it included 6670 FC features, 116 ReHo features, and 116 fALFF features. After conducting a two-sample t-test followed by false discovery rate (FDR) correction with a significance level of P < 0.001, 693 functional connection features, 4 ReHo features, and 3 fALFF features exhibited statistical differences, as shown in Fig. 2. Among the top ten brain regions showing statistically significant differences in FC, eight brain regions belonged to the visual network. Regarding the regions with significant differences in ReHo, they were the left inferior temporal gyrus, the precuneus, and the right inferior parietal marginal angular gyrus. As for the fALFF, the brain regions demonstrating statistical differences were the right superior frontal gyrus (orbital part), the right lenticular nucleus (putamen), and the left caudate nucleus.

Classification results

In the feature selection process, we used RFE-RF and pinpointed 181 optimal classification features. These features enabled us to obtain the best training outcomes, with an accuracy of 98.2%, a sensitivity of 97.7%, and a specificity of 98.8%. Moreover, in the SZ classification task, we achieved the best classification results, boasting an accuracy of 91.7%, a sensitivity of 90.9%, and a specificity of 92.6%. In the confusion matrix (Fig. 3), the top-left and bottom-right corners denote the quantity of correctly classified samples, while the top-right and bottom-left corners represent the number of misclassified samples. Currently, SZ lacks specific biomarkers for diagnosis, leading to a certain misdiagnosis rate in clinical practice. When compared with recent experiments^29,30, both our false positive rate and false negative rate have declined. This reduction indicates that our RFE-RF approach holds practical value, pending further clinical validation.

We further presented the importance rankings of the top ten most discriminative brain regions for SZ classification, Notably, over half of these regions exhibit intra-network connectivity abnormalities within the visual and default mode networks, as detailed in Table 3. Prior research has indicated that abnormalities in visual processing in SZ are caused by dysfunctions in multiple nodes of the visual neural network. These abnormal visual processing mechanisms have been demonstrated to contribute to various manifestations of cognitive dysfunction, as documented in reference³¹. Abnormalities in the default mode network have also been recurrently reported. Such findings solidify the correlation between default mode network dysfunction and both the positive and negative symptoms of SZ, as supported by reference³².

Table 3 Top 10 most discriminating categorized features.

Full size table

Treatment response prediction results

We have also clarified the optimal features number for the RR prediction of the PANSS total score, negative symptom score, and positive symptom score. The optimal feature combination for the positive symptom’s RR included only one feature for the ReHo of the left Precuneus, and all the other features were functional connections. It demonstrated that the ReHo of the left Precuneus has a unique role in predicting the RR for the PANSS positive symptoms. The optimal number of the predicted features and MSE are shown in Table 4. Using the best features to predict the treatment response, the predicted RR values of the PANSS total score, negative symptoms, and positive symptoms were all positively correlated with the actual RR values (Fig. 4).

Table 4 Characterization categories screened and predicted results.

Full size table

Biomarkers for treatment response prediction

We have also identified the optimal features for predicting the RR of the PANSS negative and positive symptoms after eight weeks of treatment (Fig. 5). As shown in Table 5, the optimal features for predicting the RR of the PANSS negative symptom scores predominantly involved the visual network, the sensorimotor network, and the superior frontal gyrus in default mode network. Regarding the prediction of the RR of the PANSS positive symptom scores, as presented in Table 6, the principal optimal features mainly involved the default mode network and the visual network.

Table 5 The ten most important features of RR for negative symptom scoring.

Full size table

Table 6 The ten most important features of RR for positive symptom scoring.

Full size table

Discussion

In this study, we discovered that, in contrast to HC, SZ patients predominantly exhibited abnormal intra-network connectivity within the default mode network and the visual network. We utilized these abnormal brain features to distinguish SZ from HC. We successfully predicted the short-term treatment response at the individual level using the information on abnormal brain regions both before and after treatment. This research also pinpointed the neuroimaging biomarkers correlated with positive and negative symptoms during the treatment of SZ patients. Once clinically validated, these biomarkers had significant potential for application in the precise diagnosis and targeted treatment of SZ.

SZ is an intricate psychiatric disorder, entailing a wide array of cognitive, emotional, behavioral, and perceptual abnormalities. The visual network plays a pivotal role in the manifestation of these SZ-related abnormalities^33,34,35. The visual network is tasked with processing visual information, facilitating our understanding of the ever-changing external world. It further enables us to concentrate visual attention, filter out irrelevant visual cues, prioritize and process more critical visual data, and store visual memories. The visual network is interconnected with motor control regions, emotional processing centers, and social cognitive areas, among others. Any impairment to this network can disrupt the transmission and integration of brain information, rendering it arduous to interpret information from multiple dimensions, including external perception, executive function, and emotion³⁶. This study revealed that eight of the top ten brain regions, which showed statistically significant differences in functional connectivity between SZ and HC, were part of the visual network. In the classification of SZ and HC, a substantial portion of the top ten brain region features were associated with the visual network and the default mode network. This further underscores the prominent contribution of visual network abnormalities in SZ.

The abnormalities in the visual networks have also been widely observed in previous studies, and there has been substantial evidence of structural and functional abnormalities in the visual network. These abnormalities include structural alterations in the visual network disturbances in functional connectivity^37,38,39, and changes in vision information processing^40,41. The previous studies^42,43 have also identified abnormalities within the default mode networks in SZ patients. Collectively, these findings corroborate our observations regarding the abnormalities in the visual network and the default mode network among SZ patients.

Negative symptoms constitute the core symptoms of SZ, and the main manifestations encompass loss of motivation and pleasure, impoverished thinking, and emotional apathy. These symptoms are a crucial determinant of the clinical and functional prognosis for SZ patients. Hence, treating negative symptoms has an important application value⁴⁴. In this study, we discovered that the abnormalities in the visual network, sensorimotor network, and the right superior frontal gyrus played a substantial role in predicting the treatment response to negative symptoms in SZ patients after short-term drug-treatment. The sensorimotor network plays a central role in the detection and processing of sensory inputs as well as in the preparation and execution of motor functions. It was the first resting-state brain network to be identified and proposed in fMRI studies⁴⁵. Multiple investigations have revealed that abnormal indicators of the sensorimotor network and visual network are correlated with negative symptoms in patients^46,47,48. Similarly, the right superior frontal gyrus has been repeatedly associated with negative symptoms in SZ^49,50. This corroborates our finding that the abnormalities in the visual network, the sensorimotor network, and the right superior frontal gyrus are the neural mechanisms for negative symptoms.

Positive symptoms also form an important part of the symptoms of SZ. Hallucinations, delusions, and abnormal behavior are the most prominent features. The default mode network is known to be activated during the rest state and is associated with individual mental processes⁵¹. Many studies have confirmed that the default mode network and visual network are associated with positive symptoms^52,53,54. In this study, we specifically selected the ReHo values of the precuneus, which serves as a fundamental node hub within the default mode network. The abnormalities of the precuneus are also widely recognized as playing a role in SZ^55,56. This further supports the concept that SZ results from the interaction of two or more brain regions rather than a lesion confined to a single brain region⁵⁷.

K-fold cross-validation is a prevalent performance assessment approach within the realm of machine learning. This method entails partitioning the dataset into k subsets. Subsequently, k–1 of these subsets are utilized for training the model, while the remaining single subset serves as the test set. This entire procedure is iterated k times, ensuring that each subset gets the opportunity to act as the test set precisely once. By doing so, this technique yields more consistent and trustworthy results for evaluating the performance of a model. Leave-one-out cross-validation represents a special instance of k-fold cross-validation. In this case, every individual data point is successively designated as the test set, with all the other data points being used for training. Given the relatively limited scale of our dataset, we decided to employ leave-one-out cross-validation. This choice was made to optimize the utilization of the available data. However, this method requires higher computational costs. Hence, in the future, it will be necessary to explore more efficient validation methods.

There are some limitations in our research. Firstly, future studies should include more samples to enhance the generalizability of the model. Secondly, our study solely relied on rs-fMRI data, so we cannot determine whether the functional changes are due to disease-related structural alterations in the brain, and we have missed potential information provided by sMRI changes. We plan to incorporate structural brain data in future research. Finally, many studies⁵⁸ have suggested that demographic factors, such as gender, age, and years of education, are associated with SZ. In our study, the variable of years of education demonstrated significant differences. Although it has been incorporated as a covariate in the analysis, these demographic factors should be independently incorporated into the model in future research initiatives.

Conclusion

In this study, we developed a multi-feature fusion recursive feature elimination random forest for SZ classification and treatment response prediction. We also identified the associated neuroimaging biomarkers. The abnormalities within the visual network and the default mode network are the potential neuroimaging biomarkers in distinguishing SZ from HC. To predict the individual treatment response of patients, we utilized the RR score derived from the PANSS. Through this approach, we were able to pinpoint the biomarkers conducive to treatment response prediction. Specifically, the visual network, the sensorimotor network, and the right superior frontal gyrus were found to be predictive of the short-term drug-treatment response regarding negative symptoms in SZ patients. Meanwhile, the abnormalities within the visual network and the default mode network could forecast the short-term drug-treatment response related to positive symptoms in SZ. These biomarkers have application value in the accurate diagnosis and treatment of SZ after clinical validation.

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

Solmi, M. et al. Fusar-Poli, age at onset of mental disorders worldwide: large-scale meta-analysis of 192 epidemiological studies. Mol. Psychiatry 27, 281–295 (2021).
Article PubMed PubMed Central MATH Google Scholar
Galderisi, S. & Kaiser, S. The pathophysiology of negative symptoms of schizophrenia: main hypotheses and open challenges. Br. J. Psychiatry. 223, 298–300 (2023).
Article PubMed PubMed Central MATH Google Scholar
Marder, S. R. & Umbricht, D. Negative symptoms in schizophrenia: newly emerging measurements, pathways, and treatments. Schizophr. Res. 258, 71–77 (2023).
Article PubMed MATH Google Scholar
Lei, D. et al. Integrating machining learning and multimodal neuroimaging to detect schizophrenia at the level of the individual, Hum. Brain. Mapp. 41 1119–1135. (2019).
Guo, W. et al. Olanzapine modulates the default-mode network homogeneity in recurrent drug-free schizophrenia at rest. Australian New. Z. J. Psychiatry. 51, 1000–1009 (2017).
Article MATH Google Scholar
Gong, J. et al. Abnormalities of intrinsic regional brain activity in first-episode and chronic schizophrenia: a meta-analysis of resting-state functional MRI. J. Psychiatry Neurosci. 45, 55–68 (2020).
Article PubMed MATH Google Scholar
Wang, Y. et al. Resting-state functional connectivity changes within the default mode network and the salience network after antipsychotic treatment in early-phase schizophrenia. Neuropsychiatr. Disease Treat. Vol. 13, 397–406 (2017).
Article MathSciNet CAS MATH Google Scholar
Zang, Y., Jiang, T., Lu, Y., He, Y. & Tian, L. Regional homogeneity approach to fMRI data analysis. NeuroImage 22, 394–400 (2004).
Article PubMed Google Scholar
Yu-Feng, Z. et al. Yu-Feng, altered baseline brain activity in children with ADHD revealed by resting-state functional MRI. Brain Develop. 29, 83–91 (2007).
Article Google Scholar
Zou, Q. H. et al. An improved approach to detection of amplitude of low-frequency fluctuation (ALFF) for resting-state fMRI: fractional ALFF. J. Neurosci. Methods. 172, 137–141 (2008).
Li, X. et al. Abnormalities of Regional Brain activity in patients with Schizophrenia: a longitudinal resting-state fMRI study. Schizophr. Bull. 49, 1336–1344 (2023).
Article CAS PubMed PubMed Central MATH Google Scholar
van Veelen, N. M. J. et al. Prefrontal lobe dysfunction predicts treatment response in medication-naive first-episode schizophrenia. Schizophr. Res. 129, 156–162 (2011).
Article PubMed MATH Google Scholar
Jing, H. et al. Deviant spontaneous neural activity as a potential early-response predictor for therapeutic interventions in patients with schizophrenia. Front. NeuroSci. 17, 1243168 (2023).
Article PubMed PubMed Central Google Scholar
Wang, P. et al. Amplitude of low-frequency fluctuation (ALFF) may be associated with cognitive impairment in schizophrenia: a correlation study. BMC Psychiatry 19, 63 (2019).
Cui, L. B. et al. Prediction of early response to overall treatment for schizophrenia: a functional magnetic resonance imaging study. Brain Behav. 9, e01211 (2019).
Article PubMed PubMed Central Google Scholar
Wu, R. et al. Reduced brain activity in the right Putamen as an early predictor for treatment response in Drug-Naive, first-episode Schizophrenia. Front. Psychiatry. 10, 741 (2019).
Article PubMed PubMed Central MATH Google Scholar
Li, H. et al. Enhanced baseline activity in the left ventromedial putamen predicts individual treatment response in drug-naive, first-episode schizophrenia: results from two independent study samples. eBioMedicine 46, 248–255 (2019).
Article CAS PubMed PubMed Central MATH Google Scholar
Friston, K. J. Functional and effective connectivity: a review. Brain Connect. 1, 13–36 (2011).
Article PubMed MATH Google Scholar
Cao, B. et al. Treatment response prediction and individualized identification of first-episode drug-naïve schizophrenia using brain functional connectivity. Mol. Psychiatry. 25, 906–913 (2018).
Article PubMed MATH Google Scholar
Goff, D. C. et al. Anterior hippocampal–cortical functional connectivity distinguishes antipsychotic Naïve First-Episode Psychosis patients from controls and May Predict response to second-generation antipsychotic treatment. Schizophr. Bull. 46, 680–689 (2020).
Article PubMed Google Scholar
Cui, L. B., Fu, Y. F., Liu, L., Wei, Y. & Yin, H. Baseline structural and functional magnetic resonance imaging predicts early treatment response in schizophrenia with radiomics strategy. Cold Spring Harbor Lab. Press. 53, 1961–1975 (2020).
MATH Google Scholar
Zhu, F. et al. Disrupted asymmetry of inter- and intra-hemispheric functional connectivity in patients with drug-naive, first-episode schizophrenia and their unaffected siblings. EBioMedicine 36, 429–435 (2018).
Article PubMed PubMed Central MATH Google Scholar
Sun, H. et al. Psychoradiologic utility of MR Imaging for diagnosis of attention deficit hyperactivity disorder: a Radiomics Analysis. Radiology 287, 620–630 (2018).
Article PubMed MATH Google Scholar
Soares, J. M. et al. A Hitchhiker’s guide to functional magnetic resonance imaging. Front. NeuroSci. 10 (2016).
Lencz, T. et al. Frontal lobe fALFF measured from resting-state fMRI as a prognostic biomarker in first-episode psychosis. Neuropsychopharmacology 47, 2245–2251 (2022).
Article CAS PubMed PubMed Central MATH Google Scholar
Wang, F. et al. Spontaneous activity Associated with delusions of Schizophrenia in the Left Medial Superior Frontal Gyrus: a resting-state fMRI study. Plos One 10, 45 (2015).
Peng, Y. et al. Unraveling multi-scale neuroimaging biomarkers and molecular foundations for schizophrenia: a combined multivariate pattern analysis and transcriptome‐neuroimaging association study. CNS Neurosci. Ther. 30 (2024).
Breiman, L. Random Forests Mach. Learn. 45 5–32. (2001).
Article Google Scholar
Zhang, R. et al. Aberrant patterns of spontaneous brain activity in schizophrenia: a resting-state fMRI study and classification analysis. Prog. Neuropsychopharmacol. Biol. Psychiatry 134, 85 (2024).
Guo, H. et al. Discriminative analysis of schizophrenia patients using an integrated model combining 3D CNN with 2D CNN: a multimodal MR image and connectomics analysis. Brain Res. Bull. 206, 859 (2024).
Silverstein, S. M. & Keane, B. P. Vision Science and Schizophrenia Research: toward a re-view of the disorder editors’ introduction to Special Section. Schizophr. Bull. 37, 681–689 (2011).
Article PubMed PubMed Central Google Scholar
Garrity, A. G. et al. Aberrant default mode functional connectivity in schizophrenia. Am. J. Psychiatry. 164, 450–457 (2007).
Article PubMed MATH Google Scholar
Dondé, C., Avissar, M., Weber, M. M. & Javitt, D. C. A century of sensory processing dysfunction in schizophrenia. Eur. Psychiatry. 59, 77–79 (2020).
Article MATH Google Scholar
Kang, S. S., Sponheim, S. R., Chafee, M. V. & MacDonald, A. W. Disrupted functional connectivity for controlled visual processing as a basis for impaired spatial working memory in schizophrenia. Neuropsychologia 49, 2836–2847 (2011).
Article PubMed PubMed Central MATH Google Scholar
Belge, J. B. et al. Facial decoding in schizophrenia is underpinned by basic visual processing impairments. Psychiatry Res. 255, 167–172 (2017).
Article PubMed Google Scholar
Adámek, P., Langová, V. & Horáček, J. Early-stage visual perception impairment in schizophrenia, bottom-up and back again. Schizophrenia 8, 21 (2022).
Article Google Scholar
Jurisic, D. et al. New insights into Schizophrenia: a look at the Eye and related structures. Psychiatria Danubina. 32, 60–69 (2020).
Article PubMed MATH Google Scholar
Yang, W. et al. Alterations of dynamic functional connectivity between visual and executive-control networks in schizophrenia. Brain Imaging Behav. 16, 1294–1302 (2022).
Article PubMed MATH Google Scholar
Jimenez, A. M., Riedel, P., Lee, J., Reavis, E. A. & Green, M. F. Linking resting-state networks and social cognition in schizophrenia and bipolar disorder. Hum. Brain. Mapp. 40, 4703–4715 (2019).
Article PubMed PubMed Central Google Scholar
Lee, S. H., Lee, J. S., Park, G., Song, M. J. & Choi, K. H. Early visual processing for low spatial frequency fearful face is correlated with cortical volume in patients with schizophrenia. Neuropsychiatr. Dis. Treat. 12, 1–14 (2015).
Article PubMed PubMed Central MATH Google Scholar
Kogata, T. & Iidaka, T. A review of impaired visual processing and the daily visual world in patients with schizophrenia. Nagoya J. Med. Sci. 80, 317–328 (2018).
PubMed PubMed Central Google Scholar
Wang, C. et al. Multi feature fusion network for schizophrenia classification and abnormal brain network recognition. Brain Res. Bull. 206, 110848 (2024).
Article PubMed MATH Google Scholar
Fan, F. et al. Functional fractionation of default mode network in first episode schizophrenia. Schizophr. Res. 210, 115–121 (2019).
Article PubMed MATH Google Scholar
Bègue, I., Kaiser, S. & Kirschner, M. Pathophysiology of negative symptom dimensions of schizophrenia – current developments and implications for treatment. Neurosci. Biobehavioral Reviews. 116, 74–88 (2020).
Article Google Scholar
Biswal, B., Zerrin Yetkin, F., Haughton, V. M. & Hyde, J. S. Functional connectivity in the motor cortex of resting human brain using echo-planar mri. Magn. Reson. Med. 34, 537–541 (1995).
Article CAS PubMed Google Scholar
Zhao, C. et al. Cross-cohort replicable resting‐state functional connectivity in predicting symptoms and cognition of schizophrenia. Hum. Brain. Mapp. 45, e26694 (2024).
Zhang, C. et al. Fractional amplitude of low-frequency fluctuations in sensory-motor networks and limbic system as a potential predictor of treatment response in patients with schizophrenia. Schizophr. Res. 267, 519–527 (2024).
Article PubMed MATH Google Scholar
Deng, Y. et al. Ventral and dorsal visual pathways exhibit abnormalities of static and dynamic connectivities, respectively, in patients with schizophrenia. Schizophr. Res. 206, 103–110 (2019).
Article PubMed MATH Google Scholar
Brady, R. O. et al. Cerebellar-prefrontal network connectivity and negative symptoms in schizophrenia. Am. J. Psychiatry. 176, 512–520 (2019).
Article PubMed PubMed Central MATH Google Scholar
Zhang, Y., Yang, R. & Cai, X. Frequency-specific alternations in the moment-to-moment BOLD signals variability in schizophrenia. Brain Imaging Behav. 15, 68–75 (2020).
Article MATH Google Scholar
Raichle, M. E. The brain’s default Mode Network. Annu. Rev. Neurosci. 38, 433–447 (2015).
Article CAS PubMed MATH Google Scholar
Yuan, L. et al. Abnormal Brain Network Interaction Associated with positive symptoms in drug-naive patients with First-Episode Schizophrenia. Front. Psychiatry. 13, 870709 (2022).
Article PubMed PubMed Central Google Scholar
Rotarska-Jagiela, A. et al. Resting-state functional network correlates of psychotic symptoms in schizophrenia. Schizophr. Res. 117, 21–30 (2010).
Article PubMed Google Scholar
Li, P. et al. Altered brain network connectivity as a potential endophenotype of schizophrenia. Sci. Rep. 2017, 7 (2017).
Aryutova, K. et al. Differential aberrant connectivity of precuneus and anterior insula may underpin the diagnosis of schizophrenia and mood disorders. World J. Psychiatry 11, 1274–1287 (2021).
Article PubMed PubMed Central MATH Google Scholar
Fransson, P. & Marrelec, G. The precuneus/posterior cingulate cortex plays a pivotal role in the default mode network: evidence from a partial correlation network analysis. NeuroImage 42, 1178–1184 (2008).
Article PubMed Google Scholar
Stephan, K. E., Friston, K. J. & Frith, C. D. Dysconnection in Schizophrenia: from abnormal synaptic plasticity to failures of self-monitoring. Schizophr. Bull. 35, 509–527 (2009).
Article PubMed PubMed Central MATH Google Scholar
Luo, Y. et al. Gender difference in the association between education and schizophrenia in Chinese adults. BMC Psychiatry 20, 296 (2020).
Article PubMed PubMed Central MATH Google Scholar

Download references

Acknowledgements

This work was supported by Open Project of Henan Collaborative Innovation Center (Program No. XTkf01, XTkf07, XTgh01), the Scientific and Technological Project of Henan Province (No. 242102310002, 242102310055), Open Program of Henan Key Laboratory of Biological Psychiatry (Program No. ZDSYS2022008), the scientific research and innovation support plan of Xinxiang Medical University (No.YJSCX202304Z), Key Research Projects of Higher Education Institutions in Henan Province (No. 23A416001), International Science and Technology Cooperation Project of Henan Province (242102521012), Innovative Research Team (in Science and Technology) in University of Henan Province (No. 24IRTSTHN042).

Author information

Authors and Affiliations

School of Medical Engineering, Xinxiang Medical University, Xinxiang, China
Chang Wang, Rui Zhang, Jiyuan Zhang, Yaning Ren, Ting Pang, Xiangyu Chen, Zongya Zhao, Wenjie Ren & Yi Yu
Henan Collaborative Innovation Center of Prevention and treatment of mental disorder, the Second Affiliated Hospital of Xinxiang Medical University, Xinxiang, China
Chang Wang, Rui Zhang, Jiyuan Zhang, Yaning Ren, Ting Pang, Xiangyu Chen, Xiao Li, Zongya Zhao & Yi Yu
Henan Key Laboratory of Biological Psychiatry, Xinxiang, China
Chang Wang, Rui Zhang, Jiyuan Zhang, Yaning Ren, Ting Pang, Xiangyu Chen, Zongya Zhao, Yongfeng Yang & Yi Yu
Engineering Technology Research Center of Neurosense and Control of Henan Province, Xinxiang, China
Chang Wang, Rui Zhang, Jiyuan Zhang, Yaning Ren, Ting Pang, Xiangyu Chen, Zongya Zhao, Wenjie Ren & Yi Yu
Henan International Joint Laboratory of NeuralInformation Analysis and Drug Intelligent Design, Xinxiang, China
Chang Wang, Rui Zhang, Jiyuan Zhang, Yaning Ren, Ting Pang & Xiao Li

Authors

Chang Wang
View author publications
Search author on:PubMed Google Scholar
Rui Zhang
View author publications
Search author on:PubMed Google Scholar
Jiyuan Zhang
View author publications
Search author on:PubMed Google Scholar
Yaning Ren
View author publications
Search author on:PubMed Google Scholar
Ting Pang
View author publications
Search author on:PubMed Google Scholar
Xiangyu Chen
View author publications
Search author on:PubMed Google Scholar
Xiao Li
View author publications
Search author on:PubMed Google Scholar
Zongya Zhao
View author publications
Search author on:PubMed Google Scholar
Yongfeng Yang
View author publications
Search author on:PubMed Google Scholar
Wenjie Ren
View author publications
Search author on:PubMed Google Scholar
Yi Yu
View author publications
Search author on:PubMed Google Scholar

Contributions

Chang Wang and Rui Zhang contributed equally to this work. Chang Wang: Revision - first draft, Writing and experimental guidance. Rui Zhang: Get the results of the experiment, Write the first draft. Jiyuan Zhang: Find data and filter data. Yaning Ren: Auxiliary drawing and sorting results. Ting Pang: Writing and experimental guidance. Xiangyu Chen: Find data and filter data. Xiao Li: Find data and filter data. Zongya Zhao: Programming guidance and result correction. Yongfeng Yang: Supervision, result correction. Wenjie Ren: Supervision, Project administration. Yi Yu: Supervision, Project administration, Writing - review & editing.

Corresponding authors

Correspondence to Yongfeng Yang, Wenjie Ren or Yi Yu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Wang, C., Zhang, R., Zhang, J. et al. Multi-feature fusion RFE random forest for schizophrenia classification and treatment response prediction. Sci Rep 15, 8594 (2025). https://doi.org/10.1038/s41598-025-89359-5

Download citation

Received: 02 September 2024
Accepted: 04 February 2025
Published: 12 March 2025
DOI: https://doi.org/10.1038/s41598-025-89359-5