Enhanced machine learning approaches for OSA patient screening: model development and validation study

Dai, Rongrong; Yang, Kang; Zhuang, Jiajing; Yao, Ling; Hu, Yiming; Chen, Qingquan; Zheng, Huaxian; Zhu, Xi; Ke, Jianfeng; Zeng, Yifu; Fan, Chunmei; Chen, Xiaoyang; Fan, Jimin; Zhang, Yixiang

doi:10.1038/s41598-024-70647-5

Download PDF

Article
Open access
Published: 26 August 2024

Enhanced machine learning approaches for OSA patient screening: model development and validation study

Rongrong Dai^1,2^na1,
Kang Yang^3,7^na1,
Jiajing Zhuang³^na1,
Ling Yao³,
Yiming Hu^2,4,
Qingquan Chen²,
Huaxian Zheng³,
Xi Zhu²,
Jianfeng Ke⁵,
Yifu Zeng⁶,
Chunmei Fan⁵,
Xiaoyang Chen^1,5,
Jimin Fan^1,5 &
…
Yixiang Zhang^1,5

Scientific Reports volume 14, Article number: 19756 (2024) Cite this article

2809 Accesses
5 Citations
Metrics details

Subjects

Abstract

Age, gender, body mass index (BMI), and mean heart rate during sleep were found to be risk factors for obstructive sleep apnea (OSA), and a variety of methods have been applied to predict the occurrence of OSA. This study aimed to develop and evaluate OSA prediction models using simple and accessible parameters, combined with multiple machine learning algorithms, and integrate them into a cloud-based mobile sleep medicine management platform for clinical use. The study data were obtained from the clinical records of 610 patients who underwent polysomnography (PSG) at the Sleep Medicine Center of the Second Affiliated Hospital of Fujian Medical University between January 2021 and December 2022. The participants were randomly divided into a training-test group (80%) and an independent validation group (20%). The logistic regression, artificial neural network, naïve Bayes, support vector machine, random forest, and decision tree algorithms were used with age, gender, BMI, and mean heart rate during sleep as predictors to build a risk prediction model for moderate-to-severe OSA. To evaluate the performance of the models, we calculated the area under the receiver operating curve (AUROC), accuracy, recall, specificity, precision, and F1-score for the independent validation set. In addition, the calibration curve, decision curve, and clinical impact curve were generated to determine clinical usefulness. Age, gender, BMI, and mean heart rate during sleep were significantly associated with OSA. The artificial neural network model had the best efficacy compared with the other prediction algorithms. The AUROC, accuracy, recall, specificity, precision, F1-score, and Brier score were 80.4% (95% CI 76.7–84.1%), 69.9% (95% CI 69.8–69.9%), 86.5% (95% CI 81.6–91.3%), 61.5% (95% CI 56.6–66.4%), 53.2% (95% CI 47.7–58.7%), 65.9% (95% CI 60.2–71.5%), and 0.165, respectively, for the artificial neural network model. The AUROCs for the LR, NB, SVM, RF, and DT models were 80.2%, 79.7%, 79.2%, 78.4%, and 70.4%, respectively. The six models based on four simple and easily accessible parameters effectively predicted moderate-to-severe OSA in patients with PSG screening, with the artificial neural network model having the best performance. These models can provide a reliable tool for early OSA diagnosis, and their integration into a cloud-based mobile sleep medicine management platform could improve clinical decision making.

Sleep prediction using data from oximeter, accelerometer and snoring for portable monitor obstructive sleep apnea diagnosis

Article Open access 19 October 2024

State-of-the-art sleep arousal detection evaluated on a comprehensive clinical dataset

Article Open access 14 July 2024

Application of various machine learning techniques to predict obstructive sleep apnea syndrome severity

Article Open access 19 April 2023

Introduction

Obstructive sleep apnea

Obstructive sleep apnea (OSA) is characterized by recurrent closure of the upper airway during sleep, resulting in decreased oxyhemoglobin saturation and fragmented sleep patterns. Patients typically exhibit symptoms such as snoring, witnessed apnea, morning headache, and daytime sleepiness^1,2. OSA leads to depression, cognitive impairment, and increased risk of cardiovascular, metabolic, and other diseases^3,4,5. Globally, it is estimated that over 425 million adults suffer from moderate-to-severe OSA, with the prevalence of moderate-to-severe OSA ranging from 6 to 17% in the general adult population and up to 49% among the elderly^6,7.

Overnight polysomnography (PSG) is the gold standard for the diagnosis of OSA, which records multiple data channels including electroencephalogram (EEG), electrocardiogram (ECG), and electromyogram (EMG)^8,9. PSG is complex and costly, which may make it difficult to scale up for large numbers of potential patients. However, undiagnosed OSA can lead to severe medical and economic consequences¹⁰. A previous study showed that moderate-to-severe OSA is highly associated with adverse health events such as hypertension, depression, and all-cause mortality¹¹.

Obstructive sleep respiratory events impairs physical restoration during nocturnal sleep. Individuals with OSA may experience excessive daytime sleepiness, compromised vigilance, and a diminished capacity to effectively carry out daily tasks. This state may ultimately lead to an elevated risk of complications, traffic accidents, and workplace accidents, thereby exerting a considerable financial burden on both the healthcare system and the economy. Therefore, the development of a simple tool to identify moderate-to-severe OSA is imperative.

Machine learning

Machine learning (ML), the process of using observed data to build statistical models to predict outcomes, holds great promise in medical research¹² and has been applied in predicting OSA. ML algorithms can continually refine their predictions based on new data, thereby enhancing accuracy and predictive efficacy over time.

Related works

Features derived from electrophysiological data, such as heart rate variability, oxygen saturation and ECG-derived respiration (EDR) have been used for OSA prediction. For ML models, previous studies have applied professional questionnaires, sociodemographic information, clinical symptoms, physical examination and laboratory indicators to predict OSA for screening and further grading of suspected OSA patients.

BMI, age, and sex were the earliest and most commonly applied clinical factors, followed by neck circumference, snoring, the Epworth Somnolence Scale, witnessed apneas, waist circumference, etc.^1,13,14,15. In addition, heart rate was also correlated with the severity of OSA¹⁶. Age-related factors such as increased fat deposition in the parapharyngeal region, elongation of the soft palate, and changes in peripharyngeal body structure contribute to an increased prevalence of OSA¹⁷. Differences in body fat distribution, differences in upper airway anatomy, and pathophysiological differences such as hormonal status lead to greater susceptibility to OSA in men than in women^18,19. Obesity exacerbates OSA by reducing airway lumen and increasing upper airway collapsibility¹⁹. Sympathetic nerve activity appears to be increased during sleep in OSA patients^20,21, leading to increased heart rates.

Frequently used machine learning approaches include logistic regression, linear regression, support vector machines, neural networks, decision trees, etc. For instance, Kim et al. applied logistic regression (LR), support vector machine (SVM), random forest (RF), and XGBoost (XGB) using features such as hypertension, waist circumference, length between the subnasale and stomion (subnasale to stomion), and the Fatigue Severity Scale total score to develop predictive models for OSA²². Other researchers have also chosen age, gender, BMI, head circumference, neck circumference, dry throat, and witnessed apnea as features to develop prediction models^23,24. These studies demonstrate the feasibility of machine learning approaches in OSA screening and provide several clinical tools for early diagnosis and management of OSA. However, the predictive features selected in the models developed by different researchers vary widely, and the majority of the predictive models have not been translated into clinical applications. Most studies lack external validation and the accuracy level of diagnosis is insufficient. Challenges encompassing the complexity of selecting and acquiring predictive features, the limited availability of PSG data, imbalanced datasets, the interpretability of models, etc. have constrained the development of ML models for OSA prediction. Therefore, a simple, effective, and low-cost OSA prediction model and application platform are still needed to meet clinical needs.

The objective of the study

This study aims to provide a model for screening patients with moderate-to-severe OSA using only a small number of easily available, low-cost, and generalisable variables. The objective is to promote the awareness, treatment, and control rate among the population. Accordingly, this research employed machine learning models to predict moderate-to-severe OSA using six machine learning methods—LR, artificial neural network (ANN), naïve Bayes (NB), SVM, RF, and decision tree (DT)—based on age, sex, BMI, and mean heart rate during sleep, with the aim to derive the best fit model for predicting moderate-to-severe OSA and apply it to the cloud-based mobile sleep medicine management platform at the Sleep Medicine Center of the Second Affiliated Hospital of Fujian Medical University for practical clinical use.

Materials and methods

Data set

This was a retrospective study using the clinical records from patients who underwent PSG at the Sleep Medicine Center of the Second Affiliated Hospital of Fujian Medical University from January 2021 to December 2022.

The inclusion criteria for this study selected those who (1) were older than 18 years of age and (2) had participated in overnight PSG testing. Participants were excluded under the following conditions: (1) had participated in any treatment related to OSA, (2) had undergone continuous positive airway pressure (CPAP) treatment, (3) had a total PSG recording time less than 4 h, or (4) had a nighttime sleep duration less than 3 h.

Data collection was facilitated through our self-developed cloud-based mobile sleep medicine management platform, which enabled real-time data collection, including basic information and physical examination data on admission, PSG test data, follow-up and scientific research questionnaire data, etc. The platform is regularly upgraded and maintained, and is connected to the hospital intranet without external access, thus ensuring access stability, data compatibility and security.

In addition, we improve the privacy protection level through the following points: (1) Data desensitization, ensuring user-submitted data was used solely for model inference calculations without retention; (2) Use Differential Privacy technically. Noise is added in the training process of the model to ensure that the data set is always below a certain threshold when outputting information being affected by a single record, so that the third party cannot judge the change or addition or deletion of a single record according to the change of the output. This is considered to be the highest level of security in the privacy protection method based on perturbation. (3) Sign the privacy protection regulations in the system. Before utilizing our model to make predictions, users would be provided with an informed consent, which makes it clear to them what data is being collected and for what purpose.

Polysomnography

All participants underwent an overnight PSG (Compumedics Grael, Compumedics SOMTEPSG, SOMNO screen plus and NOX T3). Sleep signals were recorded by qualified technicians.

The apnea–hypopnea index (AHI) was computed for each participant based on the total number of apnea and hypopnea events divided by the total sleep time. All sleep apnea and respiratory events were scored following the updated 2007 American Academy of Sleep Medicine (AASM) scoring rules²⁵. Apnea was defined as a ≥ 90% reduction in airflow lasting ≥ 10 s. Hypopnea was defined as a ≥ 30% reduction in airflow lasting ≥ 10 s and a ≥ 4% reduction in arterial oxygen saturation.

OSA severity classification relied on AHI cutoffs: ≥ 5 for the presence of OSA, ≥ 15 for moderate-to-severe OSA, and ≥ 30 for severe OSA. We labeled the dataset accordingly and divided the participants into a none-to-mild OSA group (AHI < 15) and a moderate-to-severe OSA group (AHI ≥ 15) to develop the OSA risk prediction model.

Ethical review

The Second Affiliated Hospital of Fujian Medical University, the Sleep Medicine Key Laboratory of the University in Fujian Province, and the Sleep Disorder Medicine Center of the Second Affiliated Hospital of Fujian Medical University review committee approved the protocol of this study (2023-133). In addition, all participants and/or their legal guardians consented to use their medical data in this study. This study was carried out following the Helsinki Declaration contents.

Statistical analysis

Statistical analysis of the clinical records was performed using R version 4.2.1. Continuous data are presented as means and standard deviations, whereas frequencies and percentages were calculated for categorical variables. To explore the effects of age, gender, BMI, and mean heart rate during sleep on moderate-to-severe OSA, univariate analysis and multivariate analysis were performed. A Pearson's chi-squared test was conducted to determine whether the variables were statistically significant, and multivariate analysis was used to determine independent predictors of moderate-to-severe OSA. A two-tailed P-value less than 0.05 indicated a statistically significant difference. A multiple imputation method was implemented to fill in missing data for the height and weight of a total of 58 patients. As a repetitive simulation-based method, multiple imputation can generate a complete set of datasets from a dataset containing missing values, and the missing data in each dataset is filled in using Monte Carlo methods.

Development and validation of the prediction models

Six models were built using LR, ANN, NB, SVM, RF, and DT. The six applied algorithms are mainstream supervised learning algorithms that can handle both numerical and categorical features and have a certain research base in the field of OSA prediction, which makes it easy to compare with other studies. Meanwhile, users are relatively more familiar with the six models, which is more conducive to the promotion of applications and subsequent customization upgrades of the models. All models were trained using age, gender, BMI, and mean heart rate during sleep to predict moderate-to-severe OSA (Fig. 1). The included data were randomly divided into a training-test set (80%) and an independent validation set (20%). To avoid overfitting and improve the model, we used ten-fold cross-validation in the training-test set. In this process, all data were divided into ten parts, and then each part was used as the validation set and the others as the training set for training and validation, while the hyperparameters were kept consistent and the average training loss and average validation loss of the 10 models were then taken to measure the hyperparameters. Once a satisfactory hyperparameter is obtained, the entire data set is used as the training set, and a model is trained with that hyperparameter. Finally, the model with the best prediction performance was considered the best model and applied to the independent validation set.

RapidMiner Studio has been used to automatically tune hyperparameters, which is an open source Java-based platform for implementing predictive models and data mining algorithms. The hyperparameters of the developed ANN model were set as: hidden_layers = 1; training_cycles = 1000; learning_rate = 0.01; momentum = 0.6; decay = false; shuffle = true; normalize = true; error_epsilon = 1.0E-4; use_local_random_seed = true; local_random_seed = 1992. Hyperparameters for other models are available in the source code, which can be freely downloaded on GitHub²⁶, as are the data sets used in this study. The Supplementary Table S1 of Supplementary Material 1 also provide hyperparameters of all models.

We evaluated the model performance by calculating the area under the receiver operating curve (AUROC) for the six models on the independent validation set. In addition, we computed model performance metrics such as accuracy, recall, specificity, precision, and F1-score.

In this study, calibration curve analysis was performed to assess consistency in terms of the slope, intercept, and Brier score of the calibration curve. The calibration curve is discretised by continuous data to see if the predicted probability of the model is close to the true probability. The Brier score is a combined evaluation of model differentiation and calibration and takes a value in the range of 0–1. The closer the score is to 0, the better the model is. Decision curve analysis was performed by quantifying the net clinical benefits at different threshold probabilities, and clinical impact curve analysis was performed to determine the clinical usefulness of the prediction model by quantifying the cost-to-benefit ratio at different threshold probabilities.

R version 4.2.1 was used for all machine learning model development and validation.

Ethics approval and consent to participate

The Second Affiliated Hospital of Fujian Medical University, the Sleep Medicine Key Laboratory of the University in Fujian Province, and the Sleep Disorder Medicine Center of the Second Affiliated Hospital of Fujian Medical University review committee approved the protocol of this study (2023-133). As this study is a retrospective study for secondary analysis of existing clinical databases, all subjects involved in the study approved the informed consent. In addition, this study was carried out following the Helsinki Declaration contents.

Results

Demographic and clinical characteristics of the participants

A total of 610 participants underwent PSG at the sleep center during the study, 39 of whom were excluded from further analysis as they were less than 18 years old (n = 33) or had a total sleep time less than 3 h (n = 6) (Fig. 1). Ultimately, 571 participants (115 women and 456 men) were included in our study. Among them, 379 participants (66.4%) were diagnosed with moderate-to-severe OSA (AHI ≥ 15), while 192 patients (33.6%) had none-to-mild OSA (AHI < 15).

Relationship between participant characteristics and OSA

Men were found to have a higher rate of moderate-to-severe OSA than women (Table 1). Patients with moderate-to-severe OSA were characterized by older age, higher BMI, lower mean oxygen saturation, and higher mean heart rate during sleep compared with none-to-mild OSA patients (Table 1).

Table 1 Demographic and clinical characteristics of the participants.

Full size table

Multivariate analysis was applied to investigate factors independently predicting moderate-to-severe OSA after adjusting for age, sex, BMI, and mean heart rate during sleep. The results revealed significant associations between advanced age, male gender, high BMI, high mean heart rate during sleep and an increased risk of moderate-to-severe OSA. Details are shown in Table 2.

Table 2 Unadjusted and adjusted odds ratios identifying independent predictors of OSA.

Full size table

Model performance

As previously outlined, age, gender, BMI, and mean heart rate during sleep served as input parameters to build multiple machine learning models for predicting moderate-to-severe OSA. The AUROCs of the LR, ANN, NB, SVM, RF, and DT models in the training-test set were 79.0%, 79.1%, 71.0%, 78.8%, 77.3%, and 67.8% respectively. The AUROCs with standard deviation of each fold are presented in Supplementary Table S2 of Supplementary Material 1. Following sufficient training, the LR, ANN, NB, SVM, RF, and DT models were applied to the independent validation set. Table 3 summarizes the performance of the six models in the independent validation set. The ANN model exhibited a relatively higher AUROC compared to the others, with an AUROC, accuracy, recall, specificity, precision, and F1-score of 80.4%, 69.9%, 86.5%, 61.5%, 53.2%, and 65.9%, respectively. This was followed by LR, NB, SVM, RF, and DT, with AUROCs of 80.2%, 79.7%, 79.2%, 78.4%, and 70.4%, respectively. The p-value of AUC between each model is less than 0.05, which is statistically significant. Figure 2A,B show the receiver operating characteristic (ROC) curves and AUROCs for the six models applied to the training-test set and independent validation set, respectively.

Table 3 The performance of the six OSA prediction models based on machine learning.

Full size table

The calibration curves (Fig. 3) show that the predictions agreed well with the observations, and the LR, ANN, NB, SVM, RF, and DT models gave Brier scores of 0.165, 0.165, 0.182, 0.197, 0.175, and 0.206, respectively. The details are shown in Table 4.

Table 4 Results of calibration tests of the six machine learning models for predicting OSA.

Full size table

To determine the clinical usefulness of the models, we plotted decision curves and clinical impact curves for the six prediction models. The clinical decision curves (Fig. 4) showed that the OSA thresholds for achieving a greater net benefit than the "no treatment" or "all treatment" scenarios when making clinical decisions using the LR, ANN, NB, SVM, RF, and DT prediction models were 0.92, 0.90, 0.67, 0.65, 0.78, and 0.65, respectively.

The clinical impact curve (Fig. 5) analysis illustrated the clinical effectiveness of the six predictive models. At threshold probabilities greater than 70%, 65%, 75%, 75%, 60%, and 70%, respectively, the populations identified as high risk of moderate-to-severe OSA by the LR, ANN, NB, SVM, RF, and DT models significantly overlapped with those actually experiencing moderate-to-severe OSA, indicating their clinical efficacy.

Combining these performance metrics, the ANN model was identified as the most suitable for this study due to its superior nonlinear modeling, adaptive learning, generalization capability, and flexibility. Figure 6 shows the feature importance of the ANN model.

Comparison with previous research

We compared the effectiveness of the generated ANN model with the results of previous studies. A systematic search of PubMed was conducted to identify OSA prediction models developed since 2017. Four relevant studies^22,23,24,27 were selected for comparative analysis alongside our ANN model. Our study demonstrates better predictive performance with higher recall and specificity when compared to most of these previous models (Table 5). Features used in each study were shown in Supplementary Table S3 of Supplementary Material 1.

Table 5 Effectiveness of the ANN model in our study compared to that of models from previous research.

Full size table

Application of the model

The risk of moderate-to-severe OSA could be estimated by inputting the patient's age, gender, BMI, and mean heart rate during sleep into the model. Further testing and treatment are recommended if the risk is ≥ 50%, and lifestyle improvement and regular follow-up are recommended if the risk is < 50%.

Deployed as a mobile online preliminary screening tool, the model operates within a cloud-based mobile sleep medicine management platform, which supports data collection across various scenarios including home, primary hospital and sleep medicine center screening via multi-ports accessibility, facilitating accurate outcomes. Details are shown in Fig. 6. In addition, we established a website for patients and clinicians to use the ANN model (https://slumbot.angelong.cn/)²⁸ (Fig. 7).

Discussion

Principal findings

Obstructive Sleep Apnea (OSA) represents a significant global healthcare burden, yet remains largely undiagnosed and untreated due to limited awareness and high diagnostic costs¹⁰. In this study, we developed six risk prediction models for moderate-to-severe OSA using four parameters: age, gender, BMI, and mean heart rate during sleep. The developed models employing LR, ANN, NB, SVM, RF, and DT exhibited excellent performance in predicting moderate-to-severe OSA in the independent validation set, with AUROCs of 80.2%, 80.4%, 79.7%, 79.2%, 78.4%, and 70.4%, respectively. Calibration curve analysis confirmed the models' reliability, and decision curve analysis and clinical impact curve analysis highlighted their practical clinical utility. Our model offer potential for early clinical diagnosis of adult moderate-to-severe OSA and may mitigate its adverse consequences.

Model selection was primarily based on AUROC and recall metrics. Higher AUROC values indicate better prediction ability. Higher recall can help screen for moderate-severe OSA patients more effectively, which is crucial for early detection and intervention. LR, ANN, NB, SVM, RF, and DT are all models widely used in clinical medicine. The ANN model, performing notably well with high AUROC, recall, specificity, and low Brier score, likely benefits from its ability to capture nonlinear relationships within data structures. An ANN is a computational model based on the function of biological neural networks that can be used for modeling nonlinear statistics and revealing complex relationships between input and output data, and it is well suited to solving problems such as clinical prediction and image recognition^29,30.

A large dataset of adult samples from a hospital was used in our study for accurate comparisons between the none-to-mild OSA group and the moderate-to-severe OSA group. To evaluate whether the included sample size was sufficient to draw reliable conclusions, a post hoc sample size calculation³¹ was conducted using an interactive tool available online (https://riskcalc.org/samplesize/)³². In the final model with 4 predictors, using the C-statistic in conjunction with the expected incidence to approximate the Cox–Snell R-squared, a minimum sample of 343 participants and a minimum of 22.8 events per predictor parameter were required. Thus, the real sample of 571 participants in this study may provide adequate power to ensure the reliability of our results.

While previous research has established associations between OSA and variables such as age, sex, BMI, and mean heart rate during sleep are associated with OSA, few studies have integrated these four factors into OSA risk prediction models. Studies have shown that machine learning can enhance the precision of the prediction model and promote awareness of potentially latent conditions^33,34,35. There have been many models using machine learning for the prediction of OSA. However, most of them incorporate parameters which are too complex to obtain in daily life, such as head measurement images³⁶, sleep questionnaires^23,36, breathing sounds during sleep³⁷, and anthropometric features including neck circumference, waist circumference, and head circumference^23,36,38. Moreover, fewer prediction models have been put into real clinical application.

Clinical implications

In clinical practice, our prediction model effectively stratifies patients into none-to-mild OSA and moderate-to-severe OSA. Patients with moderate-to-severe OSA show more serious clinical symptoms such as drowsiness, headache, snoring, etc., which have a greater impact on daily life and are more strongly associated with co-morbidities such as cardiovascular and cerebral vascular diseases, so it is more clinically significant to carry out early screening in these patients. The modeling parameters included in our study are simple and easy to obtain, and the generated decision curves and clinical impact curves indicate the high clinical usefulness of the model. What’s more, our developed model in this study is based on a cloud-based mobile sleep medicine management platform for clinical application, and the model has shown unique advantages in several clinical application scenarios. For residents screening at home, it can reduce the time spent on hospital appointments and queues, save costs, and increase acceptance and cooperation when initial screening is conducted in the familiar home environment. For primary hospital screening, the utilization rate of sleep monitoring equipment can be improved, and medical resources and data exchange with tertiary hospitals can be achieved, so as to provide more sleep medical services to patients. For sleep center screening, the economic burden of patients can be reduced; the use of IoT makes the consultation process simpler; the initial screening results are presented in real time; and a personalized treatment or referral plan can be formulated with the advice of sleep physicians. The OSA prediction model provides an effective tool for moderate-to-severe OSA prediction or referral under limited clinical conditions. In addition, this model can be applied to the public’s popularization of OSA to improve people's understanding of OSA and its hazards.

Limitations and future work

While our developed a model demonstrates good performance, there are several potential limitations that merit attention and future investigation.

First, our study used only a single-center dataset exclusively comprising individuals of an Asian ethnicity. Therefore, we will further validate the developed model using an external dataset, exploring its ability to be generalized to other institutions or ethnic groups.

Second, due to the COVID-19 epidemic, there was a reduced number of patients undergoing full-night polysomnography during our study, resulting in a limited sample size. Moving forward, continuous training and optimization of the model will be conducted additional data from the Sleep Medicine Center. Meanwhile, implementing transfer learning techniques will be explored to enhance training efficiency and generalize the model's applicability while mitigating overfitting.

Third, the participants in this study, all of whom underwent PSG at the sleep medicine center and were generally considered likely to have OSA or other sleep disorders, had a 66.4% prevalence of moderate–severe OSA, which is higher than the prevalence reported in the general public^6,7,39. This may limit the model's representativeness of the broader community.

Additionally, our retrospective study did not exclude individuals at high risk for undiagnosed OSA-related conditions such as cardiovascular disease, metabolic disease, neurological disease, etc. Such comorbidities may confound the model's predictions, suggesting a potential bias towards individuals at higher risk for OSA-related complications. The above indicates that our model may be more suitable for the detection of people at high risk of OSA, and there is no avoiding the existence of collinearity problems.

Fourth, the performance of our model falls within a mid-range spectrum in comparison with previous studies, which may be related to differences in the sample demographics and outcome criteria. For example, the relatively young age of our study cohort implies a lower prevalence of moderate-to-severe OSA compared to older populations⁴⁰, posing challenges for accurate prediction models. Future efforts will focus on diversifying patient data across different age groups to achieve a balanced dataset, ensuring robust prediction performance across diverse population segments.

Fifth, in aiming for broad applicability from home settings to primary care facilities and specialized sleep centers, our model incorporated only four readily available predictors. This selection may compromise the model's specificity in favor of simplicity. In future work, multidimensional indicators including physical examination data and laboratory indicators will be incorporated to construct a more accurate risk stratification model suitable for medical settings.

Sixth, traditional questionnaire-based assessments, requiring users to input data sequentially, may diminish user engagement. Future studies may explore AI-driven medical chatbots to enhance user interactivity. The Internet of Things and wearable technology can also be applied to collect and analyze OSA-related data in real time.

Finally, as the number of individuals screened for OSA continues to grow, there will be increasing demands for the stability, accuracy, real-time performance, and scalability of the algorithms used. To address the computational challenges posed by large-scale datasets, it may be necessary to adopt parallel computing and distributed processing techniques. Additionally, optimizing model complexity and hyperparameter tuning will be ongoing priorities to align with the computational resources available in medical institutions.

Conclusions

In conclusion, our study has highlighted the underappreciated impact and disease burden associated with moderate-to-severe obstructive sleep apnea (OSA). We addressed this gap by developing six streamlined risk prediction models utilizing logistic regression (LR), artificial neural network (ANN), naïve Bayes (NB), support vector machine (SVM), random forest (RF), and decision tree (DT).

Among these models, the artificial neural network (ANN) model demonstrated significant potential for practical implementation within a cloud-based mobile sleep medicine management platform. Particularly noteworthy is the identification and integration of four key predictors—age, gender, BMI, and mean heart rate during sleep—into the modeling process. These predictors have shown substantial capability in predicting the severity of OSA with notable accuracy.

Our findings present a convenient and effective tool for the early detection of OSA by leveraging readily available data, and show the practical application of OSA screening model in clinical scenarios. Future research could further explore combinations of predictors that are even more sensitive and specific, thus enhancing overall model performance.

Data availability

The source code and the datasets used and/or analyzed during the current study are available on GitHub (https://github.com/Zhangyixiang2023/slumbot [accessed 2023-08-17]).

Abbreviations

BMI:: Body mass index
OSA:: Obstructive sleep apnea
PSG:: Polysomnography
AHI:: Apnea–hypopnea index
AUROC:: Area under the receiver operating curve
ROC:: Receiver operating characteristics
ML:: Machine learning
ANN:: Artificial neural network
DT:: Decision tree
LR:: Logistic regression
NB:: Naïve Bayes
RF:: Random forest
SVM:: Support vector machine
XGB:: XGBoost
EEG:: Electroencephalogram
ECG:: Electrocardiogram
EMG:: Electromyogram
EDR:: ECG-derived respiration
AASM:: American Academy of Sleep Medicine

References

Patel, S. R. Obstructive sleep apnea. Ann. Intern. Med. 171(11), ITC81–ITC96. https://doi.org/10.7326/AITC201912030 (2019).
Article PubMed Google Scholar
Jordan, A. S., McSharry, D. G. & Malhotra, A. Adult obstructive sleep apnoea. Lancet. 383(9918), 736–747. https://doi.org/10.1016/S0140-6736(13)60734-5 (2014).
Article PubMed Google Scholar
Vanek, J. et al. Obstructive sleep apnea, depression and cognitive impairment. Sleep Med. 72, 50–58. https://doi.org/10.1016/j.sleep.2020.03.017 (2020) (Epub 2020 Mar 23).
Article PubMed Google Scholar
Sánchez-de-la-Torre, M., Campos-Rodriguez, F. & Barbé, F. Obstructive sleep apnoea and cardiovascular disease. Lancet Respir. Med. 1(1), 61–72. https://doi.org/10.1016/S2213-2600(12)70051-6 (2013) (Epub 2012 Nov 6).
Article PubMed Google Scholar
Lam, J. C., Mak, J. C. & Ip, M. S. Obesity, obstructive sleep apnoea and metabolic syndrome. Respirology. 17(2), 223–236. https://doi.org/10.1111/j.1440-1843.2011.02081.x (2012).
Article PubMed Google Scholar
Benjafield, A. V. et al. Estimation of the global prevalence and burden of obstructive sleep apnoea: A literature-based analysis. Lancet Respir. Med. 7(8), 687–698. https://doi.org/10.1016/S2213-2600(19)30198-5 (2019) (Epub 2019 Jul 9).
Article PubMed PubMed Central Google Scholar
Senaratna, C. V. et al. Prevalence of obstructive sleep apnea in the general population: A systematic review. Sleep Med. Rev. 34, 70–81. https://doi.org/10.1016/j.smrv.2016.07.002 (2017) (Epub 2016 Jul 18).
Article PubMed Google Scholar
Kapur, V. K. et al. Clinical practice guideline for diagnostic testing for adult obstructive sleep apnea: An American academy of sleep medicine clinical practice guideline. J. Clin. Sleep Med. 13(3), 479–504. https://doi.org/10.5664/jcsm.6506 (2017).
Article PubMed PubMed Central Google Scholar
Fleetham, J. et al. Canadian Thoracic Society 2011 guideline update: Diagnosis and treatment of sleep disordered breathing. Can. Respir. J. 18(1), 25–47. https://doi.org/10.1155/2011/506189 (2011).
Article PubMed PubMed Central Google Scholar
Knauert, M., Naik, S., Gillespie, M. B. & Kryger, M. Clinical consequences and economic costs of untreated obstructive sleep apnea syndrome. World J. Otorhinolaryngol. Head Neck Surg. 1(1), 17–27. https://doi.org/10.1016/j.wjorl.2015.08.001 (2015).
Article PubMed PubMed Central Google Scholar
Young, T. et al. Burden of sleep apnea: Rationale, design, and major findings of the Wisconsin Sleep Cohort study. WMJ. 108(5), 246–249 (2009).
PubMed PubMed Central Google Scholar
Rubinger, L., Gazendam, A., Ekhtiari, S. & Bhandari, M. Machine learning and artificial intelligence in research and healthcare✰,✰✰. Injury. https://doi.org/10.1016/j.injury.2022.01.046 (2022) (Epub ahead of print).
Article PubMed Google Scholar
Gabbay, I. E. & Lavie, P. Age- and gender-related characteristics of obstructive sleep apnea. Sleep Breath. 16(2), 453–460. https://doi.org/10.1007/s11325-011-0523-z (2012) (Epub 2011 Apr 16).
Article PubMed Google Scholar
Tom, C. et al. Correlations between waist and neck circumferences and obstructive sleep apnea characteristics. Sleep Vigil. 2(2), 111–118. https://doi.org/10.1007/s41782-018-0041-1 (2018) (Epub 2018 May 15).
Article PubMed PubMed Central Google Scholar
Ferreira-Santos, D., Amorim, P., Silva Martins, T., Monteiro-Soares, M. & Pereira, R. P. Enabling early obstructive sleep apnea diagnosis with machine learning: Systematic review. J. Med. Internet. Res. 24(9), e39452. https://doi.org/10.2196/39452 (2022).
Article PubMed PubMed Central Google Scholar
Kawano, Y., Tamura, A., Watanabe, T. & Kadota, J. Influence of the severity of obstructive sleep apnea on heart rate. J. Cardiol. 56(1), 27–34. https://doi.org/10.1016/j.jjcc.2010.01.007 (2010) (Epub 2010 Mar 6).
Article PubMed Google Scholar
Punjabi, N. M. The epidemiology of adult obstructive sleep apnea. Proc. Am. Thorac. Soc. 5(2), 136–143. https://doi.org/10.1513/pats.200709-155MG (2008).
Article PubMed PubMed Central Google Scholar
Zhang, Z. et al. Gender differences in clinical manifestations and polysomnographic findings in Chinese patients with obstructive sleep apnea. Sleep Breath. 24(3), 1019–1026. https://doi.org/10.1007/s11325-019-01943-y (2020) (Epub 2019 Oct 23).
Article PubMed Google Scholar
Pillar, G. & Lavie, P. Obstructive sleep apnea: Diagnosis, risk factors, and pathophysiology. Handb. Clin. Neurol. 98, 383–399. https://doi.org/10.1016/B978-0-444-52006-7.00025-3 (2011).
Article PubMed Google Scholar
Ucak, S., Dissanayake, H. U., Sutherland, K., de Chazal, P. & Cistulli, P. A. Heart rate variability and obstructive sleep apnea: Current perspectives and novel technologies. J. Sleep Res. 30(4), e13274. https://doi.org/10.1111/jsr.13274 (2021) (Epub 2021 Jan 18).
Article PubMed Google Scholar
Qin, H. et al. The different facets of heart rate variability in obstructive sleep apnea. Front. Psychiatry. 22(12), 642333. https://doi.org/10.3389/fpsyt.2021.642333 (2021).
Article Google Scholar
Kim, Y. J., Jeon, J. S., Cho, S. E., Kim, K. G. & Kang, S. G. Prediction models for obstructive sleep apnea in Korean adults using machine learning techniques. Diagnostics (Basel). 11(4), 612. https://doi.org/10.3390/diagnostics11040612 (2021).
Article PubMed PubMed Central Google Scholar
Huang, W. C., Lee, P. L., Liu, Y. T., Chiang, A. A. & Lai, F. Support vector machine prediction of obstructive sleep apnea in a large-scale Chinese clinical sample. Sleep. 43(7), zsz295. https://doi.org/10.1093/sleep/zsz295 (2020).
Article PubMed PubMed Central Google Scholar
Kuan, Y. C., Hong, C. T., Chen, P. C., Liu, W. T. & Chung, C. C. Logistic regression and artificial neural network-based simple predicting models for obstructive sleep apnea by age, sex, and body mass index. Math. Biosci. Eng. 19(11), 11409–11421. https://doi.org/10.3934/mbe.2022532 (2022).
Article PubMed Google Scholar
Berry, R. B. et al. Rules for scoring respiratory events in sleep: Update of the 2007 AASM manual for the scoring of sleep and associated events deliberations of the sleep apnea definitions task force of the American Academy of Sleep Medicine. J. Clin. Sleep Med. 8(5), 597–619. https://doi.org/10.5664/jcsm.2172 (2012).
Article PubMed PubMed Central Google Scholar
slumbot. GitHub. https://github.com/Zhangyixiang2023/slumbot (accessed 17 Aug 2023).
Keshavarz, Z., Rezaee, R., Nasiri, M. & Pournik, O. Obstructive sleep apnea: A prediction model using supervised machine learning method. Stud. Health Technol. Inform. 26(272), 387–390. https://doi.org/10.3233/SHTI200576 (2020).
Article Google Scholar
A simple tool for predicting moderate-to-severe OSA of adults. https://slumbot.angelong.cn/ (accessed 19 Mar 2023).
Azimi, P. et al. Artificial neural networks in neurosurgery. J. Neurol. Neurosurg. Psychiatry. 86(3), 251–256. https://doi.org/10.1136/jnnp-2014-307807 (2015) (Epub 2014 Jul 1).
Article PubMed Google Scholar
Azimi, P. & Mohammadi, H. R. Predicting endoscopic third ventriculostomy success in childhood hydrocephalus: An artificial neural network analysis. J. Neurosurg. Pediatr. 13(4), 426–432. https://doi.org/10.3171/2013.12.PEDS13423 (2014) (Epub 2014 Jan 31).
Article PubMed Google Scholar
Collins, G. S., Reitsma, J. B., Altman, D. G. & Moons, K. G. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD statement. BMJ. 7(350), g7594. https://doi.org/10.1136/bmj.g7594 (2015).
Article CAS Google Scholar
Sample Size Calculator. https://riskcalc.org/samplesize/ (accessed Jul 2020).
Handelman, G. S. et al. eDoctor: Machine learning and the future of medicine. J. Intern. Med. 284(6), 603–619. https://doi.org/10.1111/joim.12822 (2018) (Epub 2018 Sep 3).
Article CAS PubMed Google Scholar
Obermeyer, Z. & Emanuel, E. J. Predicting the future—Big data, machine learning, and clinical medicine. N. Engl. J. Med. 375(13), 1216–1219. https://doi.org/10.1056/NEJMp1606181 (2016).
Article PubMed PubMed Central Google Scholar
Rajkomar, A., Dean, J. & Kohane, I. Machine learning in medicine. N. Engl. J. Med. 380(14), 1347–1358. https://doi.org/10.1056/NEJMra1814259 (2019).
Article PubMed Google Scholar
Skotko, B. G. et al. A predictive model for obstructive sleep apnea and Down syndrome. Am. J. Med. Genet. A. 173(4), 889–896. https://doi.org/10.1002/ajmg.a.38137 (2017) (Epub 2017 Jan 26).
Article PubMed PubMed Central Google Scholar
Kim, J. W. et al. Prediction of obstructive sleep apnea based on respiratory sounds recorded between sleep onset and sleep offset. Clin. Exp. Otorhinolaryngol. 12(1), 72–78. https://doi.org/10.21053/ceo.2018.00388 (2019) (Epub 2018 Sep 8).
Article PubMed Google Scholar
Liu, W. T. et al. Prediction of the severity of obstructive sleep apnea by anthropometric features via support vector machine. PLoS One. 12(5), e0176991. https://doi.org/10.1371/journal.pone.0176991 (2017).
Article CAS PubMed PubMed Central Google Scholar
Peppard, P. E. et al. Increased prevalence of sleep-disordered breathing in adults. Am. J. Epidemiol. 177(9), 1006–1014. https://doi.org/10.1093/aje/kws342 (2013) (Epub 2013 Apr 14).
Article PubMed PubMed Central Google Scholar
Chang, J. L. et al. International consensus statement on obstructive sleep apnea. Int. Forum Allergy Rhinol. https://doi.org/10.1002/alr.23079 (2022).
Article PubMed Google Scholar

Download references

Funding

We thank the Fujian Provincial Health and Family Planning Commission, China (Grant No.2021GG01010048); the Natural Science Foundation of Fujian Province, China (Grant No.2022J01792); and Quanzhou Science and Technology Project, China (Grant No.2021C061R). This article was also subsidized by the Fujian Provincial University students’ innovation and entrepreneurship training program project (Grant No. S202210392005S).

Author information

These authors contributed equally: Rongrong Dai, Kang Yang and Jiajing Zhuang.

Authors and Affiliations

The Sleep Disorder Medicine Center of the Second Affiliated Hospital of Fujian Medical University, Quanzhou, 362000, Fujian, China
Rongrong Dai, Xiaoyang Chen, Jimin Fan & Yixiang Zhang
The School of Public Health, Fujian Medical University, Fuzhou, 350108, Fujian, China
Rongrong Dai, Yiming Hu, Qingquan Chen & Xi Zhu
The School of Clinical Medicine, Fujian Medical University, Fuzhou, 350108, Fujian, China
Kang Yang, Jiajing Zhuang, Ling Yao & Huaxian Zheng
National Center for Chronic and Noncommunicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing, China
Yiming Hu
The Second Affiliated Hospital of Fujian Medical University, Quanzhou, 362000, Fujian, China
Jianfeng Ke, Chunmei Fan, Xiaoyang Chen, Jimin Fan & Yixiang Zhang
Cyberspace Institute of Advanced Technology, Guangzhou University, Guangzhou, 510030, Guangdong, China
Yifu Zeng
Department of Neurosurgery, National Clinical Research Center of Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
Kang Yang

Authors

Rongrong Dai
View author publications
Search author on:PubMed Google Scholar
Kang Yang
View author publications
Search author on:PubMed Google Scholar
Jiajing Zhuang
View author publications
Search author on:PubMed Google Scholar
Ling Yao
View author publications
Search author on:PubMed Google Scholar
Yiming Hu
View author publications
Search author on:PubMed Google Scholar
Qingquan Chen
View author publications
Search author on:PubMed Google Scholar
Huaxian Zheng
View author publications
Search author on:PubMed Google Scholar
Xi Zhu
View author publications
Search author on:PubMed Google Scholar
Jianfeng Ke
View author publications
Search author on:PubMed Google Scholar
Yifu Zeng
View author publications
Search author on:PubMed Google Scholar
Chunmei Fan
View author publications
Search author on:PubMed Google Scholar
Xiaoyang Chen
View author publications
Search author on:PubMed Google Scholar
Jimin Fan
View author publications
Search author on:PubMed Google Scholar
Yixiang Zhang
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization, Y.Z. and J.F.; methodology, X.C. and C.F.; writing—original draft preparation, K.Y.; writing—review and editing, R.D. and J.Z.; validation, L.Y., Y.H. and H.Z.; formal analysis, X.Z. and Q.C.; data curation, J.K.; develop platform, Y.Z. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Yixiang Zhang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Tables.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Dai, R., Yang, K., Zhuang, J. et al. Enhanced machine learning approaches for OSA patient screening: model development and validation study. Sci Rep 14, 19756 (2024). https://doi.org/10.1038/s41598-024-70647-5

Download citation

Received: 01 December 2023
Accepted: 20 August 2024
Published: 26 August 2024
DOI: https://doi.org/10.1038/s41598-024-70647-5

Keywords

This article is cited by

Construction and validation of a predictive model for sleep disorders among pregnant women
- Deguang Yang
- Ye Li
- Suzhen Guan
BMC Pregnancy and Childbirth (2025)

Subjects

Abstract

Similar content being viewed by others

Sleep prediction using data from oximeter, accelerometer and snoring for portable monitor obstructive sleep apnea diagnosis

State-of-the-art sleep arousal detection evaluated on a comprehensive clinical dataset

Application of various machine learning techniques to predict obstructive sleep apnea syndrome severity

Introduction

Obstructive sleep apnea

Machine learning

Related works

The objective of the study

Materials and methods

Data set

Polysomnography

Ethical review

Statistical analysis

Development and validation of the prediction models

Ethics approval and consent to participate

Results

Demographic and clinical characteristics of the participants

Relationship between participant characteristics and OSA

Model performance

Comparison with previous research

Application of the model

Discussion

Principal findings

Clinical implications

Limitations and future work

Conclusions

Data availability

Abbreviations

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Tables.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

This article is cited by

Construction and validation of a predictive model for sleep disorders among pregnant women

Search

Quick links