Obstructive sleep apnea (OSA) is a kind of disease characterized by repeated partial or complete obstruction of the upper airway during sleep, which causes a series of clinical manifestations1,2. One-third or more of patients with essential hypertension have a detectable apnea-hypopnea index (AHI) greater than or equal to 15 times/hour3,4. Drivers with untreated OSA had a significantly increased risk of car crashes5. An ageing population increases the burden of disease. A study using the Markov computer simulation model found that as China’s population grows and ages, annual cardiovascular disease (CVD) events will increase by more than 50% between 2010 and 20306. Most previous studies have focused on the relationship between sleep duration and CVD, ignoring the many aspects of sleep problems in older adults. Many studies have revealed the association between sleep problems and CVD7,8. Therefore, the early identification and treatment of elderly patients with suspected OSA are vitally important.

The gold standard for diagnosing OSA is nocturnal polysomnography (PSG). However, PSG is time-consuming and expensive, and only a small number of hospitals can detect it, resulting in a large number of OSA patients not being diagnosed and treated in time9. Therefore, there is an urgent need to find screening tools for the rapid assessment of OSA. Screening questionnaires such as NoSAS score, the Epworth Sleepiness Scale (ESS), the Goal Questionnaire, the STOP-Bang Questionnaire, and the Berlin Questionnaire have greatly facilitated OSA screening9,10,11,12,13. The GOAL Questionnaire is a recently developed OSA screening tool that has shown similar sensitivity and specificity to other screening scales for OSA screening in the Brazilian population9. These screening tools are currently used to screen for OSA, but each tool has a different sensitivity, specificity, predictive value, receiver operating characteristic (ROC) curve analysis and diagnostic odds ratio (DOR). However, few studies have applied these questionnaires to validate their use in screening for OSA in the elderly population. Therefore, we compared the application value of these five screening tools in elderly patients with suspected OSA to find the most suitable scale.

Method

From January 2012 to June 2017, subjects aged 60 and above who received PSG monitoring at the Sleep Center of the First Affiliated Hospital of Guangzhou Medical University were retrospectively included. The following questionnaire items were collected: (1) The GOAL Questionnaire includes four questions, namely male, body mass index (BMI) > 30 kg/m2, age > 50 and loud snoring, answered with “yes” or “no”, with 1 point for “yes” and 0 points for “no”, and a total score > 2 points indicates a high risk of OSA. (2) The NoSAS Score includes neck circumference, BMI, snoring history, age and gender, with a total score of 0 to 17 points, of which neck circumference > 40 cm scores 4 points; 25 > BMI < 30 scores 3 points, BMI > 30 scores 5 points, snoring scores 2 points, age > 55 scores 4 points and male scores 2 points; a NoSAS Score > 8 indicates a high risk of OSA. (3) The STOP-Bang Questionnaire includes eight questions, namely snoring, fatigue, observed apnea, hypertension, BMI > 35 kg/m2, age > 50 years old, neck circumference > 40 cm and male, answered with “yes” or “no”, with 1 point for “yes” and 0 points for “no”, and a total score > 3 indicates a high risk of OSA. (4) The ESS includes eight questions asking the subjects to evaluate their degree of daytime sleepiness in a variety of settings, with 0 for no sleepiness and 1, 2 and 3 for light, moderate and heavy sleepiness, respectively; the total score is 24, and an ESS score > indicates a high risk of OSA. (5) The Berlin Questionnaire includes 11 questions in the three groups of the severity of snoring, daytime sleepiness and high blood pressure or obesity, each of which is evaluated as negative or positive after calculating their respective scores; if two or more of the three groups are positive, the patient is considered to be at high risk for OSA.

This study was conducted in accordance with the Declaration of Helsinki and was approved by the Medical Ethics Committee of the First Affiliated Hospital of Guangzhou Medical University (Ethics No.: 201705). All participants gave written informed consent before participating in the study and eligible patients were selected according to the inclusion and exclusion criteria. Inclusion criteria: (1) patients who came to the sleep breathing center for PSG monitoring due to complaints of snoring, drowsiness, hypertension or apnea; (2) subjects aged 60 and above; (3) patients with autonomous behavior ability and cognitive ability who completed the five scales in the sleep laboratory and agreed to sign their informed consent; (4) patients with a total sleep time of > 4 h. Exclusion criteria: (1) patients with coronary heart disease, diabetes, kidney disease, chronic lung disease or cerebrovascular disease; (2) patients with a history of brain tumor or epilepsy; (3) patients with various mental and psychological diseases; (4) patients with severe organ failure; (5) OSA patients who had received treatment; (6) patients who submitted incomplete answers to the scales; (7) patients with sleep apnea hypopnea syndrome dominated by central or mixed events.

Basic data collection: in our study, we collected each patient’s name, age, sex, height, weight, neck circumference, waist circumference, blood pressure and other general data. The items in the ESS, GOAL, NoSAS, STOP-Bang and Berlin scales were then collected, and each item was verified by a sleep technician to ensure reliability.

PSG monitoring was mainly used to diagnose sleep-disordered breathing. Recorded indicators included electroencephalogram, electrooculogram, mandibular EMG, oronasal airflow and respiratory motion, electrocardiography (ECG), blood oxygen saturation, snoring, limb movement, body position. We used an Alice 5 PSG from Philips Respironics to conduct continuous synchronous recording for at least seven hours. After the original parameters were automatically analyzed, they were manually reviewed and corrected. Finally, they were interpreted and analyzed by a trained professional sleep physician, but technicians did not modify participant responses. The sleep AHI referred to the number of apnea/hypopnea events per hour of sleep. OSA severity was classified according to AHI: normal group (AHI < 5 times/hour), mild OSA group (5 > AHI < 15 times/hour), moderate OSA group (15 > AHI < 30 times/hour) and severe OSA group (> 30 times/hour)14.

Statistical processing

SPSS 23.0 statistical software was used for analysis. The measurement data was expressed as mean ± standard deviation and the count data was expressed as frequency. One-way ANOVA test was used for the measurement data, and X2 test was used for the count data. ROC curve analysis was performed using MedCalc software to evaluate the ROC curve of each scale. The sensitivity, specificity, positive predictive value and negative predictive value of the five scales were calculated in the form of a four-grid table and reported with their respective 95% confidence interval (CI) to evaluate the diagnostic value of the five screening scales for OSA. P < 0.05 was defined as statistically significant.

Results

Basic information

Among the 273 subjects selected for this study, there was no statistical significance in age or gender among each group (P > 0.05). Minimum nocturnal oxygen saturation, neck circumference, AHI and waist circumference all increased with the severity of OSA. In this study, there were significant differences in the GOAL, STOP-Bang and NoSAS Scores between the normal or mild OSA group and the severe OSA group, while there were no significant differences between the other groups. There was no significant difference between the mild OSA group and moderate to severe OSA group in the Berlin Questionnaire, while the comparison between the other groups was statistically significant. The ESS score was statistically significant between severe OSA and other groups (See Table 1 for details).

Table 1 Basic Data of patients in each group and scores of five scales;

Predictive value of the five scales

Taking AHI as 5, 15 and 30 times/hour as the cut-off points, respectively, the areas under the ROC curve (AUC) of the five scale scores were compared (Figs. 1, 2 and 3). Regardless of the cut-off points, the AUC of the Berlin Questionnaire was higher than that of the other scale scores, and the AUC of the five scale scores continued to increase with the increase in OSA severity. When AHI was 5 times/hour as the cut-off point, the AUC of the Berlin Questionnaire was the highest at 0.670 (0.611–0.725), and the AUC of the NoSAS Score was the lowest at 0.594 (0.533–0.653). The AUC of GOAL, STOP-Bang and ESS were 0.599 (0.538–0.658), 0.616 (0.555–0.674) and 0.597 (0.537–0.656) respectively (see Figs. 1, 2 and 3).

Fig. 1
figure 1

ROC Curve with AHI 5 as Cut-off Point.

Fig. 2
figure 2

ROC Curve with AHI 15 as Cut-off Point.

Fig. 3
figure 3

ROC Curve with AHI 30 as Cut-off Point.

Predictors of the five scales

With the increased severity of OSA, the sensitivity and negative predictive value of the five scales increased, while their specificity and positive predictive value decreased. Among them, the sensitivity of the GOAL Questionnaire was the highest, but its specificity was the lowest, while the specificity of the ESS score was the highest, but its sensitivity was the lowest. When the AHI was 5 times/hour as the cut-off point, the specificity of the ESS score was the highest at 0.743 (0.644–0.843), but its sensitivity was the lowest. When the AHI was 5 and 15 times/hour as the cut-off points, the DOR value of the GOAL Questionnaire was the highest, followed by that of the Berlin Questionnaire. However, when the AHI was 30 times/hour as the cut-off point, Berlin had the highest DOR value (see Tables 2, 3 and 4 for details). Moreover, Berlin had high sensitivity and specificity at 0.653 (0.587–0.719) and 0.608 (0.497–0.719), respectively, which reached an acceptable level among screening tools.

Table 2 AHI 5 as cut-off point for diagnosis of OSA.
Table 3 AHI 15 as cut-off point for diagnosis of OSA.
Table 4 AHI 30 as cut-off point for diagnosis of OSA.

Discussion

The results of our study show that the Berlin Questionnaire has an AUC of 0.670 (95% CI: 0.611–0.725) across different cut-off points. The sensitivity and specificity of the Berlin Questionnaire are notably high, at 0.653 and 0.608, 0.699 and 0.533, and 0.803 and 0.503 when the AHI is 5, 15, and 30 events per hour, respectively. The GOAL Questionnaire demonstrates the highest diagnostic odds ratio (DOR) at AHI cut-off points of 5 and 15 events per hour, while the Berlin Questionnaire has the highest DOR at an AHI cut-off point of 30 events per hour. Among the five scales, GOAL, NoSAS, and STOP-Bang exhibit high sensitivity, but their specificity is very low, which could lead to a significant number of false positives. Nevertheless, the Berlin Questionnaire has the largest area under the ROC curve (close to 0.7), and its sensitivity and specificity are both above 0.6 at the diagnostic cut-off points, suggesting that it may offer higher efficacy in elderly patients than the other questionnaires. However, the Berlin Questionnaire’s positive predictive value (PPV) for severe OSA is 0.384, indicating that fewer than 40% of those who screen positive actually have severe OSA. Comprehensive use of these five screening questionnaires for suspected OSA in elderly patients aged 60 years and older is valuable and worth promoting among the elderly population.

The prevalence of OSA and severe OSA is generally higher in older than younger adults15. In our study, 199 of the 273 elderly patients suspected of OSA were diagnosed with OSA, and the proportion of males was much higher than that of females, which was consistent with the results of previous epidemiological studies16. It has been reported in the literature that age, neck circumference, waist circumference and BMI all affect the severity of OSA17. In addition, relevant studies have shown that neck circumference, waist circumference and BMI are also risk factors for hypertension. As people grow older, sleep disorders and CVD become more common18,19. In population-based studies, among sleep disorders in elderly patients, OSA can reach a prevalence of 25–46%20,21,22. The higher prevalence of OSA with age may also be associated with factors unrelated to weight, including changes in the morphology and function of the skeletal muscle of the pharynx23. Although fat deposition is associated with OSA in the general population, indirect measures suggest that reduced lean body mass plays an important role in the development of OSA in elderly patients24. It is clear from the screening table that the high-risk factors of age affect the identification and severity distribution of elderly OSA patients, but the exact mechanism by which age increases the risk of severe OSA is not fully understood in the elderly population25,26. Anatomically small pharynx and impaired pharyngeal anatomy are key factors in the occurrence of upper airway obstruction during sleep27. In addition to anatomical features, lung and muscle-related mechanisms play an important role in the mechanisms of OSA, and have been shown to vary in the elderly population28, which could justify a higher incidence of OSA independent of obesity status in this age group. A population-based study in Europe examined the prevalence of OSA in different age and gender groups. The continued increase in OSA with age challenges the current theory that OSA and cardiovascular comorbidities may influence the prevalence of OSA in older age29. OSA increases the risk of hypertension in a dose-responsive manner30. OSA may play a role in sympathetic activation and increased oxidative stress associated with the pathogenesis of hypertension31. Therefore, the increased prevalence of OSA in elderly patients leads to a series of complications and affects their quality of life and social burden. Our results show an increased frequency of hypertension in older adults with OSA, suggesting that a similar mechanism may be present in older adults. Our study was conducted after excluding other underlying diseases and may also have underestimated the prevalence of OSA. If elderly patients with other underlying diseases were included, the prevalence of OSA may be higher and the social burden may be heavier. Therefore, using a simple questionnaire to screen and treat OSA has good effects in improving the long-term prognosis of patients and is beneficial to the social burden in the long run.

For different questionnaires, the GOAL, STOP-Bang, and NoSAS scores show significant differences between the normal or mild OSA group and the severe OSA group. This suggests that these questionnaires may have certain value in distinguishing between normal/mild and severe OSA. The Berlin Questionnaire does not show significant differences between the mild OSA group and the moderate-to-severe OSA group, but it does show statistically significant differences in comparisons between other groups. This indicates that the Berlin Questionnaire may have limitations in differentiating between certain degrees of OSA but can still be useful in other group comparisons. The ESS score only shows statistical significance in the comparison between the severe OSA group and other groups, suggesting that the ESS may be more sensitive for identifying severe OSA. Therefore, these screening questionnaires should be used in combination when screening elderly patients for OSA to maximize their value. Additionally, other symptoms and signs may need to be incorporated into the screening process when necessary. The NoSAS Score is a tool newly developed in a Swiss cohort and subsequently validated in a Brazilian cohort9. Studies have shown that compared with the Berlin and Stop-Bang questionnaires, NoSAS had better discrimination among the general population32. However, some studies have shown that STOP-Bang is superior to NoSAS and Berlin, and can be used for OSA screening in communities and hospitals. A meta-analysis found that STOP-Bang was a more accurate screening tool for detecting mild, moderate and severe OSA than Berlin, STOP-Bang and ESS33. Berlin resulted from the Primary Care Sleep Conference held in Berlin, Germany in April 1996. Despite being widely used in clinical practice, the sensitivity and specificity of these tools vary in different patient populations depending on age, gender and the presence of comorbidities. One study has shown that compared with the STOP-Bang and ESS scores, Berlin had the best performance in screening for OSA in older adults, which was consistent with our findings34. Berlin is divided into three types of questions. In Category 1, there are four questions related to snoring and one about apnea during sleep; Category 2 covers questions about sleep while driving, fatigue and sleepiness; and Category 3 is associated with the presence of BMI and systemic arterial hypertension (SAH). The Berlin Questionnaire was designed with many questions, including sleepiness, hypertension, etc., and many elderly people are accompanied by basic diseases, which makes the Berlin Questionnaire effective in screening for OSA in older adults.

Shortcomings of this study: This single-center retrospective study was mainly conducted on the population of Guangdong Province, which cannot represent the general population of China. However, the research center was the National Center for Respiratory Medicine, and the patients came from all over the country, which could make up for the above deficiency to a certain extent. In addition, the questionnaire contents recorded in this study consisted of information recorded in accordance with strict procedures before our center performed sleep detection, which was completed by patients and their families together, and the deviation was basically negligible. Besides, the sample size of this study was small, and the study focused on elderly individuals undergoing sleep monitoring. However, it does not represent the general elderly population but rather a clinical cohort of patients suspected of obstructive sleep apnea (from sleep clinics). In the future, national multi-center large-scale studies, as well as the Shenzhen international multi-center collaborative study, which include community-based patients, are worthy of further exploration to validate the clinical efficacy of these screening tools.

Conclusion

To sum up, in the screening of OSA, the comprehensive use of these five screening questionnaires for suspected OSA in elderly patients aged 60 years and older is valuable. As such, the combined use of these questionnaires is worthy of promotion and application in communities and remote areas where medical resources are scarce, as it holds significant potential for preventing complications and reducing the medical burden on patients.