Introduction

Rheumatoid arthritis (RA) is an autoimmune inflammatory disease affecting about 0.5% of the global population [1]. It is two times more prevalent in women than men [2]. Although the aetiology of RA is not fully elucidated, it is widely accepted that genetic, environmental, and autoimmune factors are involved in the development of the disease [3]. Diet, as a modifiable environmental factor, has gained increasing attention for its role in RA prevention [4].

The Mediterranean diet (MD) is one of the most well-known and evaluated dietary patterns, characterized by high consumption of fruits, vegetables, legumes, cereals, fish, and olive oil, moderate consumption of alcohol, and low consumption of meat and dairy products [5]. While it originated in the Mediterranean region, it has been widely promoted for its health benefits beyond that specific geographic area, such as the UK [6], in preventing numerous chronic diseases. Adherence to MD reduces the incidence of cardiovascular diseases, diabetes mellitus, dementia, and mortality in both Mediterranean and non-Mediterranean regions [7,8,9,10,11,12]. Given its anti-inflammatory and weight loss properties, the MD may have a protective effect against RA [13, 14]. Firstly, being overweight/obese is a risk factor for developing RA [15]. A meta-analysis of prospective cohort studies revealed that adherence to the MD was inversely associated with the risk of being overweight/obese, as well as five-year weight gain [16]. Secondly, inflammation has been proven to play an important role in the pathogenesis of RA. The MD rich in multiple anti-inflammatory nutrients was associated with lower serum level of inflammatory markers such as C-reactive protein (CRP) and interleukin (IL)-6 [17].

However, most existing relevant studies only focused on some of the MD components but not MD as a whole. Consumption of individual components of MD, such as fish and olive oil, has been shown to reduce the incidence of RA [18, 19]; and a meta-analysis of eight prospective studies demonstrated a J-shaped association between alcohol and the risk of RA [20]. To our best knowledge, few studies investigated the association between MD and the risk of RA and yielded inconclusive results. A case-control study in Sweden found an inverse relationship between MD and the presence of RA [21]. However, the Nurses’ Health Study (NHS) I&II based study did not observe any association between MD and the risk of RA [22]. Although two systematic reviews summarized the previous evidence on this theme, they did not conduct the meta-analysis to pool results, and their conclusions indicated that the association between MD and the risk of RA needs further study.

Therefore, we aimed to investigate the association between MD and the risk of RA based on a cohort study using the UK Biobank (UKB) data and incorporating the results with other regional studies in an updated systematic review with meta-analysis. We hypothesized that adherence to MD is inversely associated with the incidence of RA.

Methods

This study included a cohort study and an updated systematic review with meta-analysis of all available studies to date.

Cohort study

Data source and study population

We retrieved data from the UKB, which is a population-based cohort comprising more than 0.5 million participants aged 40 to 70 years from 22 sites across England, Wales, and Scotland. The full details of the UKB can be found elsewhere [23]. The baseline survey was conducted between 2006 and 2010. Information on demographics, lifestyles, and medical history was obtained via a touchscreen questionnaire and face-to-face interviews. Physical measurements and blood samples were also collected. Data were linked to hospital records, primary care records, and death registries. From 2009 to 2012, a web-based 24-h recall dietary instrument (Oxford WebQ) was used to collect detailed dietary information up to five times. The first assessment was completed during the baseline survey. Subsequently, participants received emails every three to four months (four times in total) inviting them to complete the Oxford WebQ online via their own computer.

The current cohort study included UKB participants enrolled from 2006 to 2010, who had completed Oxford WebQ on at least two occasions and had no RA or Juvenile RA before the last Oxford WebQ assessment. Then, we excluded participants if they (i) had left the UK before completing Oxford WebQ; (ii) reported an implausible energy intake (females: <500 or >3500 kcal/day; males: <800 or >4000 kcal/day) [24]; (iii) had missing information on covariates (detailed in the following subsections). Given the potential for delays between the symptoms onset and RA diagnosis and dietary information collected close to the diagnosis of RA might not reflect habitual dietary intake, we further excluded participants who developed RA within two years from the last Oxford WebQ assessment to minimize the reverse causation [25, 26]. The final analysis included 117,341 participants (Fig. 1). Follow-up began on the date of the last completed diet assessment and ended on the date of the RA incident (773 participants), the date of loss to follow-up (i.e., left the UK or withdrew consent for future linkage, 296 participants), the date of death (4820 participants), or the last date with available information (30 September 2021, 111,452 participants), whichever came first.

Fig. 1
figure 1

Flowchart of study participants in cohort study.

Dietary assessment

The Oxford WebQ was developed for large population studies and validated against an interviewer-administered 24-h recall questionnaire [27]. It collects information on the quantities of 206 widely consumed foods and 32 beverages consumed over the previous day. Food and beverage quantiles were calculated by multiplying the number of portions consumed by the set quantity of each food/beverage portion size [28]. Alcohol and energy intake were estimated by multiplying the quantiles of each food/beverage by its alcohol and energy content, respectively, according to the UK Nutrient Databank food composition tables [29, 30]. The average intake from multiple Oxford WebQ assessments was calculated to reflect each participant’s daily diet.

We calculated a validated literature-based adherence score, named MEDI-LITE score, to assess the degree of adherence to MD [31]. Food groups that are beneficial in MD [fruits, vegetables (without potatoes), legumes, cereals, fish, and olive oil] were assigned 2 points for highest consumption, 1 point for middle consumption, and 0 points for lowest consumption. Food groups that are harmful in the MD (meat and dairy products) scored negative points. For alcohol, the lowest, middle, and highest consumption were assigned 1 point, 2 points, and 0 points, respectively. The cut-off value of each component is provided in Supplementary Table S1. The total score (ranged 0–18) was the sum of all components, with higher scores representing higher adherence.

Outcome

The outcome is the incident RA determined using hospital records, primary care records, death registry, and self-reporting. We defined incident RA by International Classification of Diseases, Tenth Revision (ICD-10) codes M05 and M06. The Read code used in primary care was converted to ICD-10 codes using the UKB Lookup table. Regarding self-reported RA, participants were invited to answer whether they had been told by a doctor that they had RA.

Covariates

Demographic data (age, sex, ethnicity, Townsend deprivation index (TDI), education), lifestyle factors (smoking status, sleep duration, physical activity) [22, 32], and health indicators (hypertension, diabetes, other autoimmune diseases [33], genetic predisposition of RA [21]) were collected at baseline. These covariates were chosen based on previous literature as potential confounders for the exposure-outcome association. Additional covariates were considered for females only, including the age of menarche, menopausal status, parity, use of oral contraceptives, and use of hormone replacement therapy. Rheumatoid factor (RF) was considered as a potential modifier as participants with positive RF ( > 14 IU/mL)[34] were at high risk of RA. The detailed information on these factors is provided in Supplementary Text 1.

Statistical analysis

We used Kruskal-Wallis H test or chi-square test to compare characteristics across quartiles of MEDI-LITE score. In main analyses, the Cox proportional hazard (PH) model was employed to calculate the hazard ratio (HR) and 95% confidence interval (CI) for estimating the association between adherence to MD and the incidence of RA, using MEDI-LITE score as a categorical (four-factor quartile) variable. Model 1 was a crude mode with MEDI-LITE as the only predictor; Model 2 adjusted for age and sex; based on Model 2, ethnicity, TDI, education, smoking status, sleep duration and physical activity, and energy intake were added to Model 3. Model 4 was further adjusted for other autoimmune diseases, hypertension, diabetes, and genetic predisposition. We additionally analyzed the MEDI-LITE score as a standardized continuous variable (z-score) in Cox PH models, given the absence of established clinical cutoffs. The HR represents the risk associated with each standard deviation (SD) increase in adherence. The linear trend was tested by assigning the median value of each quartile of the MEDI-LITE score as a continuous variable. The PH assumption was examined using Schoenfeld residuals, a diagnostic method evaluating whether hazard ratios remain constant over time by testing the correlation between residuals and event times.

Stratified analyses were conducted by sex, age group, education, smoking status, history of hypertension, genetic predisposition, and RF, with potential modifier effects assessed through significance testing of multiplicative interaction terms between the MEDI-LITE score and subgroup variables. Additionally, we explored the associations between each component of the MEDI-LITE score and RA risk separately. Multivariable restricted cubic spline (RCS) with three knots (at the 10th, 50th, and 90th percentiles, selected based on Akaike Information Criteria) was used to explore the potential non-linear association of MEDI-LITE score and alcohol consumption with the risk of RA.

In the sensitivity analyses, we repeated the main analysis (i) that defined RA as at least two RA diagnosis records, (ii) that excluded RA cases determined by self-reporting only, (iii) among participants who completed at least three occasions Oxford WebQ, which may better reflect usual intakes, (iv) on 20 imputation datasets generated through multiple imputations by chained equations for covariates with missing values, and (v) that excluded participants with other autoimmune diseases, who are at high risk for RA and may alter diet following the diagnosis of autoimmune disease [35, 36].

All analyses were conducted using R 4.2.1. The two-tailed P < 0.05 was considered the significance level.

Systematic review and meta-analysis

This systematic review was pre-registered in PROSPERO (CRD42023481228).

Data sources and search strategy

We conducted a comprehensive search of PubMed, Embase, Scopus, Web of Science, CNKI, and SinoMed from database inception to November 2023 without language restriction. The search strategy involved the terms related to “Mediterranean diet OR dietary pattern” and “rheumatoid arthritis”, which was developed in consultation with a Medical Librarian. The detailed strategy is provided in Supplementary Table S2. The systematic review also included the current cohort study (UKB-Based study). Additionally, the reference lists of all included studies were reviewed for relevant papers.

Study eligibility

Studies were included if they: (i) were observational studies (cross-sectional, case-control, or cohort studies) or intervention studies (non-randomized trials or randomized control trials); (ii) examined the relationship between the MD and risk of RA development; or (iii) reported effect size data as odds ratio (OR), relative risk (RR) or HR with 95% confidence interval or provided data allowing for their calculation. Animal studies were excluded. For duplicate publications of the same dataset, the inclusion prioritized those with a larger sample size. Only the journal publication was included if a study was published in both a conference and a journal.

Study screening, data extraction, and quality assessment

Two reviewers (P.H and Q.L) independently evaluated the eligibility of studies through title/abstract screening and full-text assessment. The following information was extracted by two reviewers independently: first author, year of publication, study design, ___location, project name, sample size, duration of follow-up (for cohort study), age, sex, assessment of adherence to MD, assessment of RA, variable of adjustment, and reported effect sizes. For studies reporting effect sizes in the form of multiple statistical adjustment models, the model with the highest number of covariates was selected. The quality of the included studies was assessed using the Newcastle–Ottawa Scale (NOS). The NOS scores were converted to the Agency for Healthcare Research and Quality (AHRQ) standards (good, fair, poor) [37].

Any discrepancies were resolved through discussion and consensus.

Statistical analysis

Due to the low incidence of RA ( < 0.01), the OR serves as a good approximation to HR and RR [38, 39]. Therefore, the OR (95% CI) was considered the reported effect size in this study. Heterogeneity was examined by applying the Cochrane Q test and I2 statistic. We employed fixed effect meta-analysis (I2 < 50%) for the main analysis to calculate the pooled OR for the highest versus lowest adherence to the MD and risk of RA. We conducted two types of sensitivity analyses. Firstly, we applied a random effect model to calculate the pooled OR to obtain a more conservative estimate of the effect of the MD. Secondly, we stepwise exclude each study from the main analysis to re-estimate the pooled OR. A series of subgroup analyses were conducted by study design, number of cases, study quality, and sex of participants. We evaluated the publication bias by visually evaluating the funnel plot, as Egger’s test was inapplicable with fewer than 10 studies. We additionally explored the dose-response association with the score being modeled using RCS with three knots at percentiles 10%, 50%, and 90% of the distribution. For studies that did not use a 9-point scale to assess adherence to the MD, we transformed the scores to a 9-point scale [40]. One study was excluded from the dose-response analysis due to insufficient data. The attempt was made to contact the author, but no response was received.

All analyses were conducted using R 4.2.1. The two-tailed P < 0.05 was considered the significance level.

Results

Cohort study

A total of 117,341 participants were included in the final analysis. Baseline characteristics according to quartiles of MEDI-LITE score were displayed in Table 1. The mean MEDI-LITE score in this cohort was 8.9 ± 2.8. Older age, female, and non-white participants were more likely to adhere to MD. Participants with the highest MEDI-LITE score had higher TDI, higher education level, higher physical activity level, and slept longer; they were also more likely to be non-smokers and had no history of other autoimmune diseases, hypertension, and diabetes. Females who were in the highest quartiles of the MEDI-LITE score tended to have gone through menopause and not use oral contraceptives.

Table 1 Baseline characteristics of participants according to METI-LITE score in the cohort study.

During a median follow-up of 9.42 years (1,098,276 person-years), 773 participants developed RA. No violation of the PH assumption was observed (all Schoenfeld residual test P > 0.05). The risk of RA was lower in the highest quartile of the MEDI-LITE score compared with the bottom quartile (HR = 0.713, 95% CI = 0.580 to 0.876). Per SD increment of MEDI-LITE score was associated with a lower risk of RA (HR = 0.873, 95% CI = 0.812 to 0.939) (Table 2). No non-linear association between MEDI-LITE score and the risk of RA was found (Pnon-linearity = 0.509) (Fig. 2A). No modifier effect of the sex, age group, education, smoking status, hypertension, genetic predisposition, and RF was found (all Pinteraction > 0.05, Supplementary Fig. S1).

Fig. 2: Adjusted HR (solid lines) and 95% CI (dashed lines) for risk of rheumatoid arthritis by the MEDI-LITE score and alcohol consumption.
figure 2

A Model was adjusted by age, sex, ethnicity, Townsend Deprivation index, education, energy intake, physical activity, smoking, sleep duration, history of other autoimmune diseases, hypertension, diabetes, and genetic predisposition; B Model was further adjusted by other eight MEDI-LITE score components.

Table 2 Hazard ratios for the associations between METI-LITE score and risk of rheumatoid arthritis in the cohort study.

Regarding the MEDI-LITE score components, adherence to the MD recommendations for legumes and olive oil was associated with a reduced risk of RA (Supplementary Table S3). Multivariable-adjusted RCS results showed a J-shaped association between alcohol intake and RA risk (Pnon-linearity = 0.006) (Fig. 2B). The risk of RA was reduced with increasing alcohol intake until 28 grams/day, and thereafter the risk increased slowly and gradually.

The association between MD and the risk of RA was robust in several sensitivity analyses (Supplementary Table S4).

Systematic review and meta-analysis

After deduplication, 2068 records remained for title or abstract screening or full-text assessment. Finally, five articles (6 studies, as Hu et al. [22]. reported the results from two cohorts in their article) were included in this review (Supplementary Fig. S2).

Table 3 shows the characteristics of the included studies, including four cohort studies and two case-control studies. Two studies were conducted in the US, two in Sweden, one in France, and one in the UK. These studies consisted of 362,268 participants, with 4273 cases of RA. Three studies were conducted only in females, and the other three involved both males and females. The NOS scores of all included studies were equal to seven or higher, and two studies [22] were qualified as fair-quality (Supplementary Tables S5, S6).

Table 3 Characteristics of included studies of the systematic review and meta-analysis.

As shown in Fig. 3, the pooled OR for the highest versus lowest adherence to the MD and risk of RA was 0.838 (0.758,0.926), with low heterogeneity (I2 = 17%, p = 0.30). This association remained significant in sensitivity analysis using the random effect model (Fig. 3) or excluding each study at a time (Supplementary Fig. S3). No subgroup difference was detected between study design, number of cases, or study quality. Regarding subgroups by sex, two studies reported results for males and five for females. The association was stronger in males than in females (p = 0.002, Supplementary Table S7). After excluding fair-quality studies, the association remained significant in females (OR = 0.857, 95% CI = 0.743 to 0.989). No substantial evidence for publication bias was found by the funnel plot (Supplementary Fig. S4). The dose-response analysis found a nonlinear relationship between adherence to the MD and the risk of RA (p non-linear < 0.001). As the adherence to MD increased, the curve slope tended to flatten (Supplementary Fig. S5).

Fig. 3
figure 3

Forest plot for the association between the highest vs. lowest adherence to the Mediterranean diet and the risk of RA. *the current cohort study.

Discussion

The present study aimed to investigate the association between adherence to the MD and the risk of RA using UKB data. To provide a comprehensive evaluation, we conducted a systematic review with meta-analysis, incorporating our results with other regional studies. The findings from both the cohort study and the meta-analysis support that adherence to the MD is associated with a reduced risk of RA development.

Previous studies have yielded inconclusive results. For example, the NHS and NHS II [22], which were assessed as fair-quality studies in the systematic review, demonstrated no association between MD and the risk of RA. Their studies only recruited nurses who generally have higher health literacy and tend to adopt healthier lifestyles. The variation in their participants’ adherence to the MD might be too small to detect the association. It may introduce selection bias, potentially resulting in an underestimation of the true effect of MD. Our results demonstrated that the protective effect was stronger when only good-quality studies were considered. The effect size (HR, OR) of other included studies was all less than one, indicating a protective effect of MD on the risk of RA. Although two studies [32, 41] did not reach statistical significance, this could be attributed to insufficient sample size.

Although the role of MD in RA pathogenesis is not yet clearly understood, it has been postulated to involve reducing inflammation and promoting weight loss, thereby lowering the risk of developing RA [4]. The anti-inflammatory properties of MD can be attributed to its abundant supply of anti-inflammatory nutrients, such as fiber in vegetables and cereal, as well as the presence of carotenoids, flavonoids, and vitamin C in fruits, magnesium in legumes and vegetables, and omega-3 fatty acids in fish and olive oil [42]. In contrast, the MD is low in pro-inflammatory nutrients like saturated fatty acids and heme iron in meat [43]. Furthermore, evidence was found supporting the effectiveness of MD in reducing weight/BMI [44], which is a risk factor in RA development. Obesity can disrupt the immune system, leading to abnormal immune reactions against self-tissues by releasing pro-inflammatory cytokines such as interferon-α and IL-6 [45, 46].

In this cohort study, only three components (olive oil, legumes, and alcohol) were associated with the risk of RA, which suggests that the overall effect of the MD might be greater than that of individual foods. The effect of other components might be too small to be detected, and combinations of foods may work synergistically to reduce the risk of RA [47]. Our findings revealed a J-shaped association between alcohol consumption and RA risk, suggesting a protective effect against RA development at moderate levels but heightened susceptibility with excessive intake. This phenomenon could be attributed to the differential immunomodulatory effects of ethanol concentrations, with moderate alcohol consumption suppressing pro-inflammatory cytokines while heavy drinking disrupts immune homeostasis and a pro-inflammatory state [48]. A study examined the relationship between alcohol consumption and inflammatory markers in preclinical RA patients, finding a U-shaped association between daily alcohol intake and IL-6 levels before RA symptoms appeared [49].

To our best knowledge, this is the first cohort study involving both males and females to examine the association between MD and the incidence of RA. Secondly, our cohort study has investigated important potential confounders, including genetic predisposition. Additionally, external validation by a systematic review confirmed our finding. However, our study has the following limitations. Firstly, dietary information was collected using the 24-h recall questionnaire rather than the food frequency questionnaire of the past year. However, the study of Carter et al. showed that ≥2 Oxford WebQ assessments could capture usual intake habits [50]. In the sensitivity analysis, the results of using ≥3 Oxford WebQ assessments showed a similar association as the main analysis, supporting the robustness of the results. Secondly, compared to the general population in the UK, UKB participants were more likely to be female, healthier, older, and well educated, particularly those who voluntarily completed the Oxford WebQ [51, 52]. This may introduce selection bias, potentially leading to an overestimation of the observed association in our study, as healthier and better-educated participants may have more favorable overall lifestyle factors [53], which could independently contribute to a reduced risk of RA. Thirdly, although many confounders have been considered, residual confounding may still arise from unmeasured biological variables (e.g., gut microbiome composition) or socioeconomic factors (e.g., food accessibility). Fourthly, the dietary information and the potential confounders were only measured at baseline, without dynamic follow-up. This may fail to capture the changes in these factors during the study period, which could hinder obtaining more accurate estimates of the observed associations. Lastly, the studies included in the systematic review primarily involve the Western population. Given that cultural traditions and geographical ___location influence adherence to MD [54, 55], and genetic polymorphisms are known to modulate nutrient metabolism pathways [56], the extrapolation of the observed associations to other ethnic populations may be limited.

Conclusions

Higher adherence to MD was associated with a lower risk of RA. Our finding emphasizes the importance of diet in RA development and provides novel directions for the prevention of RA. However, further investigations are required to fully understand the exact mechanisms involved.