Introduction

Ischemic stroke (IS) remains a leading cause of death globally, accounting for 70-80% of all stroke cases1,2,3. According to the Global Burden of Disease Study (GBD 2021), approximately 7.63 million individuals were diagnosed with IS in 2019, leading to an economic loss of approximately $96.451 billion, placing a significant burden on the global economy4,5. The “China Stroke Report 2020” identified stroke as the primary cause of premature death in China, with a notable increase in IS cases. Furthermore, 41% and 17% of IS patients experience recurrence within five years and one year, respectively6,7,8. According to the 2019 GBD data, stroke is the leading cause of disability-adjusted life years (DALYs) in China, surpassing heart disease and cancers of the respiratory and digestive systems9,10. In 2019, China recorded 4.33 million hospitalizations for IS, with hospital costs quadrupling since 2010, posing a significant health burden and prevention challenge11. Given the high burden of IS and the limitations of secondary prevention, identifying modifiable risk factors and upstream molecular mechanisms has become increasingly important.

Key modifiable risk factors for IS prevention include hypertension, diabetes, dyslipidemia, heart disease, smoking, alcohol intake, diet, overweight/obesity, physical inactivity, and psychological factors9,12,13. Hypertension is the most significant risk factor for stroke, with persistent hypertension after IS increasing the risk of poor outcomes and recurrence within one year14,15,16. Diabetes affects approximately 14% of adult stroke patients in China and significantly increases stroke-related mortality17,18. Dyslipidemia independently contributes to stroke risk; each 1 mmol/L increase in total cholesterol and low-density lipoprotein cholesterol significantlyly raises IS risk19,20. A study with a 34-year follow-up found that age-adjusted stroke incidence was over double for patients with coronary heart disease compared to those without21. Smoking increases stroke occurrence and worsens prognosis, whereas quitting smoking substantially reduces recurrence and mortality22. Alcohol consumption proportionally increases stroke-related mortality23. Excess weight and obesity significantly elevate stroke risk, with metabolic abnormalities further amplifying this risk24,25. Clarifying the genetic underpinnings of these modifiable risk factors is essential for identifying precise biomarkers and therapeutic targets, promoting personalized prevention and clinical intervention strategies.

The rapidly evolving field of metabolomics plays a pivotal role in elucidating IS mechanisms, facilitating the systematic identification of metabolites and pathways essential in IS prevention and treatment, and highlighting potential therapeutic targets. Compared to proteomics or transcriptomics, metabolomics uniquely detects small-molecule metabolites capable of crossing the blood-brain barrier, offering distinct advantages for IS biomarkers discovery26. Metabolites, as intermediate or final products of metabolic reactions, influence disease risk and therapeutic targets. Mendelian randomization (MR), which leverages genetic variation from genome-wide association studies (GWAS), investigates causal relationships between exposures and diseases while minimizing confounding and reverse causality27,28,32. In 2023, a large-scale GWAS published in Nature Genetics analysed 1,091 metabolites and 309 metabolite ratios from 8,299 individuals in the Canadian Longitudinal Study on Aging (CLSA), providing insights into their genetic architecture and relevance to IS, presenting opportunities for therapeutic target discovery29. A systematic search revealed only one study exploring the correlation between 1,400 plasma metabolites and IS30. No study has systematically explored potential metabolite targets demonstrating a causal relationship with IS and IS-related risk factors among these 1,400 plasma metabolites or conducted mediation effects and MR analysis across the entire phenome.To systematically investigate the causal pathways and potential therapeutic targets between the human plasma metabolome and IS, we conducted a two-sample MR analysis involving 1,400 plasma metabolites. Genetic instrumental were derived from GWAS data from 8,299 participants, and summary statisticsfor IS and 11 established risk factors were obtained from the Integrated Epidemiology Unit database. We further examined the mediating roles of these risk factors in the metabolite-IS associations and explored the disease implication of IS-associated metabolites through a phenome-wide MR analysis across 3,948 phenotypes from the UKBB GWAS31.

Materials and methods

Study design

In this study, we initially conducted a two-sample MR analysis on 1,400 metabolites and IS to investigate their causal relationship. Subsequently, we employed two-sample MR to identify the causal relationships between IS and 11 common clinical risk factors: body mass index (BMI), low-density lipoprotein (LDL), high-density lipoprotein (HDL), non-high-density lipoprotein (non-HDL), systolic blood pressure (SBP), diastolic blood pressure (DBP), smoking, alcohol consumption, coffee intake, type 2 diabetes (T2D), and coronary heart disease (CHD). Additionally, we performed two-sample MR analyses to determine the causal relationships between metabolites and each positive risk factor. We conducted mediation analysis on metabolites with causal relationships to both IS and IS-related risk factors, aiming to elucidate the underlying mechanisms through which these metabolites affect IS by influencing IS risk factors. Finally, we carried out a PheWAS MR analysis to comprehensively explore metabolites with causal regulatory relationships with IS, assessing potential side effects and additional indications for IS-associated metabolites across 3,948 phenotypes. The overall study design is illustrated in Fig. 1. The STROBE-MR (Strengthening the Reporting of Observational Studies in Epidemiology using Mendelian Randomization) checklist was completed for this observational study32,33.

Fig. 1
figure 1

Overall study design. IS ischemic stroke, MR Mendelian randomization, SNPs single nucleotide polymorphisms, GWAS genome-wide association studies, CLSA Canadian Longitudinal Study on Aging, BMI body mass index, LDL low-density lipoprotein, HDL high-density lipoprotein, non-HDL non-high-density lipoprotein, SBP systolic blood pressure, DBP diastolic blood pressure, T2D type 2 diabetes, CHD coronary heart disease, MR-PRESSO MR pleiotropy residual sum and outlier, FDR false discovery rate, OR odds ratio, IVs instrumental variables, PheMR phenome-wide MR.

Data sources

We obtained summary data for 1,091 plasma metabolites and 309 metabolite ratios from a large genome-wide association study (GWAS) involving 8,299 European individuals in the Canadian Longitudinal Study on Aging (CLSA) cohort29. The data is accessible on the GWAS Summary Statistics. For ischemic stroke (34,217 cases and 406,111 controls) and 11 risk factors (BMI, LDL, HDL, non-HDL, SBP, DBP, smoking, alcohol consumption, coffee intake, T2D, and CHD), we retrieved GWAS data (ebi-a-GCST005843) from the Integrative Epidemiology Unit (IEU) database (https://gwas.mrcieu.ac.uk/). The search keywords included: ischemic stroke/IS, body mass index/BMI, low-density lipoprotein/LDL, high-density lipoprotein/HDL, non-high-density lipoprotein/non-HDL, systolic blood pressure/SBP, diastolic blood pressure/DBP, smoking, alcohol, coffee, type 2 diabetes, and coronary heart disease/CHD. The search covered the database’s entire duration up to February 9, 2024. Samples for MR analysis were prioritized based on their sample size. We utilized summary data from the UK Biobank’s online repository for PheWAS analysis. The data, curated in the IEU database, includes 3,948 phenotypes encompassing diagnoses, current health status, treatment records, biochemical analyses, body and anthropometric measurements, family history, lifestyle factors, and mental health. Detailed information about the sources, original questionnaires, or measurements can be found on the UK Biobank. Detailed information on the GWAS datasets used for ischemic stroke and its risk factors is summarized in Table 1.

Table 1 Summary of GWAS datasets for ischemic stroke and its clinical risk factors.

Identification of causal plasma metabolites for IS

We extracted exposure data for 1,400 plasma metabolites and conducted SNP selection based on on a genome-wide significance threshold of p < 1e-05. SNPs with linkage disequilibrium (kb = 10,000, r² = 0.001) and weak instruments (F-statistic < 10) were removed. The causal effects of each metabolite on IS were evaluated using five MR methods: MR Egger, weighted median, inverse variance weighted, simple mode, and weighted mode, as described in the Statistical Analysis section. To ensure the directionality of the causal relationship, we further performed reverse MR analysis by treating IS as the exposure and positive metabolites as the outcome to rule out reverse causality.

Identification of causal risk factors for IS

Based on stroke risk factors outlined in the 2021 Guideline for the Prevention of Stroke in Patients With Stroke and Transient Ischemic Attack9, we selected 11 common clinical risk factors for IS: BMI, LDL, HDL, non-HDL, SBP, DBP, smoking, alcohol consumption, coffee intake, T2D, and CHD, to establish causal inferences with IS. GWAS summary data for these exposure factors were obtained, and SNPs were selected using p < 5 × 10⁻⁸ as the significance threshold. Instrumental variables with linkage disequilibrium or weak instrument strength were excluded. MR analysis was performed to assess the causal effects of each risk factor on IS,, following the procedures detailed in the Statistical Analysis section.

Identification of causal metabolites for IS risk factors

We conducted MR analysis to explore the causal effects of 1,400 metabolites on the 11 IS-related risk factors. SNPs were selected using the same criteria as described above. MR analysis procedures and filtering criteria were consistent with those described in the Statistical Analysis section.

Mediation analysis for metabolites, risk factors, and IS

For metabolites causally associated with both IS and risk factors, we performed mediation analysis to quantify indirect through the identified risk factors. Inclusion criteria were: (a) positive MR results for metabolites and risk factors; (b) positive MR results for risk factors and IS; (c) positive causal association between metabolites and IS without reverse causation. To distinguish between direct and indirect effects, we employed two-step MR results. The Product method was used to estimate the beta of the indirect effect, while the Delta method was used to calculate the standard error (SE) and confidence interval (CI)34.

Phenome-wide MR (Phe-MR) analysis of 3948 phenotypes for metabolites causally associated with IS

Phenome-Wide Association Study (PheWAS) is a method that examines associations between a specific SNP or phenotype and all phenotypes within a phenome. To extend the exploration of side effects and additional indications for the nineteen IS-associated metabolites to non-IS phenotypes, we conducted Phe-MR analyses across a broad spectrum of diseases. Using metabolites with positive MR results for IS and leveraging summary data from the IEU database encompassing 3,948 phenotypes in the UK Biobank, we performed PheWAS-MR analyses. Genetic instruments for the 19 significant metabolites were selected using the same criteria as described previously. MR analyses were performed as detailed in the Statistical Analysis section. Associations with p < 0.05 were considered suggestive.

Statistical analysis

MR estimates were based on five methods (MR Egger, weighted median, inverse variance weighted, simple mode, and weighted mode). The primary MR estimates were calculated using the inverse-variance weighted (IVW) method under multiplicative random effects. Instrumental variables (SNPs) were selected using genome-wide significance thresholds (p < 1 × 10⁻⁵ for metabolites and p < 5 × 10⁻⁸ for risk factors), with linkage disequilibrium (LD) pruning (kb = 10,000, r² = 0.001) and exclusion of weak instruments (F-statistic < 10). The Benjamini-Hochberg method that controls the FDR was applied to correct for multiple testing. The association with a Benjamini–Hochberg adjusted p-value < 0.2 35,36 was deemed statistically significant. The MR results were filtered based on p < 0.05, FDR < 0.2, consistent OR direction across five MR methods, and pleiotropy > 0.05. Sensitivity analysis was conducted to validate the robustness of MR results, including MR-Egger, MR-PRESSO, pleiotropy, and heterogeneity. The mediation effect was calculated by the formula: 1– direct effect (the estimate after adjusting for the mediator)/total effect (the estimate in the univariable MR analysis). All analyses were two-sided and performed using the Two Sample MR, Mendelian Randomization, and MRPRESSO R packages in R software 4.2.1.

Results

Identification of IS-associated metabolites

We obtained GWAS data for 1,400 metabolites and IS GWAS data (ebi-a-GCST005843) from the CLSA cohort and IEU databases29. We conducted an association analysis on the metabolite data, excluding SNPs in linkage disequilibrium. Five MR methods were applied to determine the causal relationships between these metabolites and IS, estimating ORs, heterogeneity, and pleiotropy. Additionally, scatter plots, forest plots, funnel plots, and leave-one-out sensitivity analysis forest plots were generated. We summarized the overall results in a circular clustering heatmap (Fig. 2). We identified 19 IS-associated metabolites based on p-value, FDR, and OR values, excluding those with pleiotropy. Using MR-PRESSO, we identified 19 metabolites causally associated with IS, including levels of 1-linoleoylglycerol (18:2), 1-stearoyl-GPG (18:0), S-methylcysteine sulfoxide, 4-methylcatechol sulfate, 1-oleoyl-GPG (18:1), glycodeoxycholate 3-sulfate, 1-linoleoyl-GPG (18:2), 1-oleoyl-2-linoleoyl-GPE (18:1/18:2), 1-linoleoyl-2-linolenoyl-GPC (18:2/18:3), octadecenedioate (C18:1-DC), octadecadienedioate (C18:2-DC), N-succinyl-phenylalanine, 1-palmitoyl-2-linoleoyl-GPC (16:0/18:2), X-11,299, X-24,546, X-24,951, the arginine to phosphate ratio, the aspartate to mannose ratio, and the cholesterol to linoleoyl-arachidonoyl-glycerol (18:2 to 20:4) ratio. The integrated results are presented in a forest plot (Fig. 3). Among these metabolites, levels of 1-linoleoylglycerol (18:2), S-methylcysteine sulfoxide, and the aspartate to mannose ratio showed positive associations with IS, indicating them as risk factors. The remaining metabolites were negatively associated with IS, suggesting their roles as protective factors.

Fig. 2
figure 2

Circular clustering heatmap of the causal relationship between 1,400 metabolites and IS based on five MR methods. MR Mendelian randomization, IVW inverse variance weighted. GWAS data for 1400 metabolites from the CLSA cohort (Supplementary Table S1). IS GWAS data from the IEU databases (ebi-a-GCST005843).

Fig. 3
figure 3

Effects of 19 potential causal metabolites on IS outcomes. Primary MR estimates were calculated using the inverse-variance weighted method under multiplicative random effects. Using MR-PRESSO, we identified 19 metabolites that were causally associated with IS.

Identification of likely causal IS risk factors

We investigated the association between SNPs from 11 risk factors (BMI, LDL, HDL, non-HDL, SBP, DBP, smoking, alcohol, coffee, T2D, and CHD), accounting for linkage disequilibrium to obtain instrumental variables. Five MR methods were used to analyze the causal relationships between these 11 risk factors and IS. The results are depicted in forest plots, funnel plots, scatter plots, and leave-one-out sensitivity analysis forest plots. The analysis suggests a causal relationship between all 11 risk factors and IS. BMI, HDL, SBP, DBP, and T2D showed positive correlations with IS, identifying them as risk factors (Fig. 4). In contrast, LDL, non-HDL, smoking, alcohol, coffee, and CHD demonstrated negative correlations with IS, indicating their roles as protective factors.

Fig. 4
figure 4

Causal effects of five risk factors on IS outcomes. MR analyses of the effect of risk factors on stroke outcomes. GWAS data of IS risk factors from the IEU databases. Primary MR estimates were calculated using the inverse-variance weighted method under multiplicative random effects. BMI, HDL, SBP, DBP, and T2D demonstrated positive correlations with IS, identifying them as risk factors. BMI body mass index, HDL high-density lipoprotein, SBP systolic blood pressure, DBP diastolic blood pressure, T2D type 2 diabetes.

Identification of IS risk factors associated metabolites

We conducted MR analysis on all 1,400 plasma metabolites with the eleven identified IS risk factors, calculating OR values, heterogeneity, and pleiotropy. The overall results are visualized with circular cluster heatmaps. Metabolites associated with the eleven risk factors were filtered based on p-values, FDR, OR values, and pleiotropy. After multiple testing correction, 136 metabolites were found to be associated with at least one IS risk factor.

We identified 44 metabolites associated with BMI, with 28 showing positive correlations and 16 showing negative correlations. For LDL-associated metabolites, 24 were identified, with 8 showing positive correlations and 16 showing negative correlations. For HDL, 42 metabolites were associated, with 17 positive correlations and 25 negative correlations. Twenty metabolites were associated with non-HDL, with an equal split of 10 positive and 10 negative correlations. Five metabolites were linked to SBP, with 3 positive correlations and 2 negative correlations. For DBP, 25 metabolites were associated, with 5 positive correlations and 20 negative correlations. Smoking was associated with 2 metabolites, both showing positive correlations. Only one metabolite was associated with alcohol, showing a negative correlation. Coffee was linked to 9 metabolites, with 5 positive correlations and 4 negative correlations. For T2D, 10 associated metabolites were found, with 6 positive and 4 negative correlations. Lastly, 5 metabolites were associated with CHD, with 3 positive correlations and 2 negative correlations. There was no evidence of horizontal pleiotropy, and sensitivity analyses provided consistent causal effect estimates. Among the 19 IS-associated metabolites, 4 were identified as being associated with one of the IS risk factors (Fig. 5). Of the 136 metabolites associated with at least one IS risk factor, 132 were associated with the risk factors but not with the IS outcome.

Fig. 5
figure 5

Among the nineteen IS-associated metabolites, four were identified as being associated with one of the IS risk factors. MR analyses of the effect of metabolites on IS and IS risk factors. The leftmost column shows 19 IS-associated metabolites. X-24,951 was identified as being associated with BMI. 1-stearoyl-GPG (18:0) and 1-oleoyl-2-linoleoyl-GPE (18:1/18:2) were identified as being associated with DBP. Octadecadienedioate (C18:2-DC) was identified as being associated with coffee intake. *Indicates that the causal association is significant, which passed p < 0.05, false discovery rate (FDR) < 0.2, consistent OR direction across five MR methods, and pleiotropy > 0.05. BMI body mass index, LDL low-density lipoprotein, HDL high-density lipoprotein, non-HDL non-high-density lipoprotein, SBP systolic blood pressure, DBP diastolic blood pressure, T2D type 2 diabetes, CHD coronary heart disease.

Mediation effect of four metabolites on IS outcomes via risk factors

To investigate the indirect effects of metabolites on IS outcomes via risk factors, we conducted a mediation analysis using effect estimates from two-step MR and the total effect from primary MR. This analysis focused on four metabolites that showed evidence of an effect in both MR analyses on risk factors and IS outcomes: 1-stearoyl-GPG (18:0), 1-oleoyl-2-linoleoyl-GPE (18:1/18:2), octadecadienedioate (C18:2-DC), and X-24,951. Indirect effects were estimated using the product method, with standard errors (SE) and confidence intervals (CI) calculated using the delta method. The mediation effect of 1-stearoyl-GPG (18:0) via DBP accounted for 9.82% (Fig. 6a), while that of 1-oleoyl-2-linoleoyl-GPE (18:1/18:2) via DBP accounted for − 12.4% (Fig. 6b). The indirect effect of octadecadienedioate (C18:2-DC) through coffee intake on IS risk accounted for − 2.97% of the total effect (Fig. 6c). Similarly, the mediation effect of X-24,951 through BMI on IS was − 2.85% (Fig. 6d).

Fig. 6
figure 6

Mediation effects of metabolites on IS via risk factors. Mediation analyses to quantify the effects of four metabolites on IS outcomes via risk factors. (a) 1-stearoyl-GPG (18:0) levels effect on IS mediated by DBP. (b) 1-oleoyl-2-linoleoyl-GPE (18:1/18:2) effect on IS mediated by DBP. (c) Octadecadienedioate (C18:2-DC) effect on IS mediated by coffee intake. (d) X-24,951 effect on IS mediated by BMI. βEM: effects of exposure on mediator, βMO: effects of mediator on outcome, βEO: effects of exposure on outcome. IVs instrumental variables.

Phenome-wide MR (Phe-MR) analysis of IS-associated metabolites

To evaluate the effects of the nineteen IS-associated metabolites on other conditions, we conducted an extensive MR screen of 3,948 diseases and traits in the UK Biobank. Metabolites are classified as “deleterious” if their effect direction on a disease aligns with their effect on IS, and “beneficial” if the direction is opposite.

Notably, several metabolites showed significant associations with diseases known to be comorbid or mechanistically linked with IS. For instance, 1-stearoyl-GPG (18:0), a metabolite positively associated with IS, also showed risk associations with coronary atherosclerosis, heart disease, and hypercholesterolemia, indicating shared vascular-metabolic pathways. Similarly, 1-linoleoylglycerol (18:2) was positively associated with heart failure, asthma, and type 2 diabetes—all recognized IS risk factors—supporting its deleterious role in cerebrovascular health. In contrast, Octadecadienedioate (C18:2-DC) exhibited protective effects against hypertension, heart disease, and type 2 diabetes, suggesting its potential as a systemic cardiometabolic modulator. X-24,951 was linked to hypertrophic cardiomyopathy and depression, implying a role in heart-brain axis pathology, while also being inversely associated with hypercholesterolemia and valvular disease. Furthermore, 1-oleoyl-2-linoleoyl-GPE (18:1/18:2) demonstrated risk associations with hypertension and gastric ulcers but protected against respiratory conditions like bronchitis and allergic rhinitis. Finally, S-methylcysteine sulfoxide, although associated with digestive diseases, showed inverse associations with coronary atherosclerosis, asthma, and musculoskeletal conditions. Together, these findings highlight that IS-associated metabolites often influence diseases beyond stroke, particularly within cardiovascular, metabolic, and neuropsychiatric domains, offering insights into comorbidity mechanisms and translational implications.

Discussion

This study is the first large-scale analysis employing GWAS data on 1,400 plasma to systematically identify IS biomarkers. Using GWAS data on 1,091 plasma metabolites and 309 metabolite ratios, we provide robust evidence of nineteen causal metabolites for IS. Elevated levels of 1-linoleoylglycerol (18:2), S-methylcysteine sulfoxide, and the aspartate to mannose ratio were positively associated with IS, indicating risk factors. The remaining sixteen metabolites show negative associations, suggesting they act as protective factors. A prior study reported thirteen causal metabolites that did not overlap with ours, despite also focusing on a European population, which may reflect differences in recruitment, outcome definitions, and overall study design. In addition, we employed a lower significance threshold for instrument selection, enabling the detection of more subtle associations that might have been excluded under stricter criteria. Importantly, all 19 metabolites in our study met the screening criteria of p < 0.05, FDR < 0.2, consistent OR direction in five MR methods, and pleiotropy > 0.05, underscoring the reliability of our findings. We identified eleven risk factors causally related to IS, including BMI, HDL, SBP, DBP, and T2D, indicating their crucial role in IS pathogenesis, consistent with classical epidemiological data9,37,38,39. Conversely, LDL, non-HDL, smoking, alcohol consumption, coffee intake, and CHD were negatively associated with IS, acting as protective factors, which contrasts with epidemiological data. Our analysis indicates that the associations between 1-stearoyl-GPG (18:0), 1-oleoyl-2-linoleoyl-GPE (18:1/18:2), octadecadienedioate (C18:2-DC), and X-24,951 levels with IS may be mediated by risk factors like BMI, DBP, and coffee intake. Additionally, we identified 132 other metabolites with causal relationships to these risk factors. The Phe-MR analysis highlighted the bidirectional effects of the 19 IS-associated metabolites on other conditions.

BMI, HDL, SBP, DBP, and T2D were causally associated with increased IS risk. Previous studies have shown that increased BMI significantly elevates stroke risk. Reducing BMI through lifestyle changes and dietary control can significantly lower this risk40. Changes in HDL composition appear to correlate with the severity and outcomes of acute ischemic stroke, potentially serving as biomarkers for risk stratification and management41. The China Hypertension Cohort Study revealed that elevated blood pressure (DBP/SBP) is positively associated with stroke incidence, supporting our results42. Other studies also support the causal relationship between T2D and IS risk, indicating that genetic susceptibility to T2D and higher HbA1c levels are linked to increased risk of large-artery and small-vessel ischemic strokes43. Conversely, LDL, non-HDL, smoking, alcohol consumption, coffee intake, and CHD were negatively associated with IS. Studies have shown that LDL is negatively associated with IS44, consistent with our findings. Most previous studies investigating the relationship between elevated non-HDL cholesterol levels and IS risk have found correlations, suggesting that elevated non-HDL cholesterol might be a better marker for IS risk45,46. In this study, we found for the first time that non-HDL cholesterol is negatively associated with IS risk, a result warranting further investigation. Smoking is a well-established risk factor for stroke occurrence and poor prognosis. The correlation between a family history of stroke and IS risk is more pronounced among smokers but is not observed among those who have quit smoking for over ten years or never smoked47. Recent data indicate that active smoking or exposure to environmental tobacco smoke (ETS) is linked to increased risks for all types of strokes, including their major pathological and etiological subtypes48. Despite our contrary findings showing that smoking is negatively associated with IS, we still encourage younger individuals to avoid smoking, promote quitting, and support smoke-free environments. Traditional observational epidemiological studies have found that, compared to non-drinkers, moderate alcohol consumers have a slightly reduced risk of IS, aligning with our findings. However, a recent study indicated that increased alcohol consumption consistently raises blood pressure levels and subsequently increases stroke risk49. This research, derived from the CKB project, included prospective follow-up and genetic data from 160,000 adults, suggesting that moderate alcohol consumption does not protect against stroke and that even low levels of alcohol intake may increase stroke risk. Large-scale cohort studies have previously found that drinking coffee may be linked to reduced risks of stroke, dementia, and post-stroke dementia50. This study showed that compared to non-coffee drinkers, people who consume 2–3 cups of coffee daily have a 32% reduced risk of stroke and a 28% reduced risk of dementia. Coffee contains polyphenols and other bioactive compounds with potentially beneficial health effects, such as neuroprotection, antioxidative stress, anti-inflammation, inhibition of β-amyloid accumulation, and anti-apoptotic properties50. Previous research has demonstrated that age-adjusted stroke incidence rates more than doubled among patients with CHD, including acute coronary syndrome (ACS), compared to those without CHD51. This contrasts with our findings. Our results partially contradict the existing literature. Possible reasons for identifying LDL and smoking as “protective factors” include the broad phenotype definitions in GWAS, which may overlook details such as smoking intensity or LDL subtypes; a “survivor bias” that excludes high-risk individuals from the final sample; unaddressed heterogeneity among stroke subtypes; and the inability to fully eliminate residual pleiotropy. Moreover, MR captures lifetime genetic exposure, differing from short-term or behavior-based effects. Future research employing larger cohorts, delineated stroke subtypes, and functional validation will be crucial to reconciling these discrepancies and strengthening causal inferences.

Metabolomics represents a highly promising approach for biomarker identification. Unlike other analytical techniques, its advantage lies in providing a comprehensive spectrum of low molecular weight metabolites rather than focusing on a single molecular profile. Alterations in metabolite concentrations during cerebral ischemia can profoundly affect primary neuronal function52. MR analysis of 1,400 plasma metabolites with IS revealed that 1-linoleoylglycerol (18:2), S-methylcysteine sulfoxide, and the aspartate-to-mannose ratio are positively correlated with IS, serving as risk factors. In contrast, the following metabolites were negatively associated with IS, acting as protective factors: 1-stearoyl-GPG (18:0), 4-methylcatechol sulfate, 1-oleoyl-GPG (18:1), glycodeoxycholate 3-sulfate, 1-linoleoyl-GPG (18:2), 1-oleoyl-2-linoleoyl-GPE (18:1/18:2), 1-linoleoyl-2-linolenoyl-GPC (18:2/18:3), octadecenedioate (C18:1-DC), octadecadienedioate (C18:2-DC), N-succinyl-phenylalanine, 1-palmitoyl-2-linoleoyl-GPC (16:0/18:2), X-11,299, X-24,546, X-24,951, arginine to phosphate ratio, and cholesterol-to-linoleoyl-arachidonoyl-glycerol (18:2 to 20:4) [2]. Among the 19 metabolites causally associated with IS, we identified two—1-stearoyl-GPG (18:0) and 1-oleoyl-2-linoleoyl-GPE (18:1/18:2)—that are potentially negatively associated with IS risk by modulating DBP. Notably, 1-stearoyl-GPG (18:0) was also identified in a study distinguishing elite and non-elite athletes, indicating genetic tendencies for elite athletic performance53. This metabolite might reduce IS risk by influencing DBP after intense exercise. 1-oleoyl-2-linoleoyl-GPE (18:1/18:2), known as PE (18:1(9Z)/18:2(9Z,12Z)), plays an essential regulatory role in central nervous system diseases like Parkinson’s and Alzheimer’s diseases, with DBP being a known risk factor for these conditions54,55. Additionally, we identified that octadecadienedioate (C18:2-DC) and X-24,951 are potentially negatively associated with IS risk by regulating coffee intake and BMI, respectively. Further research is needed to understand the relationship and mechanisms of these metabolites with IS. It is noteworthy that some analyses revealed a “negative mediation effect,” meaning the metabolite’s influence on the mediator opposes the mediator’s effect on IS. Rather than amplifying risk, this suggests the mediator may buffer or suppress the causal pathway. Such findings imply complex biological processes, possibly involving pleiotropy or compensatory responses. Further research, including functional assays or refined stratification by IS subtype, could clarify these mechanisms and inform targeted interventions. For metabolites identified in the MR analysis as positively associated with IS, we performed a multitrait colocalization analysis using the Hyprcoloc package56. Unfortunately, the positive metabolites identified through our MR analysis weren’t replicated through colocalization. Recent research indicates that a lack of colocalization evidence does not invalidate findings, as colocalization methods have a high false-negative rate, often around 60%57. Therefore, despite the absence of the same positive metabolites in our multitrait colocalization analysis, it does not undermine the robustness of our study results. However, we acknowledge that our colocalization findings require validation in independent datasets, limiting the robustness of causal interpretations. Current methods (e.g., coloc, Hyprcoloc) assume a single causal variant and rely on accurate LD estimation, potentially introducing biases in regions with complex LD or small samples. Future work should confirm these results in larger cohorts, employ multiple-signal tools (e.g., SuSiE, eCAVIAR), and integrate functional assays to strengthen causal inferences.

The results of the phenome-wide MR analysis underscore the systemic influence of IS-associated metabolites, revealing interconnected pathways that extend beyond the cerebrovascular ___domain. Several metabolites, including 1-stearoyl-GPG (18:0) and 1-linoleoylglycerol (18:2), demonstrated consistent risk profiles for both IS and other cardiovascular, metabolic, or respiratory conditions, suggesting shared pathophysiological mechanisms. Conversely, metabolites such as octadecadienedioate (C18:2-DC) displayed protective effects against conditions frequently comorbid with IS, including hypertension and type 2 diabetes. These cross-disease associations emphasize the need to view IS not as an isolated entity but as part of a broader cardiometabolic continuum, wherein specific metabolites could serve as key mediators. In addition, the linkage of certain metabolites, like X-24,951, with both cardiovascular and neuropsychiatric conditions signals a heart-brain interplay that may inform future research on integrative preventive strategies. Protective or deleterious roles observed across multiple disease phenotypes highlight potential therapeutic targets, enabling a more personalized approach to intervention. Overall, this Phe-MR analysis advances our understanding of how IS-related metabolites influence a wide spectrum of disorders, providing valuable insights into shared molecular pathways and offering avenues for novel diagnostic, prognostic, and therapeutic developments.

Conclusion

In conclusion, we identified four human plasma metabolites with causal relationships to both IS and its risk factors. Our analysis revealed their mediating effects on IS through one or more risk factors. Using Phe-MR analysis of 3,948 phenotypes associated with the target metabolites, we found that these four metabolites influenced other conditions in the same direction as IS, all of which were protective. These findings provide new insights into screening, prevention, and treatment of IS using these metabolites as biomarkers. However, it is important to acknowledge the current lack of sufficient evidence supporting the associations between these metabolites and IS risk. We advocate for rigorously designed, large-sample, prospective cohort studies to confirm whether these metabolites can serve as definitive biomarkers for IS. Future research should focus on elucidating the pathways and biological processes involving these biomarkers to better understand the specific mechanisms by which they mediate IS.