Introduction

Liver cirrhosis is the 11th most common cause of death, associated with high morbidity and mortality1. Approximately one million people die from cirrhosis each year1. Previous studies have shown that the development of cirrhosis is associated with various factors such as viral hepatitis, alcohol consumption, and non-alcoholic fatty liver disease2,3. Cirrhosis could lead to various complications such as ascites, portal hypertension, and acute variceal hemorrhage4,5. Currently, treatment options for cirrhosis are very limited. The main treatment goals for diagnosed cirrhosis are to reduce the impact of causative factors (such as viruses or alcohol), delay the onset of decompensated cirrhosis as much as possible, and evaluate whether the patient is a candidate for liver transplantation1. For patients with decompensated cirrhosis, liver transplantation remains the mainstream recommended treatment4. Therefore, finding targets for the treatment of cirrhosis and reversing its progression is of paramount importance.

The plasma proteome is the largest protein reservoir in the human body, containing not only plasma proteins but also proteins from various tissues6. In clinical practice, assessing the physiological status of individuals by measuring the levels of circulating proteins in plasma using immunoassays is very common7. Additionally, the plasma proteome can serve as a therapeutic target for the treatment of diseases8. Due to the dynamic nature of plasma proteins and their interactions with each other, understanding the relationship between individual plasma proteins and diseases becomes exceptionally challenging7. In recent years, the rapid development of protein quantitative trait loci (pQTLs) has provided abundant resources for identifying disease-related proteins9,10. These data are advantageous for understanding the association between plasma proteins and diseases through Mendelian randomization (MR) analysis. MR is an epidemiological research method that utilizes the random allocation of genetic variants during fertilization to eliminate confounding factors and causal interference. Currently, MR has identified potential therapeutic targets for various diseases11,12.

In this study, we systematically explored the association between plasma proteins and different subtypes using summary-data-based mendelian randomization (SMR) and the heterogeneity in dependent instruments (HEIDI) test. Additionally, we employed proteome-wide MR analysis (PWMR) and colocalization analysis, to reveal more associated proteins. Subsequently, we conducted druggability evaluation to investigate the potential therapeutic targeting effects of the associated plasma proteins. Finally, we employed Phenome-wide MR to investigate the potential side-effects associated with drug targets.

Methods

Study design

The workflow of the study is presented in Fig. 1. Initially, we conducted preliminary analysis using data from Pan UKB (Discovery cohort). Plasma proteins associated with 3 subtypes of cirrhosis were identified through SMR and HEIDI test. Results were considered significant when the false positive rate (FDR) was less than 0.05 (PFDR of SMR < 0.05) and passed HEIDI test (PHEIDI > 0.05). Subsequently, significant proteins from the discovery cohort were extracted and validated in FinnGen data (PSMR < 0.05 and PHEIDI> 0.05)13. For uncover cirrhosis-relevant drug targets as comprehensively as possible, we employed PWMR analysis and colocalization analysis14. Finally, significant results (PFDR of PWMR< 0.05 and PPH4 > 80%) were obtained15. All these significant proteins must demonstrate a strong correlation in both the Pan UKB and FinnGen datasets to be considered significant associated drug targets. If they were found to be strongly correlated in only one dataset, we regard them as suggestive associated drug targets.

Fig. 1
figure 1

The flowchart of study. Initially, preliminary analysis was conducted using data from Pan UKB (Discovery cohort). Plasma proteins associated with 3 subtypes of cirrhosis were identified through SMR and HEIDI test. Results were considered significant when the false positive rate (FDR) was less than 0.05 (PFDR of SMR < 0.05) and passed HEIDI test (PHEIDI > 0.05). Subsequently, significant proteins from the discovery cohort were extracted and validated in FinnGen data (PSMR < 0.05 and PHEIDI > 0.05). Then, PWMR analysis and colocalization analysis were employed, and significant results (PFDR of PWMR < 0.05 and PPH4 > 80%) were obtained. Subsequently, protein-protein interaction analysis was conducted on these proteins, and drug information was searched on DrugBank and DGIdb. Finally, MR-PheWAS was employed to explore potential side effects of the drugs targets.

Data source of proteomic data and cirrhosis

For pQTLs, the large-scaled proteomic data were used to get the summary GWAS data of plasma proteins. Ferkingstad’s study included 35,559 individuals of Iceland and finally contained 4,709 proteins by SomaScan multiplex aptamer assay (version 4)9. Previous study had indicated that trans-pQTLs are more susceptible to potential pleiotropic effects compared to cis-pQTLs16. Consequently, cis-pQTLs are widely employed in the screening of drug targets16,17. In our study, we selected 1775 cis-pQTLs from a total of 4,709 pQTLs for subsequent SMR analysis. The three subtypes of cirrhosis were sourced from Pan UKB and the FinnGen (Release 10), respectively18. For Pan UKB, it used the data of UK Biobank, which recruited 500,000 people aged between 40 and 69 years in 2006–201019. In this project, it contained 420,531 individuals of European ancestry, and including 7,200 phenotypes, the detail information about Pan UKB could find in the website (https://pan.ukbb.broadinstitute.org/docs/technical-overview). We selected cirrhosis-related GWAS summary data pertinent to the European population, including fibrosis and cirrhosis of liver (International Classification of diseases 10th Edition [ICD-10]: K74), primary biliary cirrhosis, and chronic liver disease and cirrhosis. For FinnGen, it encompasses a total of 412,181 individuals of European ancestry in Release 1019. We obtained fibrosis and cirrhosis of liver (ICD-10: K74) and primary biliary cholangitis (ICD-10: K74.3). Due to the absence of GWAS data corresponding to the description of chronic liver disease and cirrhosis in Pan UKB, we consulted the Pan-UKB database for the types of diseases included under chronic liver disease and cirrhosis (phecode-571) using the following website: (https://phewascatalog.org/phecodes_icd10). And we substituted with data on all-cause cirrhosis from the FinnGen with detailed definitions available in the study by Emdin et al.20, and described chronic liver disease and cirrhosis (Pan UKB) as all-cause cirrhosis in this study. The detail information was displayed in Table 1.

Table 1 Data source for MR analysis in this study.

SMR and HEIDI test

Firstly, SMR analysis was firstly conducted to detect the association between plasm protein (Ferkingstad et al.: 1,775 cis-pQTLs) and discovery cohort (Pan UKB). We further differentiated proteins associated with cirrhosis risk using the HEIDI test to ascertain whether this association was due to shared genetic variants rather than linkage disequilibrium (LD). SMR analysis and HEIDI tests were performed using SMR software (v1.3.1), and the default parameters were used. A P-value greater than 0.05 in the HEIDI test indicated that the causal association was not due to LD. Subsequently, plasma proteins (PFDR of SMR < 0.05, PHEIDI > 0.05) from the initial analysis were identified in the data of FinnGen (replication cohort), and subjected to validation analysis based on the criteria of PSMR < 0.05 and PHEIDI > 0.05.

Proteome‑wide MR analysis and colocalization analysis

To explore potential associations of plasma proteins with cirrhosis as comprehensively as possible, we employed PWMR and colocalization analysis as supplementary analytical methods for screening cirrhosis-related plasma proteins. For PWMR, we extracted instrumental variables (IVs) based on the following criteria: (i)  IVs must be strongly correlated with plasma proteins (P< 5e-8); (ii) The Major Histocompatibility Complex (MHC) region (chr6:25.5–34.0 Mb) was excluded due to its complex linkage disequilibrium structure, which could significantly impact conclusions15; (iii) A standard of r2 < 0.001 and a window size of 10,000 kb were used to remove LD; (iv) To prevent weak instrument bias, we calculated the F-value of each SNPs (Beta2/SE2) based on previous literature21. When the F-statistic > 10, it indicated a low likelihood of weak instrument bias. Finally, 4510 proteins were filtered. The extracted IVs can be found in Table S1.

We conducted MR analysis using the ‘TwoSampleMR’ package (v0.5.8). When a protein had only one IVs, we computed the association between that protein and cirrhosis using the Wald ratio method. When there were two instrumental variables, we utilized the IVW method as the primary analysis method22. Additionally, we employed MR-Egger intercept test as a supplementary analysis method to evaluate the presence of horizontal pleiotropy23. When the number of IVs exceeded three, we used the MRPRESSO package (v1.0) to test for horizontal pleiotropy (Globe test)24. We used Cochran’s Q test to assess heterogeneity among IVs25.

To determine whether cirrhosis and proteins share the same genetic variants, we conducted Colocalization analysis using the ‘coloc’ package (v5.2.2)15. Colocalization analysis is based on five hypotheses: (i) Cirrhosis and plasma proteins have no causal variants at the genetic locus (H0); (ii) Only plasma proteins have causal variants (H1); (iii) Only cirrhosis has causal variants (H2); (iv) Plasma proteins and cirrhosis have two distinct causal variants (H3); (v) Plasma proteins and cirrhosis share the same causal variant (H4)26. We selected SNPs within ± 250 kb of each protein’s pQTLs and Colocalization analysis was performed with default parameters26. When the posterior probability of H4 (PPH4) was higher than 80%, it had strong evidence for colocalization15. The final significant proteins (PFDR of PWMR < 0.05 and PPH4 > 80%) underwent next analysis.

Protein‑protein interaction (PPI) and druggability evaluation

To elucidate the interaction between identified potential and significant proteins, we constructed a PPI network using the STRING database (https://string-db.org/). Additionally, to determine whether the identified proteins could serve as potential therapeutic targets, we queried DrugBank (https://go.drugbank.com/) and Drug–Gene Interaction Database (DGIdb, v4.2.0, https://www.dgidb.org/) for information on drugs associated with the identified proteins27.

Phenome-wide MR

To further investigate the potential side effects after pharmacological modulation of plasma proteins, we conducted Phenome-wide MR using the Top SNPs (rs28929474, rs429358, rs2228603, rs483082) in SMR analysis and causal SNPs (rs1229984, rs71794132) identified through colocalization analysis. The GWAS of disease from the UK Biobank were conducted using the Scalable and Accurate Implementation of a Generalized Mixed Model (SAIGE version 0.29) to address the unbalanced case-control ratios28. Disease phenotypes with fewer than 500 cases were subsequently excluded to ensure adequate statistical power, and 782 non-cirrhotic phenotypes were selected (Table S2). Summary data related to top SNPs and causal SNPs can be downloaded from the SAIGE GWAS at https://www.leelabsg.org/resources29. Causal associations between six proteins (SERPINA1, PSG5, NCAN, APOE, ADH1B, GM2A) and other phenotypes were estimated using the Wald ratio method. Results were considered significant when the P was less than 6.394e-05 (0.05/782).

Results

SMR and HEIDI tests identified 4 circulating proteins for 2 subtypes cirrhosis

The SMR analysis revealed four circulating proteins (SERPINA1, PSG5, NCAN, APOE) associated with two subtypes of cirrhosis (PFDR of SMR < 0.05). Following the HEIDI test, both circulating proteins exhibited genetic correlations with the two subtypes of cirrhosis (PHEIDI > 0.05), including Fibrosis and cirrhosis of liver (ICD-10: K74) and all-cause cirrhosis (Figs. 2A, B and C and 3). Subsequently, we conducted secondary analysis. All four proteins passed the secondary analysis (PSMR < 0.05, and PHEIDI > 0.05). Combining these results, we found that SERPINA1, PSG5, NCAN, and APOE might be significantly associated drug targets for all-cause cirrhosis and SERPINA1 was the significantly associated drug target of Fibrosis and cirrhosis of liver (ICD-10: K74). The result of SMR analysis and HEIDI test were displayed in Fig. 3. All SMR results could find in Table S3-S4.

Fig. 2
figure 2

The result of primary and supplementary analysis (Pan-UKB). (A, B, C). The results of SMR analysis were displayed. SERPINA1 was the potential drug target for cirrhosis (ICD-10: K74) and all-cause cirrhosis, and three proteins (PSG5, NCAN, APOE) were the potential drug targets for all-cause cirrhosis. (D, E, F): the result of PWMR analysis. 10 proteins (ADH1B, CRYZL1, EFNB1, GM2A, GPC6, HTR7, INS, RGS7, TP53I11, UXS1) were associated with all-cause cirrhosis. three proteins (GPC6, HLA, HTR7) were associated with cirrhosis (ICD-10: K74). One protein (TNFRSF1B) was associated with primary biliary cirrhosis.

Fig. 3
figure 3

The results of SMR analysis and HEIDI test. SMR analysis and HEIDI test displayed that four proteins were associated with cirrhosis, including all-cause cirrhosis and Cirrhosis (ICD-10: K74).

Proteome‑wide MR analysis and colocalization analysis

We identified associations between 13 proteins and liver cirrhosis, including GPC6, HLA, HTR7, ADH1B, CRYZL1, EFNB1, GM2A, HTR7, INS, RGS7, TP53I11, UXS1, and TNFRSF1B (Figs. 2D, E and F and 4). These proteins remained significant after correcting, with PFDR of PWMR < 0.05. No significant heterogeneity or horizontal pleiotropy was observed for these proteins (Pfor Cochran’s Q test > 0.05, Pfor MR−Egger intercept test > 0.05, and Pfor MR−PRESSO Global test > 0.05, Table S5-S7). Subsequent colocalization analyses were conducted to determine whether these proteins share the same genetic variants with liver cirrhosis subtypes. Only ADH1B and GM2A shared genetic variants with all-cause liver cirrhosis (Fig. 5). Detailed colocalization information is available in the Table S8. However, in the replication cohort, although PWMR indicated a causal relationship between these two proteins and all-cause liver cirrhosis, colocalization analyses did not support a shared genetic variant. Therefore, we consider ADH1B and GM2A as suggestive pharmacological targets.

Fig. 4
figure 4

The results of PWMR and Colocalization analysis. The results of PWMR and Colocalization analysis, two proteins (ADH1B, GM2A) were suggestive associated with all-cause cirrhosis.

Fig. 5
figure 5

The significant results of colocalization analysis. (A): In each plot, each dot represents a genetic variant. The casual SNP between cirrhosis and GM2A is rs72794132, while other SNPs are color-coded according to linkage disequilibrium (r2) in Europeans. SNPs lacking linkage disequilibrium information are coded as dark blue. On the left plot, the -log10 (P) values associated with cirrhosis risk are on the y-axis, and those related to protein levels are on the x-axis. On the right plot, genomic positions are on the x-axis, and the -log10 (P) values for GM2A are on the y-axis. The lower plot displays the -log10 (P) values and cirrhosis for the corresponding region. (B): In each plot, each dot represents a genetic variant. The casual SNP between cirrhosis and ADH1B is rs1229984, while other SNPs are color-coded according to linkage disequilibrium (r2) in Europeans. SNPs lacking linkage disequilibrium information are coded as dark blue. On the left plot, the -log10 (P) values associated with cirrhosis risk are on the y-axis, and those related to protein levels are on the x-axis. On the right plot, genomic positions are on the x-axis, and the -log10 (P) values for ADH1B are on the y-axis. The lower plot displays the -log10 (P) values and cirrhosis for the corresponding region.

PPI and druggability evaluation on potential and significant proteins

We utilized protein-protein interaction (PPI) analysis to explore the associations between significantly correlated proteins (SERPINA1, PSG5, NCAN, APOE) and suggestive associated proteins (ADH1B and GM2A), seeing in Fig. 6. Subsequently, we searched for corresponding drug targets of these proteins in Drugbank and DGIdb. Detailed results are presented in Table S9-S10.

Fig. 6
figure 6

The result of Protein-protein interaction (PPI) between 4 significant associated proteins and 2 suggestive associated proteins. the result of PPI between 4 significant associated proteins and 2 suggestive associated proteins. Lines represent interactions between proteins. Green line indicates gene neighborhood and predicted interaction. Black line indicates co-expression. Data information was from STRING database.

Side-effects of significant associated drug targets

Six plasma proteins (SERPINA1, PSG5, NCAN, APOE, ADH1B, GM2A), under a threshold of P < 6.394e-05, were significantly associated with 12, 15, 1, 7, 5 and 0 disease phenotypes respectively (Fig. 7, Table S11-S12). High plasma concentrations of SERPINA1 were associated with a reduced risk of cirrhosis and developing other digestive system disease and respiratory disease, such as emphysema and cholelithiasis. However, it would increase the risk of cardiovascular disease, including ischemic heart disease, myocardial infarction, and coronary atherosclerosis. Notably, PSG5 and APOE were positively correlated with all-cause cirrhosis, indicating that lower plasma levels of PSG5 and APOE are protective factors for all-cause cirrhosis. However, PSG5 was inversely correlated with the occurrence of 15 diseases, and lower plasma PSG5 may lead to an increased risk of cardiovascular and neurological diseases, warranting careful consideration of its role as a therapeutic target for cirrhosis. Similarly, APOE was inversely associated with 7 diseases, and similarly lead to the development of cardiovascular and neurological diseases. Additionally, it may reduce the risk of developing benign adrenal tumors. The development of cardiovascular and neurological diseases should also be considered when using APOE as a therapeutic target for cirrhosis. High plasma NCAN associated with low risk of other chronic nonalcoholic liver disease. ADH1B is positively correlated with overall liver cirrhosis, and it also exhibits a positive correlation with alcohol-related disorders and hypertension. Therefore, inhibiting ADH1B could not only serve as a treatment for liver cirrhosis but also potentially reduce blood pressure and the risk of other associated diseases. It is worth mentioning that GM2A did not show association with other diseases under the threshold of P < 6.394e-05.

Fig. 7
figure 7

The Manhattan plot derived from the MR-PheWAS results. Six plasma proteins (SERPINA1, PSG5, NCAN, APOE, ADH1B, GM2A), under a threshold of P < 6.394e-05, were significantly associated with 12, 15, 1, 7, 5 and 0 disease phenotypes respectively, and detail information was displayed in Table S9-S10.

Discussion

In this study, we explored the association between plasma proteins and cirrhosis using large-scale plasma pQTLs databases. Our study revealed significant causal relationship between 4,709 proteins and three subtypes of cirrhosis, including Fibrosis and cirrhosis of liver (ICD-10: K74), Primary biliary cirrhosis and all-cause cirrhosis. Our result showed that SERPINA1 was significant associated with two subtypes of cirrhosis, including Fibrosis and cirrhosis of liver (ICD-10: K74) and all-cause cirrhosis. PSG5, NCAN and APOE were displayed associated with all-cause cirrhosis. Additionally, we identified 2 circulating proteins, ADH1B and GM2A, potential associated with all-cause cirrhosis.

SERPINA1 is a short peptide from α1-antitrypsin (AAT), and mutations in its gene can lead to decreased circulating AAT. Approximately 95% of SERPINA1coding abnormalities are associated with the PI ZZ genotype, where the Z-AAT encoded by this genotype accumulates in the endoplasmic reticulum of liver cells, causing liver cell damage and inflammatory responses, ultimately leading to cirrhosis29. Additionally, numerous studies have confirmed the genetic association between SERPINA1and cirrhosis30,31. A recent study showed that Fazirsiran (a new RNAi therapy) can promote messenger RNA degradation of Z-AAT, thereby improving liver fibrosis32. Therefore, it is also one of the candidate therapeutic targets for cirrhosis. Our study also identified SERPINA1 as a candidate target for 2 subtypes of cirrhosis, including Fibrosis and cirrhosis of liver (ICD-10: K74) and all-cause cirrhosis. PSG5, also known as Pregnancy-Specific Beta-1-Glycoprotein 5, are mainly produced during pregnancy by syncytiotrophoblast cells33. PSG5 is associated with fetal weight during pregnancy34. Unfortunately, the association between PSG5 and liver cirrhosis remains unknown. However, given the role of PSGs in immunosuppression, it is inferred that they may achieve therapeutic effects by suppressing liver inflammation35. NCAN is a member of the Lectican family, and previous studies have shown that NCAN gene variants are closely associated with liver cirrhosis and inflammation in patients with non-alcoholic fatty liver disease36,37. NCAN variants also increase the risk of liver cancer in patients with alcoholic cirrhosis38. These findings suggest that NCAN plays an important role in the process of liver fibrosis and progression to liver cancer. Our study expands the potential therapeutic scope by identifying NCAN as a potential therapeutic target for all-cause cirrhosis. APOE is an important component of very low-density lipoproteins and is also a ligand for the low-density lipoprotein receptor39. It is mainly secreted by the liver and participates in the metabolism of triglycerides and cholesterol. A recent review has shown significant associations between apolipoprotein E genetic polymorphisms and a variety of liver diseases, such as hepatocellular carcinoma, alcoholic cirrhosis, and primary biliary cirrhosis40. Different genotypes play different roles in the development of cirrhosis, however, their potential effects on the risk of developing cirrhosis have been inconsistent across different studies40. Our research demonstrates the association of APOE with all-cause cirrhosis. This suggests that APOE plays a role in cirrhosis caused by various etiologies.

Apart from GM2A, the other 5 proteins have been found to be associated with various diseases, necessitating a comprehensive consideration of their potential benefits when targeting them for cirrhosis treatment. Specifically, PSG5 and APOE, when considered as drug targets, present potential cardiovascular and neurological side effects. SERPINA1 increased the risk of cardiovascular disease, including ischemic heart disease, myocardial infarction, and coronary atherosclerosis. In addition to these findings, ADH1B and GM2A emerged as suggestive pharmacological targets. Previous research has indicated that a mutation in the gene encoding ADH1B (ADH1B rs1229984) is protective against the risk of developing alcoholic cirrhosis. However, a study focusing on a Turkish cohort of heavy drinkers revealed no association between ADH1B polymorphisms and alcoholic cirrhosis. As for GM2A, to date, there is no reported evidence of its association with liver diseases or cirrhosis. Therefore, the potential links between these two proteins (ADH1B and GM2A) and cirrhosis require further investigation.

Our study represents the first to analyze the association between cirrhosis and specific circulating proteins using large-scale circulating protein data. Concurrently, we conducted a detailed analysis of the pharmaceutical value of these proteins and also examined the causal relationship between these proteins as drug targets and potential side-effects. However, our study has certain limitations. Firstly, we used GWAS and pQTLs databases of European descent, so the generalizability of our conclusions to other populations is unclear. Secondly, our study only used circulating proteins and did not analyze other intracellular proteins. A comprehensive analysis of intracellular proteins may provide more potential drug targets for cirrhosis treatment. Thirdly, due to the limitation of SMR method and numbers of SNPs, we were unable to perform reverse causality analysis (Steiger test) in meaningful proteins. Finally, the use of strict definition criteria (PFDR < 0.05 or PPhenome−wide MR <0.05/782) may lead to the presence of false negatives, i.e., the protein may be associated with cirrhosis, but not significant after correction.

Conclusion

Our study provides the first systematic elucidation of the association between cirrhosis and circulating plasma proteins. Additionally, we identified four significant associated circulating proteins (SERPINA1, PSG5, NCAN, APOE) and two suggestive associated proteins (ADH1B, GM2A) that exhibit genetic associations with cirrhosis, suggesting their potential as drug targets for cirrhosis. Phenome-wide MR identified the possible side effects of these drug targets.