Introduction

Gastrointestinal cancer refers to cancer of the digestive tract, which includes cancer of the associated organ such as the esophagus, stomach, anal, pancreas, liver, gallbladder, colon, and rectum. These tumors often develop slowly and are difficult to recognize early. Treatment may include surgery, radiation therapy, chemotherapy, and targeted medications. There were approximately 19.9 million novel instances of cancer ((including nonmelanoma skin cancers (NMSCs)) and over 9.7 million fatalities (including NMSCs) attributed to the disease reported in Globacan 20221. By 2050, these numbers will increase if advancements in cancer prevention, diagnosis, and treatment don’t progress; there could be around 35 million instances and around 17 million cancer-related fatalities1. A major cause of death in the world is gastrointestinal (GI) cancer. Among all types of cancer, GI-tract cancer is the third most common cause of death worldwide, after lung cancer and breast cancer1. Despite the heterogeneity, many GI-tract cancers share common molecular pathways. For example, aberrations in the Wnt/β-catenin pathway, mutations in the TP53 gene, and alterations in the PI3K/AKT/mTOR pathway are observed across multiple GI tract cancers2,3. Chronic inflammation is a known risk factor for many GI tract cancers4. Biomarkers associated with inflammatory processes or components of the tumor microenvironment, such as cytokines and immune cells, might be shared among different cancer types. Therefore, it is crucial to identify a common biomarker for GI-tract cancers that aims to leverage shared molecular features and pathways to improve early detection, streamline screening, and reduce costs.

Inhibin β A (INHBA), which is a member of the transforming growth factor-β (TGF-β) superfamily, is encoded by the activin/inhibin gene, which is located at the 7p14.1 chromosome5. It controls immunological responses, tissue homeostasis, cell proliferation, differentiation, and embryonic development. Transforming proliferation factor beta (TGF-β), which is known as a multifunctional protein, and regulates the immunological response, cell proliferation, and cell differentiation. It is essential for development, tissue repair, and the progression of disease6. It is frequently linked to being associated with metastasis, immune evasion, and tumor growth in malignancies such as gastric7,8, cervical9, ovarian10, breast11,12, pancreatic13, lung14, Head and Neck squamous cell carcinoma15, colorectal cancer16. The roles of INHBA and TGF-β in cancer are complicated and interconnected. INHBA can operate as a ligand for TGF-β receptors, activating the TGF-β signaling pathway when it binds to these receptors. The TGF-signalling pathway can promote cancer progression by different mechanisms17, like inhibiting the immune responses, apoptosis, or induce an epithelial-mesenchymal transition (EMT). In a study of gastric and esophageal adenocarcinoma, high activin protein expression was detected in tumors, and patients with elevated activin levels had significantly longer overall survival compared to those with lower activin expression. This indicates that the activin subunit inhibin beta A might serve as a prognostic biomarker18. In another study of gastro-esophageal (GE) cancers, several genes were identified as differentially expressed compared to normal tissues. The top overexpressed genes across all stages of GE cancers were CST1, INHBA, ACAN, HSP90AB1, and HSPD1 when examining plasma-proteome-associated genes. Machine learning techniques were used in this study to evaluate the predictive accuracy of these identified biomarkers with an area-under curve (AUC) > 0.919.

The molecular mechanism through which INHBA contributes to the GI tract cancer progression needs to be elucidated, but the enhanced expression of INHBA has been associated with tumor progression, cell migration, and poor prognosis in various cancers, including colorectal cancer, gastric cancer, and esophageal cancer. The exact role of INHBA is not known. However, there are studies that reported and investigated the relationship of INHBA with various cancers. A study reported that INHBA and circularTHBS1 both shared the same miRNA response elements, and CircTHBS1 serves as a sponge for miR-204-5p. Dual luciferase reporter assay revealed that MiR-204-5p could bind to the 3’UTR region of INHBA; based on the results, they hypothesized that CircTHBS1 might regulate the INHBA expression and promote gastric cancer progression20. In colon cancer also, INHBA upregulation is positively correlated with colon cancer cells malignant progression by the influence of BHIHE40 Transcription factor. SiRNA-based silencing of of INHBA in HCT116 cell lines suggested the remarkable suppression of cell proliferation, and wound healing assay revealed the suppressed migration of colon cancer cells. Results suggested INHBA modulates cell proliferation as well as migration; however, the mechanism behind this is still unclear21. A comprehensive study also reported INHBA as a promising predictor of immunotherapy response as they evaluated the expression of INHBA and the correlation between INHBA and GC prognosis and revealed that INHBA expression was positively correlated with the type markers of CD8 + T cells, neutrophils, macrophages, and dendritic cell22. A recent study in colorectal cancer suggested an interplay of HNF1A-AS1/ miR-214/ INHBA axis regulation in TGF β / SMAD signaling based on their preliminary results, where they showed the expression of HNF1A-AS1 was negatively correlated with hsa-miR-214 and hsa-miR-214 targets INHBA, and expression INHBA and HNF1A-AS1 were found to be increased in colon adenocarcinoma (COAD), and rectum adenocarcinoma (READ)23.

The interaction of INHBA with other receptor subunits/different molecular signatures may contribute to cancer, depending on their relative gene expression. The current study aimed to undertake a comprehensive bioinformatics analysis of the publicly available “The Cancer Genome Atlas (TCGA)” datasets and GEO datasets that provide expression datasets for many tumor types. Our study seeks to examine the role and mechanism of INHBA in all malignancies of the GI tract, including liver, cholangiocarcinoma, stomach, colon, rectum, esophageal, and pancreatic cancers. Different bioinformatics approaches are utilized to uncover potential highlights of INHBA in GI tract cancers. This study would be significant for a number of reasons, as it might aid in identifying patients on the verge of high risk for developing certain tumors and developing novel medications to target INHBA for cancer therapy to improve effectiveness.

Additionally, we investigated the putative molecular processes of INHBA in GI tract cancers using gene expression, patient survival analysis, gene mutation status, immunological infiltration, methylation, protein interactions, and pathway analysis. These results may improve our knowledge of how INHBA interacts biologically with various GI tract cancer types. However, the precise involvement of INHBA in diverse GI tract malignancies is still unexplored.

Results

INHBA expression in GI-tract cancers and their pathological stages

In this study, we investigated the possible significance of INHBA genes in cancer development and initiation in gastrointestinal tract cancers. Figure 1 represents the total number of TCGA cancer and TCGA normal and GTEx samples in (Fig. 1(A)) TCGA 7 GI tract cancers and (Fig. 1(B)) TCGA 24 other cancers from the GEPIA2 online server. We found significantly higher INHBA expression in cancers like Bladder Urothelial Carcinoma (BLCA), Breast invasive carcinoma (BRCA), and Head and Neck squamous cell carcinoma (HNSC). However, notable higher expression of INHBA was found in Cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), Lymphoid Neoplasm Diffuse Large B-cell Lymphoma (DLBC), Glioblastoma multiforme (GBM), Kidney renal clear cell carcinoma (KIRC), Acute Myeloid Leukemia (LAML), Brain Lower Grade Glioma (LGG), Lung squamous cell carcinoma (LUSC), Pheochromocytoma and Paraganglioma (PCPG), Pancreatic adenocarcinoma (PRAD), Sarcoma (SARC), and Thyroid carcinoma (THCA) when compared with the normal including GTEx normal. INHBA was found to be low expressed in cancers such as Adrenocortical carcinoma (ACC), Kidney Chromophobe (KICH), Kidney renal papillary cell carcinoma (KIRP), Ovarian serous cystadenocarcinoma (OV), Skin Cutaneous Melanoma (SKCM), Lung adenocarcinoma (LUAD), Testicular Germ Cell Tumors (TGCT), Thymoma (THYM), Uterine Corpus Endometrial Carcinoma (UCEC) and Uterine Carcinosarcoma (UCS) in Fig. 2(A). As for GI-tract cancers, INHBA was found to be upregulated in cancers with significant differences in Colon adenocarcinoma (COAD), Esophageal carcinoma (ESCA), Pancreatic adenocarcinoma (PAAD), Rectum adenocarcinoma (READ), and Stomach adenocarcinoma (STAD) but there is no significance was found in Cholangiocarcinoma (CHOL). However, it was found to be downregulated in Liver hepatocellular carcinoma (LIHC) when compared with normal and GTEx normal tissues, which suggests the role of INHBA in GI tract-related cancer Fig. 2(B). This also indicates that INHBA might play an important role in GI tract cancers. Each TCGA tumor (red) is accompanied by its matching normal data (blue). Here, the Y axis defines the number of transcripts per million (log2(TPM + 1)), and the X axis defines the various cancers with a significance level of (p-value < 0.05). We analyzed INHBA expression levels in various cancer types using data from TCGA, utilizing TIMER2, ULCAN, and TNMplot. Consistent results were observed with TIMER2 and UALCAN, showing upregulation of INHBA in all GI-tract cancers and downregulation in liver hepatocellular carcinoma (LIHC). In TNMplot, significant differences were illustrated on the x-axis, highlighted in red. Notably, data for cholangiocarcinoma was not available for this analysis, as shown in Supplementary Figure S1 (A-C). To further investigate the proteomics data of INHBA in cancer patients, we analyzed the CPTAC dataset using UALCAN. Our analysis revealed that INHBA protein expression levels were higher in breast cancer, lung cancer, HNSC, and PAAD, and lower in OV and UCEC compared to normal tissues, as shown in Supplementary Figure S1 (D). Among GI-tract cancers, proteomics data were available only for PAAD, where a significant difference was observed between cancer and normal samples (p-value < 0.001).

Fig. 1
figure 1

The Number of samples in (A) TCGA 7 GI tract cancers and (B) TCGA 24 other cancers.

Fig. 2
figure 2

INHBA expression across diverse cancers. (A) The expression of mRNA level (INHBA) in various TCGA cancers and normal tissues along with the GTEx data was used as a control. (B) INHBA expression in GI cancers (CHOL, COAD, ESCA, LIGC, PAAD, READ, and STAD) and expression in normal tissues and GTEx dataset with p-value < 0.05.

The GTEx data was used to measure the expression of INHBA in normal tissue, and we found that the expression of INHBA is higher in blood vessels and lungs. It was also observed that the expression of INHBA is relatively low in blood, muscle, pancreas, and spleen (Fig. 3(A)). Moreover, we examined significant differences (p-value < 0.001) in GI-tract TCGA cancers (colon, esophageal, pancreatic, liver, and stomach) and GTEx normal samples Fig. 3(B) and for cholangiocarcinoma and rectum adenocarcinoma, we didn’t find any GTEx samples. Therefore, we compared the TCGA tumor with TCGA normal samples for cholangiocarcinoma and rectum adenocarcinoma. When the expression level of INHBA was analyzed using the microarray datasets for Intrahepatic Cholangiocarcinoma (GSE32879), Colorectal Cancer (GSE110224), Esophageal Squamous Cell Carcinoma (GSE38129), Pancreatic Ductal Adenocarcinoma (GSE71989), Liver Cancer (GSE29721), and Gastric Cancer (GSE54129), the box plot indicated that INHBA expression was upregulated in the tumor tissues of Intrahepatic Cholangiocarcinoma, Colorectal Cancer, Pancreatic Ductal Adenocarcinoma, Esophageal Squamous Cell Carcinoma, and Gastric Cancer compared to normal tissues and found to be significant (p-value < 0.05). However, no significant difference was observed in the case of liver cancer (Fig. 4). The number of samples of each case has been reported in Supplementary Table 1. Next, we plotted a receiver operating characteristic (ROC) curve to evaluate the diagnostic value of INHBA expression in several cancers: Intrahepatic Cholangiocarcinoma (GSE32879), Colorectal Cancer (GSE110224), Esophageal Squamous Cell Carcinoma (GSE38129), Pancreatic Ductal Adenocarcinoma (GSE71989), Liver Cancer (GSE29721), and Gastric Cancer (GSE54129). This analysis revealed high AUC values for Intrahepatic Cholangiocarcinoma (AUC = 0.96), Colorectal Cancer (AUC = 0.79), Esophageal Squamous Cell Carcinoma (AUC = 0.89), Pancreatic Ductal Adenocarcinoma (AUC = 0.82), and Gastric Cancer (AUC = 0.81), indicating strong diagnostic potential for these cancers. However, a lower AUC value was observed in Liver Cancer (AUC = 0.49). These findings suggest that INHBA expression could serve as a prognostic indicator for GI-tract cancers, except for liver cancer, as shown in Supplementary Figure S2(A-F).

Fig. 3
figure 3

A measure of INHBA expression in GTEx normal tissue. (A) GTEx data showing INHBA expression in different tissues. (B) Comparison of TCGA normal and GTEx normal tissue with GI tract cancer tissue (p-value < 0.001 (***), p-value < 0.01 (**), and p-value < 0.05 (*)).

Fig. 4
figure 4

Expression levels of INHBA in various microarray datasets for different cancers, including Intrahepatic Cholangiocarcinoma (GSE32879), Colorectal Cancer (GSE110224), Esophageal Squamous Cell Carcinoma (GSE38129), Pancreatic Ductal Adenocarcinoma (GSE71989), Liver Cancer (GSE29721), and Gastric Cancer (GSE54129), compared to normal tissues.

Further, we also calculated the AUC value for pan-cancer to assess the accuracy of INHBA in tumors. Table 1 represents the AUC value in different cancers. Among them, the AUC value is higher than 0.7 in CHOL (AUC = 0.71), COAD (AUC = 0.99), ESCA (AUC = 0.86), READ (AUC = 0.98), and STAD (AUC = 0.96). The AUC value for LIHC (AUC = 0.61) and PAAD (AUC = 0.52) was observed to be lower than other GI-tract cancers. However, this suggests that INHBA could be used as a potential predictor of tumor prognosis. Using UALCAN, we also accessed the expression of INHBA in TCGA normal samples according to pathological stages of GI tract cancers Fig. 5(A-G). Additionally, Supplementary Figure S3 shows the expression of INHBA according to nodal metastasis status, weight, age, gender, tumor grade, and other patient characteristics. We observed a significant difference in INHBA expression between normal tissues and various stages of COAD, and READ in the early stages, suggesting INHBA’s involvement in cancer initiation. We also observed a notable difference in late-stage cancer when compared to the early-stage in STAD, which might suggest the role of INHBA in cancer progression and invasion. We haven’t observed any significant difference in CHOL, PAAD, and LIHC in either the early or late stage of cancer when compared to the different stages of cancer (Fig. 5, p-value < 0.05) except for CHOL. We found a significant difference between Normal and Stage 1; for PAAD, a significant difference was observed between Stage 1 and Stage 2. Abutalebi et al., in their study, identified dysregulation of INHBA and PELATON in colorectal cancer (CRC) using bioinformatics and validated the upregulation of PELATON in CRC tissues through RT-qPCR and also checked for tumor purity, and response to immune checkpoint inhibitors in CRC, explored associations between INHBA expression focussing on COAD and READ24. In our study, we also identified higher INHBA expression in COAD and READ. Furthermore, we conducted a comprehensive analysis across all GI tract cancers to uncover the broader mechanisms involving INHBA in the GI-tract cancers.

Table 1 The AUC values for INHBA in TCGA tumor expression and TCGA normal tissue expression in pan-cancer.
Fig. 5
figure 5

Expression of INHBA across different stages of GI-tract cancers (A-G) where the Y-axis represents the transcript per million and the X-axis represents the cancer stages with the total number of samples in the individual stages.

Survival analysis

To further examine the relationship between the INHBA expression and disease prognosis in patients with GI tract cancers, we took a median of INHBA expression as a cutoff for higher or lower expression of the patients (Cutoff-High(50%) and Cutoff-Low(50%)). This cutoff value was chosen because using the median is a common and statistical approach. It also works on equal group sizes, allowing for balanced data comparisons. We observed that the cancer cases associated with higher expression of INHBA have worse overall survival (OS) in COAD (p-value = 0.045, HR = 1.6) and STAD (p-value = 0.028, HR = 1.4) in Fig. 6(A). When we did disease free survival (DFS) analysis, we observed a poor DFS for the patients with higher expression of INHBA in COAD (p-value = 0.04, HR = 1.6) and PAAD (p-value = 0.049, HR = 1.6) in Fig. 6(B). This indicates that the expression of INHBA is strongly associated with COAD, STAD, and PAAD, suggesting that INHBA expression correlated in these GI tract cancer prognosis. We also plotted the overall survival (OS) of other GI-tract cancers, including CHOL, ESCA, LIHC, PAAD, and READ, but did not observe any statistically significant p-values for these cancer types using GEPIA2, as shown in Supplementary Figure S4 (A). Additionally, we compared these findings with the UALCAN tool, which indicated a significant (p-value < 0.05) for COAD and PAAD. We also observed a p-value of 0.06 for STAD in the survival analysis, indicating that higher expression of INHBA has prognostic significance in predicting survival outcomes for patients with COAD, STAD, and PAAD, as shown in Supplementary Figure S4 (B).

Fig. 6
figure 6

Relationship between the gene expression of INHBA and survival prognosis for patients with GI tract cancer. Here, we have performed (A) overall survival analysis and (B) disease-free analysis.

Gene alteration analysis of INHBA in GI-tract cancers

Here, we analyzed gene alteration of INHBA in Cholangiocarcinoma, Liver Hepatocellular Carcinoma, Colorectal, Esophageal, Pancreatic, and Stomach Adenocarcinoma for available TCGA samples using cBioPortal. We noted 3% of genetic alteration in all the GI tracts mentioned above in cancer, as shown in Fig. 7(A), with genetic alteration in cancer type in detail. Further, we found that when selecting the “cancer type detailed” type (> 10%) of highest alteration of INHBA in (> 10%) of Mucinous adenocarcinoma of the colon and rectum followed by (> 6%) in esophageal squamous cell carcinoma mainly of the “Mutation” type as shown in Fig. 7(B-C). The highest alteration was found in esophageal (> 5%) and in stomach adenocarcinoma (> 4%) when selected as a “cancer study” in cBioPortal. In esophageal adenocarcinoma, we found mutation, amplification, and deep deletion, whereas we found only mutation and amplification type in stomach adenocarcinoma. In colorectal adenocarcinoma, we found mutations and very low frequency of amplification types. In cholangiocarcinoma, we observed only mutation type. Additionally, we analyzed the association between INHBA expression and clinical survival across all available TCGA GI-tract cancers from cBioPortal. We didn’t observe any significant difference in overall survival (OS), disease specific survival (DSS), progression free survival (PFS), and disease free survival (DFS) with or without alteration of INHBA, as shown in Fig. 7(D-G) using the Kaplan–Meier plotter. A significant difference was observed for the Esophageal adenocarcinoma (p-value = 0.05) case in OS with or without alteration of INHBA, as shown in Fig. 7(I), indicating poorer prognosis and shorter survival time with the altered group. We also showed the gene alteration of INHBA in all the GI-tract cancer studies other than TCGA from cBioPortal, as shown in Supplementary Figure S5 (A). No significant statistical difference was observed in the survival analysis of other GI-tract cancers with or without alteration of INHBA, as shown in Supplementary Figure S5 S5 (B). Figure 7(H) depicts the different types of cancer in which the mutation loci correspond to the cases in INHBA as lollipop plots. The missense mutation was found to be associated mainly with the INHBA. We observed R308C alteration in INHBA, which was found to be associated with one case of each: esophageal adenocarcinoma, hepatocellular carcinoma, and colon adenocarcinoma. This site of alteration induced missense mutation in INHBA from Arginine (R) to Cysteine (C) at 308 sites in Fig. 7(H). We also observed that in the transforming growth factor beta-like ___domain (319–426 aa) of INHBA there were seven sites mainly missense mutation associated with one case in C350R (Mucinous adenocarcinoma of colon and rectum), one case in S356N (Stomach adenocarcinoma), two cases in R377C/H (Mucinous adenocarcinoma of colon and rectum), one case in R379W (Esophageal adenocarcinoma), one case in A385S (Colon Adenocarcinoma), one case in K395R (Mucinous adenocarcinoma of colon and rectum) and one case in N417S (Intestinal Type Stomach Adenocarcinoma) sites. Figure 7(J) shows the most frequent site (R308C) for genetic alteration using the alpha fold structure available for INHBA. Further studies into the functional consequences of INHBA gene alterations and their interactions with other molecular pathways could offer valuable insights for developing targeted cancer therapies.

Fig. 7
figure 7

Mutation analysis of INHBA in GI-tract cancers (A) Genetic alteration of INHBA in detailed cancer type (B-C) mutational frequency (D-G) The overall, disease-specific, progression-free and disease-free survival in patients with GI-tract cancers. (H) Lollipop plot showing the mutation site in GI-tract cancers (I) Overall survival in ESCA with correlation with mutation status (J) The alpha fold structure of INHBA and the mutational site showing which is common and present in COAD, ESCA, and LIHC.

Analysis of immune infiltrates with INHBA in GI-tract cancers

Tumor growth, progression, and metastasis are all impacted by the presence and activity of immune cells infiltrating the tumor microenvironment. We have used TIMER2 and explored the Cancer-associated fibroblasts (CAF) for a detailed study. CAFs have an essential role in tumor microenvironment, influencing immune infiltration and tumor progression. CAF plays a vital role in tumor progression by secreting inflammatory ligands, growth factors, and matrix proteins. Different algorithms have been used for CAF analysis, like EPIC, MCPCOUNTER, XCELL, and TIDE, to find further the correlation between the expression of INHBA and the infiltrates of CAF in different TCGA cancers, emphasizing GI-tract cancer. We observed a positive correlation of INHBA expression in all GI tract cancers. Interestingly, most of the other TCGA cancers were found to be positively correlated with the expression of INHBA in CAFs, as shown in Fig. 8(A). We have also shown correlations between the infiltration level of different immune cells and the INHBA expression level in Supplementary Figure S6. In Fig. 8(B-H), we have demonstrated the results of the correlation plot between the infiltration level and INHBA expression level in CAF with the algorithm, which has the greatest correlation and significant p-value. We also used the old version of TIMER to analyze the correlation of infiltration levels in different immune systems and expression of INHBA, shown in Fig. 9(A), where we observed that in CHOL, the INHBA gene is positively correlated with B cells, CD8 + cells, CD4 + cells, macrophages, neutrophils, and dendritic cells while in COAD we observed INHBA is negatively correlation with B cells. Similar to COAD observation, B cells were found to be negatively correlated in ESCA and STAD while positively correlated with other immune cell infiltrates. In LIHC and READ, we observed a similar correlation pattern as observed in CHOL, where all the infiltrates were found to be positively correlated, while in PAAD, we observed a negative correlation in T cells CD4+, while others were positively correlated. The higher expression of INHBA was found to be positively correlated with an increased immune cell infiltration, particularly macrophages, in all the GI-tract cancers, further leading to cancer progression. The significant p-value and partial correlation values have been provided in Supplementary Table 2. These findings imply that INHBA might contribute to the immune response during the development and progression of cancer. According to the cumulative survival analysis, neutrophils are strongly related to INHBA in CHOL (p-value < 0.05), while macrophages are associated with INHBA in STAD (p-value < 0.05), as shown in Fig. 9(B). We didn’t observe any statistical difference in other GI-tract cancers while analyzing cumulative survival analysis. As shown in Supplementary Figure S7, we performed a somatic copy number alterations analysis to understand the distribution of each immunological subgroup in GI-tract cancers based on INHBA copy number status (deep deletion, arm-level deletion, diploid/normal, arm-level gain, and high amplification) using TIMER. This analysis was performed to understand how changes in the number of copies of certain genomic regions are associated with different immune cell populations within the tumor microenvironment. In COAD, significant differences were observed in B cells with arm-level deletion (p < 0.001), arm-level gain (p < 0.01), and high amplification (p < 0.01); in CD8 + T cells with arm-level gain (p < 0.001); in neutrophils with arm-level gain (p < 0.01); and in dendritic cells with arm-level gain (p < 0.001). In ESCA, significant differences were found in B cells with high amplification (p < 0.05), CD8 + T cells with arm-level deletion (p < 0.05), and CD4 + T cells with both arm-level deletion and arm-level gain (p < 0.05). In PAAD, significant differences were seen in B cells with arm-level gain (p < 0.01) and CD4 + T cells with arm-level gain (p < 0.01). In READ, significant differences were observed in macrophages with arm-level gain (p < 0.05). In STAD, significant differences were found in B cells with arm-level deletion (p < 0.01), arm-level gain (p < 0.01), and high amplification (p < 0.05); in CD8 + T cells with arm-level gain (p < 0.001); in CD4 + T cells with arm-level deletion (p < 0.01), arm-level gain (p < 0.01), and high amplification (p < 0.05); in macrophages with arm-level gain (p < 0.001); in neutrophils with arm-level deletion (p < 0.05) and arm-level gain (p < 0.001); and in dendritic cells with arm-level gain (p < 0.001) as also reported by Liu et al. in gastric cancer study19. However, we have addressed a systematic analysis of somatic mutation study in all the GI tract cancers to understand the underlying mechanism of INHBA in all other GI tract cancers.

Fig. 8
figure 8

Correlation of INHBA and immune infiltration of Cancer-associated fibroblast in GI tract cancers (A) Heatmap (B-H) Correlation with CHOL, COAD, ESCA, LIHC, PAAD, STAD and READ.

Fig. 9
figure 9figure 9

Correlation analysis with immune cells (A) Correlation of INHBA and immune infiltration of B cells, T cells CD8+, CD4+, Macrophages, Neutrophils and Dendritic cells in GI tract cancers (B) Survival plot of immune infiltrates of INHBA expression.

DNA methylation analysis

Gene methylation has been extensively studied in various human cancers and therefore we conducted DNA methylation of INHBA using the SMART app tool to study the genomic information of INHBA along with the CpG probe associated with INHBA, and we observed the chromosome distribution at 1000 genomic length scale is mostly in N shore and N shelf as shown in Fig. 10(A) and (B). We also observed a significant difference in tumor methylation levels and normal levels in GI tract cancers when CpG aggregation methylation was taken for CHOL, COAD, ESCA, LIHC, and READ, but no significant difference was observed for PAAD and STAD, as shown in Fig. 10(C). Supplementary Figure S8(A-G) shows the INHBA promoter methylation level expression in tumor and normal TCGA samples in which the methylation level of the tumor is upregulated in CHOL, whereas it is downregulated in other GI-tract cancers. The significant difference in p-value is shown in red color (p < 0.05). Using “Methylation DIY,” we analysed the difference between methylation in tumors and normal for each probe ID for each GI tract cancer with methylation by pathological stages based on the clinical data by TCGA in Supplementary Figure S9. Among the 18 CpG probes, cg18413237 was found to be downregulated in tumors and showed a statistically significant difference (p < 0.05) in all gastrointestinal tract cancers, as shown in Supplementary Figure S9(A). The above observation from the methylation study suggests that INHBA might play different roles in GI tract cancer.

Fig. 10
figure 10

DNA methylation study of INHBA (A) Genomic information of INHBA & the name of transcript are given (B) CpG shelves and shores (C) CpG methylation aggregation of INHBA in tumor and normal samples.

Gene enrichment analysis of INHBA-associated genes

To further study the gene INHBA’s role in tumorigenesis, we have used a string database to find INHBA-binding proteins. We employed the STRING database to identify 50 genes interacting with INHBA, setting the “max number of interactors to show” and “minimum required interaction score” to 0.4 parameters. These interactions, depicted in Fig. 11(A), were determined based on text mining, experiments, databases, co-expression, gene fusion, and co-occurrence. The analysis identified 51 interacting protein nodes, including INHBA, and 377 edges. The network exhibited an average node degree of 14.8, an average local clustering coefficient of 0.716, and significant protein-protein interaction (PPI) enrichment (p-value < 1.0e-16). We also investigated INHBA-interacting proteins by configuring the parameters in the STRING database. We set “active interaction sources” to include STRING experiments, textmining, databases, co-expression, neighborhood, gene Fusion, and co-occurrence. We also set the “minimum required interaction score” to 0.7 and the “maximum number of interactors to show” to 50. The analysis identified 17 interacting protein nodes and 64 edges, with an average node degree of 7.53, an average local clustering coefficient of 0.821, and significant PPI enrichment (p-value < 1.0e-16). Further, to find any overlapping genes, we compared these results with 50 genes screened from GEPIA2 based on expression similarity from all tumor expression data related to INHBA. No common genes were found, as illustrated in Supplementary Figure S10 (A-B). We accessed 51 screened proteins and performed gene ontology (GO) and pathway analysis using ShinyGo (version 0.77), and we found that the KEGG pathways (https://www.kegg.jp/kegg/kegg1.html) associated with the screened proteins are involved in TGF-beta signaling pathways, Signalling pathways regulating pluripotency of stem cells, Colorectal cancer, Pancreatic cancer, and other pathways such as AGE-RAGE signaling pathways, inflammatory bowel disease, Th17 cell differentiation, cytokine-cytokine receptor interaction, Hippo signaling pathway and Diabetic cardiomyopathy as shown in Fig. 11(B). These pathways collectively suggest that INHBA may influence tumorigenesis through various mechanisms, such as altering cell signaling, affecting the tumor microenvironment, and promoting inflammation. The GO analysis revealed that most of these genes are involved in the Activin receptor signaling pathway and its regulatory mechanisms. This pathway is essential for cell growth, differentiation, and programmed cell death. The observed positive regulation and phosphorylation events indicate active signaling, which can influence tumor progression. Signaling through transmembrane receptor protein serine/threonine kinases, which carry signals from outside the cell to the inside, affects cellular responses like proliferation and survival. When this signaling is disrupted, it can lead to tumor growth and metastasis. The involvement of the extracellular matrix and cell surface receptor complexes suggests that INHBA expression may interact with the tumor microenvironment and cell surface receptors, impacting cell communication and growth. Functions such as Activin receptor activity, SMAD binding, and growth factor binding highlight INHBA’s role in interacting with key signaling molecules and receptors, as shown in Fig. 11(C-E). Furthermore, based on the expression similarity of genes, we screened 50 genes from GEPIA2 from all the tumor expression data related to INHBA, and after overlapping the 51 screened genes from STRING database, we identified five common genes based on expression and binding with INHBA as shown in Fig. 11(F) and found all five genes were positively correlated with INHBA in CHOL, COADREAD (Colorectal), LIHC, PAAD, ESCA and STAD as shown in Fig. 11(G). The five overlapping genes, namely Biglycan (BGN)25,26,27,28, Sulfatase 1 (SULF1)29,30,31, Thrombospondin 2 (THBS2)32,33, Collagen Type I Alpha 2 Chain (COL1A2)34,35 and Collagen Type III Alpha 1 Chain (COL3A1)36,37,38 have been previously studied in various GI-tract cancers.

Fig. 11
figure 11

INHBA-related genes enrichment analysis (A) top 50 genes associated with INHBA PPI network (B-E) KEGG pathway analysis, BP, CC, MF of top 50 INHBA associated proteins using ShinyGo (version 0.77) (http://bioinformatics.sdstate.edu/go77/) (F) Common genes from string database and GEPIA2 (G) Heatmap of correlation between INHBA and the shared genes.

We utilized LinkedOmics to investigate gene correlations in CHOL, COADREAD, LIHC, ESCA, PAAD, and STAD cohorts. Employing the Pearson correlation method via the “LinkFinder” module, we analyzed associations between INHBA and differentially expressed genes. Heat maps in Supplementary Figure S11 illustrate the top 50 genes positively and negatively correlated with INHBA. Additionally, using the “LinkInterpreter” module, we conducted Gene Set Enrichment Analysis (GSEA) for KEGG pathways, biological processes, cellular components, and molecular functions, detailed in Supplementary Figure S11. In the CHOL cohort, the top positively and negatively correlated genes were COL8A1 and BLOC1S1. While in COADREAD, LIHC, ESCA, PAAD, and STAD, the top positively correlated genes COL5A2, KIAA0754, TNFAIP6, ADAMTS12, COL10A1, and negatively correlated genes are C6orf136, ATP5G2, TJP3, TRMU, and ALDH9A1 respectively.

When we compared the GSEA among all the GI-tract cancers, these were found to be mainly the common biological processes involved, such as extracellular structure organization (in all GI-tract cancers), muscle cell migration (CHOL, COADREAD, PAAD, STAD, and LIHC), collagen metabolic process (ESCA and STAD). The negatively enriched pathways were mitochondrial and its related pathways. In the KEGG pathway analysis, positively enriched pathways included ECM-receptor interaction and focal adhesion, while negatively enriched pathways mainly fatty acid degradation, which was found to be commonly regulated. For cellular components, these genes were enriched in collagen trimers and extracellular matrix and negatively enriched NADH dehydrogenase complex. In terms of molecular function, they were mainly involved in extracellular matrix structural constituents, extracellular matrix binding, collagen binding, and so on. The GO and KEGG pathways were shown individually for each GI-tract cancer in Supplementary Figure S11.

Discussion

Cancers of the gastrointestinal tract have a significant risk of recurrence and have a poor prognosis. Despite different treatments, gastrointestinal tract cancer remains a leading cause of death worldwide39. Therefore, early detection can be monitored using accurate biomarkers in disease progression. INHBA has been previously investigated across various cancers. However, no systematic study has yet been made on the role of INHBA in cancers related to the GI tract. In this study, we did bioinformatics analysis to find the role and mechanism of INHBA in GI tract cancers that are available from TCGA available data. Several biomarkers are available for this particular cancer, but it would be promising to identify a biomarker that could be targeted for GI-associated cancers and their therapeutic targets.

INHBA, a member of TGF Beta superfamily, encodes subunit of activin and inhibin, which plays a role in cancer progression5. In our study, we have seen that there is significant upregulation of INHBA in many cancers, and previous studies have been reported, including gastric cancer7, esophageal squamous cell carcinoma40, colon adenocarcinoma41, breast cancer12, which matched our findings. In our study, we focused mainly on GI tract cancers and the expression of INHBA in these cancers, and we observed the overexpression of INHBA in COAD, ESCA, PAAD, READ, and STAD. In one previous study, INHBA was found to act as a senescence-related regulator, prognostic predictor in CRC, and a mediator of immune evasion42. In pancreatic cancer, the knockdown of two genes, INHBA and MMP14, inhibited the invasion of cancer, and it is reported that the overexpression of INHBA has a role in cancers13. In vitro studies have shown that INHBA plays a crucial role in enhancing breast cancer cell proliferation as well as invasion and INHBA induced epithelial-mesenchymal transition (EMT), thereby hastening cell motility by activating TGF-β-regulated genes. These findings showed that INHBA might be employed as a diagnostic marker in GI tract cancers. A study on colon adenocarcinoma examined the role of INHBA in predicting the prognosis of patients with COAD. A model was developed using a nomogram to predict survival and recurrence in COAD patients by combining genetic information with clinical data. Using data from TCGA, differentially expressed genes (DEGs) were identified, and created a new model that incorporates INHBA expression alongside clinical factors. This model was compared to a TNM nomogram based on the pathologic TNM stage, and their results suggest that the combined model outperforms the TNM model in predicting survival and recurrence, highlighting the clinical utility of integrating genetic and clinical data43. In our study, we found that higher expression of INHBA correlates with poorer overall survival (OS) in COAD. In another study on colorectal cancer (CRC), two genes, CTHRC1 and INHBA, were identified as key drivers of disease progression. These genes are highly expressed in cancer-associated fibroblasts (CAFs) within the tumor microenvironment. Single-cell sequencing analysis revealed that CTHRC1 and INHBA are significantly upregulated in fibroblasts within CRC tumor tissues compared to normal tissues. The study also found that fibroblasts are significantly more abundant in CRC tissues than in normal tissues, and a high proportion of fibroblasts is associated with poor prognosis in CRC patients. Consequently, CTHRC1, INHBA, and CAFs are identified as crucial contributors to CRC progression and could serve as promising therapeutic targets44. In our study, we observed a positive correlation between INHBA expression and CAFs in all GI tract cancers.

Here in our study, the prognostic analysis in the GI-tract tumors investigation was visualized using the TCGA dataset “Survival heatmap” and the “Kaplan-Meier plotter.” Using GEPIA2. The study utilized two primary survival analysis criteria: (1) overall survival (OS) and (2) disease-free survival (DFS). The Kaplan-Meier plots for OS in COAD and STAD and for DFS in COAD and PAAD indicated a correlation between INHBA expression and poorer prognosis. INHBA showed higher expression in several gastrointestinal cancers—COAD, ESCA, PAAD, READ, and STAD—but not in CHOL or LIHC. High INHBA expression correlated with adverse outcomes, specifically in COAD, PAAD, and STAD, while its role varied across different GI cancer types, suggesting a complex involvement in cancer progression. We observed the R308C mutation occurring frequently in COAD, ESCA, and LIHC using cBioPortal. We also found R377C/H mutation at the TGF-beta ___domain for Mucinous adenocarcinoma of the colon and rectum. Further detailed investigations are required to assess the impact of the INHBA mutation using molecular dynamics analysis, providing insights into its atomic-level implications. While analyzing genetic mutation, using cBioPortal, the KM-plot for the TCGA GI-tract cancers, we didn’t find any significant statistical differences in the disease-specific interval, disease-specific survival, overall survival, and progression-free survival of genetic alteration of INHBA and the clinical survival prognosis of patients. Meanwhile, genetic alterations of INHBA in esophageal adenocarcinoma are associated with a poor prognosis in overall survival.

The tumor microenvironment (TME) consists of various cellular and non-cellular components, including non-cancerous cells, nerves, lymphoid tissues, blood vessels, and metabolites, all located within or near the tumor site, whether at its core, edges, or adjacent areas. It is an environment that has an impact on all aspects of tumor biology45. TME relies heavily on fibroblasts and immune cells. Our findings show that INHBA expression in GI tract cancer is positively correlated with cancer-associated fibroblasts, which suggests that using cancer-associated fibroblasts might be used as a potential approach that can be employed for immune treatment for GI tract-associated cancers.

The DNA methylation study of INHBA CpG aggregate methylation showed significant differences when comparing tumor and normal samples in the different GI tract cancers, which further shed light on the novel approach that can be used to study the different CpG probe and their correlation with the INHBA expression. The KEGG pathways associated with INHBA proteins are mainly enriched in the “TGF-B signalling pathway”, “Signalling pathways regulating pluripotency of stem cells”, “Colorectal cancer”, “Pancreatic cancer” while GO enrichment analysis revealed INHBA mainly functions as “Activin receptor signalling pathway”, “SMAD associated pathways”, “SMAD protein complex”, “Activin receptor activity” and “Activin receptor complex”. In addition, we found Biglycan (BGN), Sulfatase 1 (SULF1), Collagen Type III Alpha 1 Chain (COL3A1), Thrombospondin 2 (THBS2), and Collagen Type I Alpha 2 Chain (COL1A2) were positively correlated with INHBA which suggests that relationship between these genes might regulate pathways that are regulated in GI-tract cancer progression.

Conclusion

We conducted and did a comprehensive bioinformatics analysis of INHBA in GI-tract cancers (CHOL, COAD, READ, ESCA, PAAD, LIHC, and STAD) and explored to find a relationship between INHBA expression level, disease prognosis, mutations (gene alteration), immune cell infiltration, genes integrative analysis, INHBA -DNA methylation, and associated pathways in this study. INHBA is overexpressed in COAD, ESCA, PAAD, READ, and STAD and is found to be significantly associated with poor overall survival in COAD and STAD. The results also suggest the role of INHBA in immune infiltration for GI tract cancers. The INHBA also significantly differs in DNA methylation in all GI tract cancers. Therefore, INHBA might be used as a promising biomarker for GI tract cancers. Using molecular dynamic study of mutation sites might provide insight into the TGF-B ___domain and the structural impact and stability of INHBA. The above findings indicate that INHBA might play different biological functions in different GI tract cancers, but it still needs clinical validation from more cancer patients.

Finally, further research is needed for INHBA, which might yield a more clinically significant and potentially therapeutic gene target for GI tract cancers as our study is based on publicly available data and bioinformatics analysis.

Materials and methods

For the current study, we have used different bioinformatics approaches, online web servers, and datasets, including GEPIA2 (http://gepia2.cancer-pku.cn/#analysis ), a web server that collects data from the TCGA data portal, and the GTEx normal tissue database. TIMER2 (http://timer.cistrome.org/ ) (a web resource for methodical analysis of gene INHBA through immune infiltrates across a wide range of cancer types), SMART (http://www.bioinfo-zs.com/smartapp/ ) (an interactive web tool for analysis of INHBA in DNA methylation analysis and visualisation), UALCAN (https://ualcan.path.uab.edu/ ) (a web server used for the stage analysis of INHBA in various GI tract cancers), cBioPortal (https://www.cbioportal.org/ ) (an online web server for Exploring INHBA in different GI tract cancers data and its mutation), ShinyGO (version 0.77) (http://bioinformatics.sdstate.edu/go77/ ) (gene ontology and pathway analysis) and (https://xena.ucsc.edu/ ).

Gene expression data

The gene expression of INHBA was studied across all TCGA GI-tract tumors using the GEPIA2 tool46 with a log2FC threshold of 1 and a significance level set at 0.01 for the p-value. We have used GEPIA2 to examine the differential expression pattern of INHBA in different cancer samples over TCGA tissue normal and GTEx for 31 (24 other cancer types and seven gastrointestinal cancers) and data for pan-cancer was utilized using UCSC XENA47.

We used UALCAN48 for pathological stage analysis to examine INHBA expression in different stages of GI tract cancer. All the box plots in the figure were created using log2(TPM + 1) converted expression data. Using the UALCAN, we also explored the changes in protein expression using the module “clinical proteomic tumor analysis consortium” (CPTAC) to investigate changes in INHBA for the total protein levels across different GI-tract cancers and normal tissues. We only found a dataset for this investigation for the expression of PAAD among all GI tract tumors.

Survival analysis

Here, we have used the GEPIA2 online tool to get the survival heat map. Further, we also examined the survival analysis49 for both (1) (overall survival (OS) and (2) disease-free survival (DFS)) for INHBA expression in various samples present only for GI tract-related cancer. Here, as a threshold, we have used months instead of days as the survival, higher cutoff (50%) values for higher expression and lower cutoff (50%) values of lower expression samples, and p-value < 0.05 as the level of significance for the Kaplan-Meier50 plot. Then, using the GEPIA2 survival analysis module, we reported cancers with statistically significant variability in patients’ survival using a heatmap and obtained the corresponding survival plots.

Gene mutation analysis

We utilised the cBioPortal tool51 to study the prevalence, type, and ___location of INHBA mutations in all GI tract tumors. We also accessed the survival analysis of INHBA genetic mutations. The INHBA mutation site information is presented in the ‘Mutations’ segment. Then, using the survival/comparison analysis module, we analysed the (1) overall survival (OS), (2) progression-free survival (PFS), (3) disease-specific survival (DSS), and (4) disease-free survival (DFS) data from TCGA cancer patients for INHBA gene changes and got the relevant KM plots with log-rank test p-value for cumulative GI tract cancers. The structure of INHBA was generated using a 3D protein imaging online tool52.

Analysis of the correlation between INHBA expression and immune cell infiltration

For immune infiltration analysis, we have used TIMER53 and TIMER2 websites54 and accessed immune cells to find the relationship between INHBA and the expression of immune infiltration across all GI tract cancers. For further investigation, we chose (1) B-cells, (2) macrophages, (3) natural killer cells, (4) neutrophils, (5) CD8 + T cells, and (6) CD4 + T cells from TIMER, and we also accessed cancer-associated fibroblasts (CAF) from TIMER2. The p-values and partial correlation values of 4 algorithms, namely EPIC, MCPCOUNTER, TIDE, and XCELL, were accessed using Spearman’s rank (purity-adjusted) correlation analysis from TIMER2 for CAF analysis. After the data was visualized, a correlation plot, survival plot, and mutation plot were generated using TIMER for different GI-tract cancers.

INHBA-related gene enrichment analysis

The online STRING tool55 was used to analyse the molecular interaction network for INHBA and related genes. We then defined the network for a minimum needed interaction score with low confidence of 0.400 and a maximum number of interactors to show not more than 50 interactors. Finally, we are able to obtain the top 50 INHBA interacting proteins as well as the accompanying network diagrams. We then utilised ShinyGO (version 0.77)56,57,58,59,60,61 to do an enrichment analysis of INHBA and associated genes. In this study, we have provided insights into gene ontology (biological processes (BP), cellular components (CC), molecular functions (MF), and KEGG pathways). Further, GEPIA2 was accessed to find only the top 50 genes that are similar to INHBA, and Venny2.0 was accessed to find the shared genes between STRING and GEPIA2. Finally, we used the TIMER2 gene correlation module to acquire heatmap data from 5 common genes in different cancer types. Finally, the degree of association was quantified by presenting the Spearman rho value (adjusted for purity).

LinkedOmics analysis

The LinkedOmics62 was accessed, which is a unique online platform for analyzing and evaluating cancer data across all cancer projects. We utilised LinkedOmics to identify positively and negatively correlated significant genes analytical modules for each GI tract cancer. A functional enrichment analysis and integrated association results were compared using Gene Set Enrichment Analysis (GSEA)63. In this investigation, we also used Pearson correlation and a substantial p-value to highlight the top positively and negatively linked genes in distinct GI tract tumors.

SMART app

We employed the SMART (Shiny Methylation Analysis Resource Tool), which is a web App/Tool64, to ease the DNA methylation analysis study for INHBA. It is a user-accessible web tool designed to study DNA methylation data from various cancer projects within TCGA. The SMART app was used to investigate the methylation probes associated with INHBA. We analysed the different probe IDs associated with INHBA in normal and tumor tissue from several GI-tract malignancies. Further methylation among distinct clinical stages of cancer was also investigated.