Introduction

Chronic obstructive pulmonary disease (COPD) is a prevalent chronic airway disease characterized by persistent airflow obstruction and progressive respiratory symptoms, with high mortality and significant socio-economic impacts1,2,3. Its pathogenesis involves gene-environment interactions, with cigarette smoke (CS) exposure being the primary risk factor4,5. Despite current therapeutic advancements, COPD management remains challenging due to disease heterogeneity and limited biomarkers for early diagnosis6,7,8. Recent studies highlight mitochondrial dysfunction as a critical player in COPD progression, linking it to inflammation, epithelial damage, and steroid resistance9.

Mitochondrial impairment in COPD manifests through multiple mechanisms: CS exposure disrupts mitochondrial dynamics (fusion/fission as well as transport) and autophagy pathways10,11, while acrolein (a major CS-derived toxin) induces bioenergetic collapse and inflammation activation12. This mitochondrial impairment is thought to contribute to heightened inflammatory responses, hindered epithelial repair processes, reduced corticosteroid sensitivity in lung tissue, and impaired macrophage phagocytosis, which are factors implicated in COPD progression13. Animal models demonstrate that mitochondria-specific antioxidant MitoQ mitigate lung injury by inhibiting mitochondrial autophagy and NLRP3 inflammasome signaling14. Clinical studies have further revealed impaired mitochondrial dysfunction in lung tissues of COPD patients, including disrupted mitochondrial fusion, increased autophagy, and accelerated cell senescence15. These changes are closely related to the aggravated airflow limitation and decreased exercise tolerance in patients16. These findings underscore the therapeutic potential of targeting mitochondrial pathways in COPD, and a number of mitochondrial-specific antioxidants have been proposed as therapeutic options17,18.

However, the molecular interplay between mitochondria-related genes and immune microenvironment regulation in COPD remains poorly understood. This study aims to identify key mitochondria-related differentially expressed genes (MitoDEGs) in COPD by using multi-omics analysis of Gene Expression Omnibus (GEO) datasets, and elucidate their regulatory roles in immune infiltration and disease progression. By integrating machine learning with network biology approaches, we expect to uncover novel biomarkers and therapeutic targets for COPD management.

Materials and methods

Microarray data collection and analysis of differentially expression genes (DEGs)

We obtained lung tissue transcriptomic datasets from the NCBI GEO database (https://www.ncbi.nlm.nih.gov/geo). The training set (GSE57148) and validation sets (GSE76925, GSE151052, GSE239897) were selected through systematic filtering by “COPD”, Homo sapiens, and lung tissue criteria. Post-quality control measures, the specifics of case inclusion across datasets are elucidated in Table 1. Data preprocessing utilized the “GEOquery” package (version 2.70.0)19, “limma” package (version 3.58.1)20, “ggplot2” (version 3.4.4) and “ComplexHeatmap” (version 2.18.0) packages in R. Differential expression analysis identified genes with adjusted P < 0.05 and log2 fold change > 0.58.

Table 1 GSE datasets referenced in this study.

Gene set variation analysis (GSVA) and functional enrichment analysis

GSVA, an extension of the Gene Set Enrichment Analysis (GSEA) framework, allows the transformation of the gene expression matrix across samples into a gene set expression matrix to evaluate pathway enrichment in sample sets21. Utilizing the “msigdbr” package (version 7.5.1) in R, we retrieved the "c2.cp.kegg.v7.4. symbols.gmt" gene set from the Molecular Signatures Database (MSigDB) (https://www.gsea-msigdb.org/gsea/msigdb/index.jsp)22. Enrichment analysis was subsequently performed on COPD samples from GSE57148 using “GSVA” package (version 1.50.0). Functional annotation was conducted through Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses using the “ClusterProfiler” package (version 4.10.2)23. GO terms categorized molecular functions, biological processes, and cellular components, while KEGG enrichment analyses contextualized pathway information24,25. Significance thresholds were set at P-value < 0.05 or Q-value < 0.05. Results were visualized by “enrichplot” package (version 1.22.0) and "ggplot2″ package (version 3.4.4).

WGCNA and key modules gene identification

We applied WGCNA (version 1.72-5) to construct a co-expression network from GSE57148 transcriptomic data26. Initial sample clustering eliminated outliers to enhance network stability. Soft thresholding power analysis optimized gene–gene correlations detection, followed by dynamic tree-cutting method to partition genes into 23 co-expression modules (Supplementary Table S1 and Supplementary Figure S1). Module eigengenes were computed to identify associations with COPD clinical features. Key modules showing significant correlations (|GS|> 0.7) were prioritized for further analysis. Hub genes within these modules were determined through intra-module connectivity analysis, highlighting candidates on the pathogenesis of COPD.

Identification of MitoDEGs

Following Fu et al.s protocol27, we identified 2030 human MitoDEGs through integrated analysis of three datasets: mitochondrial genes from MitoCarta3.0 (http://www.broadinstitute.org/mitocarta), DEGs detected by limma (v3.58.1) in GSE57148 transcriptomic data, and co-expression modules associated with COPD pathology via WGCNA (Supplementary Table S2)28,29. The intersection of these datasets was visualized via Venn diagram, ensuring comprehensive coverage of mitochondria-related genes linked to COPD.

Development and evaluation of a classification predictive model using machine learning

To identify mitochondrial gene-based biomarkers and develop a robust COPD classification model, we adapted Maimaiti et al.'s framework30, training 143 predictive models using 12 machine learning algorithms (including Lasso, Ridge, Enet, Stepglm, SVM, glmBoost, LDA, plsRglm, RF, GBM, XGBoost and NaiveBayes) to evaluate 14 MitoDEGs. Model performance assessed via receiver operating characteristic (ROC) curves. The area under the curve (AUC) values along with the corresponding 95% confidence intervals (CIs) were calculated using the “pROC” package (version 1.18.5)31. Heatmaps were used to visualize the results, clearly depicting the model performance of the various algorithms tested.

Immune infiltration analysis

We applied CIBERSORT (version 0.1.0) to quantify immune cell infiltration in COPD samples,analyzing 22 immune cell types using gene expression data32. Immune cell abundance and proportions were visualized as bar graphs (ggplot2 v3.4.4). Correlation between cell types was assessed via Pearson correlation matrices (corrplot version 0.92). Differences in cell proportions between COPD and control groups were identified by Wilcoxon test (P < 0.05), with results displayed through boxplots (ggplot2 version 3.4.4). Finally, Spearman correlation analysis linked the five hub genes to immune cells, visualized using ggplot2 (version 3.4.4).

Connectivity map (cMAP) analysis

We employed cMAP (https://clue.io), a platform analyzing differential gene expression to connect diseases, genes and small molecules33. By inputting mitochondria-associated modular gene from COPD lung samples, we identified top 10 small molecules ranked by normalized connectivity scores, highlighting their therapeutic potential. Structural data for target genes were retrieved to elucidate molecular interaction with these identified compounds.

Construction of COPD mice model and Ethical Approval

Following an established protocol34, we developed a COPD mouse model using twenty male C57BL/6 mice (6–8 weeks old) from Hunan Slake Jinda Laboratory Animal Co. Mice were housed at the Department of Laboratory Animal Science at Central South University, divided into two groups: control and COPD group. The COPD group received LPS (1 mg/kg, i.t) on day 1 and 30, plus 10 cigarettes/day (each containig 1.0 mg nicotine, 11 mg tar, Furong brand cigarettes from China) in a homemade fume box twice a day, 6 days a week, for 16 weeks. Controls were exposed to air. Lung tissues were collected after euthanasia via pentobarbital anesthesia. The protocol was approved by the Ethics Committee for Laboratory Animal Welfare at Central South University (with approval number CSU-2023-0191) and complied with ARRIVE guidelines.

Histological analysis

Left lung tissues were fixed in 4% paraformaldehyde overnight, paraffin-embedded, and sectioned into 4-μm slices. Hematoxylin–eosin staining identified histopathological changes. Leica microscopes captured images and blinded histological scoring assessed inflammatory infiltration (0–3 scale) and alveolar damage across three random fields per sample. Inflammatory infiltrates were evaluated on a four-point scale, with 0 indicating no or occasional inflammatory cells; 1 indicating a few loosely arranged inflammatory cells; 2 indicating many cells in the interstitial and intra-alveolar spaces; and 3 indicating numerous inflammatory cells in the perivascular space35.

Reverse transcription quantitative polymerase chain reaction (RT-qPCR) analysis

To extract total RNA from lung tissue, the Trizol reagent (Invitrogen) was used, followed by reverse transcription into cDNA with the PrimeScript™ II 1st Strand cDNA Synthesis Kit (Takara, Shiga, Japan). The 2 × ChamQ Universal SYBR qPCR Master Mix (Vazyme, China) was employed for amplifying each sample in a 20 μl reaction mixture. The fold changes were caculated using the 2-ΔΔCt method. Expression levels were determined by calculation and normalized to the endogenous GAPDH. Detailed primer sequences for this procedure are systematically listed in Table 2.

Table 2 Detailed primer sequences for this study.

Study approval

All animal experiments were performed in accordance with relevant guidelines, regulations and ARRIVE guidelines. This study was approved by the ethics committee of Laboratory Animal Welfare at Central south university (grant number: CSU-2023-0191).

Statistical analyses

All statistical analyses were performed using R software (version 4.2.1, accessed on January 25, 2025, https://www.r-project.org) and GraphPad Prism (version 8.0, accessed on January 25, 2025). All results were expressed as mean ± standard error of the mean (SEM) of at least 3 independent experiments. T-tests were used for comparisons between dual groups while one-way ANOVA was used for three or more groups. Statistical significance was determined by comparing p-values, where P < 0.05 denoted significance (* P < 0.05, ** P < 0.01, *** P < 0.001, **** P < 0.0001).

Results

Identification of DEGs between healthy controls and patients with COPD

Figure 1 illustrates the study workflow. Data normalizaztion of GSE57148 identified 174 DEGs in COPD vs healthy lungs (Supplementary Table S3), with 114 upregulated and 60 downregulated genes (Fig. 2A,B). Pathway enrichment analysis via GSVA revealed significant COPD-specific alterations: upregulated pathways including the ErbB signaling pathway, prostate cancer, type 2 diabetes mellitus, phosphatidylinositol signaling system, pathways in cancer, adherens junction, neurotrophin signaling pathway, glioma, aldosterone regulated sodium reabsorption, and O-glycan biosynthesis; downregulated pathways involved proteasome, Alzheimer’s disease, the peroxisome, regulation of autophagy, the ribosome, Huntington’s disease, glutathione metabolism, base excision repair, Parkinson’s disease, and oxidative phosphorylation (Fig. 2C). Notably, mitochondrial gene sets showed marked dysregulation between the two groups (P = 5.7e−11) (Fig. 2D), underscoring mitochondrial dysfunction’s role in COPD pathogenesis alongside immune imbalance.

Fig. 1
figure 1

Flowchart of the Entire Study.

Fig. 2
figure 2

Detection of DEGs in COPD and control groups. (A) Volcano plot of DEGs; (B) Heat map of top 20 DEGs; (C) GSVA analysis based on the KEGG database; (D) Comparison of mitochondrial gene scores between two groups.

Weighted gene co-expression network construction

WGCNA identified key COPD-associated gene modules using GSE57148 dataset. After filtering low-quality samples, 91 normal and 98 COPD samples were clustered. A soft-threshold (power = 7) with R2 > 0.8 and average connectivity > 0.8 was applied, resulting in 32 modules (Fig. 3A–C). Clustering tree illustrated module evolution. Key modules showed significant correlations with COPD: yellow module (r = 0.57, P = 2e−17), pink module (r = − 0.58, P = 8e−18) and the blue module (r = − 0.48, P = 3e−12) (Fig. 3D). Strong module-gene associations were observed in all three modules: yellow module (621 genes, r = 0.73, P = 2.1e−104), pink module (351 genes, r = 0.72, P = 2.5e−57), and blue module (2634 genes, r = 0.61, P < 1e−200) (Fig. 3E–G), totaling 3606 genes for downstream analysis.

Fig. 3
figure 3

Modular genes in normal and COPD groups detected using WGCNA. (A) Determine the optimal “soft” threshold for the dataset; (B) Initial and merged modules in the clustering dendrogram; (C) Gene co-expression modules indicated in different colors under the cluster tree; (D) Identify modules that are highly correlated between the normal and COPD groups; Scatterplot of the yellow (E), pink (F) and blue (G) module.

Identification and functional analysis of MitoDEG in the context of COPD

MitoDEGs were identified through intersection of WGCNA modules, limma DEGs, and MitoCarta3.0 mitochondrial genes (Fig. 4A), yielding 14 key genes (COMTD1, ELK3, ERN1, ETFB, FASTK, HIGD1B, IFI27L2, MRPL41, MRPL55, NDUFA7, NDUFB7, NME3, PLA2G4B and ZBED3). GO enrichment analysis revealed that MitoDEGs are involved in the regulation of mitochondrial gene expression, purine ribonucleoside triphosphate biosynthesis process, mitochondrial protein-containing complexes, organellar ribosome, electron transfer activity, and structural constituent of ribosome (Fig. 4B). KEGG analysis suggested that these MitoDEGs are related to various disease pathways such as Non-alcoholic fatty liver disease, Parkinson disease, Huntington disease, Amyotrophic lateral sclerosis, Alzheimer disease, and many other and others (Fig. 4C).

Fig. 4
figure 4

Identification and functional analysis of MitoDEGs in COPD. (A) Venn diagram of key modular genes, DEGs and mitochondria-related genes; (B, C) GO and KEGG analysis of MitoDEGs.

Selection of candidate hub genes using machine learning algorithms

We trained 143 machine learning models using 12 machine learning algorithms to identify MitoDEGs among 14 candidates (Fig. 5A). Evaluated on GSE57148 training and three external validation cohorts (GSE57148, GSE76925, and GSE151052), the glmBoost and RF algorithm achieved optimal performance (AUC = 0.72, accuracy = 0.662, recall = 0.667, precision = 0.738, and F1 score = 0.696) (Supplementary Figure S2), identifying five key genes: ERN1, FASTK, HIGD1B, NDUFA7, and NDUFB7. Compared to controls, COPD samples showed decreased expression in FASTK, HIGD1B, NDUFA7, and NDUFB7 and increased ERN1 (Fig. 5B). The diagnostic model built from these genes demonstrated strong accuracy: AUC = 0.881 (95% CI: 0.832–0.930) in training (Fig. 5C), and 0.613–0.881 in validation (Fig. 5D–F). Individual gene AUCs ranged from 0.774 (HIGD1B) to 0.855 (ERN1) in training (Fig. 5G–K), confirming their diagnostic value for COPD.

Fig. 5
figure 5

Construction and testing of mitochondria-related genes score. (A) AUC values of 143 machine learning algorithms combinations in 4 sets; (B) Expression of the 5 candidate genes (ERN1, FASTK, HIGD1B, NDUFA7, and NDUFB7) in the training sets; (CF) ROC curves for the diagnostic model in the training set GSE57148, and validation sets (GSE76925, GSE151052, GSE239897); (GK) ROC curves for the 5 candidate genes (ERN1, FASTK, HIGD1B, NDUFA7 and NDUFB7).

Immune cell infiltration in COPD

CIBERSORT analysis of GSE57148 dataset revealed distinct immune cell infiltration patterns between COPD and control groups (Fig. 6A). Monocyte and neutrophil infiltration exhibited strong positive correlation (r = 0.379, P = 0.0008), while activated and resting mast cells showed significant negative correlation (r = − 0.668, P = 2.72e−07) (Fig. 6B). Differential analysis highlighted enrichment of activated dendritic cells (DCs), macrophages M0, activated mast cells, and neutrophils in COPD patients contrasted with dominance of macrophages M2, resting natural killer (NK) cells and T follicular helper (Tfh) cells in controls (P < 0.05) (Fig. 6C). Five key MitoDEGs displayed unique immune associations: Upregulated ERN1 correlated with activated DCs, eosinophils, activated mast cells, monocytes, neutrophils, and resting CD4 memory T cells (P < 0.01), while downregulated FASTK, HIGD1B, NDUFA7, and NDUFB7 preferentially linked to memory B cells, macrophages M2, resting NK cells, and Tfh cells (P < 0.01) (Fig. 6D). Notably, neutrophils infiltration was significantly higher in COPD (Fig. 6C), with ERN1 showing the strongest positive correlation (r = 0.52, P = 2.04e−14). Conversely, FASTK (r = − 0.55, P = 5.87e−16), HIGD1B (r = − 0.42, P = 3.43e−09), NDUFA7 (r = − 0.46, P = 5.19e−11), and NDUFB7 (r = − 0.51, P = 5.94e−14) exhibited marked negative correlations with neutrophil levels. These findings underscore immune dysregulation and mitochondrial-gene interactions in COPD pathogenesis, particularly highlighting neutrophil-driven inflammation modulated by key MitoDEGs like ERK1 and NDUFA7.

Fig. 6
figure 6

Analysis of immune cell infiltration in COPD and control groups. (A) Stacked bar chart of immune cell proportions; (B) Heatmap of the correlation between 22 immune cell; (C) Boxplots showing the difference in immune cells between the COPD and healthy control; (D) Correlation between 5 hub genes and 22 immune cells.

Prediction of candidate drugs

We identified 10 mitochondria-related compounds with potential therapeutic effects on COPD via cMAP analysis (Fig. 7A). Key candidates included mitomycin-c (a DNA alkylating drug), RITA (a thioredoxin reductase inhibitor), SN-38 (a topoisomerase inhibitor), celastrol (an anti-inflammatory agent), spectinomycin (a 30S ribosomal subunit inhibitor), SB-216763 (a glycogen synthase kinase inhibitor), PTB1 (an AMPK activator/tyrosine phosphatase inhibitor), androstenol (a GABA receptor modulator), PKC beta-inhibitor, and 2-aminopurine (a serine/threonine protein kinase inhibitor) (Fig. 7A). Among these, androstenol emerged as the most promising candidate due to its high score and association with GABA-mediated airway mucus regulation36. Pathways linked to glycogen synthase kinase and thioredoxin reductase were prioritized for their roles in COPD-related inflammation and oxidative stress37,38 (Fig. 7B). Figure 7C–L described the chemical structures of these 10 small molecule compounds.

Fig. 7
figure 7

Screening of potential small molecule compounds for mitochondria-related COPD genes by cMAP analysis. (A) The top 10 compounds with the highest negative enrichment scores based on cMAP analysis; (B) Pathway descriptions associated with the 10 compounds. (CL) Chemical structures of the 10 compounds.

Validation of core gene expression in lung tissue of COPD mice

To validate bioinformatics findings, we established a COPD mouse model using CS/LPS induction. Lung pathological scores and mean linear intercept (MLI) were significantly elevated in COPD mice vs. controls (Fig. 8), confirming successful emphysema modeling. qPCR analysis of ERN1, FASTK, HIGD1B, NDUFA7, and NDUFB7 in lung tissues revealed: HIGD1B and NDUFB7 were significantly downregulated (P < 0.05), while FASTK, NDUFA7, and ERN1 showed non-significant expression changes (P > 0.05) (Fig. 9). These results partially contradicted bioinformatics predictions but highlighted HIGD1B and NDUFB7 as key players in COPD inflammation, underscored by their robust diagnostic relevance despite limited statistical significance for other genes.

Fig. 8
figure 8

Histological changes in the lungs of control (CTL) and COPD (CS + LPS) mouse. (A) Hematoxylin–eosin staining of lung tissues. Lung pathological scores (B) and MLI analysis (C) of mice in the control and COPD groups. n = 3 mice per group, data are expressed as mean ± SEM, Student’s t-test. Compared with the control group, * P < 0.05; ** P < 0.01.

Fig. 9
figure 9

Expression verification of 5 hub genes in lung tissues of COPD mouse. (A) ERN1 (B) FASTK (C) HIGD1B (D) NDUFA7 (E) NDUFB7. n = 5 mice per group, data are expressed as mean ± SEM, Student’s t-test. Compared with the control group, * P < 0.05; ** P < 0.01; ns: not significant.

Discussion

COPD is a common chronic inflammatory airway disease linked to mitochondrial dysfunction39, involves dysregulation of mitochondrial protein expression, structure, and function triggered by external stimuli like inflammation, infection, and CS40. This dysfunction perpetuates chronic airway inflammation and immune injury41. Using MitoCarta 3.0, we identified mitochondrial-related genes and explored their correlation with COPD pathogenesis and immune microenvironment via bioinformatics methods. Significant differences in mitochondrial gene scores were observed between COPD patients and healthy controls, and a diagnostic model based on these genes demonstrated high accuracy for COPD.

Machine learning identified 5 hub MitoDEGs (ERN1, FASTK, HIGD1B, NDUFA7, and NDUFB7) associated with COPD. Animal experiments confirmed downregulation of HIGD1B and NDUFB7 in COPD, aligning with bioinformatics predictions. The hypoxia-inducible ___domain (HIGD) gene family primarily consists of five genes, including HIGD-1A, -1B, -1C, -2A, and -2B, are crucial for mitochondrial integrity and respiratory chain complex IV42,43. Limited studies on HIGD1B suggest its involvement in tumorigenesis and pituitary adenomas progression44. Pang et al.45 demonstrated that HIGD1B maintains mitochondrial integrity under hypoxia, promoting cell survival via caspase-3/9 inhibition, while its knockdown induces mitochondrial fragmentation. NDUF (NADH-ubiquinone oxidoreductase), the first enzyme of the mitochondrial electron transport chain, plays a critical role in oxidative phosphorylation46. Its subunits NDUFA7 and NDUFB7 are linked to inflammatory diseases: elevated levels were observed in COVID-19, Acute lung injury (ALI) and psoriatic arthritis patients47,48, reflecting tissue damage. These findings highlight connections between HIGD and NDUF families and inflammation. Combined with our data, we propose that HIGD1B and NDUFB7 downregulation in COPD disrupts mitochondrial function, exacerbating airway inflammation and disease progression, though further validation is required.

Although ERN1, FASTK, and NDUFA7 showed non-significant expression differences between groups, their trends suggest biological relevance given COPD’s complexity and sample limitations49. Despite constrained validation sample size, we integrated multiple datasets to strengthen findings. Existing studies link these molecules to inflammation. The endoplasmic reticulum to nucleus signaling 1 (ERN1) gene encodes the transmembrane protein kinase inositol-requiring enzyme 1 (IRE1), a key mediator of endoplasmic reticulum (ER) stress that regulates neutrophils activation and contributes to inflammatory diseases. In ALI mice model, C5a receptor-mediated ER stress induces neutrophils granule release and lung injury via the IRE1α-TRAF2-NF-κB pathway50,51. Similarly, IRE1α overactivation in non-alcoholic fatty liver disease promotes liver inflammation by recruiting macrophages through extracellular vesicle release52. While in asthma models, it exacerbates neutrophilic airway inflammation via Th17 cell activation53. IRE1α inhibitors reduce apoptosis and fibrosis in alveolar epithelial cells54. The Fas-activated serine/threonine kinase (FASTK) gene, a serine/threonine protein kinase and key post-transcriptional regulator of mitochondrial gene expression55,56, is linked to liver and cardiac diseases. Genetic deletion of FASTK attenuated hepatic steatosis and inflammation in chronic alcoholic liver diseases57, while its downregulation in alcoholic cardiomyopathy correlates with ethanol-induced suppression via reactive oxygen species, destabilizing FASTK mRNA58. These studies suggest a potential role for FASTK in COPD-associated inflammation, though further large-scale studies are needed to clarify its mechanistic involvement.

Mitochondrial-related genes hold significant potential for COPD prevention and treatment. Our studies demonstrate that their expression levels can serve as biomarkers for early COPD risk prediction, while specific genes correlate with immune cell infiltration patterns, guiding immunotherapy strategies59. Therapeutically, targeting mitochondrial dysfunction has already made some progress in COPD: SUL-151, a novel compound with mitochondria protective properties, reduced neutrophil infiltration and lung inflammation in COPD mice60; Dysregulation of mitochondrial iron homeostasis is closely related to mitochondrial dysfunction; iron chelators significantly alleviated CS-induced mucociliary clearance impairment and pulmonary inflammation in COPD mice61; and taurine/3-methyladenine restored mitochondrial gene expression in emphysema models62. We have also identified 10 small molecule compounds with potential therapeutic effects in COPD. Notably, androstanol- a volatile steroid found in male sweat- emerges as a candidate compound63. Although direct COPD efficacy data are lacking, its possible biological activity as a volatile steroid and its influence on physiological responses warrant further investigation. Exogenous supplementation of androgens has been reported to alleviate pulmonary artery hypertension, increase serum insulin-like growth factor (IGF)-1 and IGF-binding protein-1, reverse the loss in diaphragm force-generating capacity, improve mitochondrial and muscle function, increase myosin expression, and attenuate pulmonary epithelial inflammation in COPD64,65. Future studies should explore the potential of androstenol in COPD, particularly in its ability to modulate inflammatory pathways, enhance muscle function, and interact with the endocrine system. Investigating its mechanism and therapeutic efficacy could provide novel insights into COPD management and expand the scope of steroid-based interventions.

This study uncovered mitochondria-related genes and COPD immune microenvironment through bioinformatics analysis. However, there are some limitations. Firstly, although rigorous bioinformatics analysis was conducted, the data were sourced from public databases with limited sample sizes, and the variations in data composition, geographical distribution, and collection methods can introduce bias. Secondly, due to the specific characteristics of the datasets used, the lack of relevant clinical data limited our ability to adjust or correct for common covariates such as smoking status, ethnicity, and comorbidities, which can lead to confounding bias. Additionally, while direct validation in human tissue would be ideal, limitations in sample access made us rely on computational methods and indirect validation using available COPD animal tissues. Finally, we identified five key genes, but their exact functional roles in the pathogenesis of COPD remain speculative. Further cellular and animal experiments are needed to explore the specific roles of these genes in COPD. We acknowledge these constraints, highlight the preliminary nature of our findings, and suggest that future studies should consider incorporating larger and more diverse datasets as well as additional in vivo and in vitro experiments to more effectively address these potential confounders and enhance the robustness of the research findings.

Conclusions

This study revealed distinct mitochondrial gene expression and immune infiltration patterns between COPD and healthy controls, uncovering the interplay between mitochondrial metabolism and immune responses. Machine learning identified five key MitoDEGs (ERN1, FASTK, HIGD1B, NDUFA7, and NDUFB7) and developed a classification diagnostic model for the early COPD diagnosis. ERN1 expression positively correlated with neutrophil infiltration, while FASTK, HIGD1B, NDUFA7 and NDUFB7 showed negative correlations. Animal validation confirmed differential expression of HIGD1B and NDUFB7 in COPD, underscoring their co-regulatory roles in mitochondrial metabolism and immune crosstalk.