Abstract
Breast cancer, a complex global health concern, has predominantly been studied for nuclear DNA variations. However, the role of mitochondrial DNA (mtDNA) haplogroups in breast cancer susceptibility, especially in Pakistan, remains underexplored. This case-control study investigates the association between mtDNA haplogroups and breast cancer in Pakistan. The study reveals a significant abundance of haplogroup M in breast cancer cases by analyzing breast cancer patients and healthy controls through mitochondrial control region genome sequencing (p < 0.001). Increased frequencies of haplogroups M, H, and R in patients compared to controls suggest their potential role in breast cancer susceptibility. Triple-Negative Breast Cancer (TNBC) cases are also linked to haplogroup M, showing a statistically significant association with a p-value of 0.002. This suggests a potential meaningful association between haplogroup M and the occurrence of TNBC in the studied population. These findings emphasize the importance of mitochondrial genetics in breast cancer risk among the Pakistani population, offering insights for biomarker discovery and targeted interventions. Recognizing mitochondrial genetics in breast cancer risk assessment holds promise for tailored medicine strategies and may impact global breast cancer research and prevention efforts.
Similar content being viewed by others
Introduction
Breast cancer, a leading cause of female mortality for over a decade1, affects individuals across diverse societal strata, particularly in the age range of the fourth to the sixth decade of life2. Characterized by uncontrolled cell proliferation in breast tissue, this multifaceted disease is influenced by a combination of genetic and environmental factors3. Established risk factors encompass age, gender, family history, early menstruation, late menopause, and exposure to environmental elements like radiation and hormonal drugs4. Key genetic contributors include mutations in high-risk genes such as BRCA1, BRCA2, and TP53 and other implicated genes like ATM, CHEK2, PALB2, and PTEN5,6. Recent insights from genome-wide association studies (GWAS) highlight over 170 genetic loci, mainly in non-coding regions, influencing breast cancer risk, emphasizing the interplay between genetic variants and environmental factors7. The Warburg hypothesis posits cancer as a metabolic disorder stemming from mitochondrial impairment. Variations in the mitochondrial control region are scrutinized as potential indicators of breast cancer susceptibility8. In 2020, Pakistan ranked 26th globally in Breast Cancer incidence, with a mortality rate of 25.28 cases per 100,000 people. With 178,388 new cases reported in 2020, the high prevalence of BC in Pakistan highlights the significant link between mtDNA haplogroups and BC susceptibility. Studies suggest that mitochondrial haplogroups may impact breast cancer risk, especially among individuals with BRCA1 and BRCA2 mutations, emphasizing the need for comprehensive research to inform prevention and treatment strategies in high-risk regions like Pakistan9.
The double-layered cellular organelle’s mitochondrion produces most of its cellular energy through oxidative phosphorylation (OXPHOS) and contains a unique DNA of 16,569 base pairs inherited maternally. The control region of mitochondrial DNA (mtDNA) in the field of genetics is particularly variable and is found in the displacement loop (D-loop), which is a region of approximately 1122 base pairs (bps) between positions 16,024–16,569 and 1-576. This region is known to be a hotspot for mtDNA alterations, as demonstrated by numerous studies10. This mitochondrial DNA (mtDNA) encodes essential polypeptides for OXPHOS and differs from nuclear DNA11. Specific single nucleotide polymorphisms (SNPs) identify mtDNA haplogroups, which trace human origins and are also linked to several diseases, including cancer. For instance, hereditary breast cancer is commonly linked to mutations in nuclear genes like BRCA1 or BRCA2. Still, variations in mtDNA have also been implicated, leading to heightened oxidative stress and an increased risk of breast cancer12. Numerous investigations carried out in diverse demographics have examined the possible associations between mitochondrial haplogroups and the likelihood of developing breast cancer. Haplogroup H was linked to a higher incidence of breast cancer in a European sample; 164 cases and controls showed this relationship13. Haplogroup N showed increased breast cancer susceptibility in the Indian population12. Researchers discovered that haplogroup M was a substantial risk factor for breast cancer in southern China14. The whole mitochondrial genomes of 20 breast cancer samples and controls were sequenced in Sulaimaniyah, Iraq, for a study that found that the HV haplogroup had a significantly higher odds ratio (OR) of 28 than the normal H haplogroup15. Among Sinhalese women, possessing the M65a haplogroup and a mutation at position 16,311 was associated with a slightly elevated risk of sporadic breast cancer, as indicated by a p-value of 0.07718. These findings collectively highlight the potential role of mitochondrial haplogroups in influencing breast cancer risk across diverse ethnic groups.
Pakistan has a high incidence of breast cancer, which emphasizes the significance of examining the relationship between mitochondrial DNA and disease susceptibility, as data on breast cancer susceptibility and mtDNA is scarce. Comprehending the function of mtDNA haplogroups may provide important information for high-risk areas prevention, diagnosis, and treatment approaches.
Methods
Sample size was calculated by using G power sample size calculator taking level of significance 90% population14. Total 184 breast cancer patients and 184 healthy participants were initially recruited in this study.
Collection of samples
A detailed consent form was filled out by all the participants. Blood samples (3–5 ml) were collected in EDTA vials. Since maternal family members share the same haplogroups 71 breast cancer patients and 60 healthy individuals were finally subjected to mtDNA analysis. The University of Lahore’s (UOL) ethical committee approved the sample collection.
Sample selection
Women with primary Breast cancer (BC), aged between 18 and 40 years were included in the study. BC patients of all four stages, having no distant metastases and who voluntarily agreed to take part in the research and agreed to sign a written consent form were involved in this research. While those aged below 18 or above 40, or having pregnancy were excluded from the study. Patients with autoimmune disease, undergoing organ transplantation, or having ischemic stroke and brain hemorrhage were also not included.
Clinical profiling of patients
The laboratory diagnosis of patients and healthy individuals encompasses hematological and biochemical assessments. A Sysmex XP-300TM automated analyzer conducted a complete blood count (CBC), quantifying blood elements such as Red Blood Cells (RBCs), White Blood Cells (leukocytes), and Platelets (thrombocytes). Additionally, liver function tests (LFTs), assessing ALT, AST, ALP, and total bilirubin, and renal function tests (RFTs), measuring urea and creatinine, were performed using the Roche Cobas c-311 automated biochemistry analyzer.
DNA extraction
DNA extraction using the BloodZol kit involved initial steps of incubating a mixture of whole blood and Red Cell Lysis Buffer, followed by centrifugation to isolate white cell pellets. Lysis Buffer and Proteinase K were added to the pellets, and after incubation and isopropanol addition, filamentous DNA was observed. The DNA was then purified through ethanol vortexing, centrifugation, and air-drying. The dried DNA pellets were dissolved in Elusion Buffer, vortexed, and incubated before preservation at -40 °C, ensuring the collection of mitochondrial DNA samples for subsequent analysis.
Quantification
To evaluate the size of DNA fragments, agarose gel electrophoresis was employed to assess DNA (mtDNA) qualitatively (Supplementary Figs. 1,2). For quantitative analysis, the Thermo-Scientific Nanodrop One C utilized a spectrophotometric technique, measuring light absorption at 260 nm to quantify the isolated mtDNA (Supplementary Figs. 3,4). This method provided a comprehensive assessment of both the quality and quantity of nucleic acids, ensuring a reliable characterization of the mitochondrial DNA samples. The extraction of genomic DNA from blood samples of breast cancer patients and healthy individuals is demonstrated in Fig. 1, illustrating the quality of the DNA obtained.
Amplification
Next, using the 7500 SDS Real-Time PCR System, real-time PCR was used to measure the concentration of genomic DNA. The full control region (about 1122 base pairs long) was the subject of the analysis. Table 1 provides primer sequences that were used in the amplification procedure. One ng of genomic DNA, 0.4 µM of both forward and reverse primers, and 0.4 µl of Taq Polymerase (Thermo Scientific #EP0402) were added to 25 µl volumes for the polymerase chain reaction (PCR). Denaturation was placed at 94 °C, followed by 30 s of annealing at 54 °C, 60 s of extension at 72 °C, and a final 60 s of extension at 72 °C during the PCR process. Subsequently, the mitochondrial DNA (mtDNA) control region was amplified across all samples, resulting in PCR products of approximately 1122 base pairs (bp) in size, as determined through agarose gel electrophoresis, as shown in Fig. 1.
Sequencing
The whole mtDNA control region of breast cancer cases and healthy controls, which spans nucleotide positions 16,024–16,569 and 1–576, was sequenced commercially from the West China Hospital of Sichuan University, Sichuan University, Chengdu, China.
Data analysis
Mitochondrial DNA (mtDNA) analysis involves a multi-step approach using various bioinformatics tools and software. Chromas is employed for extracting the FASTA format, while MtDNA Profiler aligns the FASTA format with the revised reference Cambridge System, enabling SNP analysis and haplotype acquisition. Haplogroups of samples are determined using MitoTool (http://www.mitotool.org.)16, Mitomap (https://mitomap.org.)17, and Haplogrep (https://haplogrep.i-med.ac.at.)18. DnaSP (http://www.ub.edu/dnasp.) is utilized to derive forensically important statistical parameters, such as nucleotide diversity, and Tajima’s D. MEGA (http://www.megasoftware.net/mega.php), a comprehensive software package, is applied for evolutionary distance estimation, phylogenetic tree reconstruction19, and computation of basic statistical quantities from molecular data. The data derived from statistical analysis are presented as means and standard deviations. Kolmogorov Smirnov test was used to assess the normality of the distribution of continuous variables. The categorical variables were compared by using chi-square test. Laboratory data means between groups and other quantitative parameters were analyzed by Student‘s t-test. Odds ratios with 95% confidence interval (95% CI) using logistic regression were calculated to find the association of individual haplogroups with breast cancer. Haplogroup M was used as a reference group during odds ratio (OR) calculation since it was the major haplogroup in our study. Graphs were plotted in R programming language (R-4.4.182 megabytes, 64 bit) using the package ggplot2. One-way ANOVA test was used to determine the association between the means of laboratory data and haplogroups. To determine the significant correlations between the groups and laboratory parameters, we used the point-biserial correlation coefficient. The level of significance was set as p < 5%. Moreover, computational techniques were used to calculate population statistics including Random Match Probability (RMP), Power of Discrimination (PD), and Genetic Diversity (GD).
Results
Breast cancer patients and control group samples underwent sequencing, and haplotypes were found by comparing the results to the revised Cambridge Reference Sequence (rCRS). The sequencing analysis results demonstrated the unique genetic differences present in each sample, highlighting the population’s diversified gene pool. Next, the frequencies of each haplogroup were determined individually. Remarkably, 6% of the samples each belonged to haplogroups M5a2a1a and R5a2 in Breast cancer cases. Table 2 provides specifics on the haplogroup distribution of mitochondrial DNA in the Breast cancer (BC) samples. There is a mixed mitochondrial DNA pool in BC samples. Of the samples, 45.07% belong to South Asian haplogroup M and 28.16% to West Eurasian haplogroups U and H. While the Middle East haplogroup J percentage reaches 5.63%, Southeast Asian haplogroup R accounts for 9.86% of all haplogroups. While haplogroups F, W, I, and HV only have 1.41% of samples each, haplogroups T and G have 2.82% each.
The control group, consisting of 60 sequences, underwent analysis to uncover potential genetic variations, as outlined in Table 3. Among the control group sequences, the predominant haplogroup was West Eurasian haplogroup U, representing 49% of all sequences. Following this, South Asian haplogroup M accounted for 22%, and Southeast Asian haplogroup R constituted 10% of the control cases. These findings shed light on the distribution of haplogroups within the control group, providing insights into the genetic diversity present in this population.
Genetic diversity and population dynamics
The genetic diversity and population dynamics of breast cancer patients were investigated using DNAsp, dividing the patient samples into two sets with 33 and 38 sequences, respectively. The analysis revealed a high haplotype diversity (H d) of 0.99960, indicative of extensive genetic variability, encompassing 121 segregating sites and 70 haplotypes. Nucleotide diversity (PiT) was calculated at 0.01195, and the average number of nucleotide differences (Kt) was 10.69135. The non-significant p-value (0.4777) from the polymorphic site associations chi-square test suggested a stable genetic structure. Subsequent sequence and haplotype data analysis using Nei’s methods indicated DeltaSt of 0.00017, GammaSt of 0.04766, Gst of − 0.00033, and Fst of 0.00098, reflecting low genetic differentiation and a high degree of gene flow among the patient populations. The overall genetic diversity of breast cancer cases was notably high (1.014), accompanied by a low random match probability (0.014) and a correspondingly high power of discrimination (0.999), underscoring a rich gene pool within the studied breast cancer patient population.
In the Control Group (CG) Cases comprising 60 individuals, a genetic analysis revealed 52 unique haplotypes, with two shared haplotypes observed among this group. The probability of two individuals having the same haplotype by chance was slightly elevated at 0.019 compared to the Breast Cancer (BC) cases. Despite this, the Power of Discrimination remained notably high at 0.9997, indicating a robust ability to differentiate between individuals based on their haplotypes. The overall genetic diversity within the CG cases was measured at 1.058, and although Haplotype Diversity was slightly lower at 0.9032, Nucleotide Diversity was observed to be 0.0108. The Chi-square test demonstrated a significant difference among haplotypes (p = 0.0469), suggesting diversity within the control group. Furthermore, the Fst value indicated a higher level of genetic differentiation among the CG cases, reaching 0.17330. These findings underscore the genetic complexity and diversity within the control group, providing valuable insights into the population’s genetic structure and differentiation. The summary of both BC cases and CG cases are shown in Table 4.
MEGA analysis: neighbor joining tree
The evolutionary history of the taxa under study was inferred using the Neighbor-Joining method, and the resultant evolutionary history was represented by the bootstrap consensus tree derived from 1000 replicates. Branches corresponding to partitions reproduced in fewer than 50% of the bootstrap repetitions were collapsed to ensure accuracy. The evolutionary distances reported in base substitutions per site were calculated using the Maximum Composite Likelihood technique. This analysis encompassed 71 nucleotide sequences in total, as shown in Supplementary Fig. 5.
Using the pairwise deletion option, any unclear places were eliminated from each sequence pair to guarantee accuracy and clarity. The final dataset included 1031 locations in total. MEGA11 was utilized for the evolutionary analysis, yielding a dendrogram that illustrates the genetic linkages and variations among Lahore-based female breast cancer patients. The dendogram of 60 nucleotide sequences in the control group samples revealed the samples’ evolutionary history shown in Supplementary Fig. 6.
Median joining networks
To examine potential links among haplotypes in BC samples, median-joining (MJ) networks were constructed using all control region haplotypes of breast cancer samples (Fig. 2) and control group samples (Fig. 3). The haplotypes of the BC samples showed strong clustering, with the most frequently occurring rCRS haplotype located in the network’s core while the control group network showed a shared haplotype cluster that didn’t spread over a large area. Significant divergence was observed between haplotypes in the population, exhibiting multiple branches containing significant haplotypes. There were a lot of independent branches in the MJ network, which led to many sub-branches that were divided by many mutations.
Statistical analysis
Among the 131 study participants, 71 (54.2%) were breast cancer patients, and 60 (45.8%) were healthy participants. Out of the total breast cancer patients, 33 (46.5%) of them had M haplogroup. The average hemoglobin of the breast cancer patients was 53.13 ± 6.58 (unit), and the average age was 53.13 ± 6.58 years. The mean WBCS for breast cancer patients was 7.23 ± 2.43 units, while the mean PLTs was 312.34 ± 135.29 units. In the case of healthy participants, 13 (21.6%) had M haplogroup. The average hemoglobin of the healthy participants was 53.49 ± 6.92 (unit), and the average age was 53.49 ± 6.92 years. The mean WBCS for healthy participants was 7.30 ± 0.82 unit while the mean PLTs was 293.60 ± 86.33 unit. The biochemical and clinical data of breast cancer patients and healthy participants are shown in Table 5. In breast cancer patients, the estimated average value of urea was 25.45 units, creatinine (0.76 units), and T. bilirubin (0.53 units), while in healthy participants, this value was 16.33 units, 0.69 units, 0.76, respectively for urea, creatinine, and T. Bilirubin. The mean values of ALT, AST, and ALP in breast cancer patients were 39.61, 39.93, and 140.94, respectively. There were significant differences between both groups regarding biochemical and clinical parameters of the study, while age, hemoglobin, WBCs, and PLTs showed no significant difference. In this study, out of the 10 cases of Triple-Negative Breast Cancer (TNBC) analyzed, 7 cases were found to be associated with haplogroup M. The statistical analysis yielded a significant correlation with a p-value of 0.002, indicating that the prevalence of haplogroup M in these TNBC cases is unlikely to occur randomly. This suggests a potentially meaningful association between haplogroup M and the occurrence of TNBC in the studied population.
The frequencies of the haplogroups found in study participants have been presented in Table 6, and Supplementary Fig. 7. Among the study participants, 11 haplogroups (F, G H, J, M, N, R, S, T, U, and W) were recognized. Out of total 131 participants, highest frequency was observed of M haplogroup 46 (35.1%) followed by U haplogroup 41(31.3%), H haplogroup 14 (10.7%), R haplogroup 12 (9.2%), J 7(5.3%), T 4(3.1%), W 3(2.3%), N 2 (1.5%). Among these, haplogroup F was only observed in healthy participants, while haplogroup S was detected in patients with breast cancer patients with the lowest frequency (1(0.8)). Logistic regression analysis demonstrated that haplogroup U is also significantly associated after haplogroup M with the risk of breast cancer in the current study (OR; 7.87, CI: 3.02− 2 0.54).
Our study population detected the highest significant frequency of M haplogroup (p < 0.001). As delineated in Table 7, positive HER2, ER, and PR cases were insignificantly associated with haplogroup M, while TNBC had a significant association (p-value = 0.016). TPBC rates among M haplogroup were not significantly different from non-M haplogroup even though M haplogroup with positive TPBC had a higher risk of breast cancer (OR = 1.24, CI: 0.36–4.28).
Healthy participants and breast cancer patients were stratified into Groups H, M, R, U, and others to predict the association with biochemical characteristics; no significant variation was observed in hemoglobin, WBCs, PLTs, creatinine, and ALP (p = 0.694, 0.210, 0,120, 0,899 and 0.203, respectively). The haplogroups H, M, R and U patients exhibited a significant mean difference in the age of breast cancer patients (p < 0.001), urea (p = 0.002), T.bilirubin (p = 0.013), ALT (p = 0.010), and AST (p = 0.038) compared with the other haplogroup participants (Table 8).
Point biserial correlation coefficients with p values between laboratory parameters of breast cancer participants and healthy participants are presented in Table 9. Haplogroups showed no significant correlations with the the laboratory parameters except for WBC (correlation coefficient: 0.366; p < 0.001), while PLTs, urea, creatinine, ALT, AST, and ALP showed negative correlation with insignificant p-value.
The median age of Breast cancer patients was 55 years (Inter Quartile Range (IQR) = 11) which did not differ significantly from that of healthy participants, which was 54 years (IQR = 9) (Fig. 4). The study population’s lower quartile and upper quartile of age were (48, 59) and (49, 58) respectively.
Haplogroup H had 53 years (IQR = 11.5), group M had 56 years (IQR = 10), group R had 51 years (IQR = 13.5), group U had 55.5 years (IQR = 4.75), and other groups had 52 years (13.5) of age among breast cancer patients. The distributions of age of healthy participants and patients in different groups have been presented in Fig. 5.
The heatmap chart was drawn to compare the average ages of participants from five haplogroups (U, R, N, M, and H) among Breast Cancer patients and healthy participants. The color gradient, which ranges from lighter to darker colors, indicates average age of 50 to 55 years. Notably, the case group has a higher mean age in haplogroups U and R than the control group, which has a more uniform mean age across all haplogroups. Correlation Matrix of all the quantitative variables of BC patients was also made which is presented in Supplementary Figs. 5 and 6, that represent that most of the variables are negatively or moderately positive correlation with each other.
For this case-control study, the AUC was 0.442 (95% CI: 0.294–0.590) with a sensitivity of 0.406 and a specificity of 0.536 for breast cancer patients, while the AUC was 0.417 (95% CI: 0.271–0.563) for healthy participants with a sensitivity of 0.594 and a specificity of 0.714. A significant difference in the AUC between the haplogroups (p < 0.001) (Fig. 6).
Discussion
The study of the influence of mitochondrial DNA (mtDNA) haplogroups on breast cancer has provided valuable insights into the complex interplay between genetic variations and susceptibility to this prevalent disease. The current study aimed to elucidate the potential associations between mtDNA haplogroup M and the risk of developing breast cancer, shedding light on the mitochondrial genome’s contribution to the pathogenesis of this multifactorial condition. One of the key findings of this study is the observed distribution of mtDNA haplogroups in breast cancer patients compared to healthy controls. The prevalence of certain haplogroups among breast cancer cases suggests a potential correlation between mitochondrial genetic variations and disease susceptibility. This aligns with previous research indicating that mtDNA haplogroups may influence the bioenergetics capacity of cells and impact various cellular processes, including those implicated in carcinogenesis. Beyond energy production, mitochondria are involved in various cellular processes, including cell growth and differentiation, apoptosis, and maintaining redox balance, which is linked to the generation of reactive oxygen species (ROS) as a byproduct of OXPHOS. Mitochondrial haplogroups, defined by ancestral single nucleotide polymorphisms (SNPs) within the mitochondrial genome, may lead to variations in energy metabolism and ROS production, potentially influencing disease susceptibility20,21. By analysis of the mtDNA haplogroups of 71 patients battling breast cancer and 60 healthy controls, we discovered that haplogroup M is more prevalent in cases of breast cancer, with statistical significance (p < 0.001), while haplogroup U is more prevalent in the control group. The frequency of haplogroup M, H, and R was observed to be higher in patients as compared to controls (odds ratio = 2.03, 95% confidence interval (CI: 0.66–6.29); odds ratio = 2.93, 95% confidence interval (CI: 0.61–14.23); odds ratio = 1.12, 95% confidence interval (CI: 0.26–4.92)). In Southern China, a comprehensive case-control study and cohort analysis demonstrated a significant association between the D5 haplogroup and an increased risk of breast cancer (odds ratio = 2.789; 95% CI [1.318, 5.901]; p = 0.007)22. These findings contribute valuable insights to our research paper’s discussion of mitochondrial haplogroups and breast cancer susceptibility. Even these findings emphasize that Pakistani population’s sensitivity to breast cancer is also due to mitochondrial genetics, which may help identify new biomarkers and treatment targets.
This current study has an admixed mitochondrial DNA pool. South Asian haplogroup M contains 45.07% (M13ʹ46ʹ61ʹ16362, M18a, M2a1a, M3, M30, M30 + 16234, M30c1, M34, M3a1 + 204, M4, M4a, M4b, M5, M5a1a, M5a2a1a, M5b2, M5b2b*, M5c1, M6) of the samples, West Eurasian haplogroup U and H contains 28.16% (H14a, H1b, H1ba, H1e1a1, H1f1a, H2a2a, H6, H7a1, U2a1, U2a1, U2b2, U2c1a, U2’cd, U4b1a1a1, U7, U7a) of samples. While Southeast Asian haplogroup R is 9.86% (R2, R30a1b, R5a2, R8) of all and also exhibits Middle East haplogroup J percentage as high as 5.63% (J1b, J1b1b, J1b8, J1c2). Haplogroup T and G (T1a1’3, G2b2*, G3b1) contain 5.64% while haplogroup F W I and HV (F1c1a, I1b*2, W6, HV2) have just 4% of samples. In Mexico City, a study of haplogroup analysis of 564 germline variants in normal tissues, focusing on tumor subtypes, revealed a predominant occurrence of haplogroup A (44.6%) followed by B (22.8%)23. In this study, it was observed that haplogroup M covers approximately 45.07% of the samples, indicating a significant proportion of individuals at high risk for developing breast cancer. This finding aligns with a previous study conducted in southern China, where the presence of haplogroup M was identified as a noteworthy risk factor for breast cancer development14. The consistent correlation between the prevalence of haplogroup M and an increased risk of breast cancer in both populations suggests a potential genetic susceptibility to the disease in regions where this haplogroup is prevalent. Moreover, M65a haplogroup was associated with a minor increase in the risk of sporadic breast cancer among Sinhalese women8. A study conducted in Sulaimaniyah, Iraq, analyzed the entire mitochondrial genome of 20 breast cancer samples and controls and found that the HV haplogroup was associated with a higher risk of developing breast cancer15. The current study contains haplogroup HV2 1% of the BC sample, which indicates the low risk of breast cancer development in the Pakistani population.
The analysis of mitochondrial DNA haplogroups has unveiled intriguing associations with cancer risks in diverse populations. Notably, haplogroup H has emerged as a potential contributor to an elevated risk of breast cancer13. Concurrently, the presence of haplogroup D4 significantly increases cases of Acute Myeloid Leukemia (AML)24, suggesting its potential role as a risk factor for this hematologic malignancy. Furthermore, in the context of colorectal cancer (CRC), individuals with mitochondrial haplogroup M7 face significantly heightened mortality risks, particularly observed in Northwestern China25. These findings underscore the intricate relationship between specific mitochondrial haplogroups and distinct cancer risks, shedding light on the importance of mitochondrial genetics in cancer susceptibility and outcomes in various populations. Macro-haplogroup M, tracing its ancestry back to L3, shows high frequencies in Asia. The study emphasizes the significance of haplogroup M in understanding ancient human migrations and genetic diversity, particularly in Asian populations26,27. In this study, the frequency of haplogroup M is significantly higher in breast cancer patients (45.07%) compared to healthy individuals. In the present study, 70% of Triple-Negative Breast Cancer (TNBC) patients were also associated with haplogroup M, signifying a significant correlation with a p-value of 0.016. This finding aligns with a prior study identifying substantial disparities among ethnic groups in TNBC patients. Furthermore, genetic analysis revealed prevalent mitochondrial DNA patterns linked to Nigerian, Cameroon, or Sierra Leone ancestry and haplogroups A, U, H, and B28. Reduced mitochondrial DNA copy number and diminished cellular respiration are more prevalent in triple-negative breast tumors, suggesting a potential link to the observed aggressiveness in TNBCs. These findings highlight the significance of mitochondrial defects in understanding the underlying mechanisms of TNBC aggressiveness29,30,31,32,33. Conversely, haplogroups H and N have been linked to elevated breast cancer risk in some populations, emphasizing the need for deeper exploration into the functional consequences of mtDNA variants associated with these haplogroups12,13. The associations between mtDNA variants and specific haplogroups, particularly haplogroup M, warrant a deeper investigation into their functional implications in breast cancer development. It is crucial to understand the molecular pathways by which these mtDNA haplogroups may contribute to cancer, especially when compared to other diagnostic and therapeutic markers, such as circRNA34 and nuclear DNA. However, due to financial constraints and reliance on Sanger sequencing, the study’s limited sample size may not adequately represent the diversity of mtDNA control region variants in Pakistan. Future research with larger sample sizes and whole mtDNA genome analysis could provide significant insights into the role of mtDNA in breast cancer. Identifying haplogroup M as a breast cancer risk factor within the Pakistani population enables more accurate risk profiling.
Incorporating genetic screening for haplogroup M into routine assessments, particularly for women with a family history of breast cancer, could enhance risk evaluation accuracy and facilitate earlier detection, improving outcomes by identifying the disease at a more manageable stage14. Understanding an individual’s genetic predisposition to breast cancer can also guide personalized prevention strategies. Women with haplogroup M might benefit more from specific lifestyle modifications, chemoprevention, or even prophylactic surgeries, such as mastectomy, compared to those without this genetic marker35,36.
Conclusion
In conclusion, our research highlights the significant role of mitochondrial genetics in breast cancer predisposition, particularly the strong association between haplogroup M and an increased risk of this disease. By analyzing breast cancer patients and healthy controls, we found that haplogroup M, along with haplogroups H and R, is more prevalent in breast cancer cases, suggesting its potential role in susceptibility. Additionally, haplogroup M showed a significant link to Triple-Negative Breast Cancer (TNBC), reinforcing its relevance in this population. These findings underscore the importance of mitochondrial genetics in breast cancer risk assessment, offering valuable insights for biomarker discovery and the development of targeted, personalized interventions. These insights could lead to the development of targeted preventive and therapeutic strategies, potentially transforming cancer care and prognosis. However, further experimental and clinical validation is necessary to fully harness these findings and integrate them into effective treatment regimens.
Data availability
The datasets generated and/or analyzed during the current study are available in the manuscript file, and supplementary material file. The data may be found from NCBI with accession PP809513 https://www.ncbi.nlm.nih.gov/nuccore/PP809513.1/.
References
Khurram, I. et al. Efficacy of cell-free DNA as a diagnostic biomarker in breast cancer patients. Sci. Rep. 13, 15347. https://doi.org/10.1038/s41598-023-42726-6 (2023).
Vega Avalos, J. H. et al. Mitochondrial control region variants related to breast cancer. Genes (Basel). 13, 962. https://doi.org/10.3390/genes13111962 (2022).
Blein, S. et al. An original phylogenetic approach identified mitochondrial haplogroup T1a1 as inversely associated with breast cancer risk in BRCA2 mutation carriers. Breast Cancer Res. 17. https://doi.org/10.1186/s13058-015-0567-2 (2015).
Shafi, H. et al. Mutational Analysis of Exons 5–9 of TP53 Gene in Breast Cancer Patients of Punjabi Ethnicity (2022).
Shahbandi, A., Nguyen, H. D. & Jackson, J. G. TP53 mutations and outcomes in breast cancer: reading beyond the headlines. Trends Cancer. 6, 98–110. https://doi.org/10.1016/j.trecan.2020.01.007 (2020).
Wengner, A. M., Scholz, A. & Haendler, B. Targeting DNA damage response in prostate and breast cancer. Int. J. Mol. Sci. 21, 273. https://doi.org/10.3390/ijms21218273 (2020).
Wang, Y., Li, Y., Liu, B. & Song, A. Identifying breast cancer subtypes associated modules and biomarkers by integrated bioinformatics analysis. Biosci. Rep. 41, 200. https://doi.org/10.1042/bsr20203200 (2021).
Tiphania Kotelawala, J., Ranasinghe, R., Rodrigo, C., Tennekoon, K. H. & Silva, K. Evaluation of non-coding region sequence variants and mitochondrial haplogroups as potential biomarkers of sporadic breast cancer in individuals of Sri Lankan sinhalese ethnicity. Biomed. Rep. 12, 339–347. https://doi.org/10.3892/br.2020.1292 (2020).
Tufail, M. & Wu, C. Exploring the burden of cancer in Pakistan: an analysis of 2019 data. J. Epidemiol. Global Health. 13, 333–343. https://doi.org/10.1007/s44197-023-00104-5 (2023).
Khan, M. U. Forensic and genetic characterization of mtDNA lineages of Shin, a unique ethnic group in Pakistan. Pakistan J. Zool. 53. https://doi.org/10.17582/journal.pjz/20191024091047 (2020).
Malik, A. et al. Role of oxidative stress and immune response alterations in asthmatic pregnant females. Bull. Biol. Allied Sci. Res. 2024 (1), 85. https://doi.org/10.54112/bbasr.v2024i1.85 (2024).
Li, Y. et al. Association between mitochondrial genetic variation and breast cancer risk: the multiethnic cohort. PLoS ONE. 14, e0222284. https://doi.org/10.1371/journal.pone.0222284 (2019).
Bonilla, C. et al. Breast cancer risk and genetic ancestry: a case-control study in Uruguay. BMC Womens Health. 15. https://doi.org/10.1186/s12905-015-0171-8 (2015).
Fang, H. et al. Cancer type-specific modulation of mitochondrial haplogroups in breast, colorectal and thyroid cancer. BMC Cancer. 10, 421. https://doi.org/10.1186/1471-2407-10-421 (2010).
Fadhl, H. N. M. & Abdulkarim, F. M. Potential association of mitochondrial haplogroups and A8860G mutation with breast cancer risk. MedRxiv. https://doi.org/10.1101/2021.02.12.21249541 (2021).
Fan, L. & Yao, Y. G. MitoTool: a web server for the analysis and retrieval of human mitochondrial DNA sequence variations. Mitochondrion 11, 351–356. https://doi.org/10.1016/j.mito.2010.09.013 (2011).
Lott, M. T. et al. mtDNA variation and analysis using Mitomap and Mitomaster. Curr. Protoc. Bioinform. 44 23, 21–26 (2013).
Schönherr, S., Weissensteiner, H., Kronenberg, F. & Forer, L. Haplogrep 3—an interactive haplogroup classification and analysis platform. Nucleic Acids Res. 51, W263–W268. https://doi.org/10.1093/nar/gkad284 (2023).
van Oven, M. PhyloTree Build 17: growing the human mitochondrial DNA tree. Forensic Sci. Int. Genet. Supplement Ser. 5, e392–e394. https://doi.org/10.1016/j.fsigss.2015.09.155 (2015).
Malik, A. et al. Correlation of oxidative stress markers in multiple biofluids of end-stage renal disease patients. Bull. Biol. Allied Sci. Res. 2024, 86. https://doi.org/10.54112/bbasr.v2024i1.86 (2024).
Kaneva, K. et al. Mitochondrial DNA haplogroup, genetic ancestry, and susceptibility to Ewing sarcoma. Mitochondrion 67, 6–14. https://doi.org/10.1016/j.mito.2022.09.002 (2022).
Ma, L. et al. Breast cancer-associated mitochondrial DNA haplogroup promotes neoplastic growth via ROS-mediated AKT activation. Int. J. Cancer. 142, 1786–1796. https://doi.org/10.1002/ijc.31207 (2018).
Pérez-Amado, C. J. et al. Mitochondrial DNA mutation analysis in breast cancer: shifting from germline heteroplasmy toward homoplasmy in tumors. Front. Oncol. 10, 572954. https://doi.org/10.3389/fonc.2020.572954 (2020).
Kim, H. R. et al. Spectrum of mitochondrial genome instability and implication of mitochondrial haplogroups in Korean patients with acute myeloid leukemia. Blood Res. 53, 240–249. https://doi.org/10.5045/br.2018.53.3.240 (2018).
Yan, Z. et al. Mitochondrial DNA haplogroup M7: a predictor of poor prognosis for colorectal cancer patients in Chinese population. Cancer Sci. 114, 1056–1066. https://doi.org/10.1111/cas.15654 (2023).
Rajkumar, R. et al. Phylogeny and antiquity of M macrohaplogroup inferred from complete mt DNA sequence of Indian specific lineages. BMC Evol. Biol. 5. https://doi.org/10.1186/1471-2148-5-26 (2005).
Thangaraj, K. et al. In situ origin of deep rooting lineages of mitochondrial macrohaplogroup ‘M’ in India. BMC Genom. 7, 151 (2006).
Rao, R. et al. Genetic ancestry using mitochondrial DNA in patients with triple-negative breast cancer (GAMiT study). Cancer 123, 107–113. https://doi.org/10.1002/cncr.30267 (2017).
Guha, M. et al. Aggressive triple negative breast cancers have unique molecular signature on the basis of mitochondrial genetic and functional defects. Biochim. Biophys. Acta Mol. Basis Dis. 1864, 1060–1071. https://doi.org/10.1016/j.bbadis.2018.01.002 (2018).
31 Hassan, N. et al. Antiviral response of drugs used against HBV patients of Khyber Pakhtunkhwa, Pakistan. Bull. Biol. Allied Sci. Res. 2023, 49. https://doi.org/10.54112/bbasr.v2023i1.49 (2023).
Gohar, M. et al. Prevalence of hepatitis B virus and genotypes in the region of Khyber Pakhtunkhwa Pakistan. Bull. Biol. Allied Sci. Res. 2023, 53. https://doi.org/10.54112/bbasr.v2023i1.53 (2023).
Sheema et al. Molecular identification of HCV genotypes among injecting drug users having HCV and HIV co-infection. Bull. Biol. Allied Sci. Res. 2024 (1), 71. https://doi.org/10.54112/bbasr.v2024i1.71 (2024).
Rehman, A. et al. Molecular analysis of aminoglycosides and β-lactams resistant genes among urinary tract infections. Bull. Biol. Allied Sci. Res. 2023 (1). https://doi.org/10.54112/bbasr.v2023i1.56 (2023).
Saleem, A. et al. Biological role and regulation of circular RNA as an emerging biomarker and potential therapeutic target for cancer. Mol. Biol. Rep. 10 (1), 296. https://doi.org/10.1007/s11033-024-09211-3 (2024).
Akram, A. et al. Silibinins and curcumin as promising ligands against mutant cystic fibrosis transmembrane regulator protein. AMB Express. 14, 1. https://doi.org/10.1186/s13568-024-01742-z (2024).
Ahmed, M. A. et al. Investigating the inheritance patterns and potential associations of selected human morphogenetic traits. Bull. Biol. Allied Sci. Res. 2024 (1), 77. https://doi.org/10.54112/bbasr.v2024i1.77 (2024).
Acknowledgements
This work was supported by the Institute of Molecular Biology and Biotechnology, The University of Lahore, Lahore, Pakistan.
Author information
Authors and Affiliations
Contributions
NK wrote the manuscript draft, designed the protocol, applied the method, and collected the data; MUK revised the manuscript and supervised the project; RR, AI, and MUG guided in the cell culture lab and revised the manuscript; SK, TZ assisted in revising the manuscript, and editing; MAJ supervised the methodology and manuscript; GS reviewed and analyzed the data statistically; QA reviewed and editing; MAJ and QA reviewed the manuscript. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethical approval and consent to participate
It has been confirmed that the informed consent was obtained from all the participants included in this study and this research project was ethically approved by the institution ethical review board and it also compiles the relevant institutional, national, and international guidelines and legislation with appropriate permissions from Institutional Review Board/Ethics Committee of the Institute of Molecular Biology and Biotechnology, The University of Lahore, 54000, Lahore, Pakistan.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Khalid, N., Khan, M.U., Rehman, R. et al. Unraveling the genetic connections for mitochondrial DNA control region and breast cancer susceptibility. Sci Rep 15, 4821 (2025). https://doi.org/10.1038/s41598-025-89115-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-89115-9