Introduction

The ACL connects the femur and the tibia, and is composed of dense tissue made of collagen and elastin proteins. It is one of the structures that maintain the stability of the knee joint1,2. ACL injury is a common knee joint injury, accounting for 78% of all sports-related knee joint lesions3,4, the incidence of injuries continues to rise5. ACL injury results in the loss of the extracellular matrix of chondrocytes, thereby hastening the progression of osteoarthritis6,7,8. ACL injury can be caused by various risk factors, but it has a poor intrinsic healing ability and often requires surgical treatment. Research has shown that individuals with a family history of ACL rupture have twice the risk of experiencing an ACL rupture compared to those without a family history9. Genetic factors play a role in ACL injuries, but it remains unclear how genes are involved in the susceptibility to ACL injuries.

Previous studies have primarily focused on germline mutations, obtaining genetic information from blood cells or oral mucosal epithelial cells, but have not yielded consistent affirmative results, with variations observed among different racial and ethnic groups10,11,12,13,14,15,16. Most of the prior research have been limited to germline mutations only, and the number of studies targeting somatic mutations is poor. Additionally, some researchers have only focused on specific pedigrees, which may lack representativeness9,17,18.

There are about 1014 cells in the human body, and approximately 1016cells are produced throughout one’s lifetime, which is quite a large number. The substantial cell divisions required for human body construction inevitably lead to the accumulation of somatic cell mutations19, therefore, it is also necessary to study somatic cell mutations in tissues. In contrast, germline mutations only focus on certain cells and pay less attention to somatic cells within tissues. Currently, research on diseases related to somatic cell mutations mainly focuses on tumors such as lung cancer, and colorectal cancer20,21,22, with little research on the impact of somatic cell mutations on ACL injury.

In the human genome, Whole exome sequencing (WES) allows the sequencing of all protein-coding regions (exome), particularly for monogenic diseases23. Approximately 85% of mutations are distributed in the exome, and about 60% of disease mutations are missense mutations and nonsense mutations24,25,26. Therefore, through exome sequencing, mutations related to ACL injury may be identified in the ligament. We extract genetic information from blood cells and affected tissues, then use WES to identify genomic alterations in ACL cells and analyze differences between ACL patient ligament tissue and matched blood data to explore the influences.

Methods

Study subjects and patients

This study included the selection of a population meeting the criteria from ACL injury patients, obtaining 10 ml of whole blood, and approximately 0.3 cm3 of the ends of the injured ligament and the healthy side of ACL from the same ligament (healthy tissue 1–1.5 cm from the ends of the injured ligament). The normal ACL is shown in Fig. 1a, the sampling site is shown in Fig. 1b. Inclusion criteria were: 20–40 years old; non-contact injury (occurred without physical contact with an external force e.g. trauma); not combined with injury to other knee structures (e.g. other ligaments or menisci); no history of rheumatoid arthritis, pyogenic arthritis, knee tuberculosis, or other knee diseases; no history of intra-articular drug injection; no prior history of ligament rupture; previously healthy with no other diseases. Seven ACL samples were collected (Supplemental Table S1), all of which had a substance ligament rupture. All samples were collected and isolated by experienced surgeons and were not contaminated during the collection process. All samples were collected and stored in a biorepository at -80℃ until submitted to WES. Most importantly, the study approved by the Ethical Committee of Qingdao Municipal Hospital [grant number 2024-KY-012]. All participants had signed informed consent forms.

Fig. 1
figure 1

Schematic workflow of the experiment. (a) Normal ACL and PCL (b) ACL injury site and sample ___location (c) The workflow for collection and processing of ACL samples for WES. ACL: anterior cruciate ligament; PCL: posterior cruciate ligament; I: Injured ligament ends; H: Healthy ligament ends.

Exome sequencing

All samples were sent to the sequencing facility of Nanjing Geneseeq Biotechnology Inc. (Nanjing, China) for NGS analyses. Commercially available DNeasy Blood and Tissue Kit (Qiagen) with established protocols was used to isolate genomic DNA for whole exome sequencing and downstream analyses. Using a highly sensitive Qubit3.0 fluorometer to detect the concentration and purity of DNA samples. QIAamp DNA Mini Kit was used for library construction. Firstly, 1–2 ug genomic DNA for each sample was processed through fragmenting to 350 bp, followed by end repair, A-tailing, and adaptor ligation using the Covaris M220 sonication system and KAPA Hyper Prep Kit (KAPA Biosystems, KK8504).

Libraries were amplified by PCR and purified using Agencourt AMPure XP beads. NanoDropTM 2000 (Thermo Fisher Scientific) for A260/280 and A260/230 ratios was used to sample quality control, was used to size distribution, and Qubit 3.0 dsDNA HS Assays (Life Technology) was used to sample and library quantification. All of the procedures were following the manufacturers’ recommended protocols.

Subsequently subjected to WES target region capture. After processing, segments containing the target gene sequences are retained. Subsequently, another round of PCR amplification was performed to obtain the final DNA library. Next, the library was sequenced on the DNBSEQ-T7 sequencing platform. The generated sequencing data was then processed and analyzed bioinformatically. Fastq v0.20.0 (https://github.com/OpenGene/fastp) was used to demultiplex sample removing adapters and low-quality trim, Sentieon v0.7.17 (https://www.sentieon.com/) was used to perform the paired-end sequencing reads align to the human reference genome hs37d. Bamdst v1.0.8 (https://github.com/shiquan/bamdst) and Verdict v1.5.4 (https://github.com/AstraZeneca-NGS/VarDict) were used for Quality control and sequence comparison. Then, using Vcf2maf v1.6.16 (https://github.com/mskcc/vcf2maf) annotate mutation sites. At last, the sequencing depth, coverage, single nucleotide variants (SNVs), insertions/deletions (InDels), copy number variations (CNVs), etc., relative to the reference genome were calculated. Then, we assessed potential pathogenicity of variants based on Allele frequency (AF), sorting intolerant from tolerant (SIFT), and combined annotation dependent depletion (CADD).

Mutation site analysis

The somatic mutation sequence of ACL is obtained by comparing the ACL tissue mutation site with the nucleotide sequence in the blood sample. Then, by comparing the ACL injury end to the healthy end of the ACL tissue, the mutation sites that exist only at the end can be obtained. Finally, the two sets of mutation data were compared, and all mutation data were presented in Table (Tables 1, 2 and 3).

Table 1 The somatic mutations compared to the healthy ligament ends.
Table 2 The somatic mutations compared to the blood.
Table 3 All gene deletions were detected in ACL.

Enrichment analysis

Mutation data was imported into Metascape (Metascape v3.5.20240101, http://metascape.org) to explore the functions and pathways of mutated genes. The species was restricted to “Homo sapiens,” with a p-value adjusted to 0.01 and a minimum overlap set to 3 for enrichment analysis. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases were used for enrichment analysis.

A graphical depiction of experimental methods is displayed in Fig. 1c.

Ethics statements

All of the human tissues used in the present study were obtained with written informed consent from all subjects and their legal guardians. All experimental protocols were approved by the Ethics Committee of Qingdao Municipal Hospital [approval number 2024-KY-012]. Both the Declaration of Helsinki and the Good Clinical Practice Guidelines were followed and informed consent granted by all participants.

Result

This study included 7 samples (Supplemental Table S1) for WES analysis, with patients aged between 20 and 39 years and a median age of 32 years. Among them, 4 patients had left ACL injuries, while 3 had right ACL injuries. However, there was only one female among the 7 participants. All patients were diagnosed with ACL injury through MRI and arthroscopy, and underwent ligament reconstruction surgery.

Compared to healthy ligament ends of the ACL tissue (Table 1), 21 differential sites were detected, including 67% (14 sites) missense mutations, 2 insertions, and 1 deletion, as shown in Table 1. Among these, 10 mutation sites had a Combined Annotation Dependent Depletion (CADD) score > 15, and 9 sites had a Sorting Intolerant From Tolerant (SIFT) score < 0.05. THAP4 had a CADD score of 12.58 and a SIFT score of 0.002. FLYWCH1 had a CADD score of 23, while the SIFT score was 0.203. The LAMC3 gene had a CADD score of 21.4, while the SIFT score was 0.51.

Compared to the blood (Table 2), 24 different loci were detected in damaged ligaments, as shown in Table 2. Among these, 75% (18 sites) were missense mutations, 8% (2 sites) were nonsense mutations, 8% (2 sites) were insertion mutations, 4% (1 site) were deletion mutations, and 4% (1 site) were splice_acceptor_variant & intron_variant. One sample showed a single gene deletion, another sample had 5 gene deletions, and only one sample exhibited 5 gene deletions along with partial chromosome loss (19P), reaching up to 73%.

The gene deletions in the two sample groups were all located on chromosomes 16 or 19 (Table 3). Fourteen mutation sites had a CADD score > 15, and 8 sites had a SIFT score < 0.05. For the same mutation sites, different scoring methods (such as CADD, Poly, SIFT) yielded different results. THAP4 had a CADD score of 12.58 and a SIFT score of 0.02. LAMC3 had a CADD score of 21.34 and a SIFT score of 0.051, and ZNF492 had a CADD score of 15.34 and a SIFT score of 0.521. FLYWCH1 had a CADD score of 23.1 and a SIFT score of 0.203. HDLBP had a CADD score of 25.8 and a SIFT score of 0.086, while RHPN2 had a CADD score of 18.06 and a SIFT score of 0.33.

Import mutation data into Metascape to get enrichment analysis results. GO enrichment analysis revealed that these mutation sites are enriched in gliogenesis (ABL1, LAMB1, LAMC3), cell adhesion (ABL1, CD5, LAMB1), transmembrane transport of substances (ABL1, CACNB4, CYBA), post-translational protein modification (ABL1, GATA2, FBXO2), axon guidance (ABL1, CACNB4, LAMB1), RHO GTPase Effectors (ABL1, CYBA, RHPN2), and signaling by Receptor Tyrosine Kinases, among others (CYBA, LAMB1, LAMC3) (Fig. 2).

Fig. 2
figure 2

Enrichment analysis of mutated genes.

GO enrichment analysis of deleted genes showed in Fig. 3a, such as positive regulation of transforming growth factor beta receptor signaling pathway (CREBBP, STK11, AXIN1), negative regulation of TOR signaling (STK11, TSC2, NPRL3). The results of KEGG enrichment analyses were summarized in Fig. 3b, including thyroid hormone signaling pathway (CREBBP, NOTCH3, TSC2), mTOR signaling pathway (STK11, TSC2, NPRL3), AMPK signaling pathway (PPP2R1A, STK11, TSC2).

Fig. 3
figure 3

Enrichment analysis of deleted genes. (a) GO enrichment analysis. (b) The enrichment results of deleted genes in the KEGG.

Discussion

In recent years, there has been limited progress in the genetic susceptibility of ACL injury. Current research focuses on specific SNP sites, with potential ethnic factors. This study, using WES, analyzed differences between injured ligament ends and blood cells as well as the healthy ligaments ends, but the results were not satisfactory, lacking consistency. The significant differences in mutation sites may be associated with ligament injuries, or they could simply be individual mutations unrelated to ligament damage.

The extracellular matrix (ECM) is a non-cellular three-dimensional macromolecular network, contains usually collagens, proteoglycans/glycosaminoglycans, laminins, and several other glycoproteins27. The properties of ECM are not static, and processes such as collagen turnover, transcriptional and post-translational modifications, and the release of growth factors can promote ECM remodeling28,29. In tendons, ligaments and other tissues, ECM is an important component, and its basic structure is collagen, which plays an important role in maintaining force transmission stabilizing tissue structure and influencing cell signaling28,30,31.

Genetic variants related to ACL injury show associations with ECM, such as LAMB1, LAMC3, ABL1, and GATA2 genes. LAMB1 and LAMC3are members of ECM glycoproteins, and are likely critical for the maintenance of ECM composition32,33,34,35. It has been found that COL1A2, COL4A1, COL4A2, etc., as ECM components, have a strong correlation with LAMB136LAMC3also harmonizes TGFβ signaling, which has been shown that disruption of TGF-beta signaling pathway results in the loss of ligaments37. Studies also suggest that GATA2interacts with Smad4 to inhibit the TGF-β signaling pathway38,39,40Without TGFβ signaling, this would lead to a defective development of ligaments and ligamentous laxity41,42. Treatment of fibroblasts in the uterosacral ligament with TGF-β1 attenuates ECM loss, and similarly, TGF-β stimulates collagen synthesis, which stimulates ligament healing by binding to epidermal growth factor43,44. Coincidentally, in this study, missense mutations in the LAMB1 and LAMC3genes were identified in ACL injury. We venture to hypothesize that the mutated gene interferes with the initiation of ligament repair via TGFβ signaling45, affecting ECM function, ultimately resulting in reduced ligament rupture strength. Other related genes in the ECM components, such as ABL1, can modulate the ECM and are active in a range of biological processes46,47.

The molecular mechanisms regulating ACL injury are not yet fully understood, on this basis, we analyzed the remaining genes based on the results of the current study. It has been reported that KDM4B can promote osteogenic differentiation by regulating the function of MSCs and thus interacting with p65 to promote bone loss48,49; ARSD, FBXO2 has been reported to be associated with JAK2/STAT3 pathway regulation and local inflammation and repair in osteoarticular joints50,51,52; Zhang et al. have found that DDX-11 can regulate osteosarcoma development53. These genes have been reported to be associated with bone repair and destruction in existing studies, and we, therefore, hypothesized that they may regulate physiological changes after ACL injury through certain pathways. ACSF3 has been reported to be associated with the oxidative respiratory chain of fibroblasts, and gene deletion leads to impaired mitochondrial respiration, reduced lipid acylation, and decreased glycolytic flux in fibroblasts54. Therefore, it may affect the components of ACL by interfering with oxidative processes.

Some of the genes of interest in this study, such as KATNIP, NBPF10, FLYWCH1, etc., have been widely studied mainly in the field of tumors, and in the field of ligament have not yet been investigated, so most of the genes of interest in this study still need to be further experimentally verified in the future, to clarify the specific mechanism of their regulation of ACL injury.

GO enrichment analysis revealed that these genes are enriched in RHO GTPase Effectors (Fig. 2). The RHO (RAS homologous) family belongs to the Ras superfamily of guanine nucleotide-binding proteins and can regulate or directly modulate the expression of approximately 1% of proteins, controlling almost all fundamental cellular processes55,56. According to some researchers, it is believed that CD5can modulate cellular behavior57,58, and serve as a receptor and substrate for the protein tyrosine kinase p56lck59, which is in line with the Signaling by Receptor Tyrosine Kinases identified in the enrichment analysis. It also validates that CD5 mutations may have an important role in ACL injury.

This study has some limitations that need to be acknowledged. Firstly, the sample size was small, with analysis conducted on only 7 samples. Therefore, caution should be exercised when interpreting the results. To further understand the role of gene mutations in ACL injuries, larger sample sizes for exome sequencing analysis are needed. Therefore, it may be necessary to establish more stringent criteria for sample selection. Furthermore, previous studies have not provided conclusive evidence linking these genes to ligament or muscle injuries, and this experiment did not include validation using cellular or animal models.