Introduction

Leonurus japonicus Houtt, a member of the Lamiaceae family, is an annual or biennial herbaceous plant native to various parts of Asia but now widely distributed around the world1. L. japonicus is a traditional Chinese medicinal herb first recorded in China’s earliest pharmacopoeia, Shennong Baicao Jing. Its aerial parts are recognized in traditional Chinese medicine for their effects in promoting blood circulation, regulating menstruation, removing blood stasis, reducing swelling, and treating female-related diseases2. Currently, over 280 compounds have been extracted from L. japonicus, including flavonoids, alkaloids, diterpenes, phenylpropanoids, and phenolic acids, which possess significant potential for various applications3. In recent years, it has been discovered that L. japonicus possesses important pharmacological activities such as antibacterial, anti-inflammatory, anti-apoptotic, antioxidant, immune regulation, neuroprotection, and protection of the uterus and cardiovascular system4. These effects are all derived from the secondary metabolites contained in L. japonicus. The flowers and stems of L. japonicus possess an aesthetic appeal, showcasing a fresh and pure natural simplicity. Additionally, L. japonicus is favored by the people in southern China as an edible vegetable. Therefore, it is considered a resource with significant ornamental value and dual uses in medicine and food. Conducting in-depth research on this plant is beneficial for exploring its potential applications. The extensive research on L. japonicus has already provided information on its morphology, genetic diversity, metabolomics, transcriptomics, and genomics5,6,7,8,9,10,11. However, due to the lack of the mitochondrial genome of L. japonicus, elucidating the biological functions of key genes associated with mitochondrial traits remains highly challenging.

It is well known that mitochondria are important energy-producing organelles within cells, and they primarily release the ATP required for life through oxidative phosphorylation12. With the deeper study of mitochondria it was discovered that they are also the processors of the cell. Together with the nucleus and other organelles, they create an information processing system and are involved in all phases of the cell cycle13,14,15,16,17. Recent reports suggest that all mitochondria originated from a common ancestral organelle, which was formed through the symbiotic relationship between an α-proteobacterium and a host cell related to Asgard archaea18. Furthermore, the mtDNA exist independently of the nuclear genome within the cell. During evolution, mtDNA and the nuclear genome have continuously exchanged genetic material and the former has been controlled by the latter, which in turn has led to mitochondria becoming semi-autonomous organelles19. The number of genes in mtDNA is relatively small, with fewer than 40 genes. At the same time, it also possesses several characteristics, including a relatively fast evolutionary rate, relative conservation, and maternal inheritance20. This also gives it an extremely considerable advantage in the study of genetic mechanisms. In addition, mitochondria are also important in plant stress tolerance. Under abiotic stress conditions, mitochondria in plant cells undergo a variety of metabolic reactions to enhance the plant’s ability to adapt to the stress21. Genome-level studies of plant mitochondria can provide important references for the field of molecular genetics. This is significant for species classification, adaptation to adversity, and systematic evolution.

As of today, although the organelle genomes of many species have been extensively studied, the number of reports on cpDNA and mtDNA is not balanced. The NCBI database has nearly 13,000 published cpDNAs, while the number of published plant mtDNAs is less than 10% of that of cpDNAs22. Due to the higher complexity of plant mtDNA and their frequent interference from nuclear genomes, their assembly is more challenging. This may be due to the higher complexity of plant mitochondrial genomes, which are often influenced by the nuclear genome, making their assembly more challenging. Although the chloroplast genome and nuclear genome of Leonurus japonicus have been reported, its mitochondrial genome has not been reported to date. This severely limits our molecular biology research on L. japonicus. In this study, we performed sequencing, assembly, and annotation of the mitochondrial genome of L. japonicus, and conducted multiple analyses, including genomic features, codon usage, RNA editing, and repetitive sequences based on its mitochondrial genome. These results provide important reference data for the molecular genetics research of L. japonicus.

Results

Features of the L. japonicus mtDNA

The assembly of the L. japonicus mtDNA reveals (Fig. 1) that its total length is 382,905 bp, with a base composition of 27.31% A, 27.56% T, 22.39% C, and 22.74% G, resulting in a GC content of 45.13%. As shown in Table 1, a total of 51 genes were identified in the complete mtDNA, including 15 tRNA genes, 32 protein-coding genes (PCGs), and 4 rRNA genes. The 32 PCGs are classified into 10 categories: NADH dehydrogenase (9), ATP synthase (5), Small ribosomal subunit proteins (5), Cytochrome c biogenesis (4), Cytochrome c oxidase (3), Large ribosomal subunit proteins (2), Ubiquinol cytochrome c reductase (1), Succinate dehydrogenase (1), Transport membrane protein (1), and Maturases (1). During the annotation of this mtDNA, we found that trnI-GAT, cox1, ccmFc, cox2, rps3, and rps10 each contain one intron; nad7 and nad4 each contain three introns; and nad5, nad2, and nad1 each contain four introns. Additionally, there are two copies of rrn18 and five copies of trnM-CAT in the genome.

Fig. 1
figure 1

Leonurus japonicus mt genome map. The outer and inner sides of the largest circle in the diagram represent genes that are transcribed in the reverse and forward directions, respectively. The dark portions of the inner circle indicate GC content. Genes are represented in different colored blocks based on their functions.

Table 1 Classification of genes in the L. japonicus mtDNA.

Prediction of RNA editing sites

RNA editing primarily occurs in various organelles of plants and is closely related to their functions. RNA editing is an important molecular mechanism in mtDNA. As shown in Fig. 2, we predicted a total of 480 RNA editing sites among the 32 PCGs in the Motherwort mtDNA. The majority of these RNA editing sites (84.79%) are of the cytosine-uracil (C-U) type. Of these, 184 (38.33%) RNA editing sites are located at the first position of the codon, while 296 (61.67%) are at the second position. The gene ccmB has the most RNA editing sites, with 36, followed by ccmFn with 35. Genes ccmC, nad4, and nad6 each have more than 30 RNA editing sites. Conversely, atp9 has the fewest RNA editing sites, with only one. Additionally, we speculate that RNA editing may lead to premature termination of the rps10 gene. RNA editing results in 7.80% of amino acids changing from hydrophobic to hydrophilic, 49.27% changing from hydrophilic to hydrophobic, and 42.69% of hydrophilic amino acids remaining unchanged (Table 2).

Fig. 2
figure 2

RNA editing sites in the L. japonicus mtDNA. The x-axis represents different gene names, while the y-axis indicates the number of RNA editing sites.

Table 2 Analysis of RNA editing sites and types in the L. japonicus mt genome.

Repeat sequence analysis

In the L. japonicus mtDNA, there are numerous dispersed, tandem, and simple sequence repeats (DSR, TSR, and SSR), which are widely distributed throughout various regions of the genome (Fig. 3A). A total of 49 DSRs were identified in this mtDNA (Fig. 3B), including 17 forward repeats, 21 palindromic repeats, and 11 reverse repeats. The majority of these DSRs (67.35%) are between 30 and 70 bp in length. These DSRs may be closely related to the function of the mtDNA. A total of 10 TSRs were predicted in this mtDNA (Table 3), all with a match rate higher than 90% and lengths ranging from 6 to 41 bp. Additionally, 74 SSRs were identified in this mtDNA (Fig. 3C,D), with di-nucleotide repeats being the most common (27%), followed by mono-nucleotide repeats (23%), tetra-nucleotide repeats (23%), and tri-nucleotide repeats (21.6%). The least common were penta-nucleotide and hexa-nucleotide repeats, each accounting for only 2.7% of the total. Among mono-nucleotide SSRs, adenine (A) repeats and thymine (T) repeats are the most prevalent, each constituting 47.06% of the total. In di-nucleotide SSRs, AG and CT repeats are the most common, each making up 30% of the total. These SSRs can serve as molecular markers, providing important genetic information for plant classification and identification.

Fig. 3
figure 3

Repeat sequence analysis of the L. japonicus mt genome. (A) The distribution of repeat sequences. The outer circle displays different genes. The blue inner arc indicates repeat sequences that are greater than or equal to 30 bp in length. (B) Histograms of lengths and classes of different DSRs. Different colors represent different types of DSRs. The x-axis and y-axis indicate the length and quantity of DSRs, respectively. (C) Proportion of different types of SSRs. Different colors represent different types of SSRs. (D) Statistical histograms of various SSRs. The x-axis displays different SSRs, while the y-axis shows the length of each SSR.

Table 3 Distribution of perfect tandem repeats in L. japonicus mtDNA.

Homologous sequence analysis between organellar genomes

Mitochondria and chloroplasts are semi-autonomous organelles that are endosymbiotic within the cytoplasm, and they share many similarities, including a number of identical genes. Previous studies have shown that during plant evolution, frequent DNA sequence transfers occur between the mitochondria, chloroplasts, and the nuclear genome23,24,25,26. In this study, we found that the mtDNA size of L. japonicus (382,905 bp) is approximately 2.5 times larger than that of the cpDNA (151,654 bp). Due to the relatively fewer number of genes in the mtDNA, the distribution of genes in homologous regions is more dispersed, whereas the distribution of chloroplast genes is more compact (Fig. 4). As shown in Supplementary Table 1, we detected 28 segments in the L. japonicus mtDNA that may be involved in DNA migration between the two organelles, with a total length of 21,113 bp (5.51%). These homologous segments range in length from 30 to 5541 bp. The mtDNA of L. japonicus contains 5 genes that are fully matched with the cpDNA (trnI-GAT, trnW-CCA, trnD-GTC, trnH-GTG, and trnM-CAT) and 5 genes with partial matches (nad2, nad1, rrn18, nad5, and rrn26). The cpDNA of L. japonicus contains 7 genes that are fully matched with the mtDNA (rrn23S, trnA-TGC, trnD-GTC, trnH-GTG, trnI-CAT, trnI-GAT, and trnP-TGG) and 12 genes with partial matches (psaA, rbcL, rpoB, rpoC1, rpoC2, rps3, rrn16S, rrn23S, rrn4.5S, trnM-CAT, trnW-CCA, and ycf2).

Fig. 4
figure 4

Schematic diagram of homologous fragments between organelle genomes in L. japonicus. In the diagram, the blue and orange bands in the inner circle represent cpDNA and mtDNA, respectively. The outer circle displays different genes, while the green inner arc indicates homologous DNA segments.

Relative synonymous codon usage (RSCU) analysis

The RSCU value is commonly used to assess the frequency of codon usage under random conditions, with an RSCU of 1 indicating that the codon usage rate aligns with the theoretical frequency under random conditions. The RSCU values of L. japonicus mtDNA are shown in Fig. 5. In this mtDNA, 28 codons were identified with RSCU > 1, indicating that these codons are preferentially used. The average RSCU value of the GCU codon encoding Ala is the highest at 1.63, suggesting it is used most frequently. Moreover, over 90% (28) of the high-frequency codons end with A/U, which further indicates a preference for A/U at the third position of these codons.

Fig. 5
figure 5

Analysis of RSCU in the L. japonicus mtDNA. The x-axis and y-axis represent the types of amino acids and their RSCU values, respectively. Each amino acid, which can be encoded by multiple codons, is presented in different colored histograms.

Ka/Ks analysis

The synonymous substitution rate (Ks) and non-synonymous substitution rate (Ka) are often useful for analyzing the dynamic evolution and phylogenetic relationships of protein-coding sequences between closely related species27. The ratio of non-synonymous to synonymous substitutions (Ka/Ks) can predict the type of selective pressure experienced during the genetic evolution of species. When Ka/Ks > 1, it indicates positive selection; when Ka/Ks = 1, it suggests neutral selection; and when Ka/Ks < 1, it implies negative selection. Comparing the Ka/Ks ratios of L. japonicus with those of five other Lamiaceae mtDNAs (Fig. 6), it is observed that the Ka/Ks values for the genes ccmB, mttB, and rps10 all exceed 1, suggesting they are likely under positive selection. In contrast, the majority of genes (90%) have Ka/Ks values below 1, indicating they are under negative selection. This suggests that most PCGs in the L. japonicus mtDNA are highly conserved during genetic evolution.

Fig. 6
figure 6

Boxplots of the Ka/Ks ratios for L. japonicus and five other plants. The x-axis represents different gene names, while the y-axis indicates Ka/Ks values.

Phylogenetic analysis

The high conservation of PCGs in mtDNA is advantageous for constructing phylogenetic trees and can aid in classifying plant genetic evolution. To determine the phylogenetic position of L. japonicus, a phylogenetic tree was constructed using 21 shared PCGs from the mtDNAs of 24 species, including 9 Lamiaceae plants, 3 Cucurbitaceae plants, 2 Poaceae plants, 2 Solanaceae plants, 1 Caricaceae plant, 1 Bataceae plant, 1 Apiaceae plant, 1 Asparagaceae plant, 1 Araceae plant, 1 Arecaceae plant, 1 Ginkgoaceae plant (outgroup), and 1 Marchantiaceae plant (outgroup). The results showed that the phylogenetic tree is consistent with traditional taxonomic relationships and strongly supports the separation of different plant families (Fig. 7). The phylogenetic tree divides all plants into three major groups: Lamiaceae, Solanaceae, and Apiaceae are grouped together; Cucurbitaceae, Caricaceae, and Bataceae are grouped together; and Poaceae, Asparagaceae, Araceae, and Arecaceae are grouped together. The phylogenetic analysis indicates that Lamiaceae and Solanaceae are the closest related families, suggesting that these two families are closely related. In addition, the genera Leonurus and Scutellaria in the Lamiaceae family have the closest relationship, indicating that these two genera are sister groups.

Fig. 7
figure 7

Phylogenetic tree of 24 species (1000 bootstrap replicates). The numbers in the diagram represent the bootstrap value values of each node. The different colors on the right indicate the families to which each species belongs.

Discussion

Both chloroplasts and mitochondria are crucial organelles that play essential roles in the growth, development, and life activities of plant cells28,29. Both are semi-autonomous organelles in plants, and their genomes exhibit complex structures, including circular, branched forms, linear, and mixed structures with multiple rings22,30,31. This suggests that they may have some mysterious potential values that need to be further revealed. In this study, we report for the first time the mitochondrial DNA (mtDNA) of the important medicinal and edible plant L. japonicus. Similar to Perilla frutescens, the mtDNA of L. japonicus exhibits a typical circular structure (Fig. 1)32. This conserved topological feature may be associated with its relatively low frequency of homologous recombination, thereby maintaining the stability of the genomic structure. In previous studies, the mitochondrial genomes of some Lamiaceae species have been shown to possess complex multimeric structures, as observed in Prunella vulgaris, Salvia officinalis, and Scutellaria tsinyunensis33,34,35. The mtDNA of phylogenetically closely related species exhibits remarkable conformational diversity, a characteristic that fully demonstrates the high degree of adaptability and structural complexity displayed by Lamiaceae mtDNA during evolutionary processes34. The full length of this mtDNA is 382,905 bp, which is similar to the mtDNA size of S. tsinyunensis (354,073 bp). GC content serves as a crucial factor in studies of biological evolution and genomic adaptation36. The GC content of L. japonicus mtDNA is 45.13%, which is similar to that of other Lamiaceae plants, such as Vitex rotundifolia (45.54%)37, S. tsinyunensis (45.26%)35, and P. frutescens (45.23%)32. It appears that the GC content of mtDNA in Lamiaceae plants has been relatively conserved throughout long genetic evolution. Angiosperm mtDNA contains 24 core protein-coding genes, which are primarily associated with functional categories including ATPase subunits, cytochrome c biogenesis, and cytochrome c oxidase subunits35. A total of 51 genes were identified in L. japonicus mtDNA, including 15 tRNA genes, 32 protein-coding genes (PCGs), and 4 rRNA genes. The types and numbers of important core protein-coding genes in the mtDNA of L. japonicus are generally consistent with those of most Lamiaceae plants, but its tRNA genes show partial deletions38. This significant feature may be due to the combined action of various molecular mechanisms during the long-term evolution of L. japonicus mtDNA, including homologous recombination, horizontal gene transfer, and retrograde signaling from the nuclear genome39,40,41. These mechanisms have collectively promoted more frequent gene exchange between the mtDNA, nuclear genome, and cpDNA. The sequence transfers between chloroplasts and mitochondria within plant cells may also be a dynamically evolving endosymbiotic process42. By comparison we identified 28 sequence fragments that may be involved in the migration of DNA between the genomes of the two organelles. This sequence migration may be the primary reason for the differences in gene numbers among the mtDNAs of different plants.

After transcription is complete, RNA editing occurs, a process that modifies the RNA sequence. This can result in differences in protein function and expression compared to what is defined in the genome43. This process plays a crucial regulatory role in the life cycle of plants. In this study, we identified a total of 480 RNA editing sites in the PCGs of mtDNA, with the vast majority (84.79%) classified as C-U type. The preferential selection of G/C codons during RNA editing may be attributed to their relatively high binding free energy, a molecular characteristic that contributes to the maintenance of translational accuracy44. Interestingly, RNA editing events in the mitochondrial DNA (mtDNA) of Leonurus japonicus lead to the premature introduction of a stop codon in the rps10 gene and generate start codons in three genes: cox2, nad4L, and rps10. The premature termination of translation or abnormal generation of start codons could potentially result from either the organism’s adaptive response to environmental conditions or the occurrence of nonsense mutations45,46. These editing events are typically associated with the production of highly conserved homologous proteins, a mechanism that may contribute to the optimization of gene expression efficiency within mitochondria47. Additionally, the frequency of editing sites in different PCGs varies significantly among species. In Abelmoschus esculentus, Siberian larch, and Fritillaria ussuriensis, the genes rpl2, nad5, and nad4 have the most editing sites, respectively48,49,50. In the mtDNA of L. japonicus, the gene ccmB has the highest number of editing sites (Fig. 2). Additionally, we predicted that certain genes related to cytochrome c biogenesis and NADH dehydrogenase (ccmB, ccmC, ccmFn, nad2, nad4, and nad5) in L. japonicus mtDNA have a relatively higher number of editing sites. This indicates that the frequency of editing sites for genes associated with cytochrome c biogenesis and NADH dehydrogenase is higher in the mtDNA of L. japonicus. Based on previous studies, we have observed that RNA editing events in the mitochondrial mtDNA of many Lamiaceae species exhibit a strong bias towards genes associated with cytochrome c biogenesis and NADH dehydrogenase33,37,38,51. This editing preference likely reflects the Lamiaceae family’s critical dependence on energy metabolism and redox homeostasis, which is closely linked to their biological characteristics such as high essential oil content, rapid growth rates, and environmental adaptability52,53,54. This editing strategy serves a dual purpose: it acts as a compensatory mechanism for the high mutation rate in mitochondrial genomes, while simultaneously optimizing metabolic efficiency and enhancing stress resistance43,44,55,56. Hydrophilic amino acids interact with water molecules, guiding the proper folding of protein structures30. When a large number of hydrophilic amino acids are replaced by hydrophobic ones, forming a more stable hydrophobic core, the overall stability of the protein structure increases. In L. japonicus mtDNA, most amino acids are converted from hydrophilic to hydrophobic to enhance protein stability. This early termination could be an adaptive response to environmental conditions or the result of a nonsense mutation45,46. In the genomes of species, codon usage bias often arises due to the combined effects of dynamic changes in the external environment and internal factors. Based on RSCU analysis, we identified 28 highly frequent codons in the mtDNA of L. japonicus, the majority of which tend to end with A/U. This phenomenon is also observed in the mtDNAs of various other plants19,57,58. Additionally, the RSCU results can provide a reference for exploring the genetic mapping of L. japonicus during the evolutionary process. Ka/Ks analysis is commonly used to determine whether protein-coding genes (PCGs) are influenced by selective pressure during genetic evolution. In this study, we found that the majority of PCGs in the mtDNA of L. japonicus are affected by purifying selection. However, genes under positive selection, such as ccmB, mttB, and rps10, have already been identified, indicating that they may play a significant role in the evolution of the species.

Large amounts of repeat sequences are present in the mtDNAs of angiosperms, and they play an important role in genetic molecular markers, genomic structural variation, and genetic stability studies across species59,60. DSR refers to the identical or similar DNA sequences that are scattered at different locations within the genome. In mtDNA, these sequences are often regarded as segmental duplications involved in the DNA double-strand break repair process61. DSR includes four types: Forward repeat, Reverse repeat, Complement repeat, and Palindromic repeat. According to a report by Li et al. in the mtDNA of nine Lamiales plants, DSRs mainly exist in the forms of Forward repeat and Reverse repeat35. In contrast, L. japonicus mtDNA also contains a notable number of DSRs in the form of Complement repeats. This phenomenon may suggest that specific evolutionary pressures or changes in genomic repair mechanisms have led to the accumulation of these Complement repeats. SSRs (simple sequence repeats) possess high polymorphism, ease of detection, extensive genomic coverage, relative abundance, and codominant inheritance, making them uniquely valuable for genetic marker development, cultivar breeding, and population genetics research62. In 1997, Szibor et al. discovered dinucleotide polymorphisms in the D-loop region of human mtDNA, which can be used to distinguish between different human populations63. As early as 1999, Soranzo et al. were the first to discover the polymorphism of SSRs in the mtDNA of 15 species of conifers64. Khera et al. also developed SSR markers in the mtDNA of Cajanus cajan, and they noted in their report that these SSR markers can distinguish genotypes based on mtDNA type in phylogenetic studies65. In this study, we identified 74 SSRs distributed across the mtDNA. We identified 74 SSRs distributed throughout the mtDNA of L. japonicus. In contrast to the predominance of tetranucleotide repeats reported in most Lamiaceae species, dinucleotide repeats constitute the most abundant SSR type in L. japonicus mtDNA, accounting for 27% of the total35,51. This distinctive pattern may reflect unique selective pressures acting on the mitochondrial genome during its evolutionary history or could be associated with its specific ecological adaptations. These identified SSRs will aid in the identification and conservation of important species resources.

The mtDNAs are conserved and slow to mutate during species evolution, and they often play an important role in phylogenetic studies19. To investigate the systematic evolutionary relationships of L. japonicus, we collected high-quality mtDNAs from various plants within the Lamiaceae family and other families. Phylogenetic analysis based on their shared conserved PCGs showed that the Lamiaceae family is more closely related to the Solanaceae family among the numerous families studied. Additionally, within the Lamiaceae family, Leonurus and Scutellaria are likely to be sister genera. Compared to cpDNA, plant mtDNA is generally larger and more structurally variable, which means that, in certain cases, mtDNA may provide richer phylogenetic information than cpDNA66. Currently, research on plant mtDNA is receiving increasing attention, and it is believed that it will play an important role in future systematic evolutionary studies. The mtDNA of L. japonicus provides important reference data for exploring the evolutionary history of Lamiaceae species and further contributes valuable information for molecular genetic studies of L. japonicus.

Conclusions

In this study, we have for the first time assembled the complete mtDNA of L. japonicus using both second-generation and third-generation sequencing data. This assembly strategy has resulted in a more complete and reliable genome assembly. Through genomic characterization of the complete mtDNA and its annotation results, we found that the mtDNA of L. japonicus is 382,905 bp in length, including 15 tRNA genes, 32 PCGs, and 4 rRNA genes. By combining RNA editing site prediction results with previous studies, we observed a higher frequency of editing sites in genes related to cytochrome c biogenesis and NADH dehydrogenase. The Ka/Ks results also indicate that the majority of PCGs in mtDNA are highly conserved. Additionally, phylogenetic analysis based on shared PCGs from 24 mtDNAs revealed the evolutionary positioning of L. japonicus. Our study provides important reference data for the molecular genetics, dynamic evolution, and species identification of L. japonicus, thereby advancing the conservation and development of this important medicinal and edible plant resource.

Materials & methods

Plant collection, DNA extraction, and sequencing

L. japonicus plants cultivated for 5 months at the Baiyun Experimental Base of Guangdong Academy of Agricultural Sciences (N23°07ʹ29ʺ, E113°10ʹ31ʺ) were selected as experimental materials. In June 2024, fresh leaves of L. japonicus were collected using liquid nitrogen for quick freezing. The collected samples were stored in a deep freezer (− 80 °C). The plant material used in this study was formally identified by Professor Keming Liu, a retired professor of plant taxonomy from Hunan Normal University. Specimens were preserved in the Key Laboratory Building of the College of Forestry and Horticulture at Hubei Minzu University (N30°17ʹ51ʺ, E109°29ʹ54ʺ, person: Qun Hu, email: [email protected]) with the voucher number 20240627009. Total DNA was extracted using a plant genome DNA extraction kit (Tsingke Biotech, Beijing, China). The quantity and quality of extracted DNA were evaluated using a NanoDrop One spectrophotometer and agarose gel electrophoresis. Qualified samples were then sent to Guangzhou Yuda Biotechnology Company Limited (Guangzhou, China) for Illumina and SMRT sequencing to obtain high-accuracy reads.

Assembly and annotation of mtDNA

We used SMRT Analysis v2.3.0 software to assess the quality of sequencing data generated by the PacBio RS II system, identifying and filtering out low-quality reads. We extracted the mtDNA sequence of L. japonicus from the filtered reads data. The obtained sequences were aligned to plastid genome data in the NCBI database, and sequences with alignment rates greater than 90% were filtered out to maximize the quality and accuracy of the mitochondrial genome assembly. Subsequently, we first used Canu v2.2 software to correct the third-generation data, and then used Bowtie 2-v2.5.2 software to align the second-generation data to the corrected sequences67,68. Finally, we used Unicycler v0.5.0 software to assemble the final sequences from the second-generation and third-generation data (with default parameters), resulting in the complete circular mtDNA of L. japonicus69.

We used BLAST to align rRNAs and coding proteins in the assembled mtDNA with published plant mt sequences, which were manually calibrated using closely related species as the primary reference. Annotation of tRNAs and ORFs using tRNAscan-SE (https://lowelab.ucsc.edu/tRNAscan-SE/) and Open Reading Frame Finder (https://www.ncbi.nlm.nih.gov/orffinder/)70,71. Finally, we used OGDRAW software (https://chlorobox.mpimp-golm.mpg.de/OGDraw.html) to draw the mtDNA map of L. japonicus72.

Prediction of RNA editing sites

To predict the RNA editing sites of L. japonicus mtDNA, we imported the sequence files of the PCGs into the PREPACT online website (http://www.prepact.de/prepact-main.php), then chose to use the complete mtDNA of Arabidopsis thaliana as a template for predicting the editing sites73. Finally, we compiled the predicted results from the comparison and plotted a histogram.

Repetitive sequence analysis

Repeated sequences include SSRs, TSRs, and DSRs. We used MISA software to predict SSRs in the L. japonicus mtDNA (parameters: 10 5 4 3 3 3)74. We predicted TSRs in the L. japonicus mtDNA using Tandem Repeats Finder software (TRF-v4.09.1, parameters: 2 7 7 80 10 50 2000, − f − d − m)75. Subsequently, we predicted DSRs in the mtDNA using the REPuter website (https://bibiserv.cebitec.uni-bielefeld.de/reputer)76. We used the Advanced Circos program in TBtools software to visualize the repeated sequences77.

DNA migration between mitochondria and chloroplasts

Download the complete cpDNA sequence of L. japonicus from the NCBI database (https://www.ncbi.nlm.nih.gov/nuccore/OQ417592.1). We used BLAST software to compare homologous sequences between the cpDNA and mtDNA of L. japonicus, using length greater than 30 bp, e-value less than 1e−5, and alignment rate greater than 70% as screening criteria78. Subsequently, we used the Advanced Circos program in TBtools software to create plots of the homologous sequences77.

Codon usage patterns of mtDNA

To understand the codon usage pattern of the mtDNA of L. japonicus, we first used a Perl script to filter out all CDS sequences from its mtDNA and import them into a text file. Then, we uploaded this text file to an online cloud platform (http://112.86.217.82:9919/#/) to calculate the RSCU values. Finally, we compiled the results and plotted a stacked chart and a histogram.

Ka/Ks analysis

We imported the mtDNAs of five Lamiaceae plants (Ajuga reptans KF709392.1, Pogostemon heyneanus MK728874.1, S. tsinyunensis MW553042.1, Scutellaria barbata MZ127834.1, and Scutellaria franchetiana MZ127835.1) along with the mtDNA of L. japonicus into the Genepioneer cloud platform (http://112.86.217.82:9919/#/) to calculate the Ka/Ks values. Subsequently, a box plot was created using Origin 2021 software based on the conserved proteins.

Prediction of the systematic evolutionary relationships

We selected Ginkgo biloba and Marchantia paleacea as outgroups and performed systematic evolutionary analysis using species from the Lamiaceae family and other plant families included in the NCBI database. The mtDNAs of 24 plant species were imported into PhyloSuite v1.2.2 software, where duplicate gene copies and non-shared genes were removed79. Subsequently, we performed multiple sequence alignments of the filtered shared genes using MAFFT v7.313 software80. The concatenated multiple sequence alignment results were used to construct the phylogenetic tree. The optimal substitution model (GTR + F + I + G4) was selected based on the Bayesian Information Criterion (BIC). Maximum Likelihood (ML) analysis with 1000 bootstrap replicates was performed using IQ-TREE v1.6.8 to assess nodal support81. Finally, we imported the constructed phylogenetic tree into the ITOL online website (https://itol.embl.de/) for optimization82.