Introduction

Lung cancer is the leading cause of cancer death worldwide1. Multiple primary lung cancer (MPLC) is defined as multiple tumor nodules in the same or different lung lobes, which can be synchronous or metachronous2. The widespread application of computed tomography (CT) screening has caused a significant increase in the detection of MPLCs. In 1975, Martini and Melamed proposed the first diagnostic criteria for distinguishing MPLCs from intrapulmonary metastases. The Martini and Melamed (M-M) criteria includes the histological type and the involved anatomical areas of each lesion3. However, the M-M standard is found unable to derive a correct diagnosis sometimes, due to the increasing detection of MPLC. Along with the development and wildly usage of next-generation sequencing (NGS), the American College of Chest Physicians (ACCP) guidelines and the American Joint Committee on Cancer (AJCC) Cancer Stage Manual (eighth edition) have added different molecular genetic characteristics as a diagnostic criterion, but no specific genetic characteristic was mentioned4,5. Thus, it is necessary to reveal the genetic characteristics of MPLCs.

As for the treatment of MPLCs, systemic treatments including targeted therapy and immune checkpoint inhibitors (ICIs), besides radical surgery and radiotherapy, are playing increasingly critical roles. However, the efficacy of targeted therapy and ICIs on MPLCs is unclear6,7,8. Several interventional clinical trials have been started for applying EGFR tyrosine kinase inhibitors (EGFR-TKIs) and anti-Programmed Death 1 (anti-PD-1) therapy on MPLC patients (NCT04026841, NCT04982900, NCT05053802, NCT04840758)2. Revealing the frequent mutations, critical pathways, and immunotherapy-related targets in MPLCs can help guide treatment strategies for MPLC patients.

Recent research gradually revealed the genetic characteristics of MPLCs and found that adding molecular genetic characteristics into the criteria can improve the diagnosis accuracy9,10,11,12,13,14,15,16,17,18. However, previous research always collected MPLC patients with fewer than 3 lesions. Finley et al. collected 175 MPLC patients and most patients (125 of 175, 71%) had two lesions19. In the cohort of 26 patients with multifocal SSNs (subsolid nodules), Li et al. found that 17 (65.4%) patients have 2 lesions, 6 (23.1%) patients have 3 lesions, and only 2 (7.7%) patients have no fewer than 5 lesions20. Qu et al. specifically collected MPLC patients with no fewer than three lesions, and they also found that only 10.7% of patients have no fewer than 5 lesions21.

Based on lesions confirmed as independent primary lesions through genetic and clinical characteristics, we defined super MPLC as the MPLC with no fewer than 5 lesions. Previous studies only included patients with 2, 3, and 4 lesions22,23, or grouped patients according to having 2, 3, or greater than or equal to 4 lesions16,20. Focusing on patients with 5 or more lesions can fill the blank of previous research. Also, based on our previous research24, we found that the patients with 5 or more lesions tend to become a separate group (Supplementary Figure 1A). These patients with super MPLCs are more likely to be misdiagnosed as infections and metastases. Besides, super MPLCs often bring the requirement for more precise comprehensive treatment strategies, combined with more times of surgery therapies, medical treatment, minimally invasive ablation, and other methods. Also, the response to the same treatment may vary among different lesions due to molecular genetic differences.

For exploring MPLCs, we have documented a case involving a 56-year-old male patient who presented with six simultaneous invasive adenocarcinomas in 2020, which demonstrated the role of whole-exome sequencing (WES) in elucidating the relationship between these lesions very well. We detected that some lesions derived from aerogenous metastasis, while others originated independently25. Building upon these findings, we then conducted a subsequent investigation involving the collection of 115 multifocal tumor tissues from 32 patients diagnosed with synchronous multiple primary lung cancers. In this study, we initially characterized the involvement of the MAPK/ERK pathway in the development of MPLC using a large-scale sample size. Based on this cohort, in which the mean lesion number of patients is 3.59, we detected that EGFR is the primary driver mutation with the highest proportion in the MAPK/ERK pathway24. Here, for further elucidating the differences between super MPLCs and other MPLCs, and explore the change of mutation landscapes as the lesion number increases, we established a specialized cohort (PUMCH-ssesMPLC) specifically for super MPLCs and performed whole exome sequencing (WES) on these samples. Our results can provide treatment guidance for super MPLCs and bring new insights into the molecular evolution of tumors.

Results

Clinical information

We collected 130 lesions from 18 patients, all of whom were stage I (17/18 is stage IA), with a mean age of 57.7 years, 12 (66.7%) were female, and 3 (16.7%) had a smoking history. Of these patients, 15 (83.3%) patients had multiple lesions in different pulmonary lobes. All patients presented synchronous MPLCs with 5 or more lesions, which are the so-called super MPLCs. The pathologic type of these 130 lesions was adenocarcinoma. Among the four pathological subtypes, the subtype with the highest percentage was atypical adenomatous hyperplasia (AAH, 42.3%), and 77.8% of patients have more than one lesion in invasive stages (minimally invasive adenocarcinoma (MIA) and invasive adenocarcinoma (IAC)) (Supplementary Figure 1B, Supplementary Table 1). Lesions are more prevalent in the upper lobes (67.7%), compared with the middle (3.8%) and lower lobes (28.5%), while lesions have a similar incidence in the left (41.5%) and right (58.5%) lungs (Supplementary Table 2). Other clinical information, such as family history and career, is included in Supplementary Table 3. We found super MPLCs with high prevalence in non-smoking, relatively younger women, and the lesions are mostly located in the upper lobes and classified as adenocarcinoma. The clinicopathological features of these patients are demonstrated in Table 1. These features are similar to the features of MPLCs previously reported in the literature16,26.

Table 1 Clinicopathological features of patients enrolled (N = 18)

All patients underwent surgery using video-assisted thoracoscopic surgery (VATS). Two patients only underwent wedge resection, while most patients (83.3%, 15/18) underwent resection of at least one lung lobe (Supplementary Table 4). Among the 18 patients, 14 only underwent surgical treatment, while the remaining 4 patients underwent targeted therapy and chemotherapy. We conducted a recent follow-up of patients on February 27, 2024. All patients survived at follow-up, with no recurrence or metastasis. Based on follow-up CT examinations, only two patients had imaging changes of unresected nodules that existed before surgery, but both did not meet the criteria for determining malignant nodules (Supplementary Table 5).

The mutational landscape of super MPLCs

We performed whole exome sequencing (WES) on these samples. The mutational landscape of super MPLCs is summarized in Fig. 1A. Among these 130 lesions, BRAF mutations occurred with the highest frequency of 31.5%, followed by EGFR (13.8%) and KRAS (18.5%), while TP53 mutation frequency was low (5.4%). We also detected MET exon 14 skipping events in two lesions (Supplementary Table 6). The analysis based on each patient yielded similar results, and we found that 90% of patients had mutations in BRAF (Supplementary Figure 1C). This result was significantly different from MPLCs with fewer lesions. Previous studies found that in MPLCs with 2.63 lesions on average, the proportions of EGFR and BRAF mutations were 56.1% and 11.0%, respectively16. The proportions in our PUMCH-ssesMPLC cohort differed significantly from theirs. Similarly, compared to subsolid nodules (SSNs) and lung adenocarcinomas, we also found super MPLCs had a significantly higher frequency of BRAF mutations and a lower frequency of EGFR mutations20,27 (Table 2). For validation, we also collected 11 cohorts as independent controls. In Table 2, we summarized the 11 published datasets into 3 types, which include 5 cohorts mainly consisting of early-stage lung cancer with single lesions, 4 cohorts of MPLCs with fewer lesions, and 2 late-stage lung cancer cohorts. Compared with early-stage and metastatic lung cancer, and MPLCs with fewer lesions, the mutation frequency of BRAF in our super-MPLC cohort is significantly higher and the EGFR mutation frequency is significantly lower. Considering the relatively high proportion of preinvasive lesions in our cohort, we extracted preinvasive lesions from the early-stage lung cancer cohort of Xiang et al.28. We found that our cohort with a significantly higher BRAF mutation frequency and a significantly lower EGFR mutation frequency than the preinvasive early-stage lung cancer (Table 2). This further confirms the uniqueness of super MPLC. Furthermore, interestingly, we found that as the mean lesion number of MPLCs increases16,24, the BRAF mutation frequency gradually increases, while the EGFR mutation frequency gradually decreases (Fig. 1B). In this tendency, the comparisons between our cohort and control cohorts all met the statistical significance criteria. Some comparisons among control cohorts are also supported by statistical significance (Fig. 1B). This is also consistent with our findings. In Supplementary Table 7, we compared the clinical information of super MPLCs with published cohorts. We found that compared to single-lesion lung cancer, super MPLCs tend to occur in younger, non-smoking populations, and females, although most do not reach statistical significance.

Fig. 1: The mutational landscape of super MPLCs.
figure 1

A Mutational landscape of nonsynonymous somatic mutations, including various alternation types like frameshift, deletions, and insertions. The columns represent samples, and there are 130 samples in total. Mutated genes are ordered by the mutation frequencies. Pathologic subtypes are annotated in the upper panel. AAH: atypical adenomatous hyperplasia; AIS: adenocarcinoma in situ; MIA: minimally invasive adenocarcinoma; IAC: invasive adenocarcinoma. B Comparison of the mutation frequencies of several driver genes among our PUMCH-ssesMPLC cohort, two cohorts of MPLC with fewer lesions, and an early-stage single-lesion lung cancer cohort. C Functional enrichment analyses on mutated genes. D Co-occurrence or exclusivity of high-frequency mutations. We include mutations occurring in no fewer than 5 samples. The dot plot displays the odds ratios (OR) (Fisher’s exact test) of exclusivity (red) and co-occurrence (blue). The size of the dots also represents the logarithm of the odds ratios. The value of log (OR) is assigned 1 when the OR ratio is positive infinity (since no samples had a common mutation of the two genes). The significances (asterisks) are calculated by the DISCOVER method56. (*** < 0.001; ** < 0.01; * < 0.05).

Table 2 Comparing the mutation rates of several key driver genes in our PUMCH-ssesMPLC cohort with other public datasets

On the other hand, similar to previous studies20,24,29, we found that mutations in super MPLCs were enriched in mitogen-activated protein kinases (MAPK), PI3k-Akt, and Phospholipase D signaling pathways (Fig. 1C). The MAPK signaling pathway is one of the critical pathways in the eukaryotic signaling network, which is crucial for cell proliferation, differentiation, and apoptosis. Multiple key driver mutations in this article are associated with the MAPK signaling pathway. In the MAPK signaling pathway, tyrosine kinase receptors (such as EGFR) are activated by extracellular growth factors, recruiting Sos proteins to the cell membrane, and then activating Ras proteins (such as KRAS). Activated Ras proteins can activate multiple downstream proteins, including the Raf protein family (such as BRAF). BRAF is a crucial MAP kinase kinase kinase that, upon activation, can initiate downstream cascade phosphorylation, producing activated MAPK that enters the nucleus to regulate transcriptional regulation.

Similar to the previous study30, correlation analysis showed that BRAF mutations are mutually exclusive with EGFR mutations (\(p=0.013\)), while there are also rare co-mutations of BRAF and EGFR (Methods, Fig. 1D). We detected that the BRAF mutations are also significantly mutually exclusive with KRAS and MAP2K1 mutations. The EGFR and KRAS also exhibited exclusivity (Fig. 1D). Interestingly, in early-stage lung cancers or SSNs, TP53 was found to be co-altered in 19-60% of the EGFR mutant population20,31,32. But no co-occurrence of EGFR and TP53 mutation was found in our PUMCH-ssesMPLC cohort (Fig. 1A). Also, we found that in different lung lobes (upper lobe vs lower lobe, left lung vs right lung), the frequencies of these mutations have no significant differences. (Supplementary Figures 1D-F). Also, there is no significant difference between the results generated by different mutation calling methods (Supplementary Figure 1G).

Selection preferences for driver gene mutation sites in super MPLCs

We further analyzed the mutation sites of several major driver genes (Fig. 2A). Among the 18 EGFR mutations, two classical mutations, EGFR 19del and L858R, have the highest incidence (33.3% and 38.9%, respectively), but still around 30% of patients carried lesions with rare EGFR mutations (Supplementary Figure 2A). Rare mutations mainly include EGFR exon 20 insertions (20INS), L861Q, and M277E, which, especially 20INS, are insensitive to traditional EGFR-TKIs and require the latest targeted drugs to achieve the ideal therapeutic effect33. Also, we found that the EGFR-19del mutation deletion insertion sites were different in a single patient (Supplementary Figure 2D), which is consistent with the molecular characterization of multiple primary cancers. Compared with other studies, we found that the proportion of 19del in super MPLCs is relatively higher than other MPLCs (21%) and SSNs (20%), while the proportion of L858R is similar16,20.

Fig. 2: Mutation sites and subtypes of key driver genes in super MPLCs.
figure 2

A Mutational landscape of nonsynonymous somatic mutations, separately showing mutations at different sites of BRAF, EGFR, KRAS, MAP2K1 and TP53 genes. The BRAF mutations are classified into I, II, and III three subtypes. The columns represent samples and are ordered by their corresponding patients. Pathologic subtypes are annotated in the upper panel. B Counting the mutation frequency of each BRAF mutation site and classify these BRAF mutation sites into Class I, II, and III, as well as activating and non-activating mutation. C Statistical tests on the occurrence number of three classes of BRAF mutations.

As for BRAF, all 41 BRAF mutations are in the kinase structural domains. BRAF mutations contain three classes, which are class I (V600D/E/K/R), classII (G464V, G469A/S/V, L597Q/R, K601X), and class III (G466E, D594G/N, G596R, N581S/I). Class I V600E is the predominant lung adenocarcinoma27, while we found that the most common BRAF mutation class is Class II in our PUMCH-ssesMPLC cohort (Fig. 2B). The BRAF mutation with the highest prevalence was G469A/V/S, which was detected in 10 samples, followed by K601. Both the mutations belonged to Class II, whereas Class I V600E was only detected in 5 (5.4%) samples (Fig. 2B and Supplementary Figure 2B). The prevalence of BRAF Class II mutations in our cohort is significantly higher than in Class I and III (Fig. 2C). As for KRAS, the mutation sites with the highest incidences are G12D (46%) and G12V (23%) (Fig. 2A and Supplementary Figure 2C). Also, in patients like P03 and P06, we detected multiple KRAS mutations (Fig. 2A). Among the mutation subtypes of MAP2K1, E102_I103del, a small indel located in the protein kinase ___domain, is detected in 5.4% of lesions and serves as the predominant subtype (Fig. 2A), which is similar to the finding of Xiang et al.28. The mutation frequencies of other MAP2K1 mutation subtypes (such as K57N and Q56P) are all relatively low, only detected in 1-2 patients. Xiang et al. also found that indels accounted for 92.3% of MAP2K1 mutations in AIS, 94.1% in MIA, and the proportion in IAC is relatively low. As for TP53 mutations, TP53 R151A is the most detected mutation subtype (in 3.8% of lesions), while TP53 K373R and N30kfs*13 are detected in only one patient (Fig. 2A).

Comparing mutation patterns in different pathologic subtypes

We analyzed the differences in mutations among the four pathological subtypes. Generally, we found that the numbers of unique mutated genes in the four pathological types (AAH, AIS, MIA, and IAC) are 470, 407, 85, and 1250, respectively (Fig. 3A). The average number of unique mutated genes in each lesion of the four pathological types is 8.5, 11.3, 5.7, and 52.1, respectively. Also, the lesions in invasive stages (MIA and IAC) have significantly higher tumor mutation burden (TMB, \(p=0.0088\)) and chromosomal instability (CIS, \(p=0.0003\)) than those in preinvasive stages (AAH and AIS) (Fig. 3B, C). This indicates that as the tumor progresses, mutations are constantly accumulating.

Fig. 3: Comparing the mutational landscapes among different pathologic subtypes of super MPLC lesions.
figure 3

A Venn diagram showing the unique and shared mutations of the four pathologic subtypes (AAH, AIS, MIA, and IAC). B, C Comparing the value of tumor mutation burden (TMB), and chromosomal instability (CIS) among invasive (IAC and MIA) and preinvasive (AAH and AIS) stages. C Comparing the mutation proportions of several key driver genes in different pathologic subtypes. D Comparing the mutation proportions of BRAF Class I, II, and III mutations among four pathologic subtypes.

As for the mutations of essential driver genes, we found that as the tumor progresses, the proportion of BRAF mutations continues to increase (Fig. 3C). The incidence of BRAF mutations in IAC was significantly higher than the incidence in AAH (\(p=0.014\), Fig. 3A). Focusing on the mutation sites of essential driver genes, we found that in BRAF mutations, the proportion of Class II mutations gradually increases as the invasiveness of the tumor increases (Fig. 3D). We also performed the analysis on CNV, neoantigen burden, aneuploid, and mutation signatures (Supplementary Fig. 3A–3D). We found that the invasive lesions have significantly higher aneuploidy scores than non-invasive lesions. In brief, we found that with the tumor progression, the mutations of super MPLCs gradually accumulate, and form critical and typical mutation patterns.

In addition, we performed immunohistochemistry (IHC) analysis of PDL1 and CD8. We found that 91.7% (100/109) of lesions are PDL1 negative and other lesions also have very low TPS (Supplementary Figure 3E), which is similar to previous research16,24 and also indicates that super MPLCs may not respond to anti‐PD‐L1 treatment. We also didn’t find significant associations between PDL1 levels and BRAF mutations (Supplementary Figure 3G). As for CD8, we found that CD8 lymphocyte counts in invasive lesions were significantly higher than in non-invasive lesions (Supplementary Figure 3F). This reflects the gradual attraction of immune cells during the tumor initiation process.

The clone evolution pattern of super MPLCs

To study the clonal relationships among various lesions of a single patient with super MPLC, we constructed a phylogenetic tree based on mutation profiles. We found that in the same patient, there is no significant overlap in mutation sets among lesion pairs, and the phylogenetic tree also indicated that the lesions have no or only a short common trunk. This confirms the independent origin of lesions in super MPLCs. All 18 patients had no lymph node metastasis. In the case of N0, we considered a pair of lesions that met the following criteria as possibly originating from intrapulmonary metastasis: 1. Located in the same lung lobe; 2. There are more than 4 common mutations; 3. The common mutation number is more than 20% of the total mutation number of any lesion. Among these 18 patients, each patient has over 5 independent primary lesions (Fig. 4, Supplementary Figures 46).

Fig. 4: Clonal architecture of super MPLCs.
figure 4

A, D Three-dimensional reconstruction of lung CT images before and after surgery, display the lesions (yellow blocks). B, H Heatmaps showed the presence (blue) or absence (gray) of somatic mutations. C, F, G Phylogenetic trees indicated the clonal structure of lesions in each patient. The table below the phylogenetic tree shows the common mutation number of each pair of lesions. The green mark indicates that these two lesions may originate from intrapulmonary metastasis. E The excised lesions and their corresponding locations. I The ___location diagram and CT images of lesions.

We have highlighted several cases here. In P09, we detected over 20 suspicious malignant nodules before surgery, and we removed most of them one by one during the surgery (Fig. 4A). We performed WES on the malignant nodules confirmed by pathological reports, and the results showed that BRAF mutations were detected in 6 of 11 lesions (Fig. 4B). These 6 mutations are not located at the same site, namely V600G, K601E, N581I, T599dup, G469A, and G596R. Combined with other mutations and the phylogenetic tree, it has been confirmed that these lesions are independent primary lesions (Fig. 4C).

Also, we detected more than 30 suspicious malignant nodules in the P09 patient (Fig. 4D). After excision surgery and carefully separating the excised specimens, we identified over 30 nodules (Fig. 4E). Based on the WES results, we confirmed that there are at least 10 lesions with different genetic characteristics and matching the diagnostic criteria for MPLC (Fig. 4F).

Of the 18 patients, almost every patient has different genetic characteristics for any two lesions, and there are only 3 patients (P04, P05, and P11) with one pair of lesions possibly originating from intrapulmonary metastasis. P04-T3 and P04-T4 are both located in the left upper lobe, and their histological appearance is both AAH. The two lesions share mutations in EGFR, both of which are EGFR 20ins (p.D770-N771insG). The two lesions also share other three mutations, but they have 10 and 13 unique mutations, respectively (Fig. 4G-I). This suggests that they may have some homology. However, their histological appearances are both preinvasive, which may indicate that the homology here is an accidental phenomenon or that lesions have abilities to metastasize in the early stages. Similarly, both P05-T4 and P05-T5 are in the right upper lobe, and their histological appearances are also preinvasive, which are AIS and AAH, respectively. They shared 30 mutations, but there were also 49 and 48 unique mutations, respectively (Supplementary Fig. 4D).

Discussion

In the study, we defined MPLCs with five or more lesions as super MPLCs and collected samples from 18 super MPLC patients. Considering the lower incidence rate (only 5-10% in MPLC), higher misdiagnosis rate, and lower surgical reception rate, we built these precious samples into a specialized cohort (PUMCH-ssesMPLC). The 5 lesions used for inclusion criteria have all been confirmed as independent primary malignant lesions through genetic and clinical characteristics (Fig. 4, Supplementary Figures 46). In addition, we performed careful surgery on these patients, removed and identified most of the lesions (Fig. 4A and D), avoiding possible sampling bias towards larger and more invasive lesions. We performed WES for further exploring the rare and special subtype of MPLC (Supplementary Tables 8-9). Also, it is worth noting that the actual number of malignant lesions in patients often exceeds the number of lesions with high-quality WES results. We provided a comprehensive genomic landscape of super MPLCs and compared the landscape with a total of 11 control cohorts for 3 types (Table 2). We revealed the similar or opposite characteristics of these super MPLCs compared to MPLCs with fewer lesions.

Super MPLC has some characteristics similar to MPLC with fewer lesions. (1) They have similar clinical features: prevalence in non-smoking, relatively younger women, preference on the upper lobes, and being classified as adenocarcinoma. (2) The mutations in most of their lesions were enriched in PI3k-Akt and MAPK signaling pathways, because of the high frequency of BRAF, EGFR, and KRAS mutations. (3) In both of them, BRAF mutations were mutually exclusive with EGFR mutations. (4) The mutation sites and subtypes of a single driver gene have high heterogeneity across lesions in a single patient. (5) Class II mutation, rather than Class I V600E, is the predominant mutation subtype of BRAF16.

There are also some opposite characteristics between super MPLC and MPLC with fewer lesions. (1) Super MPLCs have significantly higher mutation frequency of BRAF and significantly lower mutation frequency of EGFR than MPLCs with fewer lesions (18.9% to 56.1% in EGFR, and 31.5% to 11.0% in BRAF)16 (Table 2). Also, we found that as the mean lesion number of MPLCs increases16,24, the BRAF mutation frequency gradually increases, while the EGFR mutation frequency gradually decreases. (2) In EGFR mutations, the proportion of 19del in super MPLCs is higher than in MPLCs with fewer lesions (33.3% to 21%). Comparing different pathologic subtypes of super MPLC lesions, we found that along with the increase of lesion invasiveness, mutations are accumulating, while chromosomal instability increases and mutation patterns more incline to a classic pattern of super MPLCs. For example, BRAF mutations, especially class II BRAF mutations, have higher incidences in invasive stages than in preinvasive stages.

Most importantly, this study found that compared to early lung cancer, MPLCs with fewer lesions, and advanced lung cancer, super MPLCs have significantly higher mutation frequency of BRAF and significantly lower mutation frequency of EGFR, while mutations are also enriched in the MAPK signaling pathway. The core finding was summarized in Fig. 5A.

Fig. 5: The hypothesized mechanism of observed higher BRAF mutation rate and low EGFR mutation rate in super MPLCs.
figure 5

A Summary of the main phenomena found in this article: super MPLCs have significantly higher mutation frequency of BRAF and lower mutation frequency of EGFR than MPLCs with fewer lesions. B The hypothesized mechanism contains three parts: Firstly, through the convergent evolutionary mechanism of multiple primary cancers, mutations in each lesion are limited to the MAPK pathway containing EGFR and BRAF. Then, based on the theories of functional redundancy and synthetic lethality, the mutations of two driven genes, EGFR and BRAF, exhibit mutual exclusion. Third, because of the mechanism of concomitant tumor resistance, the lesion with EGFR mutations has higher invasiveness and will inhibit the growth of smaller lesions, and thus, lesions in the observed super MPLCs tend to keep with low invasiveness and with BRAF mutation rather than EGFR mutation.

In clinical practice, this suggests possible different treatment strategies. (1) BRAF-targeted therapy drugs such as dalafenib and velofenib both target the BRAF class I V600 mutation. There is no suitable targeted therapeutic drug for super-MPLC patients because of the low occurrence rates of EGFR and BRAF Class I mutations. (2) Thoracic surgeons should collaborate with radiologists, pathologists, radiologists, oncologists, and other physicians to form a multiple-disciplinary team (MDT) to better diagnose super MPLCs. The surgical resection effect is good after clarifying that it is super MPLC. The follow-up of our PUMCH-ssesMPLC cohort confirms this viewpoint. All patients survived, and at least 8 patients only underwent surgery without any postoperative adjuvant treatment.

Regarding the mechanism, we have developed hypotheses based on existing literature. Firstly, in this study, we found the driver mutations of lesions in super MPLCs are mainly enriched into the MAPK/ERK pathway. Ma et al. developed a “convergent evolution” theory, which states that multiple tumors in the same patient are in parallel evolution and that driver mutations in multiple tumors may be constrained on the same key signaling pathways29,34. Kasper et al. also found predictability in the earliest stages of tumorigenesis and phenotypic convergence on common dominant pathways using their powerful platform for experimental evolution35. Our results and previous studies in MPLCs support this theory16,24,36.

Next, in the MAPK/ERK pathway, we found that EGFR and BRAF mutations exhibit mutual exclusion. There are currently two common hypotheses for the mechanisms of mutual exclusion37. One hypothesis is functional redundancy, which means that a mutation in one gene is sufficient to eliminate the selection pressure of other gene mutations38,39,40. The second is synthetic lethality, which means that the co-occurrence of two mutations can lead to cell death41,42. In super MPLCs, the simultaneous mutations in EGFR and BRAF can lead to excessive activation of the MAPK/ERK pathway, leading to cell death.

Furthermore, in the BRAF-EGFR mutually exclusive pair, we found a significant increase in the mutation rate of BRAF in super MPLCs compared to MPLCs with fewer lesions. Previous research reported that EGFR-mutant GGO nodules have the strongest proliferative ability, while nodules carrying BRAF mutations are more indolent than nodules with EGFR mutations16,43. Combined with our findings, this suggests that the lesions in super MPLCs are in the early stages and with low invasiveness. After that, we will explain why these lesions have low invasiveness. Through literature research, we found that various tumor types, such as melanoma and breast cancer, widely exhibit a phenomenon named as concomitant tumor resistance (CTR), which is the primary lesion that inhibits the growth of other lesions44. One of the CTR mechanisms is concomitant immunity (CI), which may be mainly mediated by T cells. The primary tumor has an immunosuppressive environment, but the secondary tumor does not45,46. For super MPLCs, if some lesions carry EGFR mutations, these lesions may exhibit strong invasiveness, which can strengthen the immune response of the body, break the tumor immune balance, and lead to the removal of other poorly growing lesions. This makes us only able to observe MPLCs with a small number of lesions or single-lesion lung cancer.

This series of mechanism hypotheses is summarized in Fig. 5B. To verify the mechanisms, especially the high incidence of BRAF mutations, we also need a mouse model of MPLCs to verify the existence of this multi-lesion interaction and explore its detailed mechanisms.

As the limitation of our study, firstly, our results mainly contain the SNVs, indels, and structural variation of super MPLCs. For gene fusion and alternative splicing, many alternative splicing sites and gene fusion sites occur in the intron region, but WES cannot cover the intron region, which leads to inaccurate estimation in the frequencies of fusion and alternative splicing events. Thus, we only listed the detected fusion and alternative splicing events in Supplementary Table 6, without fully describing their landscapes in super MPLCs. Secondly, although we considered super MPLCs to be a rare subtype of lung cancer and have drawn some significant conclusions based on a limited sample size and one race, our findings still need to be further validated in larger samples and broader patient populations. And the negative results of this study may turn into new findings in larger-scale data.

In conclusion, based on the existing results of our cohort, the low occurrence rates of EGFR and BRAF Class I mutations may limit the application of targeted therapeutic drugs. This study emphasizes the importance of surgery in the treatment of super MPLCs. We also further emphasize the importance of MDT, which includes thoracic surgeons, radiologists, pathologists, and other physicians, as well as the significance of comprehensive utilization of WES results for treatment decision-making in super MPLC patients. In addition, exploring the mechanism behind the specific mutation landscape can help us understand the progression of multiple primary cancers.

Methods

Patients and sample details

Eighteen patients are diagnosed with synchronous multiple primary lung cancer (sMPLC) according to the Martini and Melamed classification system3 and genetic characteristics. They underwent surgical resection at Peking Union Medical College Hospital between 2018 and 2023. A total of 130 resected lung tumor specimens from 18 patients with sMPLC were collected from distinct primary tumors at the Peking Union Medical College Hospital. This study involved human participants and was approved by the ethical committee of Peking Union Medical College Hospital (K5723). The study was performed in compliance with the Declaration of Helsinki Principles. All patients have signed written informed consent forms for specimen collection and genomic profiling.

Whole-exome library preparation and sequencing

Whole-exome sequencing was performed on 130 resected lung tumor specimens as previously described47,48. Genomic DNA from formalin-fixed and paraffin-embedded (FFPE) samples was extracted using QIAamp DNA FFPE Tissue Kit (Qiagen), and fragmented by M220 Focused-ultrasonicator (Covaris) into ~250 bp. A whole-genome library was prepared using the KAPA Hyper Prep Kit (KAPAxGen™ Exome Hybridization Panel (Integrated DNA Technologies, Inc.) according to the manufacturer’s protocol. Captured libraries were amplified in KAPA HiFi HotStart ReadyMix (KAPA Biosystems) and purified using Agencourt AMPure XP beads. Enriched libraries were sequenced using the Illumina HiSeq 4000 platform as paired 125 bp reads according to the manufacturer’s instructions. The mean sequencing depth is 541.2 (194.7-1281.2).

Variant filtering and mutation calling

The sequence data was processed as previously described47. For quality control procedures, Trimmomatic was used to remove low-quality samples (quality reading below 20) or N bases from the FASTQ files. Sequencing reads were mapped to the reference human genome (hg19) using the Sentieon. Duplicate reads were removed using Sentieon, followed by realignment around known indels. Base quality recalibration was performed using GATK4 and samples with Total QScores smaller than 35 or contamination rates greater than 0.02 were excluded. Fingerprint was used to remove non-matching samples from matched tumor-normal pairs. GATK4 was used to estimate cross-sample contamination. Somatic single nucleotide variant (SNV) calling and insertion/deletions (indels) calling were performed using Vardict49. SNVs in the 1000 Genomes Project and dbSNP with frequency >1% were excluded. The SNVs and indels were further filtered as previously described47.

Copy number analysis

Copy number analysis was performed using FACETS (Ver 0.5.13). Somatic CNA events were assigned based on sample-ploidy values calculated in the FACETS algorithm50. For arm-level CNV events, 39 autosomal chromosome arms were assessed. Specifically, arm-level CNA gain (>sample average ploidy +1) was defined if segments of amplification and deep amplification events account for more than 60% of total segments for the corresponding chromosome arm. Similarly, arm-level CNA loss (<sample average ploidy -1) was identified if segments of deletion and deep deletion events account for more than 60% of total segments for the given chromosome. For gene-level CNV events, only deep amplifications and deep deletion segments were considered.

Reconstruction of clonal and sub-clonal architecture and tumor evolution

Tumor purity was estimated using ABSOLUTE51. Pyclone was used to estimate major clones and subclones and reconstruct the phylogenetic tree. Cellular prevalence (CP) was estimated based on allele frequency and copy numbers to adjust for tumor purities and was used for mutation clustering. For each dissected or single tumor region, mutations with ccf >0.6 were considered as clonal events and otherwise sub-clonal. SCHISM52 was then used to reconstruct clonal and sub-clonal hierarchies.

Pathway analysis

Pathway-centric analyses of somatic mutations from different developmental stages were performed using the software RStudio 4.053. To examine the biological functions of the most frequently mutated genes in each stage, we performed functional enrichment analyses. REACTOME54 is an integrated database containing advanced functional information for systematically analyzing gene functions, biological pathways, and other research. The most frequently mutated 30 genes in each stage were used as input to the REACTOME pathway enrichment analysis R package ReactomePA55 (version 1.38). Pathways with Benjamini-Hochberg (BH) procedure adjusted p-value smaller than 0.05 were highlighted. Only the top 20 pathways enriched in all four stages samples were displayed for visualization.

Co-occurrence and mutual exclusivity

We used the DISCOVER package56 to calculate significant mutual exclusivity and co-occurrence. We plotted the odds ratio of the pairwise mutations using the “corrplot” R package, referring to strategies of Jeffrey et al. and only including the mutations occurring in more than 10 samples57. The odds ratios are calculated using Fisher’s exact test.

Three-dimensional reconstruction of the lungs

We imported the DICOM format data of chest CT from the Department of Radiology into Mimics 25.0 software. Mimics 25.0 is a tool that can directly read raw continuous CT images in DICOM format, and achieve 3D visualization, and segmentation processing of CT scan images. Mimics uses a semi-automatic lung reconstruction module with a built-in fast centerline calculation function, and automatically marks the anatomical structure of the lung and trachea using the internationally recognized lung and trachea labeling method. By adjusting the thresholds of CT values, different pulmonary nodules (solid, ground glass, mixed, calcified, etc.) are marked in pixel blocks in DICOM images. Mimics measured and analyzed all nodules, removed calcified and undersized nodules, and then converted the labeled pixel blocks of the nodules into a 3D model and optimized the boundaries. Similarly, Mimics marked the pixels of blood vessels, reconstructed the pulmonary artery and pulmonary vein, and then segmented the lung segments and completed reconstruction based on the analysis of variations in pulmonary vascular structure.

Neoantigen burden analysis

Neoantigens are a class of tumor-specific antigens caused by nonsynonymous mutations in tumor cells58,59. The neoantigen burden analysis is based on the website (https://github.com/MathOnco/NeoPredPipe)60 using ANNOVAR and netMHCpan61 and the mutation types include SNVs and Indels concurrently.

Aneuploidy score analysis

Aneuploidy score (AS) was defined as the total number of chromosome arms that were gained or lost, based on the published ABSOLUTE copy number data of the CCLE dataset62. The median total modal copy number (sum of allelic copy numbers) across segments was estimated for each chromosome arm (weighted for segment length), and compared to the cell line’s background ploidy to call the chromosome-arm copy number status (gain, loss, or neutral). Aneuploidy score (AS) was defined as the total number of chromosome arms that were gained or lost63.

Mutational signatures analysis

All the 30 mutational signatures in COSMIC are publicly and freely available. The profile of each signature is displayed using the six substitution subtypes: C > A, C > G, C > T, T > A, T > C, and T > G and generating 96 possible mutation types. Mutational signatures are displayed and reported based on the observed trinucleotide frequency of the human genome, i.e., representing the relative proportions of mutations generated by each signature based on the actual trinucleotide frequencies of the reference human genome version GRCh37 (https://cancer.sanger.ac.uk/signatures/signatures_v2/).

PDL1 and CD8 Immunohistochemistry (IHC) Staining Assay

We obtained the FFPE specimen sections from the Department of Pathology to analyze IHC. Immunohistochemical staining was performed according to standard procedures: (1) Paraffin section dewaxing (2) Antigen repair: microwave heated for 8 minutes in EDTA (pH 9.0), then stopped for 8 minutes, and turned to medium-low heat for 7 minutes. (3) Block endogenous peroxidase: 3% hydrogen peroxide solution incubating for 25 minutes. (4) Serum blocking: 3% BSA block at room temperature for 30 minutes. (5) Add the primary antibody, and incubate overnight at 4 °C in a wet box. The primary rabbit antibody against PDL1 (1:9000 dilution, GB11339A, Servicebio, wuhan, china) and primary mouse antibody against CD8 (1:2000 dilution, GB12068, Servicebio, wuhan, china) were used. (6) Add secondary antibody: secondary antibody incubating at room temperature for 50 minutes. The goat anti-rabbit (for PDL1, 1:200 dilution, GB23303, Servicebio, Wuhan, China) and goat anti-mouse (for CD8, 1:200 dilution, GB23301, Servicebio, Wuhan, China) HRP-labeled lgG was conducted as the secondary antibody. (7) DAB color rendering. (8) Hematoxylin staining cell nucleus. (9) Dehydration and sealing. (10) Analysis: Aipathwell software provided by Servicebio for analysis and interpretation. We defined PDL1 positivity as a ≥1% membranous tumor proportion score (TPS).

Statistical analysis

The unpaired Student’s t-test was used to compare mutational burdens and chromosome instability between tumors. Fisher’s exact test was used to compare proportions in the article. Tumor mutational burden (TMB) was defined as the total number of coding mutations in each sampled lung nodule.