Introduction

Thyroid cancer is one of the most common cancers with an estimated 586,202 new cases reported worldwide in 20201. Follicular thyroid cancer (FTC), medullary thyroid cancer (MTC), papillary thyroid cancer (PTC), poorly differentiated thyroid cancer (PDTC), and anaplastic thyroid cancer (ATC) are the major histological subtypes of thyroid cancer. PTC accounts for ~80% of all thyroid cancers and >90% of patients have a 10-year survival. In sharp contrast, though ATC and PDTC make up 5%–10% of thyroid cancers, they are extremely aggressive subtypes with a median survival of only 0.5 and 3.2 years, respectively. The tumor-node-metastasis (TNM) classification system generally classifies patients with ATC as stage IV (A, B, or C) due to their universally poor prognosis2, and most of them have unresectable or metastatic disease. ATC is featured with extensive local invasion and 15%–50% of patients show distant metastasis at diagnosis. Unfortunately, there are no effective treatments for prolonging the survival of patients with ATC.

Fundamental findings have delineated the molecular basis of ATC and PDTC aggressiveness. BRAF and RAS were demonstrated to be the predominant drivers of PDTC, while higher mutation frequencies of TP53, TERT promoter, and PI3K/AKT signaling were genomic hallmarks of ATC3. Compared with PDTC or other thyroid cancer subtypes, the histology of ATC is heterogeneous and often displays a mixture of giant cells, spindle cells, and epithelioid pathology rather than a pure pattern. Based on targeted next-generation sequencing, three genetically distinct subtypes of ATCs have been identified, and these subtypes have been proposed to evolve from PTC, FTC, and HCTC, mainly driven by BRAFV600E, NRAS, and PTEN/NF1/RB1 mutations, respectively4. Although the genomic and transcriptomic profiles of ATC and PDTC have been thoroughly characterized, their protein and post-modification landscapes are poorly understood and are pivotal for cancer biology and drug discovery.

ATC is an undifferentiated malignancy that loses thyroid-differentiation markers and gains cancer stemness, leading to a poor response to conventional therapies, including chemotherapy, radiation, and radioactive iodine ablation5. Targeted therapeutic options are preferred for patients with ATC when systemic therapy is indicated. Dabrafenib/trametinib was approved by FDA in 2018 for ATC with BRAFV600E mutation6. Although the objective response rate (ORR) was 69% in a phase II trial, only 20%–50% of ATCs harbored BRAFV600E mutations, which greatly limits their application3,7. Immunotherapy that blocks the PD-1/PD-L1 immune checkpoint is beneficial for patients with solid tumors. In patients with ATC, the ORR was 19% after treatment with anti-PD-1 antibody in a phase II study8. Although molecular testing is performed to help inform clinical decisions, the lack of sufficient druggable targets and heterogeneity among ATCs have raised the urgent need for precision therapy.

In this work, we construct a multicenter thyroid cancer cohort consisting of 467 formalin-fixed paraffin-embedded (FFPE) samples, for which whole-exome sequencing (WES), proteome, and phosphorylated proteome analyses are conducted. The molecular signatures and activated kinases of different thyroid cancer subtypes are systematically delineated at the genome, protein, and post-modification level. Proteomic clustering identifies three ATC/PDTC molecular subtypes with distinct prognoses and immune landscapes. The therapeutic vulnerability of ATC is identified in ATC/PDTC subtypes. The use of multi-omics potentiates the discovery of druggable targets and personalized therapies for virulent thyroid cancers.

Results

Overview of proteogenomic characterization in thyroid cancer cohort

To identify the molecular signature and therapeutic vulnerability of aggressive thyroid cancer, we performed a deep-scale molecular analysis spanning three different data types (genomic, proteomic, and phosphoproteomic) of thyroid tumors and non-cancerous adjacent tissues (NATs) from 345 patients (Fig. 1A, B). 467 FFPE samples were subjected to TMT labeling-based proteomic analysis. Thyroid tumors consisted of 124 ATC, 47 PDTC, and 177 PTC samples (including three biological replicates). The 119 NATs were derived from tumor-adjacent tissues of 20 patients with ATC and 99 patients with PTC. To focus on a homogeneous histological cohort, 11 ATCs and one PDTC that co-occurred with PTC were excluded from subsequent analysis. Among the 159 patients (113 ATC and 46 PDTC samples), follow-up survival information was available for 101 (75 ATC and 26 PDTC samples). Phosphoproteomic analysis was performed on 63 ATCs, 25 PDTCs, 81 PTCs, and 48 NAT samples (tumor-adjacent tissues from 20 patients with ATC and 28 patients with PTC). WES was conducted on 51 ATCs, 28 PDTC, and 19 NAT samples (Fig. 1B). Clinicopathological characteristics are summarized in Table 1 and Supplementary Data 1.

Fig. 1: Multi-omics landscape of thyroid cancer.
figure 1

A Workflow and sample types for molecular profiling of thyroid cancer. In this study, three types of thyroid cancer tissue samples (ATC, PTC, and PDTC) and two corresponding adjacent non-cancerous tissue samples (ATC NAT and PTC NAT) were collected. Whole-exome sequencing (WES) and TMT-labeled LC-MS/MS were respectively used for gene mutation and protein quantification profiling. 467 FFPE samples (124 ATC, 47 PDTC, 177 PTC (including 3 biological replicates), and 119 NAT samples) were detected in proteomic profiling. 217 FFPE samples (63 ATC, 25 PDTC, 81 PTC, and 48 NAT samples) were collected for phosphoproteomic profiling, and 97 FFPE samples for WES profiling (51 ATC, 28 PDTC, and 18 NAT samples). B An overview of detected samples in three molecular profiles (WES, proteomics, and phosphoproteomics). C The mutational landscape and associated clinical information of 51 ATC and 28 PDTC patients. Top 20 significantly mutant genes based on MutSig analysis (q value < 0.05). D Relationship between pathological features and gene mutations in 51 ATC and 28 PDTC patients. In boxplot, two-sided T-test is used, the central line represents median, the bounds of boxes represent the first and third quantiles and the upper and lower whiskers extend to the highest or the smallest values with 1.5 interquartile range. In contingency tables, the value represents the number of samples. Statistical significance was examined by Fisher’s exact test. E Kaplan–Meier curves showing the relationship between gene mutation and patient survival (n = 51 samples). Source data are provided as a Source Data file.

Table 1 The clinicopathological features of 345 patients with thyroid cancer

A total of 9491 proteins were detected, with an average protein number of 7307 per sample (Supplementary Fig. S1A). The relative quantitative density of proteins exhibited a normal distribution, and all samples passed the quantitative quality control (Supplementary Fig. S1B). Scatter plots based on UMAP showed that samples with the same pathological subtype were closer to each other in both PC1 and PC2 (Supplementary Fig. S1C). In phosphorylation measurements, 47,376 phosphosites were identified using a 1% global false discovery rate (FDR), which was based on the target-decoy strategy. The quantitative distribution and UMAP results of the phosphorylation profiling were consistent with those of the proteome (Supplementary Fig. S1D–F). However, compared with other sample types, more phosphorylation sites were detected in ATC and PDTC (Supplementary Fig. S1D).

Proteogenomic landscape of ATC and PDTC

Genomic profiling by WES identified high-frequency mutated genes in ATC and PDTC, such as TP53 (48%), TERT promoter mutation (36.5%), BRAF (23%), and NRAS (18%) (Fig. 1C). Consistent with most genome studies4,9,10, TP53 (58.8%) and TERT promoter mutations (46.3%) were two most frequently somatic mutated gene in ATC (Supplementary Fig. S1H, I), which were higher than those in PDTC (TP53: 28.6%, TERT promoter: 18.2%). DICER1, an oncogene that causes the hereditary cancer predisposition syndrome11, was mutated higher in PDTC (25%) than ATC (7.8%) (Fig. 1D). In addition, the mutation status of the top 20 tumor-associated genes was associated with clinicopathological characteristics. Older patients were more likely to acquire BRAF mutations (P = 0.0011), and the proportion of NRAS mutation in males (3.2%) was significantly lower than that in females (28.6%) (Fig. 1D). Mutations in TP53 (P = 0.0065), BRAF (P = 0.00027), TERT promoter (P < 0.0001), and LRP1B (P = 0.0027) were significantly correlated with poorer overall survival in patients with ATC and PDTC (Fig. 1E).

We performed whole-exome-based somatic copy number alteration (CNA) analyses and examined the regulatory effects of CNAs on protein expression. In PDTC, amplification of chromosome 17q was the only frequent CNA, with an occurrence of 39% (Supplementary Fig. S2A). Among ATC patients, the 20q amplification was the most frequent CNA with an occurrence of 43% (Supplementary Fig. S2B). Among all the CNAs identified, the amplification of 5q13.2 was the most significant CNA event with the lowest hazard ratio (Supplementary Fig. S2C, D). Three genes (OCLN, MCCC2, and MRPS36) within this region showed significantly positive cis-effects (Supplementary Fig. S2E). The protein abundance of OCLN, a mesenchymal marker, was significantly different between the ATC and PDTC groups (P = 4.40e-10), and patients with high OCLN expression had a significantly better survival probability than those with low OCLN expression (P = 0.026) (Supplementary Fig. S2F, G). Results of single sample Gene Set Enrichment Analysis (ssGSEA) suggested that nine pathways were positively associated with OCLN abundance, and 18 pathways were negatively associated with OCLN expression, respectively (Supplementary Fig. S2H). The abundance of OCLN was negatively correlated with the activation of the epithelial-mesenchymal transition (EMT) pathway (Supplementary Fig. S2I). In addition, 12 genes participating in the EMT pathway demonstrated significant positive associations with hazard ratio (Supplementary Fig. S2J).

In principle, we propose that the gain of chromosome 5q13.2, resulting in the accumulation of OCLN DNA and protein abundance, thereby restraining the activity of EMT pathways, which might explain the better prognosis of patients with PDTC than those with ATC (Supplementary Fig. S2K).

Proteomic and phosphoproteomic characteristics in ATC

Proteomic and phosphoproteomic analyses identified a total 9491 protein in our thyroid cancer cohort, among which 3342 proteins were phosphorylated. A total of 893 proteins were differentially expressed between ATC and NATs (Wilcoxon rank-sum test, BH P < 0.05, T/N > 1.5, or < 0.67), including 363 upregulated and 530 downregulated proteins (Fig. 2A). The upregulated proteins in ATC were associated with cell cycle, ribosome biogenesis, and innate immune responses, whereas ATC-downregulated proteins were linked to tight junction, thyroid hormone generation, and epithelial cell differentiation (Fig. 2B). Similar enriched pathways were found when the threshold of protein fold change was set to 1.2 (Supplementary Fig. S3A, B). Notably, pathways related to immune response, including neutrophil extracellular trap (NET) formation, neutrophil chemotaxis, and negative regulation of T cell proliferation, were frequently enriched in ATC-upregulated proteins (Supplementary Fig. S3C). In parallel, the same comparative analysis was carried out between the 177 PTC samples and 99 NATs, and the results are shown in Supplementary Fig. S3D, E. Further analysis revealed that ATCs and PTCs shared certain commonalities, as both were enriched in pathways related to cell proliferation and response to hypoxia (Fig. 2C). Nevertheless, they also exhibited distinct patterns, as the upregulated proteins in ATCs were specifically enriched in cell migration, intracellular protein transport, and small GTPase signaling, whereas those in PTCs were mainly involved in PPAR signaling and nucleotide metabolism (Fig. 2C). Interestingly, we discovered that high expression of ARF4 and ARHGAP9, two ATC-specific upregulated proteins associated with cell migration and intracellular protein transport, respectively, were both significantly correlated with poor survival (P = 0.0025 and P = 0.019, respectively).

Fig. 2: Proteomics and phosphoproteomics analyses in tumor tissues compared with NATs.
figure 2

A Volcano plot showing the differential protein abundance between 113 ATC and 20 NAT samples (FDR < 0.05, T/N > 1.5 or <0.67). Statistical significance was examined by two-sided Wilcoxon rank-sum test. P values were adjusted by Benjamini–Hochberg method. B Heatmap showing the abundance of significantly different proteins between 113 ATC and 20 NAT samples. Pathways significantly enriched with upregulated and downregulated proteins are labeled on the right side of the heatmap. C Scatter plot showing the common and specific protein changes in ATC (n = 113) and PTC (n = 177) compared to NAT (n = 119) samples. Statistical significance was examined by two-sided Wilcoxon rank-sum test. D Kaplan–Meier curves (Log-rank test) showing the relationship between the protein abundance of ARF4 and ARHGAP9 and patient overall survival (OS) (n = 101 samples). E Scatter plot showing the relationship between protein abundance changes and phosphorylation abundance changes in ATC compared to NAT samples. F Bubble plot by Hypergeometric test showing significantly enriched pathways for proteins with fold changes greater than 1.5 in both phosphorylation and protein levels. G Bar plot showing significantly changed kinase activity in ATC compared to NAT samples. Kinase activity is estimated depending on the substrate phosphorylation abundance and calculated using the KSEA method. H The changes of potential substrate phosphorylation abundance in patients with BRAFV600E or non-BRAFV600E mutation. I Network diagram showing the relationship between potential substrates influenced by BRAFV600E mutation and the pathways they participate in. J Kaplan–Meier curves showing the OS of patient with ATC based on BRAFV600E mutation and phosphorylation abundance of ERBB2, respectively. K Multivariate analyses (Cox proportional hazards models) of gene mutation and phosphorylated ERBB2 in patients with ATC. Whiskers represent 95% confidence interval. In results of (EG), 113 ATCs and 20 NATs in proteomics and 63 ATCs and 20 NATs in phosphoproteomics were analyzed. In results of (HK), n = 51 samples. *P < 0.05, **P < 0.01. Source data are provided as a Source Data file.

Integration of phosphoproteomic and quantitative proteomic data revealed that proteins with upregulated phosphorylation levels were primarily implicated in cell cycle processes (Fig. 2E, F). Moreover, kinase-substrate enrichment analysis (KSEA) identified altered kinase activity in thyroid cancer, with kinases including CSNK2A1, CSNK2A2, and AURKA significantly activated in ATC (Fig. 2G), whereas SRC and EPHA4 were significantly activated in PTC (Supplementary Fig. S3F), thereby shedding light on potential therapeutic targets.

BRAFV600E mutation is fundamental for driving thyroid cancer progression by constitutively activating BRAF kinase and downstream signaling pathways. Therefore, we explored the downstream phosphorylated substrates that were potentially regulated by the BRAFV600E mutation. In this study, we found that 11 of the 51 patients with ATC harbored BRAFV600E mutation (Fig. 2H). Based on differential analysis, we identified 18 phosphorylation sites that were potentially affected by the BRAFV600E mutation (Fig. 2H). These regulated substrates are closely associated with oncogenic signaling pathways, among which ERBB2 and EGFR are important representatives (Fig. 2I). The phosphorylation abundance of ERBB2 at Serine1054 (Phospho-ERBB2 (Ser1054)) was significantly downregulated in BRAFV600E mutation samples. BRAFV600E mutation was correlated with poorer survival (P = 0.00043), while a high Phospho-ERBB2 (Ser1054) abundance was associated with better outcomes (P = 0.0053) (Fig. 2J). Multivariate analyses revealed that mutations in TP53, LRP1B, and TERT promoter, were predictors of overall survival (OS) in patients with ATC and PDTC, although Phospho-ERBB2 (Ser1054) and BRAFV600E mutations were not identified as independent prognostic factors in the current analysis (Fig. 2K).

Molecular signatures of thyroid cancer progression

Differential expression and principal component analysis (PCA) indicated different proteomic landscapes among the different thyroid cancer subtypes (Fig. 3A and Supplementary Fig. S4A). Pathway enrichment analysis of subgroup-specific proteins revealed that ECM-receptor interaction, thyroid hormone synthesis, and cellular oxidant detoxification were enriched in NAT, whereas extracellular matrix disassembly, ribosomal large subunit biogenesis, DNA replication, and cell cycle were involved in ATC (Fig. 3A). The dedifferentiation and anaplastic transformation of thyroid cancer cells confer an extremely aggressive phenotype. We then explored the molecular signature and active kinases that potentially drive thyroid cancer progression. A total of 1586 proteins sequentially increased or decreased during the progression from NAT, PTC, to PDTC or ATC (Fig. 3B). Similarly, we found that consecutively upregulated proteins were mainly enriched in cell cycle, DNA repair, and ribosome biogenesis, whereas downregulated proteins were mostly involved in ECM-receptor interaction and thyroid hormone synthesis (Fig. 3C, D and Supplementary Fig. S4B).

Fig. 3: Molecular signature of different thyroid cancer subtypes.
figure 3

A Heatmap showing the abundance changes of proteins significantly associated with thyroid cancer progression. Pathways significantly enriched with proteins specifically highly expressed at different stages of the disease are labeled on the right side of the heatmap. B Boxplot showing proteins gradually increasing or decreasing in abundance as thyroid cancer progresses. The central line represents median, the bounds of boxes represent the first and third quantiles and the upper and lower whiskers extend to the highest or the smallest values with 1.5 interquartile range. N indicates number of increasing or decreasing proteins. C Displaying protein abundance changes in pathways significantly enriched with downregulated proteins as cancer progresses from normal thyroid towards PTC (differentiated phenotype), PDTC (poorly differentiated phenotype), and ATC (dedifferentiated phenotype). D Displaying protein abundance changes in pathways significantly enriched with upregulated proteins as thyroid cancer progresses. E Kaplan–Meier curves (Log-rank test) showing the relationship between the estimated activity changes of the ribosome biogenesis pathway based on ssGSEA and patient overall survival (n = 101 samples). F Cell viability after knockdown of RRP9 in 8505C (n = 3) and CAL62 (n = 3), respectively. The experiments were repeated for three times. G Colony formation assay after silence of RRP9 in 8505C and CAL62, respectively. Data are a representative of three separate experiments. HJ The tumor growth curve, tumor size, and tumor weight of 8505C-derived ATC xenograft in nude mice (n = 8). In results of (AD), 113 ATCs, 177 PTCs, 46 PDTCs, and 119 NATs were analyzed. Results (F, H, J) show the mean of biological replicates. Error bars indicate SD for in vitro experiments and SEM for in vivo experiments. Statistical significance was determined by two-sided T-test. Source data are provided as a Source Data file.

The strengthened activity of ribosome biogenesis attracted our interest because of its oncogenic role in supporting tumorigenesis and tumor progression. ssGSEA confirmed the significantly increased ribosome biogenesis activity in ATC compared to that in other thyroid cancer subtypes (Supplementary Fig. S4C). Moreover, patients with high ribosome biogenesis activity showed significantly shorter overall survival (P = 0.015, 95% confidence interval [CI]:1.08–3.18) (Fig. 3E). Therefore, we suggest that the ribosome signaling pathway is extremely active in ATC tissues, which is consistent with the hallmarks of rapid growth and unrestrained proliferation. Further studies identified 26 proteins involved in ribosome biogenesis as risk factors significantly associated with poor prognosis. Based on the DepMap database established using CRISPR and siRNA screening data, six genes (RRP9, RIOK1, PA2G4, BRIX1, NAT10, and DDX21) were found to be essential in both CRISPR knockout and RNAi datasets, among which RRP9 demonstrated the highest fold change in our study (Supplementary Fig. S4D). High RRP9 expression correlated with poor prognosis (P = 0.014) (Supplementary Fig. S4E). Notably, RRP9 knockdown significantly inhibited cell viability and colony formation in different ATC cell lines (Fig. 3F, G). Strikingly, the loss of RRP9 remarkably reduced the growth and tumor weight of 8505C-derived xenografts in nude mice compared to those in the control (Fig. 3H–J), indicating that RRP9 potentially serves as a therapeutic vulnerability for ATC.

To screen active kinases as alternative therapeutic targets in ATC, we analyzed the differential abundance of phosphorylation sites among the four types of tissues. Sixteen phospho-substrates corresponding to seven kinases (CDK1, CDK2, CDK4, NEK2, PLK1, AURKA, and AURKB) were stratified, for which inhibitors from DGIdb database are shown in Supplementary Fig. S4F. Patients with higher CDK1 expression had poorer prognostic outcomes (P = 0.037), and those with high levels of AURKB showed significantly shorter OS than those patients with low levels of AURKB (P = 0.037).

Proteomic clustering identifies three ATC/PDTC molecular subtypes

Consensus clustering-based unsupervised clustering of proteomic datasets from 113 ATC and 46 PDTC tumors demonstrated three resulting subtypes, termed Pro-I, Pro-II, and Pro-III (Fig. 4A and Supplementary Fig. S5A, B). Significant associations between the three subtypes and the cohort metadata are presented in Supplementary Data 5. These three proteomic subtypes exhibited significantly distinct prognoses, among which patients with Pro-III subtype had the worst survival rates (Fig. 4B). The molecular profiles, including the genomic landscape, protein signature, and active kinases, were distinctive among the three subtypes and provided subtype-specific targets that are valuable for ATC/PDTC.

Fig. 4: Proteomic Clustering and molecular characteristics in ATC/PDTC tumors.
figure 4

A Three subtypes (Pro-I, Pro-II, and Pro-II) of ATC and PDTC patients were divided based on protein expression profiles. The bottom heatmap shows the abundance of featured proteins in each subtype. The top heatmap shows the pathological characteristics of patients in three subtypes. B Kaplan–Meier curves (Log-rank test) showing significant differences in patient overall survival among different protein subtypes (Pro-I: 38 samples; Pro-II: 18 samples; Pro-III: 45 samples). C Bubble plot by Hypergeometric test showing pathways significantly enriched with differentially expressed proteins of three subtypes. D Heatmap showed the abundance of characteristic proteins under key regulatory pathways in three protein subtypes. E Radar plot showed differences in kinase family activity among three protein subtypes of patients. F Bar plot showed the activity of key regulatory kinases in three protein subtypes. G In the three protein subtypes, downstream regulatory substrates of significantly activated kinases involved in different signaling pathways. Source data are provided as a Source Data file.

Mutation of NEB, a recently defined pathogenic variant accounting for nemaline myopathy12, frequently occurred in Pro-I subtype (45.5%) compared to Pro-II (8.3%) and Pro-III (17.6%) (Supplementary Fig. S5C). The upregulated proteins in Pro-I tumors were mainly associated with insulin secretion, glucagon signaling pathway and transcriptional misregulation in cancer (Fig. 4C). IGF1R, encoding insulin-like growth factor 1, was specifically upregulated in Pro-I tumors and involved in transcriptional misregulation in cancer (Fig. 4D). Comparison of the phosphoproteome among the three proteomic subtypes revealed subtype-specific activated kinases by KSEA, including the CMGC and AGC kinase family for Pro-I, the CAMK kinase family for Pro-II, and CMGC and the TK kinase family for Pro-III (Fig. 4E). MAPK3 (ERK1) and MAPK1 (ERK2), implicated in oncogenic processes, such as cell motility, proliferation, and survival, were most active kinases in Pro-I (Fig. 4F). Interestingly, the kinase activity of the IGF1R was also increased in Pro-I tumors. In addition, GSK3A and AKT1 were the main kinases in Pro-I that phosphorylated substrates from the glucagon and insulin signaling pathways (Fig. 4G), indicating aberrant activation of the glucagon and insulin signaling pathways in the Pro-I subtype.

The molecular signatures of the Pro-II subtype are associated with cell division, chromatin remodeling, and the regulation of DNA repair. Cyclin family members (CCNK and CCNA2) and NIMA-related serine/threonine kinase family member (NEK2) were specifically upregulated in Pro-II samples. The Pro-II subtype was characterized by the overactivation of ATM and ATR, which are tightly associated with DNA repair signaling via the phosphorylation of their substrates including BRCA1 and RAD50 (Fig. 4F, G). ATM is a part of the BRCA1-associated genome surveillance complex (BASC) and is involved in the recognition and repair of aberrant DNA structures13. Hence, active cell cycle and DNA repair signaling may contribute to the aggressiveness of the Pro-II subtype.

Both Pro-III (64.7%) and Pro-II (66.7%) were characterized with higher mutant frequency of TP53 than Pro-I subtype (24.2%). It is noteworthy that the mutations of BRAF and BRAFV600E in Pro-III were higher than those in other subtypes (Supplementary Fig. S5C). Pro-III featured an enriched tumor microenvironment (TME), such as inflammatory response, extracellular matrix disassembly, and complement coagulation cascades (Fig. 4C). The immune checkpoints CD274 and IDO1 were specifically increased in the Pro-III subtype. In addition, the C5a receptor (C5AR1) was upregulated in Pro-III. Recent research has shown that the inhibition of C5AR1 improves radiotherapy even in tumors displaying immunosuppressive features and poor CD8+ T cell infiltration. Analysis of the kinase landscape in Pro-III identified microtubule affinity-regulating kinase family members, including MARK1, MARK4, and MARK3, were frequently activated in Pro-III (Fig. 4F, G). These kinases phosphorylate microtubule-associated proteins and thereby regulate microtubule dynamics and cell motility. Interestingly, PCC0208017, a small-molecule inhibitor of MARK3/MARK4 with blood-brain barrier permeability, has shown potent antitumor activity in glioma14.

Given that ribosome biogenesis is active in aggressive thyroid cancers, we evaluated its specificity for the three proteomic subtypes. Both ribosome biogenesis activity and RRP9 expression showed no significant differences among the three subtypes, although their scores were significantly higher than those of the NAT (Supplementary Fig. S5D, E), indicating that ribosome biogenesis is a common hallmark of aggressive thyroid cancer.

Immune landscape and regulation in ATC/PDTC

Consensus clustering based on xCell signatures identified three immune subtypes (Cluster1, Cluster2 and Cluster3) in the 159 ATC and PDTC cohorts (Fig. 5A and Supplementary Fig. S6A, B). Our analysis revealed three distinct immune subtypes with diverse clinical and prognostic characteristics. Cluster 1, comprising both ATC and PDTC patients, exhibited the lowest immune score and the most favorable prognosis (Fig. 5B, C, P = 0.0044). This subtype was enriched in pathways associated with oxidative phosphorylation and DNA replication. In contrast, Clusters 2 and 3, predominantly consisting of ATC patients, displayed higher immune scores and were associated with a poorer prognosis (Fig. 5A–E). Notably, in contrast to Cluster 2, Cluster 3 demonstrated a reduced prevalence of metastatic samples and elevated immune cell infiltration, particularly of CD8+ T cells, macrophages, and neutrophils (Fig. 5A, P < 0.05). Integrative analysis showed that the Cluster 1 subtype mainly comprised Pro-I samples, while most Pro-III and Pro-II samples belonged to Cluster 3. Interestingly, we noticed that the Cluster 2 subtype primarily constituted of Pro-III samples, suggesting potential heterogeneity of the TME in Pro-III subtype.

Fig. 5: Immune subtyping and regulatory factors in ATC/PDTC.
figure 5

A The immune subtypes of ATC and PDTC patients were divided into three based on immune cell scores calculated using xCell. The bottom heatmap shows the abundance of immune cell markers and the activity of pathways (estimated by the GSEA method based on protein abundance) in patients with three immune subtypes. B Heatmap of immune score and stromal score in three immune subtypes. C Kaplan–Meier curves (Log-rank test) showing differences in patient overall survival among three immune subtypes (Cluster1: n = 46 samples, Cluster2: n = 9 samples, Cluster3: n = 46 samples). D Sanky plot showing the composition of ATC and PDTC samples in immune subtypes and proteomic subtype. E Boxplot of the immune score in three immune subtypes (Cluster1: n = 74 samples, Cluster2: n = 11 samples, Cluster3: n = 74 samples). F The survival risk scores (Cox proportional hazards models) of 343 transcription factors (TFs) collected from the ENCODE Project Consortium database based on protein abundance. G Bubble plot of transcription factors in terms of patient survival risk scores and differentially changed in Cluster 3 compared to Cluster 1. Two-sided Kruskal–Wallis test is used to estimate statistical significance. H Boxplot showed the differences in protein abundance of transcription factor STAT1 in three immune subtypes (Cluster1: n = 74 samples, Cluster2: n = 11 samples, Cluster3: n = 74 samples). In boxplot (E, H), two-sided Kruskal–Wallis test is used, the central line represents median, the bounds of boxes represent the first and third quantiles and the upper and lower whiskers extend to the highest or the smallest values with 1.5 interquartile range. Source data are provided as a Source Data file.

To identify the key transcription factors that account for this immunophenotype, an integrative screening of transcription factors was conducted. The transcription factor analysis revealed a negative correlation of CEBPB, CSDE1, STAT1, ZBTB5, among others, with prognosis outcomes (Fig. 5F). Further transcription factor differential analysis between Clusters 1 and 3 identified a significant upregulation of transcription factors, including STAT1, RUNX2, and CEBPB, in Cluster3 (Fig. 5G). A specific comparison of STAT1 expression across the three clusters revealed that the expression level of STAT1 was the highest in Cluster 3. Conversely, in Cluster 1, the expression level of STAT1 was the lowest (Fig. 5H, P < 0.05). Consistently, JAK-STAT signaling was activated in the Cluster 3 subtype (Fig. 5A), indicating the essential role of STAT1 in shaping the TME of Cluster 3.

Dysfunction of neutrophil is associated with poor prognosis of ATC

Given that ATC is characterized by immune signaling (Supplementary Fig. S3C) and high immune score was found in the proteomic and immune subtypes with poorer survival (Figs. 4 and 5), we sought to explore potential regulators accounting for the immune microenvironment. CNA analyses revealed the significant amplifications were on chromosomes 1p, 1q, 5p, 7p, 7q, 17q, 20q and 21p, and deletions were on 13q, 16q, 17p and 22q in 79 ATC and PDTC patients, among which the 1q amplification was the most frequent CNA with an occurrence of 37% (Fig. 6A). Positive correlations of CNA-protein pairs along the diagonal lines in Fig. 6B were more prominent on chromosome 1q, suggesting strong cis-effects of CNAs on protein abundance for 1q. Compared with patients in PDTC and ATC groups, amplification of 1q (41%) primarily occurred in ATC (Supplementary Fig. S2A, B).

Fig. 6: Infiltration of neutrophil resulted in poor prognosis in ATC.
figure 6

A Bar plot showing significantly altered (q value < 0.1) chromosomal arms based on copy number alterations (CNA) in ATC and PDTC samples. B Heatmap showing the correlation between copy number alterations and protein abundance in ATC and PDTC samples. C Scatter plot showing the Spearman correlation between CNA of genes in the 1q chromosomal arm region and their protein abundance. D Venn diagram showing the overlap between cis-effect gene in the 1q region and significantly upregulated protein in ATC. E Expression and survival risk scores of top three proteins from the overlapping section. Heatmap showing protein expression in 20 NT and 113 ATC samples. Whiskers represent 95% confidence interval. F Immunofluorescence staining showing representative micrographs of MPO and citrullinated histone H3 (cit-H3) in normal tissues (NT), PTC, and ATC. n = 3 samples for each group. G Scatter plot showing the Spearman correlation (two-sided) between the protein abundance of NCF2 and neutrophil score calculated based on xCell (n = 159 samples). H, I Spearman correlation (two-sided) of NCF2, MPO, and PAD4 abundance in ATC/PDTC samples (n = 159). J, K Immunofluorescence staining of C5AR1 in NT (n = 30), PTC (n = 35), and ATC (n = 19). ATCs comprised of 9 ATCs that included in the proteomic analysis and belonged to 3 proteomic subtypes (3 samples for each subtype), and 10 ATCs that were not included in the proteomic analysis. Data are presented as mean ± SD, and statistical analysis is performed via one-way ANOVA. L The relevance of C5AR1 and overall survival (Log-rank test) in patients with ATC or PDTC. M, N Spearman correlation (two-sided) of C5AR1, MPO, and PAD4 abundance in ATC/PDTC samples (n = 159). O Heatmap illustrates the expression and activity of STAT1 and regulatory effect on its downstream substrate proteins correlated with neutrophil infiltration. P Flowchart illustrating the potential biological mechanisms associated with poor prognosis in ATC patients. Source data are provided as a Source Data file.

We then inspected the influence of CNA on proteins within chromosome 1q in patients with ATC. Among the 447 cis-effect genes, the abundance of 89 proteins was significantly correlated with CNA events (Fig. 6C). Furthermore, differential protein expression analysis between the ATC and NAT groups showed that 12 genes were highly expressed in ATC patients in this genomic region (Fig. 6D). Of these 12 genes, we identified three (NCF2, FCER1G and FCGR1A) that were high-risk factors for the survival of patients with ATC (Fig. 6E). NCF2, encoding neutrophil cytosolic factor 2, regulates neutrophil NADPH oxidase function and reactive oxygen species (ROS) production, which are associated with formation of NETs. We examined NET markers (myeloperoxidase (MPO) and citrullinated histone H3 (cit-H3)) in patient tumor tissues. Compared with NAT or PTC tissues, ATC showed profoundly increased NET formation (Fig. 6F). Neutrophils positively correlated with the abundance of NCF2 (R = 0.65, P < 2.2e-16) (Supplementary Fig. S6G and Fig. 6G). Notably, increased neutrophil infiltration was significantly correlated with shorter OS in patients with ATC/PDTC (Supplementary Fig. S6H). Studies have reported that NETs are generated by the orchestrated participation of several proteins, such as peptidyl arginine deiminase 4 (PAD4) and MPO. We found that NCF expression significantly correlated with MPO and PAD4 (Fig. 6H, I), respectively.

In ATC tissues, neutrophil chemotaxis and NETs were frequently enriched, and C5AR1 commonly participated in these pathways (Supplementary Fig. S3C), implying a fundamental role of C5AR1 in neutrophil function. Immunofluorescence staining supported a significant increase of C5AR1 expression in ATC tissues compared with NAT and PTC tissues (Fig. 6J, K). High expression of C5AR1 was significantly correlated with poor prognosis (P = 0.00046) (Fig. 6L). In addition, C5AR1 significantly correlated with MPO and PAD4 (Fig. 6M, N), respectively. Further analysis demonstrated that expression of STAT1 was correlated with neutrophil infiltration (Supplementary Fig. S6I), and C5AR1 expression positively correlated with STAT1 abundance (Supplementary Fig. S6J, P < 0.05). Notably, we found a close association between the high expression and activity of STAT1 and elevated expression of C5AR1 (Fig. 6O).

Collectively, the amplification of chromosomal region 1q in patients with ATC increased the abundance of NCF2 DNA fragment, therefore facilitating the protein expression of NCF2. Elevated STAT1 activity promoted the expression of C5AR1. The upregulation of NCF2 and C5AR1 coordinately triggered neutrophil chemotaxis and the formation of NETs (Fig. 6P).

Precision treatment for proteome-based ATC/PDTC subtyping

The molecular characteristics of the ATC/PDTC subtypes provide vulnerable targets for personalized therapy. Based on the results of phosphoproteome, the dominant targets that are druggable (approved or in clinical trials) were proposed in Fig. 7A. For patients with the Pro-I subtype, active kinase, including IGF1R, AKT1, and ERK1/2, that involve in insulin signaling and cell survival, are expected to be blocked by Figitumumab (phase III), everolimus (approved), and GDC-0994 (phase I). Notably, BCL2 has been suggested as a potential therapeutic target because of its participation in the platinum drug resistance pathway. Immunohistochemistry staining of BCL2 confirmed heterogenic expression of BCL2 in the three proteomic subtypes. BCL2 was strongly expressed in Pro-I, especially in tumor cells instead of stroma. In contrast, its abundance was low in the Pro-III subtype (Fig. 7B, C). BCL2 inhibitors such as venetoclax have been approved for the treatment of lymphoma; however, their value in treating ATC/PDTC requires more evidence. The upregulation of PD-L2 and DNA repair signals in patients with Pro-II subtype implied their antitumor activity following by PD-1-based immunotherapy and ATM/ATR inhibitors (Fig. 7A). However, combination therapy, including chemoradiotherapy, should be considered due to the limited antitumor outcomes of single-agent treatment against ATC8. Pro-III was the most aggressive subtype, characterized with frequent BRAFV600E mutation and high expression of CD274 and C5AR1 (Fig. 7A). Immunofluorescence staining supported a remarkable increase of C5AR1+ cells in Pro-III (31.1% ± 5.3%) comparing with those of Pro-I (0.6% ±  0.5%) and Pro-II (9.7% ± 2.4%) (Fig. 7D, E). In addition, 71.1% of C5AR1+ cells were C5AR1+ CD11b+ myeloid cells in Pro-III, indicating C5AR1 primarily expressed in myeloid cells.

Fig. 7: Potential targets for three molecular subtypes of ATC/PDTC.
figure 7

A Proposal of potential targets depending on proteome-based ATC/PDTC subtyping. B, C Immunohistochemistry staining of BCL2 in three proteomic subtypes (n = 4 samples in each group). Data are presented as mean ± SD, and statistical analysis was performed via one-way ANOVA, two-sided. D, E Immunofluorescence staining of C5AR1 and myeloid marker CD11b in three proteomic subtypes (n = 3 samples in each group). Data are presented as mean ± SD, and statistical analysis was performed via one-way ANOVA, two-sided. F, G The tumor growth curve and tumor size in an immunocompetent model of murine ATC after treating with avacopan, and/or anti-PD-1 as single agent and combination therapy (n = 6). Results show the mean of biological replicates. Error bars indicate SEM. H Immunofluorescence staining showing representative micrographs of MPO and cit-H3 in mATC-derived tumors after avacopan treatment. Experiment was repeated independently using 3 samples for each group with similar results. Source data are provided as a Source Data file.

C5AR1 has been developed as a novel target for antineutrophil cytoplasmic antibody (ANCA)-associated vasculitis15. Among the C5a receptor inhibitors, avacopan has recently been approved for the treatment of ANCA; however, its value as an anticancer therapeutic against the Pro-III subtype remains unclear. Given that Pro-III subtype was characterized with high mutant frequency of BRAFV600E and TP53, a spontaneous ATC transgenic mouse model (TPOCreER; BrafV600E/WT; Trp53−/−, termed TBP below) that phenocopied the histological characteristics of human ATCs was introduced into our study. The primary cell line mATC was constructed by isolating tumor from TBP mice. Administration of avacopan or anti-PD-1 antibody alone achieved moderate antitumor effects, whereas the combination of avacopan and anti-PD-1 antibody improved their antitumor activity and attenuated the growth of mATC-derived tumors in an immunocompetent murine model (Fig. 7F, G). In addition, treatment with avacopan dramatically reduced the formation of NETs in the tumor tissues (Fig. 7H).

Discussion

Anaplastic transformation of thyroid cancer underscores its extreme virulence. However, the rare incidence and lack of surgical indications in most patients hinder systematic research on ATC based on clinical samples. In the present study, we comprehensively delineated the molecular basis of ATC by a large-scale cohort. Integrative analyses of WES, proteomics, and phosphoproteomics data revealed somatic mutations, the mechanism underlying aggressiveness of ATC, proteomic subtypes, immune landscape, and individual therapeutics. This multi-omics study provides comprehensive insight into the heterogeneity of aggressive thyroid cancer, which sheds light on vulnerable targets for patients with ATC/PDTC.

In the last few decades, proteomics and phosphoproteomics have been applied to determine the molecular signatures and metastatic biomarkers in thyroid cancer16,17. However, their application in clinical research is limited due to the requirement of fresh/frozen specimens with large quantities. Proteomic classification using conventional strategies are unfeasible in large-scale cohorts, particularly for rare cancers. The development of PCT protocol in proteomics enables the utility of minimal amounts (0.2–1 mg) of specimens18. This breakthrough was extended by the generation of high-quality proteome data from FFPE samples19. In the current study, 467 FFPE tissue samples were collected from patients with thyroid cancer from 2010 to 2021 for TMT-MS/MS detection leveraging the PCT protocol. A total of 9491 proteins were identified and the median coefficient of variation was 0.07, supporting the high quality of data. Overall, FFPE-based proteome overcame the lack of rare thyroid cancer samples and facilitated the construction of a large-scale multi-omics cohort.

ATC shows characteristic genomic abnormalities, such as widespread loss of TP53, RB1, BRCA2, ELF1, CDKN2A, and ADAM3A, which was also reported by Zeng et al. 9. However, alterations of TP53, BRCA1, NF1, and VGLL3 were obviously higher in this study. In the present study, genomic profiling confirmed gene alterations, including TP53 (58.8%), TERT promoter mutation (46.3%), and BRAF (33.3%), were most commonly mutated genes in ATC and correlated with poor survival. Though these mutations are comparable to previous reports, the mutation frequency of TP53 (36.9%–65%), TERT promoter mutation (48%–65%), and BRAF (21.3%–41%) varied in different studies4,9,10. In addition, discrepancies of NRAS, KRAS, and EIF1AX mutations were found when compared to two other independent studies9,10. Nonetheless, BRAFV600E mutation has been recognized as a pivotal mutation in ATC owing to its universal functions in tumor dedifferentiation, occurrence, and progression. Our study found a close association between BRAFV600E mutation and EGFR phosphorylation. Phosphorylation of EGFR is tightly connected to cancer metastasis, stemness, and drug resistance20,21,22, implying an oncogenic role for BRAFV600E-linked EGFR signaling in ATC. Interestingly, the phosphorylation of ERBB2 (HER2) was increased in non-BRAFV600E mutant samples, which provides an attractive target for patients with ATC due to the impressive outcome of anti-HER2 therapeutics in breast cancer.

Fundamental findings based on the genome and single-cell transcriptome uncovered the drivers of anaplastic transformation of thyroid cancer23. Conditional mutation of Trp53 in BrafV600E mice constrains PTC progression towards ATC24. However, evidence based on a systematic proteome to delineate the aggressiveness of ATC remains limited. In the present study, molecular signatures based on the proteome unveiled an increased enrichment of cell cycle, DNA repair, and ribosome biogenesis in ATC. Robust activation of ribosome biogenesis in ATC is significantly correlated with poor overall survival. Hyperactive ribosome biogenesis is essential for sustaining aberrant tumor cell growth and proliferation by maintaining protein synthesis and translation25. Accumulating evidence suggests that the ribosome machinery can be specialized in cancer cell to function as onco-ribosomes, resulting in oncogenic translation, cancer metastasis, and therapeutic resistance26,27. We identified a specialized class of ribosomal factors that are essential for the survival of ATC cells. RRP9, a U3 small nucleolar RNAs (snoRNAs)-associated protein required for 18S rRNA production during ribosome biogenesis28,29, is significantly upregulated in ATC and acts as a prognostic indicator. Notably, silencing of RRP9 not only remarkably inhibited the proliferation of ATC cells, but also profoundly reduced tumor growth, indicating the potential vulnerability of ATC. Although ribosome biogenesis and RRP9 were elevated in ATC compared to NAT, both showed no differences among the three proteomic subtypes, suggesting that ribosome biogenesis is likely to be a common hallmark of aggressive thyroid cancer.

The high heterogeneity of ATC and PDTC poses a significant challenge for clinical therapy. However, precise therapeutics based on molecular classification remain lagging owing to the rarity of clinical samples. Efforts have been made to identify the molecular subtypes instead of histopathology in aggressive thyroid cancers. Three clusters of ATCs were previously proposed using targeted next-generation sequencing, which showed distinct genetic phenotypes with BRAFV600E, NRAS, and PTEN/NF1/RB1 mutations. Nonetheless, the limited druggable targets in most ATC subtypes based on genomic analysis restricts their clinical translation. Our study reports the proteomic classification of ATC and PDTC, and three subtypes with distinct molecular characteristics linked to clinical, prognostic, and vulnerable targets are demonstrated. Briefly, the Pro-I subtype is characterized by overactivation of ERK1/2, AKT, and BCL2 signaling, and low immune cell infiltration. Pro-II is involved in DNA repair and cell division signaling. Pro-III is the most aggressive subtype, harboring a high frequency of TP53 and BRAF mutations, exhibiting active microtubule dynamics, and shaping an immunosuppressive TME.

Active kinases (such as AKT1, ERK1/2, and IGR1R,) and phosphorylation of insulin receptor substrate (IRS) in the Pro-I subtype are tightly correlated with insulin and insulin-like growth factor signaling. Engagement of IGR1R by phosphorylated IRS triggers AKT and mitogen-activated protein kinase signaling cascades, which are essential for cell proliferation, protein synthesis, and even cancer progression30. The activation of IRS2-PI3K/AKT signaling potentiates the self-renewal of breast cancer stem cells and liver tumorigenesis31,32. Interestingly, we found that the kinase activity of AKT1 was specifically increased in Pro-I and responsible for the phosphorylation of IRS2, indicating potential positive feedback in IGF1R-IRS2-AKT signaling axis. In addition, BCL2, an antiapoptotic factor specifically upregulated in Pro-I, was maintained by IGF1 signaling33, suggesting an oncogenic role for the IGF1R-IRS2-AKT signaling axis in Pro-I. Notably, a close association between hyperinsulinemia and cancer progression has been recently reported34. Cancer cells maintain their ability to respond to insulin through several mechanisms, even when metabolic tissues become insensitive to elevated circulating insulin in the setting of obesity and early type 2 diabetes. Importantly, accumulating evidence has demonstrated that aberrant insulin signaling contributes to therapeutic failure against cancers, and preclinical and clinical trials that target insulin signaling against cancers have been conducted35. Hence, abrogating insulin signaling may be a potential strategy in the Pro-I subtype.

DNA repair and cell division signals were enriched in Pro-II. ATM and ATR, kinases that orchestrate DNA damage response36, are overactivated in Pro-II, and phosphorylate substrates including TOP, BRCA1, and RAD50. In a preclinical study, the ATR inhibitor ceralasertib in combination with carboplatin, irinotecan, or a PARP inhibitor showed impressive antitumor activities in triple-negative breast cancer37. A phase II study of ceralasertib in combination with durvalumab demonstrated an ORR of 22.6% in patients with previously treated advanced gastric cancer38. In a recent study, we reported an evident increase in DNA damage repair signaling in ATC, accounting for cisplatin resistance, which is mediated by the overproduction of lactate, resulting the recognition and repair of DNA damage induced by cisplatin. We developed nucleus-targeting platinum nanoclusters that can simultaneously disrupt glycolysis and DNA repair in ATC cells, thereby enhancing tumor chemotherapy39. Thus, the activation of DNA repair in Pro-II offers potential vulnerable targets for precision therapy.

Immunosuppressive molecules such as CD274 (encoding PD-L1), metabolic immune checkpoint IDO1, and complement 5a receptor C5AR1 were significantly increased in Pro-III tumors, indicating the potential value of antitumor immunity in this subtype. In particular, intensive infiltration of C5AR1+ myeloid cells, especially neutrophils, is a unique molecular signature of the Pro-III subtype. Nonetheless, the mechanism by which neutrophils were shaped in ATC/PDTC tissues remained to be elucidated. Integrative proteogenomic analysis uncovered the amplification of chromosome region 1q exerted cis-effects on NCF2 abundance, and increased STAT1 activity facilitated C5AR1 expression, which may coordinately induce neutrophil chemotaxis and NET formation. During NETosis, NCF2 acts as a cytosolic subunit of the multi-protein NADPH oxidase complex producing ROS, thereby triggering an MPO-mediated chromatin decondensation pathway and activating PAD4 to citrullinate histones40. The binding of complement protein C5a to C5AR1 promotes cytoplasmic and mitochondrial Ca2+ production and activates downstream MPO/PAD4-mediated NETosis. Another key point is how these neutrophils intensively infiltrate the Pro-III subtype. Loss of p53 in breast cancer cell can induce the secretion of WNT ligands that triggers tumor-associated macrophages to generate IL-1β, subsequently induces systemic neutrophilia to potentiate cancer metastasis41. It would be interesting to explore the relationship between genomic mutation and neutrophil infiltration or phenotype shaping, due to a high frequency of TP53 mutation found in Pro-III subtype.

Neutrophils are emerging as vital regulators of cancer initiation and progression through various mechanisms, including tumor-initiating genotoxicity, angiogenesis, immunosuppression, NET DNA-dependent cancer cell growth, and ECM remodeling42,43,44,45. Accumulating evidence indicates that neutrophils are unfavorable for immunotherapy. An increased neutrophil-to-lymphocyte ratio is associated with poor immune status and resistance to immune checkpoint inhibitors46. In pancreatic cancer, IL-17 triggers NETosis formation, which mediates resistance to checkpoint blockade47. Intriguingly, we found that the inhibition of C5AR1 by avacopan, an FDA-approved oral drug for the treatment of antineutrophil cytoplasmic antibody-associated vasculitis15, showed synergistic antitumor effects against ATC in combination with PD-1 blockade. Given that neutrophils intensively infiltrate Pro-III subtype, targeting C5AR1 is a promising strategy for enhancing the clinical outcomes of anti-PD-1 therapy.

In conclusion, this integrative analysis based on a multi-omics thyroid cancer cohort provides a comprehensive understanding of ATC and PDTC. The molecular signatures of different thyroid cancers revealed that RRP9-mediated onco-ribosomes are vulnerable to ATC. The proteome-based classification of ATC/PDTC combined with multi-omics data, demonstrated three distinct subtypes correlating to clinical therapy and prognosis. A translational strategy leveraging a C5AR1 inhibitor to boost the antitumor immunity of the PD-1 blockade has been proposed to treat the most aggressive ATC/PDTC subtype. Overall, our study offers a molecular landscape and druggable targets for ATC/PDTC, which would accelerate precision therapy for aggressive thyroid cancer.

Methods

Sample selection and preparation

The procedures performed in our study were in accordance with the ethical standards of the Ethics Committee of Zhejiang Provincial People’s Hospital. Thyroid cancer samples and adjacent thyroid tissues were collected from the Zhejiang Provincial Hospital, Zhejiang Cancer Hospital, and Sir Run Run Shaw Hospital. Data of patients who had not received any anticancer treatment before surgery were randomly retrieved from January 2010 to January 2022 during the first diagnosis. FFPE tissue specimens were used, and 467 surgically resected samples were collected. Sample histopathology was systematically reviewed by three expert pathologists using the World Health Organization (WHO) classification system. Furthermore, tumor specimens were evaluated by hematoxylin and eosin (H&E) staining to ensure tumor content more than 70% and necrosis equal to or less than 20%. Clinical information including smoking status, drinking status, histological subtype, lymph node metastasis, TNM stage, and overall survival time, is summarized in Supplementary Data 1. Individual information of sex and age was not provided due to restrictions on ethics. Each patient provided a written informed consent for his/her samples and clinical information to be used for research, and study protocol was permitted by the Ethics Committee of Zhejiang Provincial People’s Hospital.

Whole-exome sequencing

DNA extraction

DNA extracted from FFPE samples (51 ATCs, 28 PDTCs, and 18 NAT samples) was processed using a QIAamp DNA Mini Kit (QIAGEN, Hilden, Germany) following the manufacturer’s protocol. The quality of the isolated DNA and the presence of contamination were assessed for DNA degradation and contamination through agarose gel electrophoresis, and DNA concentration was determined by a Qubit® 2.0 Fluorimeter (Invitrogen, CA, USA) with the Qubit® DNA Assay Kit.

Library preparation

A total of 0.6 µg of genomic DNA per sample was utilized, and sequencing libraries were generated using an Agilent SureSelect Human All Exon kit (Agilent Technologies, CA, USA) as per the manufacturer’s instructions. Fragmentation was achieved by a hydrodynamic shearing system (Covaris, Massachusetts, USA), yielding fragments 180–280 bp. Blunt ends were generated through exonuclease/polymerase activities, followed by adenylation of the 3′ ends and ligation of adapter oligonucleotides. Adapter-ligated DNA fragments were selectively enriched via PCR, and libraries were hybridized with a biotin-labeled probe and captured using streptomycin-coated magnetic beads. The captured libraries were PCR-enriched to add index tags for sequencing, purified using an AMPure XP system (Beckman Coulter, Beverly, USA), and quantified by an Agilent Bioanalyzer 2100 system.

Clustering, sequencing, and quality control

The index-coded samples were clustered using a cBot Cluster Generation System with a HiSeq PE Cluster Kit (Illumina) following the manufacturer’s protocols. Subsequently, the DNA libraries were sequenced on the Illumina HiSeq platform to produce paired-end reads of 150 bp. The target region for the whole-exome sequencing (WES) encompassed 33 Mb.

The initial fluorescence image files acquired from the HiSeq platform underwent base calling to convert them into short reads (raw data), containing sequence and sequencing quality information, which were stored in the FASTQ format. Rigorous quality control measures are imperative to ensure meaningful downstream analysis. The steps of data processing include (i) discarding the paired reads if one read contained adapter contamination (>10 nucleotides aligned to the adapter, allowing ≤10% mismatches); (ii) discarding the paired reads if more than 10% of bases are uncertain; (iii) discarding the paired reads if the proportion of low-quality (Phred quality <5) bases was over 50% in either one read.

Subsequent bioinformatic analyses relied on meticulously filtered high-quality data, which were retained following the aforementioned procedures. Concurrently, quality control metrics, encompassing parameters such as total read count, raw data metrics, sequencing depth, error rate, proportion of reads with Q30 scores (representing bases with Phred-scaled quality scores exceeding 30), and distribution of GC content, were computed and summarized for assessment.

Reads mapping to reference sequence

The deemed valid sequencing data were aligned to the reference human genome (UCSC hg19) by the Burrows-Wheeler Aligner (BWA, v0.7.12) software48, resulting in original mapping outcomes stored as BAM format. When one or both paired reads were mapped to multiple positions, BWA employed a strategy to select the most probable alignment. If multiple equally probable alignments were identified, BWA randomly selected one. Subsequently, SAMtools (v1.9)49 and Picard (http://broadinstitute.github.io/picard/) were employed to organize BAM files and execute tasks such as duplicate marking, base quality recalibration, and local realignment. The generated final BAM files were used to compute the sequence coverage and depth.

Somatic mutation calling

Somatic variants were determined by MuTect (version 2) on the exome data of tumor samples from the NAT samples. Eighteen NAT samples were used to generate pools of normal and filtered germline mutations. To obtain high quality somatic mutations, stringent downstream filtration criteria were set as following50: (i) A minimum of 10× coverage; (ii) Variant allele fraction ≥5% and at least 5 variant supporting reads in the tumor sample; (iii) 1000G and gnomAD population frequency <1%. Subsequently, mutations in the non-coding regions including 3’UTR, 5’UTR, intron, and gene intergenic were removed, and variants with high consequences were retained. Somatic insertions and deletions (indels) were called using Strelka27 (version.1.0.17).

Analysis of significantly mutated genes

Significantly mutated genes were identified by MutSigCV (https://software.broadinstitute.org/cancer/cga/mutsig, version 1.4) with default parameters. P values were then adjusted by the method of Benjamini and Hochberg51, and mutated genes with a q value < 0.05 were considered significant.

Copy-number analysis

Copy Number Alterations (CNAs) were performed using the somatic CNA calling pipeline in the GATK (GATK 4) Best Practice. The resulting segment files for every 1000 were input into GISTIC252 to identify significantly amplified or deleted focal- and arm-level events. Q value < 0.1 was considered significant. A log2 ratio cutoff of 1 was set to determine SCNA amplification and deletion. Weighted sum approach was utilized to summarize the arm-level copy number change53. The segment-level log2 copy ratios for all segments located in the given arm were added to the length of each weighted segment. The cutoff thresholds were set as -ta 0.1 -tb 0.1 -brlen 0.98 -conf 0.9 to exclude false positives.

Detection of TERT promoter mutation

TERT promoter mutations (C228T/C250T) were detected by Human TERT Promoter Mutation Detection Kit (RPQ008, Genetron) basing on amplification-refractory mutation system real-time PCR and TaqMan probes. A 2 μL DNA exacted from FFPE samples were mixed with 18 μL reaction solution according to the manufacturer’s instruction, and determined by CFX96 real-time PCR instrument (Bio-Rad, CA, USA).

Proteomic analysis

Dewaxing treatment

FFPE samples were placed in the oven at 37°C for 30 min and submerged with heptane for 10 min. Samples were then covered with 100%, 90%, and 75% ethanol in sequence for 5 min. After rinsed by deionized water, samples were transferred into pressure cycling technology (PCT) tubes and 12.5 μL tris-HCl (100 mM, pH 10) were added. Samples were shaken at 600 rpm for 30 min at 95 °C.

PCT-based protein digestion

We initiated the reduction and alkylation process by adding 30 μL of a solution containing 2 M thiourea, 6 M Urea, and 100 mM triethylammonium bicarbonate (TEAB), along with 5 μL of 200 mM Tris (2-carboxyethyl) phosphine (TCEP) and 2.5 μL of 800 mM iodoacetamide (IAA). The reaction was carried out under a pressure of 45,000 psi, with 30 s of high pressure followed by 10 s of ambient pressure per cycle, at 30 °C for a total of 90 cycles. To adjust the urea concentration to below 1.5 M, 75 μL of TEAB was subsequently added to the PCT tube. Subsequently, 5 μL Lys-C (0.25 μg/μL) and 10 μL Trypsin (0.5 μg/μL) were added for protein digestion. The conditions were set to 20,000 psi, with 50 s of high pressure and 10 s of ambient pressure per cycle, at 30 °C for a total of 120 cycles. The tryptic peptides were then transferred into 1.5 mL tubes and 25 μL 10% TFA was used to terminate digestion.

Desalting and TMT labeling

The pH of the samples was maintained within the range 2–3. Desalting was conducted using SOLAμ columns (Thermo Fisher Scientific™, San Jose, USA), and TMT labeling was performed using the TMTpro 16plex Isobaric Label Reagent Set following the manufacturer’s instructions.

High pH fractionation

We utilized a Waters XBridge Peptide BEH C18 column (300 Å, 5 μm × 4.6 mm × 250 mm) coupled with a DIONEX UltiMate 3000 Liquid Chromatogram for fractionation. The mobile phases comprised 10 mM ammonium hydroxide (pH 10) for mobile phase A and 98% ACN, 10 mM ammonium hydroxide (pH 10) for mobile phase B. Peptides were collected at 1-min intervals during a 60-min gradient from 5% to 35% ACN at a flow rate of 0.5 mL/min, resulting in the formation of 30 fractions. Subsequently, the fraction samples were dried using a SpeedVac and reconstituted in 0.1% Formic Acid and 2% ACN prior to LC-MS/MS analysis.

MS analysis

467 thyroid cancer samples were performed by proteomic analysis, including three biological replicates of PTC. We employed LC-MS/MS technology utilizing the nanoflow DIONEX UltiMate 3000 RSLCnano System coupled with an Orbitrap Exploris 480 mass spectrometer (Thermo Scientific™, San Jose, USA), which was equipped with a FAIMS Pro™ (Thermo Scientific™, San Jose, USA), operating in data-dependent acquisition (DDA) mode. For chromatographic separation, we utilized Buffer A (2% ACN, 98% H2O with 0.1% formic acid (FA)) and Buffer B (80% ACN in H2O with 0.1% FA). All reagents were of MS grade. Peptides were initially subjected to a precolumn (3 µm, 100 Å, 20 mm, 75 µm i.d.) at a flow rate of 6 μL/min for 4 min, followed by injection into the analytical column (1.9 µm, 120 Å, 150 mm, 75 µm i.d.) using a 30-min LC gradient (7% to 30% Buffer B) at a flow rate of 300 nL/min. The MS1 scan range was set to 375–1800 m/z with a resolution of 60,000, an intensity threshold of 2e4, a normalized automatic gain control (AGC) target of 300%, and a maximum ion injection time (max IT) of 50 ms. MS/MS experiments were conducted at a resolution of 30,000, a normalized AGC target of 200%, and a maximum IT of 86 ms. The isolation window was set at 0.7 m/z with the first mass set at 110 m/z.

Phosphoproteomic analysis

Protein extraction and digestion

TCEP buffer supplemented with phosphatase and protease inhibitors was used to lyse samples at 99 °C for 30 min. Subsequently, trypsin was added with 1:50 of total protein amount and digested for 18 h at 37 °C. A 10% FA was then added in each sample and vortexed for 3 min, followed by sedimentation at 12,000 × g for 5 min. Next, the supernatant was absorbed to a new 1.5 mL tube with extraction (50% ACN, 0.1% FA), followed by sedimentation at 12,000 × g for 5 min. Supernatant was then collected for drying by a speed-vac at 60 °C.

The enrichment of phosphorylated peptides

High SelectTM Fe-NTA Magnetic Phosphopeptide Enrichment Kit (Thermo Fisher Scientific, Rockford, IL, USA, #A32992) was used for phosphopeptide enrichment according to a previous study50. Briefly, peptides were suspended in 200 μL Binding/Wash Buffer and mixed with equilibrated resins every 10 min for 30 min at room temperature. The resins were subsequently washed three times with Binding/Wash Buffer and twice with LC-MS-grade water. The Elution Buffer was used to elute enriched peptides and samples were immediately dried by a speed-vac at 45 °C.

LC-MS/MS analysis

For phosphoproteomic detection, 217 samples were analyzed using an Orbitrap Fusion Lumos Hybrid Quadrupole-Orbitrap Mass Spectrometer (Thermo Fisher Scientific) coupled with a high-performance liquid chromatography system (EASY nLC 1200, Thermo Fisher Scientific). Peptides, re-dissolved in 0.1% FA, were subjected to a 2-cm self-packed trap column using 0.1% FA and separated on a 150 μm-inner-diameter column with a length of 30 cm over a 150-min gradient (Solvent A: 0.1% FA; Solvent B: 80% ACN, 0.1% FA) at a constant flow rate of 600 nL/min. The gradient was as follows: 0–150 min, 0–4% Solvent B; 0–10 min, 4%–15% Solvent B; 10–125 min, 15%–30% Solvent B; 125–140 min, 30%–50% Solvent B; 140–141 min, 50%–100% Solvent B; 141–150 min, 100% Solvent B. Peptides were ionized at 2 kV and loaded into the mass spectrometer. Mass spectrometry was conducted by DDA mode. MS1 spectra were acquired by an Orbitrap mass analyzer at a resolution of 120,000, with ions in the m/z range of 300–1400 and an AGC target value set to 3E6. The maximal ion injection time was set as 80 ms. MS2 spectra acquisition was conducted in the ion trap in rapid speed mode with a cycle time of 1.5 s. Precursor ions were fragmented using higher energy collision dissociation with 30% normalized collision energy, and analyzed by an ion trap mass analyzer with an AGC target set to 5E4. The maximum ion injection time for MS2 was set to 20 ms. Peptides that triggered the MS/MS scans were dynamically excluded from MS/MS scans for 18 s. The coefficients of variation values on FAIMS were set as −45 and −65 V.

For quality control of performance of mass spectrometry, the HEK293T cell lysates were measured every 3 days to set the quality-control standard. The quality-control standard was digested and analyzed using the same method and conditions as the 30 samples.

MS database searching

Phosphoproteome MS raw files were processed through “Firmiana” (a one-stop proteomic cloud platform) software using Mascot search engine against the human NCBI reference proteome database. Carbamidomethyl cysteine was selected as the fixed modification, and protein N-terminal acetylation, oxidized methionine, and phospho (S/T/Y) were chosen as variable modifications. The cutoff of false discovery rate (FDR), depending on the target-decoy strategy, was set at 1% for peptides. Detected peptides were retrieved either as unique peptides to a particular protein group or as razor peptides to a single peptide group with the most peptide evidence. Tolerance for spectral search was set to 20 ppm for the precursors. The maximum number of missing cleavage site was set at 2. In the phosphopeptide-enriched data, sites were computed for localization probability by ptmRS54 algorithm. Sites probability ≥0.75 were considered as confident localization.

Quantification and statistical analysis

Effects of copy number alterations

The R package multiOmicsViz was used to visualize protein abundance in either “cis” (within the same aberrant locus) or “trans” (remote locus) mode affected by SCNAs55. Spearman’s correlation coefficients and related adjusted P values were subsequently evaluated for each CNA-protein pair.

Differential protein analysis

The Wilcoxon rank-sum test was utilized to examine significant differences in protein expression between NATs, ATCs, PDTCs and PTCs, or between patients with distinct mutation statuses and CNA statuses. Proteins were considered significantly altered if they exhibited differential expression in one group compared to another (Wilcoxon rank-sum test, BH adjusted P value < 0.05, fold change >1.5, or <2/3).

Pathway enrichment and correlation analysis

DAVID database (https://david.ncifcrf.gov/) was used to perform pathway enrichment analysis, and the cutoff for significant pathways was P value/FDR of <0.05. Gene sets of molecular pathways from the KEGG or Hallmark or Reactome or GO databases were retrieved to compute pathways. Fisher’s exact test was applied to examine the significance of the pathway enrichment.

Gene Set Enrichment Analysis (GSEA) was performed using the GSEA software (http://software.broadinstitute.org/gsea/ index.jsp). Gene sets were downloaded from the Molecular Signatures Database (v7.1).

Spearman’s correlation coefficients were calculated using the Hmisc (v4.5-0) R package to estimate correlations between pathway scores and other features. The ggplot2 (v3.3.5) R package was used to generate the scatter plots.

Kinase activity prediction

Kinase activities were calculated using the previously reported Kinase-Substrate Enrichment Analysis (KSEA) algorithm depending on the abundance of phosphosites. Kinase-substrate relationships were acquired from publicly available databases, including PhosphoSitePlus (PSP) and NetworKIN 3.0. Substrate motifs were retrieved from the KSEA dataset using Motif-X.

Consensus clustering analysis

Proteins that expressed in more than 30% of specimens were chosen for clustering analysis. To explore new proteomic subtypes of ATC and PDTC, the R package ConsensusClusterPlus (v.1.50.0) was used to conduct consensus clustering using reconciling clustering algorithm in 6145 proteins. The clustering algorithm utilized k-means with Euclidean distance. Consensus Cluster Plus parameters were set as repetitions = 1000 bootstraps, pItem = 0.8 (resampling 80% of any sample), pFeature = 0.8 (resampling 80% of any protein), and K-means clustering with up to 7 clusters. Consensus matrices for k = 3, 4, and 5 clusters were illustrated in Fig. S5A. The resulting k = 3 showed the clearest separation among these clusters and significantly correlated with patient survival, and was selected for further analysis.

Subtype-specific expressed proteins analysis

Differentially expressed proteins (DEPs) across three proteomic subtypes were determined by Kruskal–Wallis test. DEPs in different subtypes were identified as proteins differentially expressed in one subtype compared to other subtypes. The cutoff was set at P < 0.05, fold change >1.5, or <2/3.

Correlation between proteomic subtypes and clinicopathological features

To determine the correlations between proteomic subtypes and clinicopathological features, Fisher’s exact test was conducted on categorical variables, including group, BRAF mutation status, sex, age, smoking, lymph node metastasis, and TNM stage.

Survival analysis

Kaplan–Meier survival curves by log-rank test were utilized to plot overall survival (OS) of both proteomic subtypes and individual patients. Before conducting the log-rank test, the R package survminer (0.2.4) along with maxstat (http://r-addict.com/2016/11/21/Optimal-Cutpoint-maxstat.html) were utilized to identify the optimal cutoff point of selected samples. For multivariate survival analyses, hazard ratios (HRs) and 95% confidence intervals (CI) were calculated by R package survival based on Cox proportional hazard models.

Immune subtype analysis

xCell integrates the advantages of gene set enrichment with deconvolution approaches to estimate immune and stromal microenvironment56. The scores of 64 different cell types in 159 tumors (ATC and PDTC) were computed using xCell depending on protein expression values. The resulting scores computed by xCell are listed in Supplementary Data 6. Consensus clustering was conducted using xCell signatures to identify samples with similar immune and stromal characteristics. The R package ESTIMATE was used to estimate tumor purity based on immune and stromal scores derived from proteomic data50. To explore the different biological process enriched in immune clusters, the “ssGSEA” R package (v1.42.0) was used to define the pathway score of each sample. A one-versus-all test was employed to select cell types in different immune clusters, followed by Wilcoxon rank-sum test to screen for pathways that were differentially enriched among different subgroups.

Cell culture

Human ATC cell line 8505C (Catalog No. ACC-219) and CAL62 (Catalog No. ACC-448) were from Deutsche Sammlung von Mikroorganismen und Zellkulturen. Both cells were cultured with RPMI 1640 medium (C11875500BT, Cytiva, USA) supplemented with 10% fetal bovine serum (FBS) (SH30087.04, HyClone, USA), 100 μg/mL streptomycin, and 100 U/mL penicillin. The murine ATC cell line mATC (storage no. C2024159, China Center For Type Culture Collection) was isolated from genetically engineered TBP (TPOCreER; Braftm1Mmcm/WZ, Trp53tm1Brn/tm1Brn) mice57. The mATC cells were cultured with Dulbecco’s Modified Eagle’s Medium (DMEM) supplemented with 10% FBS, 10 ng/mL hydrocortisone, 100 μg/mL streptomycin, and 100 U/mL penicillin. All cell lines were cultured in the cell incubator at 5% CO2 and 37 °C. Cell lines were verified by short tandem repeat profiling and confirmed as mycoplasma negative before experiments.

siRNA-mediated gene knockdown

For siRNA-mediated gene knockdown of RPP9, CAL62 and 8505C cells were seeded in 6-well plates (5 × 105/well) at 30–40% confluence, respectively. Cells were transfected with siRNA using jetPRIME (Ployplus, 114–15) according to the manufacturer’s instructions. The sense sequence of RRP9 siRNA (stB0008845B, RiboBio, Guangzhou, China) was GCAGCAAGCTCATCTAT.

Cell proliferation and colony formation assay

The transfected ATC cells were seeded in 96-well plates (3 × 103/well) and cultured for 48 h. Cell viability was determined by CCK-8 assay (FD3788, Fdbio, China) according to the manufacturer’s protocol. For colony formation assay, transfected CAL62 and 8505C cells were seeded at a density of 1 × 103 cells/well in 6-well plates and cultured for 7 and 15 days, respectively. Then, colonies were stained with 0.1% crystal violet and photographed.

Immunohistochemistry and immunofluorescence staining

The 4-μm paraffin-embedded sections of thyroid cancer tissues and mATC-derived tumor tissues were first dewaxed and then subjected to antigen retrieval using 1 mM EDTA (pH 8.0). The endogenous peroxidase was subsequently inactivated by 3% hydrogen peroxide for 15 min, and 5 % goat serum in PBS was used to block nonspecific binding sites. The samples were then incubated with primary antibodies against BCL2 (1:6000, 60178-1-Ig, Proteintech, China,), CD11b (1:1000, ab133357, Abcam, UK), C5AR1 (1:400, 21316-1-AP, Proteintech), MPO (1:800, A12109, Abclonal, China), cit-H3 (1:4000, ab5103, Abcam) at 4 °C overnight. After washing three times by PBS, sections were incubated with DAPI and secondary antibodies. For IHC staining, the signal was detected using Development Kits and the nuclei were stained with hematoxylin. Images were acquired by Pannoramic Scan (3D HISTECH, Hungary).

Animal models

To examine the effects of RRP9 expression on tumor growth, 8505C cells were transfected by recombinant lentiviruses containing RRP9-targeted shRNA in the presence of 5 μg/mL polybrene. Stable knockdown of RRP9 in 8505C cells were then generated by 5 μg/mL puromycin. BALB/C nude female mice at 6–8 weeks were randomly divided into two groups, and 4 × 106/mL negative control (RPP9-WT) or RRP9-konckdown (RPP9-KD) 8505C cells in 100 μL PBS containing 10% Matrigel (v/v) were subcutaneously implanted into nude mice. Tumor volume and body weight were recorded every 3 days. Tumor volume was evaluated as Length × Width2/2. Mice were euthanized and tumors were harvested, weighed, and photographed.

To determine the synergistic effects by targeting C5AR1 and PD-1, 4 × 106/mL murine ATC cell line mATC cells suspension in 100 μL PBS were implanted subcutaneously in 6-week-old female C57BL/6J mice. Subsequently, mice were randomly divided into four groups and treated on day 7. The C5AR1 inhibitor, avacopan (MedMol, S87952), was suspended in 1% sodium carboxymethyl cellulose and administered via oral gavage once daily at a dose of 30 mg/kg as single agent or in combination with anti-PD-1 antibody. A 10 mg/kg of anti-PD-1 antibody was injected intraperitoneally every 2 days. Tumor volume and body weight were recorded every 3 days. After sacrifice, tumor tissues were collected, weighed, and imaged.

All experiments were performed according to the “Guide for the Care and Use of Laboratory Animals (8th Edition)” and “Guidelines for Endpoints in Animal Study Proposals (2019)” by National Institutes of Health. Mice were housed in a specific pathogen-free environment with a 12-h light/dark cycle and received food and water ad libitum. Mice with tumor burden greater than 10% body weight or maximal tumor size exceed 20 mm constitute an endpoint. Isoflurane was used for euthanasia. The maximal tumor size/burden was not exceeded in this study. The protocols were approved by the Animal Ethics Committee of Zhejiang Provincial People’s Hospital.

Statistics and reproducibility

The analysis was conducted using R (version 4.1.2) and Python (version 3.8.6). Fisher’s exact test was employed for categorical variables versus categorical variables; otherwise, the chi-square test was used. The Wilcoxon rank-sum test was used to assess differential protein expression between tumors and NAT, as well as between patients with different mutations and CNA statuses. The Kruskal–Wallis test was used to examine differences in protein expression among proteomic subtypes and other subtypes. A one-versus-all test was utilized to select cell types from different immune clusters. Benjamini–Hochberg FDR correction was applied to adjust P values for multiple testing. Kaplan–Meier plots (log-rank test) depicted OS, and univariate Cox proportional hazards regression models identified variables associated with OS. Significant factors from univariate analysis were subjected to multivariate Cox regression analysis using the forward LR approach. For in vitro experiments, three biological replicates were conducted independently, and data are presented as mean ± standard error (SD). For in vivo experiments, results were shown as mean ± standard error of mean (SEM). Statistical significance of differences between two or multiple groups was examined by two-sided Student’s T-test and one-way analysis of variance (ANOVA), respectively. P < 0.05, 0.01, 0.001 represented by *, **, ***, respectively. No statistical method was used to predetermine sample size. No data were excluded from the analyses.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.