Extended Data Fig. 2: Epigenetic Correlations and Filtering with TCGA, GTEx, and FANTOM5 samples.
From: Pan-cancer analysis identifies tumor-specific antigens derived from transposable elements

a, Spearman rho correlation of global methylation versus number of all 26,816 TE-chimeric transcripts across cancer types and all samples. Purple bars represent significant correlations (Adjusted P-value < 0.05). Exact p-values are the following: STAD: 2.93E-04, HNSC: 1.27E-05, LUSC: 1.58E-03, BLCA: 2.33E-02, UCEC: 3.20E-02, KIRP: 1.23E-01, SKCM: 5.73E-02, LUAD: 2.65E-01, READ: 6.69E-01, LIHC: 2.65E-01, KIRC: 4.47E-01, PCPG: 6.69E-01, ESCA: 8.485E-01, LGG: 6.69E-01, PAAD: 8.48E-01, COAD: 8.48E-01, SARC: 8.72E-01, BRCA: 8.66E-01, PRAD: 9.22E-01, CESC: 9.22E-01, THCA: 6.69E-01, All: 9.24E-01. b, Dot plot of difference in number of all TE-chimeric transcripts between samples that have a particular driver mutation and those that do not in a specific cancer type. Dots are ordered by difference. Wilcoxon rank sum test (two-sided) was used with Benjamin-Hochberg correction. Exact o-values for significant differences are the following: COAD-APC: 1.17E-03, COAD-TP53: 3.15E-06, READ-TP53: 9.01E-04, STAD-TP53: 3.70E-02, HNSC-CASP8: 7.80E-04, HNSC-NOTCH1: 4.37E-02, HNSC-NSD1: 3.08E-04, BRCA-TP53: 4.37E-02, LIHC-TP53: 7.11E-03. c, Number of tumor and normal samples all TE-chimeric transcripts were present in. Those highlighted in blue passed our threshold for tumor-specificity. The bottom graph is a zoomed in on the section of the top graph that has a dotted box around it. d, Number of TCGA tumor and GTEx adult normal samples all TE-chimeric transcripts were present in. Those highlighted in blue passed our threshold for tumor-specificity. e, Number of samples in each tissue type profiled by FANTOM5. f, Expression of candidate promoters in FANTOM5. Dashed box highlights candidates removed due to high expression in adult tissues.