Fig. 1: Overview of the pan-cancer study.
From: Next generation pan-cancer blood proteome profiling using proximity extension assay

a Age distribution and number of patients included for each cancer and the healthy cohort. b Examples of protein levels for four example proteins across the 12 cancer types. Boxplots summarize the median value, upper and lower hinges corresponding to the first and third quartiles, and whiskers indicating the minimum and maximum values within 1.5 times the IQR. Individual data points are presented for each cancer group, with n = 1462, n = 1402, n = 1462, and n = 1399 independent samples for CD79B, FLT3, LY9, and SLAMF7, respectively. c Schematic representation of the workflow used in this study. Blood plasma from 1477 cancer patients and 74 healthy individuals was analyzed using Proximity Extension Assay. Differential expression analysis and classification models was used to compare one cancer to all other cancers and identify cancer-associated proteins. The models for cancer classification were generated using machine learning techniques (70% of the data in training set). The resulting pan-cancer protein panel was used in a pan-cancer multiclassification strategy, and the performance tested against a test set (30% of the data) and ultimately compared against healthy individuals. Source data are provided as a Source data file. AML acute myeloid leukemia, CLL chronic lymphocytic leukemia, DLBCL diffuse large B-cell lymphoma.