Fig. 2: A clinically concordant morphological signature of malignant myeloma cells.

a, t-Stochastic neighbor embedding (t-SNE) of the CNN latent space of high-confidence cells colored by CNN class (nā=ā489,753 cells from 97 patient samples). b, Marker expression levels per cell projected onto the embedding of a. c, Example cropped microscopy images showing representative morphologies of myeloma cells (top) and small CD138+/CD319+ plasma cell-marker-positive cells (bottom). Scale bar, 10āµm. Box-plots of cell diameter of myeloma cells (nā=ā1,828 cells from 55 patient samples) and small plasma cell-marker-positive cells (nā=ā1,162 cells from 55 patient samples) (right). Box-plots indicate the median (horizontal line) and 25% and 75% ranges (box) and whiskers indicate the 1.5 Ć interquartile range above or below the box. Outliers beyond this range are shown as individual data points. In this case no outliers are present. P values from unpaired two-tailed Studentās t-test. d, Plasma cell class morphology projected onto the embedding of a. e, DNA-fluorescence in-situ hybridization (FISH) results assessing hyperdiploidy of FACS-sorted plasma cells (CD138+ or CD319+) that were further subdivided by size (see also Extended Data Fig. 2e). Bar graphs represent 100 cells per class for four patient samples. Example FISH-image of sample MM147 indicating hyperdiploidy for three nuclei (right). Blue indicates 4,6-diamidino-2-phenylindole (DAPI) stain. Scale bar, 10āµm. f, Scatter-plot of percentage myeloma cells by PCY compared to evaluation by clinical cytology (nā=ā82 patient samples). Spearmanās rank and Pearsonās correlations and P values are indicated. g, Box-plot of percentage myeloma cells by PCY stratified by treatment stage (nā=ā86 patient samples). P values from multiple pairwise comparison of the group means using Tukeyās honestly significant difference criterion. Data are not adjusted for multiple comparisons. Box-plots as in c. h, Difference in percentage myeloma cells in longitudinal patient samples, normalized to the first sampling. Red indicates patients with less than PR; blue shows patients with PR or better. Box-plots as in c. P values from paired two-tailed t-test. AUC, area under the receiver operating characteristic curve; PD, progressive disease; SD, stable disease; MR, minimal response; VGPR, very good partial response; CR, complete remission, as defined by the International Myeloma Working Group.