Extended Data Fig. 8: DeepGlioma molecular subgroup analysis.

Multiclass classifcation performance for molecular subgroup prediction by DeepGlioma stratified by patient demographic information and prospective testing site is shown. Results stratified by (a) age, (b) race, and (c) sex are shown. Multiclass classification performance remained high in each patient demographic compared to the entire cohort. DeepGlioma was trained to generalize to all adult patients and to be agnostic to patient demographic information. d, Confusion matrix of our benchmark multiclass model trained using categorical cross-entropy. DeepGlioma outperformed the multiclass model by +4.6% in overall patient-level diagnostic accuracy with a substantial improvement in differentiating molecular astrocytomas and oligodendrogliomas. e, Direct comparison of subgrouping performance for our benchmark multiclass model, IDH1-R132H IHC, and DeepGlioma. Performance metrics values are displayed. Molecular subgroupings mean and standard deviations are plotted for both IDH subgrouping and molecular subgrouping. These results provide evidence that multimodal training and multi-label prediction provide a performance boost over multi-class modeling. f, DeepGlioma molecular subgroup classification performance for each of the prospective testing medical centers is shown. Accuracy values with 95% confidence intervals (in parentheses) are shown above the confusion matrices. Overall performance was stable across the three largest contributors of prospective patients. Performance on the MUV dataset was comparatively; however, some improvement was observed during the LIOCV experiments. Red indicates the best performance.