Extended Data Fig. 2: Quality control metrics across our data after downsampling to account for 10x chemistry differences. | Nature Neuroscience

Extended Data Fig. 2: Quality control metrics across our data after downsampling to account for 10x chemistry differences.

From: A cross-disease resource of living human microglia identifies disease-enriched subsets and tool compounds recapitulating microglial states

Extended Data Fig. 2

(A-F). Violin plots showing the distribution of our cellular data with overlaid boxplots. The center of boxplots is the median, and the hinges of the box span the 25% to 75% percentiles. Whiskers represent 1.5 IQR from the nearest hinge. Outliers are not shown in this visualization, nor are minima or maxima. Further information about metadata traits and number of cells included in each violin plot may be found in Supplementary Table 1 under ‘QC_’ tabs. The distributions of unique molecular identifiers (UMIs) and genes detected on a per-cell level after downsampling are similar across donors (A), clusters (B), genders (C), 10x chemistry versions (D), regions, (E), and diagnoses (F). Notably, after downsampling, differences between 10x chemistry versions in these metrics are largely eliminated. (G) Validation of population stability by resampling and reclustering demonstrates that overlap of gene expression is largely observed for clusters with similarly related families, such as 2 and 4, or for intermediate subsets such as 5 and 3. To evaluate clustering stability, we randomly sampled ¾ of the cells from our dataset and ran our clustering pipeline with identical parameters. We recorded the frequency of ‘misclassification’, where cells were re-clustered into clusters different from the one that contained most cells with the same original classification. This process was repeated between pairs of cells, and repeated 50 times for each comparison. Cells were considered to be classified into the ‘correct’ class if they were assigned correctly in ¾ of classification runs. Otherwise, they were considered ‘misclassified’ into a different cluster. Classification frequency is visualized in a heatmap here. LOAD late-onset Alzheimer’s disease, EOAD early onset Alzheimer’s disease, MCI mild cognitive impairment, CNTRL control, DLBD-PD diffuse Lewy body disease-Parkinson’s disease, PSP progressive supranuclear palsy, TLE temporal lobe epilepsy, MS multiple sclerosis, ALS amyotrophic lateral sclerosis, FTD frontotemporal dementia, HD Huntington’s disease, DNET dysembryoplastic neuroepithelial tumor, BA Brodmann area, AWS anterior watershed, OC occipital cortex, TNC temporal neocortex, H hippocampus, TH thalamus, SC spinal cord, SN substantia nigra, FN facial nucleus.

Back to article page