Extended Data Fig. 10: Scanorama batch correction using 10x and non-10x scRNA-seq datasets from two different studies. | Nature Biotechnology

Extended Data Fig. 10: Scanorama batch correction using 10x and non-10x scRNA-seq datasets from two different studies.

From: A multicenter study benchmarking single-cell RNA sequencing technologies using reference samples

Extended Data Fig. 10

(a, un-corrected) UMAP of 10 datasets (10x: PBMCs 68 K, PBMCs 3 K, CD19 + B cells, CD14 + monocytes, CD4 + helper T cells, CD56 + NK cells, CD8 + cytotoxic T cells, CD4 + CD45RO + memory T cells, CD4 + CD25 + regulatory T cells; Drop-seq: PBMCs) out of 26 datasets from Hie et al.8 before batch correction by Scanorama. (b, corrected-based on dataset) UMAP of 10 different datasets shown in (a) from Hie et al. after batch correction by Scanorama, colored to identify the datasets. (c, corrected-based on platform) UMAP of 10 different datasets shown in (a) from Hie et al. colored to identify the two different platforms used (10x Genomics and Drop-seq); note poor results using Drop-seq. (d, un-corrected) UMAP of 8 datasets (breast cancer cells: C1_FDA_HT_A, C1_LLU_A, ICELL8_SE_A, and ICELL8_PE_A; and B lymphocytes: C1_FDA_HT_B, C1_LLU_B, ICELL8_SE_B, and ICELL8_PE_B) out of 20 datasets in our study analyzed using three different non-10x sequencing platforms before batch correction by Scanorama. (e, corrected-based on dataset) UMAP of 8 datasets shown in (d) after batch correction by Scanorama, colored to identify the datasets. Note lack of discrimination between different cell types. (f, corrected-based on platform) UMAP of 8 datasets shown in (d) after batch correction by Scanorama, colored to identify the platforms (C1_FDA_HT, blue; C1, purple; ICELL8, pink). The PBMC datasets were downloaded from http://scanorama.csail.mit.edu/data_light.tar.gz. Our eight datasets were preprocessed using the featureCounts pipeline and batch-effect correction was performed using Scanorama v1.4.

Back to article page