Extended Data Fig. 8: fastMNN batch-effect correction depends on the order of importing scRNA-seq data into the pipeline.
From: A multicenter study benchmarking single-cell RNA sequencing technologies using reference samples

Panels (a-c) show results obtained using fastMNN when the spiked-in (mixed) datasets (that is, 10X_LLU_Mix10, 10X_NCI_Mix5, 10X_NCI_Mix5_F, 10X_NCI_M_Mix5, 10X_NCI_M_Mix5_F, and 10X_NCI_M_Mix5_F2) were imported into the pipeline before other non-mixed scRNA-seq datasets from the 20 scRNA-seq datasets of Scenario 1. (a) t-SNE vs. UMAP with color-coding by dataset; (b) tSNE vs. UMAP, colored by cell types (HCC1395, red; HCC1395BL, blue); and (c) A silhouette score = 0.52 showing that fastMNN correctly separated the two cell types into two clusters representing breast cancer cells and B lymphocytes. Panels (d-f) show results obtained using fastMNN when the non-mixed datasets were imported into the pipeline before the mixture datasets. (d) tSNE vs. UMAP with color-coding by datasets or (e) tSNE vs. UMAP colored by cell types; and (f) A low silhouette score of 0.22 showing that fastMNN had difficulty correctly separating the two cell types in this case. Batch-effect corrections were performed using fastMNN (SeuratWrappers v0.1.0) and silhouette width scores were calculated using the silhouette function from the R package cluster (v.2.0.8). Datasets from 10x were down-sampled to 1200 cells per dataset. The order of dataset input is shown on the top of the Figures (a, b, c or d, e, f).