Extended Data Fig. 1: Quality control of snATAC-seq data. | Nature Genetics

Extended Data Fig. 1: Quality control of snATAC-seq data.

From: Integrating genetics with single-cell multiomic measurements across disease states identifies mechanisms of beta cell dysfunction in type 2 diabetes

Extended Data Fig. 1

(a) Steps for snATAC-seq data processing and quality control. (b) Representative quality control metrics for each donor. Log10 total reads, fraction of reads overlapping promoters, fraction of reads overlapping peaks, and fraction of reads overlapping mitochondria DNA distribution of cells from library JYH809 as example. Blue vertical lines denote thresholds of 1000 minimal fragment number, 15% fragments overlapping promoters, 30% fragments overlapping peaks, and 10% fraction of reads overlapping mitochondria DNA, respectively. Red vertical lines denote thresholds to identify top 1% barcodes with extremely high total fragment number and fraction of reads overlapping promoters and peaks, respectively. (c) Representative cell clustering from library JYH809. (d) Promoter chromatin accessibility in a 5 kb window around TSS for endocrine marker genes in individual nuclei library JYH809. Total counts normalization and log-transformation were applied. (e) Cell clustering of chromatin accessibility profiles from all donors. (f) Representative low-quality cluster and subcluster. Cells in cluster 14 (top, highlighted in red) have significantly lower unique fragment than cells in other clusters (p = 2.3e-9, n = 255,598 cells). Cells in subcluster 1 (bottom, highlighted in red) have significantly lower fraction of reads overlapping peaks than cells in other clusters (p = 4.8e-5, n = 16,296 cells). Data are shown as mean ± S.E.M., ANOVA test with sex, age, BMI, disease status as covariates. (g) Log10 total reads, fraction of reads overlapping peaks and fraction of reads in promoters of cells from each cluster in Fig. 1b. Data are shown as mean ± S.E.M. (h) Promoter chromatin accessibility in a 5 kb window around TSS for selected endocrine and non-endocrine marker genes for each profiled cell (alpha: GCG, beta: INS-IGF2, delta: SST, gamma: PPY, acinar: REG1A, ductal: CFTR, stellate: PDGFRB, endothelial: CLEC14A, immune: CCL3). The UMAP projection is the same as in the main Fig. 1b. (i) Genome browser tracks showing aggregate read density (scaled to uniform 1 × 106 read depth) for cells within each cell type cluster at hormone gene loci for endocrine islet cell types. The gene body of each gene is highlighted.

Source data

Back to article page