Extended Data Fig. 2: Dataset, individual, and subtype representation in meta-modules.

a) All seven datasets are represented in the generation of meta-modules. Clusters within datasets displayed similar gene score distributions (left), resulting in at least 40% of the clusters in each dataset being represented in the meta-modules (middle). Datasets also show comparable distributions of module activity overall (right). b) Developmental stages are equally represented in the meta-modules, with limited bias from pre-natal vs post-natal stages. Almost 50% of cluster markers of each developmental stage are represented in the meta-modules (left), with almost 96% overlap between the markers of post-natal vs pre-natal stages (middle). Only 18 of 225 modules harbor post-natal-specific markers (right), and notably none of these include our example modules (highlighted in red). c) Individual cluster markers well-represented in meta-modules. While individuals contributed clusters to meta-module generation to varying degrees (left), the cluster markers of each individual were represented in meta-modules at comparable levels (middle). While 9 individuals from Nowakowski 2017 did not produce cluster markers potentially due in part to their relatively low number of cells (< 103), over 75% of cluster markers for the remaining individuals were represented in modules (right), suggesting no features of the 48 individuals in this dataset were lost due to merging and subsequent module construction. d) Subtype diversity among module-positive cells, defined as cells displaying the top 90th percentile of module activity within its respective dataset. Histogram of subtypes represented across module-positive cells (left) and bar plot showing the subtype proportions within each module (right) demonstrate the large diversity of cell subtypes within module-positive cells, and the ability of modules to reflect gene programs orthogonal to conventional cell type-centric annotations. be) Sankey plot demonstrating the representation (if present) of integrated meta-atlas cluster markers (left) among meta-modules (right). Genes marking the same subtype were often included in different meta-modules, or not included in meta-module generation at all, further highlighting the ability of meta-modules to elucidate biological features not fully encapsulated by cluster markers.