Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Systematic reconstruction of molecular pathway signatures using scalable single-cell perturbation screens

Abstract

Recent advancements in functional genomics have provided an unprecedented ability to measure diverse molecular modalities, but predicting causal regulatory relationships from observational data remains challenging. Here, we leverage pooled genetic screens and single-cell sequencing (Perturb-seq) to systematically identify the targets of signalling regulators in diverse biological contexts. We demonstrate how Perturb-seq is compatible with recent and commercially available advances in combinatorial indexing and next-generation sequencing, and perform more than 1,500 perturbations split across six cell lines and five biological signalling contexts. We introduce an improved computational framework (Mixscale) to address cellular variation in perturbation efficiency, alongside optimized statistical methods to learn differentially expressed gene lists and conserved molecular signatures. Finally, we demonstrate how our Perturb-seq derived gene lists can be used to precisely infer changes in signalling pathway activation for in vivo and in situ samples. Our work enhances our understanding of signalling regulators and their targets, and lays a computational framework towards the data-driven inference of an ‘atlas’ of perturbation signatures.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Performing large-scale Perturb-seq across six cell lines and five biological signalling contexts.
Fig. 2: Mixscale for weighting and analysing CRISPRi perturbation effects.
Fig. 3: Cross-cell line analysis of perturbation effects to extract pathway gene signatures.
Fig. 4: Validation and application of Perturb-seq derived interferon pathway signatures.
Fig. 5: Immune-related pathway activation in PBMCs from patients with COVID-19.
Fig. 6: Spatial enrichment of TGFβ pathway activation in mouse healing intestine.

Similar content being viewed by others

Data availability

Raw sequencing data that support the findings of this study have been deposited in GEO under accession code GSE281048. Processed data are available at Zenodo at https://zenodo.org/records/14518762 (ref. 119). Previously published datasets that were reanalysed here are available via either GEO (under accession codes GSE132080, GSE178429, GSE147405, GSE218033, GSE129390 and GSE169749) or the CZ CELLxGENE data portal117. Source data are provided with this paper.

Code availability

Software implementing our approach is freely available as an open-source R package Mixscale (https://github.com/satijalab/Mixscale). A vignette demonstrating the application of Mixscale is also available as an online resource (https://satijalab.github.io/Mixscale/).

References

  1. Tang, F. et al. mRNA-seq whole-transcriptome analysis of a single cell. Nat. Methods 6, 377–382 (2009).

    CAS  PubMed  Google Scholar 

  2. Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  3. Picelli, S. et al. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat. Methods 10, 1096–1098 (2013).

    CAS  PubMed  Google Scholar 

  4. Buenrostro, J. D., Wu, B., Chang, H. Y. & Greenleaf, W. J. ATAC-seq: a method for assaying chromatin accessibility genome-wide. Curr. Protoc. Mol. Biol. 109, 21.29.1–21.29.9 (2015).

    PubMed  Google Scholar 

  5. Farlik, M. et al. Single-cell DNA methylome sequencing and bioinformatic inference of epigenomic cell-state dynamics. Cell Rep. 10, 1386–1397 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  6. Stoeckius, M. et al. Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods 14, 865–868 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. Rotem, A. et al. Single-cell ChIP-seq reveals cell subpopulations defined by chromatin state. Nat. Biotechnol. 33, 1165–1172 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Mimitou, E. P. et al. Scalable, multimodal profiling of chromatin accessibility, gene expression and protein levels in single cells. Nat. Biotechnol. 39, 1246–1258 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Stubbington, M. J. T., Rozenblatt-Rosen, O., Regev, A. & Teichmann, S. A. Single-cell transcriptomics to explore the immune system in health and disease. Science 358, 58–63 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. Van Hove, H. et al. A single-cell atlas of mouse brain macrophages reveals unique transcriptional identities shaped by ontogeny and tissue environment. Nat. Neurosci. 22, 1021–1035 (2019).

    PubMed  Google Scholar 

  11. Velmeshev, D. et al. Single-cell genomics identifies cell type-specific molecular changes in autism. Science 364, 685–689 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. Srivatsan, S. R. et al. Massively multiplex chemical transcriptomics at single-cell resolution. Science 367, 45–51 (2020).

    CAS  PubMed  Google Scholar 

  13. Mulder, K. et al. Cross-tissue single-cell landscape of human monocytes and macrophages in health and disease. Immunity 54, 1883–1900.e5 (2021).

    CAS  PubMed  Google Scholar 

  14. Tabula Sapiens Consortium* et al. The Tabula Sapiens: a multiple-organ, single-cell transcriptomic atlas of humans. Science 376, eabl4896 (2022).

    Google Scholar 

  15. Jinek, M. et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816–821 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. Cong, L. et al. Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819–823 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  17. Shalem, O. et al. Genome-scale CRISPR-Cas9 knockout screening in human cells. Science 343, 84–87 (2014).

    CAS  PubMed  Google Scholar 

  18. Wang, T., Wei, J. J., Sabatini, D. M. & Lander, E. S. Genetic screens in human cells using the CRISPR-Cas9 system. Science 343, 80–84 (2014).

    CAS  PubMed  Google Scholar 

  19. Li, W. et al. MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens. Genome Biol. 15, 554 (2014).

    PubMed  PubMed Central  Google Scholar 

  20. Dixit, A. et al. Perturb-seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell 167, 1853–1866.e17 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. Adamson, B. et al. A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response. Cell 167, 1867–1882.e21 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  22. Jaitin, D. A. et al. Dissecting immune circuits by linking CRISPR-pooled screens with single-cell RNA-seq. Cell 167, 1883–1896.e15 (2016).

    CAS  PubMed  Google Scholar 

  23. Shifrut, E. et al. Genome-wide CRISPR screens in primary human T cells reveal key regulators of immune function. Cell 175, 1958–1971.e15 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. Gasperini, M. et al. A genome-wide framework for mapping gene regulation via cellular genetic screens. Cell 176, 377–390.e19 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. Replogle, J. M. et al. Mapping information-rich genotype-phenotype landscapes with genome-scale Perturb-seq. Cell 185, 2559–2575.e28 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. The Gene Ontology Consortium. Expansion of the Gene Ontology knowledgebase and resources. Nucleic Acids Res. 45, D331–D338 (2017).

    Google Scholar 

  27. Jassal, B. et al. The reactome pathway knowledgebase. Nucleic Acids Res. 48, D498–D503 (2020).

    CAS  PubMed  Google Scholar 

  28. Kanehisa, M., Furumichi, M., Tanabe, M., Sato, Y. & Morishima, K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 45, D353–D361 (2017).

    CAS  PubMed  Google Scholar 

  29. Liberzon, A. et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 1, 417–425 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Morris, J. A. et al. Discovery of target genes and pathways at GWAS loci by pooled single-cell CRISPR screens. Science 380, eadh7699 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. Yeo, N. C. et al. An enhanced CRISPR repressor for targeted mammalian gene regulation. Nat. Methods 15, 611–616 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. Börold, J. et al. BRD9 is a druggable component of interferon-stimulated gene expression and antiviral activity. EMBO Rep. 22, e52823 (2021).

    PubMed  PubMed Central  Google Scholar 

  33. Banno, T., Gazel, A. & Blumenberg, M. Effects of tumor necrosis factor-α (TNFα) in epidermal keratinocytes revealed using global transcriptional profiling. J. Biol. Chem. 279, 32633–32642 (2004).

    CAS  PubMed  Google Scholar 

  34. Zeng, C.-M., Chen, Z. & Fu, L. Frizzled receptors as potential therapeutic targets in human cancers. Int. J. Mol. Sci. https://doi.org/10.3390/ijms19051543 (2018).

  35. Sato, M. et al. Distinct and essential roles of transcription factors IRF-3 and IRF-7 in response to viruses for IFN-α/β gene induction. Immunity 13, 539–548 (2000).

    CAS  PubMed  Google Scholar 

  36. Kubiczkova, L., Sedlarikova, L., Hajek, R. & Sevcikova, S. TGF-β - an excellent servant but a bad master. J. Transl. Med. 10, 183 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. Barkett, M. & Gilmore, T. D. Control of apoptosis by Rel/NF-κB transcription factors. Oncogene 18, 6910–6924 (1999).

    CAS  PubMed  Google Scholar 

  38. Honda, K., Takaoka, A. & Taniguchi, T. Type I interferon [corrected] gene induction by the interferon regulatory factor family of transcription factors. Immunity 25, 349–360 (2006).

    CAS  PubMed  Google Scholar 

  39. Gordon, M. D. & Nusse, R. Wnt signaling: multiple pathways, multiple receptors, and multiple transcription factors. J. Biol. Chem. 281, 22429–22433 (2006).

    CAS  PubMed  Google Scholar 

  40. Pico, A. R. et al. WikiPathways: pathway editing for the people. PLoS Biol. 6, e184 (2008).

    PubMed  PubMed Central  Google Scholar 

  41. Sanson, K. R. et al. Optimized libraries for CRISPR-Cas9 genetic screens with multiple modalities. Nat. Commun. 9, 5416 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. Balliu, B. et al. An integrated approach to identify environmental modulators of genetic risk factors for complex traits. Am. J. Hum. Genet. 108, 1866–1879 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. Papalexi, E. et al. Characterizing the molecular regulation of inhibitory immune checkpoints with multimodal single-cell screens. Nat. Genet. 53, 322–331 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  44. Cook, D. P. & Vanderhyden, B. C. Context specificity of the EMT transcriptional response. Nat. Commun. 11, 2142 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. Rhead, B. et al. Global expression and CpG methylation analysis of primary endothelial cells before and after TNFa stimulation reveals gene modules enriched in inflammatory and infectious diseases and associated DMRs. PLoS ONE 15, e0230884 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  46. Hall, J. C. et al. Precise probes of type II interferon activity define the origin of interferon signatures in target tissues in rheumatic diseases. Proc. Natl Acad. Sci. USA. 109, 17609–17614 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. Kumaran Satyanarayanan, S. et al. IFN-β is a macrophage-derived effector cytokine facilitating the resolution of bacterial inflammation. Nat. Commun. 10, 1–16 (2019).

    CAS  Google Scholar 

  48. Tran, V. et al. High sensitivity single cell RNA sequencing with split pool barcoding. Preprint at bioRxiv https://doi.org/10.1101/2022.08.27.505512 (2022).

  49. Almogy, G. et al. Cost-efficient whole genome-sequencing using novel mostly natural sequencing-by-synthesis chemistry and open fluidics platform. Preprint at bioRxiv https://doi.org/10.1101/2022.05.29.493900 (2022).

  50. Squair, J. W. et al. Confronting false discoveries in single-cell differential expression. Nat. Commun. 12, 5692 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  51. Wang, T., Li, B., Nelson, C. E. & Nabavi, S. Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data. BMC Bioinform. 20, 40 (2019).

    Google Scholar 

  52. Replogle, J. M. et al. Maximizing CRISPRi efficacy and accessibility with dual-sgRNA libraries and optimal effectors. eLife https://doi.org/10.7554/elife.81856 (2022).

  53. Alerasool, N., Segal, D., Lee, H. & Taipale, M. An efficient KRAB ___domain for CRISPRi applications in human cells. Nat. Methods 17, 1093–1096 (2020).

    CAS  PubMed  Google Scholar 

  54. Gilbert, L. A. et al. Genome-scale CRISPR-mediated control of gene repression and activation. Cell 159, 647–661 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. Xu, H. et al. Sequence determinants of improved CRISPR sgRNA design. Genome Res. 25, 1147–1157 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  56. Bosch, B. et al. Genome-wide gene expression tuning reveals diverse vulnerabilities of M. tuberculosis. Cell 184, 4579–4592.e24 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  57. Hawkins, J. S. et al. Mismatch-CRISPRi reveals the co-varying expression-fitness relationships of essential genes in Escherichia coli and Bacillus subtilis. Cell Syst. 11, 523–535.e9 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. Jost, M. et al. Titrating gene expression using libraries of systematically attenuated CRISPR guide RNAs. Nat. Biotechnol. 38, 355–364 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  59. Simmons, S. K. et al. Mostly natural sequencing-by-synthesis for scRNA-seq using Ultima sequencing. Nat. Biotechnol. 41, 204–211 (2023).

    CAS  PubMed  Google Scholar 

  60. Ivashkiv, L. B. IFNγ: signalling, epigenetics and roles in immunity, metabolism, disease and cancer immunotherapy. Nat. Rev. Immunol. 18, 545–558 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  61. Witten, D. M., Tibshirani, R. & Hastie, T. A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 10, 515–534 (2009).

    PubMed  PubMed Central  Google Scholar 

  62. Jerby-Arnon, L. & Regev, A. DIALOGUE maps multicellular programs in tissue from single-cell or spatial transcriptomics data. Nat. Biotechnol. 40, 1467–1477 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  63. François-Newton, V. et al. USP18-based negative feedback control is induced by type I and type III interferons and specifically inactivates interferon α response. PLoS ONE 6, e22200 (2011).

    PubMed  PubMed Central  Google Scholar 

  64. Basters, A., Knobeloch, K.-P. & Fritz, G. USP18 - a multifunctional component in the interferon response. Biosci. Rep. https://doi.org/10.1042/bsr20180250 (2018).

  65. Oshima, S. et al. Interferon regulatory factor 1 (IRF-1) and IRF-2 distinctively up-regulate gene expression and production of interleukin-7 in human intestinal epithelial cells. Mol. Cell. Biol. 24, 6298–6310 (2004).

    CAS  PubMed  PubMed Central  Google Scholar 

  66. Harada, H. et al. Structurally similar but functionally distinct factors, IRF-1 and IRF-2, bind to the same regulatory elements of IFN and IFN-inducible genes. Cell 58, 729–739 (1989).

    CAS  PubMed  Google Scholar 

  67. Bien, J. & Tibshirani, R. Hierarchical clustering with prototypes via minimax linkage. J. Am. Stat. Assoc. 106, 1075–1084 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  68. Zhu, Q. et al. Single cell multi-omics reveal intra-cell-line heterogeneity across human cancer cell lines. Nat. Commun. 14, 8170 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  69. ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    Google Scholar 

  70. Alsamman, K. & El-Masry, O. S. Interferon regulatory factor 1 inactivation in human cancer. Biosci. Rep. https://doi.org/10.1042/bsr20171672 (2018).

  71. Pollaci, G. et al. Novel multifaceted roles for RNF213 protein. Int. J. Mol. Sci. https://doi.org/10.3390/ijms23094492 (2022).

  72. Grünvogel, O. et al. DDX60L is an interferon-stimulated gene product restricting hepatitis C virus replication in cell culture. J. Virol. 89, 10548–10568 (2015).

    PubMed  PubMed Central  Google Scholar 

  73. Karlsson, M. et al. A single-cell type transcriptomics map of human tissues. Sci Adv https://doi.org/10.1126/sciadv.abh2169 (2021).

  74. Kang, H. M. et al. Multiplexed droplet single-cell RNA-sequencing using natural genetic variation. Nat. Biotechnol. 36, 89–94 (2018).

    CAS  PubMed  Google Scholar 

  75. Kartha, V. K. et al. Functional inference of gene regulation using single-cell multi-omics. Cell Genom. https://doi.org/10.1016/j.xgen.2022.100166 (2022).

  76. Rosain, J. et al. Human IRF1 governs macrophagic IFN-γ immunity to mycobacteria. Cell 186, 621–645.e33 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  77. Irving, A. T. et al. Interferon regulatory factors IRF1 and IRF7 directly regulate gene expression in bats in response to viral infection. Cell Rep. 33, 108345 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  78. Lei, X. et al. Activation and evasion of type I interferon responses by SARS-CoV-2. Nat. Commun. 11, 3810 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  79. Hadjadj, J. et al. Impaired type I interferon activity and inflammatory responses in severe COVID-19 patients. Science 369, 718–724 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  80. Lee, J. S. et al. Immunophenotyping of COVID-19 and influenza highlights the role of type I interferons in development of severe COVID-19. Sci. Immunol. https://doi.org/10.1126/sciimmunol.abd1554 (2020).

  81. Toro, A. et al. Pin-pointing the key hubs in the IFN-γ pathway responding to SARS-CoV-2 infection. Viruses https://doi.org/10.3390/v14102180 (2022).

  82. Gadotti, A. C. et al. IFN-γ is an independent risk factor associated with mortality in patients with moderate and severe COVID-19 infection. Virus Res. 289, 198171 (2020).

    CAS  PubMed  Google Scholar 

  83. Karki, R. et al. Synergism of TNF-α and IFN-γ triggers inflammatory cell death, tissue damage, and mortality in SARS-CoV-2 infection and cytokine shock syndromes. Cell 184, 149–168.e17 (2021).

    CAS  PubMed  Google Scholar 

  84. COvid-19 Multi-omics Blood ATlas (COMBAT) Consortium. A blood atlas of COVID-19 defines hallmarks of disease severity and specificity. Cell 185, 916–938.e58 (2022).

    Google Scholar 

  85. Kong, L. et al. The landscape of immune dysregulation in Crohn’s disease revealed through single-cell transcriptomic profiling in the ileum and colon. Immunity 56, 444–458.e5 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  86. Parigi, S. M. et al. The spatial transcriptomic landscape of the healing mouse intestine following damage. Nat. Commun. 13, 828 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  87. Schubert, M. et al. Perturbation-response genes reveal signaling footprints in cancer gene expression. Nat. Commun. 9, 20 (2018).

    PubMed  PubMed Central  Google Scholar 

  88. Beck, P. L. et al. Transforming growth factor-β mediates intestinal healing and susceptibility to injury in vitro and in vivo through epithelial cells. Am. J. Pathol. 162, 597–608 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  89. Oshima, H. et al. Suppressing TGFβ signaling in regenerating epithelia in an inflammatory microenvironment is sufficient to cause invasive intestinal cancer. Cancer Res. 75, 766–776 (2015).

    CAS  PubMed  Google Scholar 

  90. Penn, J. W., Grobbelaar, A. O. & Rolfe, K. J. The role of the TGF-β family in wound healing, burns and scarring: a review. Int. J. Burns Trauma 2, 18–28 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  91. Datlinger, P. et al. Ultra-high-throughput single-cell RNA sequencing and perturbation screening with combinatorial fluidic indexing. Nat. Methods 18, 635–642 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  92. Xu, Z., Sziraki, A., Lee, J., Zhou, W. & Cao, J. Dissecting key regulators of transcriptome kinetics through scalable single-cell RNA profiling of pooled CRISPR screens. Nat. Biotechnol. https://doi.org/10.1038/s41587-023-01948-9 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  93. Lamb, J. et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313, 1929–1935 (2006).

    CAS  PubMed  Google Scholar 

  94. Calderon, D. et al. Landscape of stimulation-responsive chromatin across diverse human immune cells. Nat. Genet. 51, 1494–1505 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  95. Urrutia, A. et al. Standardized whole-blood transcriptional profiling enables the deconvolution of complex induced immune responses. Cell Rep. 16, 2777–2791 (2016).

    CAS  PubMed  Google Scholar 

  96. Jiang, P. et al. Systematic investigation of cytokine signaling activity at the tissue and single-cell levels. Nat. Methods 18, 1181–1191 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  97. Cui, A. et al. Dictionary of immune responses to cytokines at single-cell resolution. Nature https://doi.org/10.1038/s41586-023-06816-9 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  98. Goeva, A. et al. HiDDEN: a machine learning method for detection of disease-relevant populations in case-control single-cell transcriptomics data. Nat. Commun. 15, 1–15 (2024).

    Google Scholar 

  99. Song, B. et al. Decoding heterogenous single-cell perturbation responses. Nat. Cell Biol. https://doi.org/10.1038/s41556-025-01626-9 (2025).

  100. Xie, Y. et al. Comparative analysis of single-cell RNA sequencing methods with and without sample multiplexing. Int. J. Mol. Sci. https://doi.org/10.3390/ijms25073828 (2024).

  101. Rubin, A. J. et al. Coupled single-cell CRISPR screening and epigenomic profiling reveals causal gene regulatory networks. Cell 176, 361–376.e17 (2019).

    CAS  PubMed  Google Scholar 

  102. Liscovitch-Brauer, N. et al. Profiling the genetic determinants of chromatin accessibility with scalable single-cell CRISPR screens. Nat. Biotechnol. 39, 1270–1277 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  103. Pierce, S. E., Granja, J. M. & Greenleaf, W. J. High-throughput single-cell chromatin accessibility CRISPR screens enable unbiased identification of regulatory networks in cancer. Nat. Commun. 12, 2969 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  104. Mimitou, E. P. et al. Multiplexed detection of proteins, transcriptomes, clonotypes and CRISPR perturbations in single cells. Nat. Methods 16, 409–412 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  105. Wessels, H.-H. et al. Efficient combinatorial targeting of RNA transcripts in single cells with Cas13 RNA Perturb-seq. Nat. Methods 20, 86–94 (2023).

    CAS  PubMed  Google Scholar 

  106. Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587.e29 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  107. McGinnis, C. S. et al. MULTI-seq: sample multiplexing for single-cell RNA sequencing using lipid-tagged indices. Nat. Methods 16, 619–626 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  108. Satija, R. Barcoded plate-based single cell RNA-seq version 1. protocols.io https://doi.org/10.17504/protocols.io.nkgdctw (2018).

  109. Ahlmann-Eltze, C. & Huber, W. glmGamPoi: fitting Gamma-Poisson generalized linear models on single cell count data. Bioinformatics 36, 5701–5702 (2021).

    PubMed  Google Scholar 

  110. Mann, H. B. & Whitney, D. R. On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 18, 50–60 (1947).

    Google Scholar 

  111. Marigorta, U. M., Rodríguez, J. A., Gibson, G. & Navarro, A. Replicability and prediction: lessons and challenges from GWAS. Trends Genet. 34, 504–517 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  112. Seshadri, S. et al. Genome-wide analysis of genetic loci associated with Alzheimer disease. JAMA 303, 1832–1840 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  113. Yasuda, K. et al. Variants in KCNQ1 are associated with susceptibility to type 2 diabetes mellitus. Nat. Genet. 40, 1092–1097 (2008).

    CAS  PubMed  Google Scholar 

  114. Hosack, D. A., Dennis, G. Jr, Sherman, B. T., Lane, H. C. & Lempicki, R. A. Identifying biological themes within lists of genes with EASE. Genome Biol. 4, R70 (2003).

    PubMed  PubMed Central  Google Scholar 

  115. Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).

    CAS  PubMed  Google Scholar 

  116. Andreatta, M. & Carmona, S. J. UCell: robust and scalable single-cell gene signature scoring. Comput. Struct. Biotechnol. J. 19, 3796–3798 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  117. CZI Single-Cell Biology Program et al. CZ CELL×GENE Discover: a single-cell data platform for scalable exploration, analysis and modeling of aggregated data. Nucleic Acids Res. 53, D886–D900 (2025).

  118. Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  119. Jiang, L. et al. Systematic reconstruction of molecular pathway signatures using scalable single-cell perturbation screens. Zenodo https://doi.org/10.5281/zenodo.14518762 (2024).

Download references

Acknowledgements

We thank all members of the Satija Laboratory at New York Genome Centre for useful discussion. We acknowledge the authors of the external datasets used in this study for making their valuable resources publicly available. This work was supported by the Chan Zuckerberg Initiative (EOSS5-0000000381 and HCA-A-1704-01895 to R.S.) and the National Institutes of Health (RM1HG011014-02 and 1OT2OD033760-01 and 5R01HD096770 to R.S.).

Author information

Authors and Affiliations

Authors

Contributions

L.J., C.D., E.P. and R.S. conceived the research. C.D., E.P., I.M., H.-H.W. and H.Y. performed experimental work. L.J. performed the computational work and developed the software tool with guidance from R.S. N.I., G.L.-Y. and D.L. performed the Ultima sequencing and generated the simulated paired-end fastq data. All authors participated in interpretation and in writing the manuscript.

Corresponding author

Correspondence to Rahul Satija.

Ethics declarations

Competing interests

In the past 3 years, R.S. has received compensation from Bristol Myers Squibb, ImmunAI, Resolve Biosciences, Nanostring, 10x Genomics, Parse Biosciences and Neptune Bio. R.S. and H.-H.W. are co-founders and equity holders of Neptune Bio. H.-H.W. has been an employee at Neptune Bio since August 2023. N.I., G.L.-Y. and D.L. are employees and shareholders of Ultima Genomics. E.P. has been an employee at Parse Biosciences since December 2021 and owns stock in the company. The other authors declare no competing interests.

Peer review

Peer review information

Nature Cell Biology thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Optimization of a guide capture method compatible with Parse Biosciences Evercode Whole Transcriptome kits.

(a). Schematic of sgRNA capture, barcoding, and library preparation. Please see https://zenodo.org/records/14518762 for a full protocol. (b). Schematic showing the ___location of tested guide additive primer binding sites. (c). Primer efficiency for cDNA amplification, measured by qPCR with annealing temperature 65 °C. Guide additive with phosphorothioate (*) bonds was ultimately chosen. (d). Primer efficiency of chosen guide additive primer for cDNA amplification, measured by qPCR at different annealing temperatures. (e). Percentage of cells assigned to each guide classification when using the original Parse cDNA amplification annealing temperature (some cycles at 67 °C and some cycles at 65 °C) compared to our modified annealing temperature (all cycles at 65 °C). (f). RNA UMI counts and genes per cell when using the original Parse annealing temperature compared to our modified annealing temperature.

Source data

Extended Data Fig. 2 Validation of Mixscale perturbation scores across different sgRNAs and datasets.

(a) Scatter plots illustrating the relationship between the expression level of the perturbation targets (y axis) and the perturbation scores (x axis) in each cell. This plot is analogous to Fig. 2b but this time cells are stratified by their guide RNA identities instead. (b) Single-cell heatmap for STAT1 perturbation in three cell lines after IFNγ stimulation, split by sgRNA identities. (c) Comparison of Mixscale score and target gene expression estimated in an external CRISPRi dataset (Jost et al 2020 Nat. Biotechnol.). The figure displays the Mixscale score (y axis on the left) using black dots and the degree of knockdown of the target gene (y axis on the right) marked by red triangles. The x axis represents different sgRNAs, including the perfectly matched sgRNA (“_00”) and those with varying numbers of mismatched nucleotides. Plot shows that sgRNA that result in more effective knockdown also result in cells with higher Mixscale scores. (d) Comparison of Mixscale score and relative activity of sgRNAs. Similar plot as in (c), but instead the figure contrasts the Mixscale score (black dots) with the relative activity of the sgRNA (y axis on the right) marked by blue diamonds, a phenotypic measure of cellular growth defects measured from a viability screen. Plot shows that sgRNA with the highest phenotypic activity also yield cells with the highest Mixscale scores. Source numerical data are available in Source Data.

Source data

Extended Data Fig. 3 Mixscale perturbation scores correlate with target gene knockdown across varying perturbation strengths in IFNβ pathway.

Scatter plots illustrating the relationship between the expression level of the perturbation targets (y axis) and the perturbation scores (x axis) in each cell from the IFNβ pathway experiment. Plots are similar to Fig. 2b, but the perturbations are ordered based on the number of DEGs identified by standard Wilcoxon rank sum test (shown in parentheses for each panel) between the perturbed cells and non-targeting controls. Plots show that Mixscale perturbation scores correlate with the degree of knockdown for the target regulator for strong perturbations (with a large number of DEGs), and also for weaker perturbations (with fewer DEG). Source numerical data are available in Source Data.

Source data

Extended Data Fig. 4 False positive rate control and replication analysis of Mixscale differential expression testing.

(a). Comparison of false positive rates (FPRs) for the Mixscale weighted DE test (wmvReg), and performing the same test without weights (‘standard test’). FPRs are calculated based on alpha = 0.05, 0.01, and 0.005. Tests were performed on ‘null’ perturbations, generated by shuffling cell sgRNA labels after calculating Mixscale scores (Methods). Number of perturbations used for each pathway are: IFNβ (n = 52), IFNγ (n = 32), TGFβ (n = 15), and TNF (n = 29). (b). FPR calculations when shuffling is performed prior to calculating Mixscale scores. This situation is a conservative control. In a real dataset, the lack of an initially identified DEG set would abort the procedure (Methods) and assign uniform weights. Here, we force the assignment of Mixscale scores even though no DEG are identified after shuffling. Number of perturbations used for each pathway are: IFNβ (n = 52), IFNγ (n = 32), TGFβ (n = 15), and TNF (n = 29). (c). Boxplots for Mixscale DE test scores \({\chi }^{2}\) (= \({{Zscore}}^{2}\)) of the genes used in the calculation of these mis-specified scores, comparing methods with and without a leave-one-out (LOO) strategy (Methods). The red dashed line indicates the expected \({\chi }^{2}\) under the null. Number of observations (genes) used for each panel are: n10 = 3,206, n30 = 10,179, n50 = 16,975, and n100 = 33,956. In (a-c), each boxplot represents the distribution of FPR across all simulation replicates. The line inside each box indicates the median value, notches indicate the 95% confidence interval, central box indicates the interquartile range (IQR), whiskers indicate data up to 1.5 times the IQR, and all data points are shown as separate dots. (d). Replication rate of DE genes that were uniquely identified by Mixscale, compared to standard unweighted test. For example, in the IFNβ pathway dataset (Replicate 1), wmvReg and unweighted test uniquely identified 3,656 and 744 DEG, respectively, across regulators, and the plot shows the percentage that reproduce (P value cutoff for replication = 0.01) in the second replicate. (e) As in (d), but using Wilcoxon rank sum test instead of unweighted test. Source numerical data are available in Source Data.

Source data

Extended Data Fig. 5 Power analysis of differentially expressing genes and their reproducibility across different numbers of cell.

Analysis of how cell number affects DEG detection in IFNβ and IFNγ pathways. For each pathway, we downsampled our Perturb-seq data (replicate 1) to k cells (k = 30-1,000) per cell line per condition and performed differential expression analysis (Bonferroni-corrected P value < 0.05/30000). We conducted parallel analyses on replicate 2 (after downsampling to the same number of cells as in replicate 1) and calculated the percentage of DEGs from replicate 1 that were reproduced in replicate 2 (P < 0.01, see Methods). (a,c) Numbers and Reproducibility of DEGs with |fold change | ≥ 0.5 for IFNβ pathway (a) and IFNγ pathway (c). (b,d) Corresponding analyses for DEGs with |fold change | < 0.5 in IFNβ pathway (b) and IFNγ pathway (d).

Source data

Extended Data Fig. 6 Validation of Mixscale-specific differential expression genes using bulk RNA-seq data.

Comparison of DE genes identified by the Mixscale wmvReg test, and the standard unweighted test. (a-b) We generated independent bulk RNA-seq data of control and stimulated A549 cells (without genetic perturbations) for both IFNγ and TGFβ pathways, with three technical replicates (n = 3) for each condition. In our Perturb-seq data, DEG analysis was run on the IFNGR1 target gene (a), and the TGFBR1 target gene (b). Heatmaps showing the expression of shared genes (identified by both DE methods), and genes that were uniquely identified by Mixscale+wmvReg. For both pathways, <5 DEG were uniquely identified by the unweighted test. (c-d) Expression module score boxplots for both gene sets, quantifying the results in (a-b). These results indicate that the genes uniquely identified by Mixscale+wmvReg are indeed targets of the IFNγ (a,c) or TGFβ (b,d) pathway. Each boxplot represents the distribution of module scores across experimental replicates. The line inside each box indicates the median value, notches indicate the 95% confidence interval, central box indicates the interquartile range (IQR), whiskers indicate data up to 1.5 times the IQR, and all data points are shown as separate dots. Source numerical data are available in Source Data.

Source data

Extended Data Fig. 7 MultiCCA-derived perturbation programmes for IFNβ, TGFβ, and TNF pathways.

The first and second perturbation programmes for IFNβ, TGFβ, and TNF pathways, identified by MultiCCA. Each panel (a) IFNβ, (b) TGFβ, and (c) TNF, shows a heatmap where columns represent correlated perturbations within and across cell lines, and rows list the programme’s top 15 downregulated genes and top 5 upregulated genes. As in Fig. 3g, the colour gradient in the heatmap cells reflects the DE test Z-scores for each gene under each perturbation. See Table 3 for a complete lists of pathway programmes and the corresponding programme genes.

Source data

Extended Data Fig. 8 Comparative analysis of Perturb-seq and MSigDB pathway signatures.

(a). Venn diagrams showing the overlap between the MultiCCA programme 1 genes we identified and the MSigDB Hallmark gene lists for IFNγ, TGFβ, and TNF pathways (Methods). (b). Density plot for the log10(count per million) of MSigDB-unique, Perturb-seq-unique, and shared genes for each pathway, calculated in our Perturb-seq data (pseudobulk for all cells). The red dashed line indicates log10(CPM = 20). Plot shows that genes identified by MSigDB have a very different expression profile than those either unique identified by Perturb-seq or shared between the two databases (c). IFNβ module score comparing unstimulated and stimulated cells using the MSigDB-unique gene set. Plot shows that the MSigDB-unique gene sets effectively discriminate stimulated and control cells in only some cell types, in contrast with Perturb-seq gene sets in Fig. 4c. Source numerical data are available in Source Data.

Source data

Extended Data Fig. 9 Validation of Perturb-seq derived pathway signatures using cytokine-stimulated datasets.

(a-b). Evaluating complete pathway and pathway-exclusive gene sets. Plots are as in Fig. 4b, but run on datasets of cells that are stimulated with a single cytokine as a positive control. (a) shows the results from human CD14 monocytes stimulated with IFNγ, and (b) shows results from the DU145 cell line stimulated with TNF and results from the OVCA420 cells stimulated with TGFβ (Methods). In each case, our Perturb-seq pathway lists show enrichment, but there is also enriched signal for alternative pathways since pathway gene sets include overlapping genes. Once we restrict the analysis to pathway-exclusive gene sets, only the correct pathway exhibits evidence of enrichment. The enrichment tests used in (a-b) were two-sided Fisher’s exact tests. (c). The module score for IFNγ pathway genes, IRF1-associated genes, and IRF1-independent genes calculated in an IRF1-KO bat PakiT03 cell dataset (Methods). The IRF1-associated genes and IRF1-independent genes are identified using the IFNγ programme 1 and 2 in our Perturb-seq data (Methods). Source numerical data are available in Source Data.

Source data

Extended Data Fig. 10 Perturb-seq derived gene signatures show cell-type-specific enrichment in Crohn’s disease.

The gene set enrichment test for DEG identified for patients with Crohn’s disease (CD) in an external dataset (Kong et al 2023 Immunity) (Methods). The analysis includes inflamed and non-inflamed tissues from CD patients. Each row indicates a gene set from our Perturb-seq data, and each column indicates a cell type from which the DEGs are obtained. Two-sided Fisher’s exact tests were used for the enrichment test. The enrichment test odds ratio is represented by the size of the dot, and the enrichment test adjusted P value (after Benjamini–Hochberg correction) is represented by the gradient of the colour. Adjusted P-values less than 0.01 are labelled by asterisk. Source numerical data are available in Source Data.

Source data

Supplementary information

Supplementary Information

Supplementary Figs. 1–15.

Reporting Summary

Supplementary Tables

Supplementary Tables 1–5.

Source data

Source Data All Figures

Statistical source data.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, L., Dalgarno, C., Papalexi, E. et al. Systematic reconstruction of molecular pathway signatures using scalable single-cell perturbation screens. Nat Cell Biol 27, 505–517 (2025). https://doi.org/10.1038/s41556-025-01622-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41556-025-01622-z

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research