Main

Acute lymphoblastic leukemia (ALL) is the most common pediatric cancer and leading cause of mortality1. Outcomes in B cell ALL (B-ALL) have improved drastically because of optimization of chemotherapy2, development of targeted therapies3,4,5,6,7 and genetically guided risk stratification8. In contrast, while outcomes have improved in T cell ALL (T-ALL), most persons who relapse are considered with low or favorable risk at diagnosis and few targeted therapies have successfully translated into the clinic9. Outcomes for persons with relapsed T-ALL are dismal as there are no effective salvage options. Accordingly, T-ALL is a disease where the goal is to use the most effective therapy at diagnosis. There is an urgent need to identify biologic risk factors to inform the development of targeted therapeutics and enable early identification of high-risk persons who need alternative treatment strategies10.

Systematic T-ALL classification used by the World Health Organization11 and International Consensus Classification12 relies on immunophenotype to characterize the maturation stage of T-ALL blasts and has not reproducibly been associated with clinical importance. Recent T-ALL genomic initiatives have refined the transcriptomic classification of T-ALL, providing a unifying framework to link genetic alterations with outcome13. Among the emerging biomarkers for high-risk disease are the high prevalence of tumor subclones that could contribute to relapse13,14,15 and clonal selection16,17 that occurs with treatment failure. Here, we used single-cell multiomics to map the tumor landscape of >595,000 T-ALL blasts to the full hierarchy of pediatric hematopoiesis. We identify and characterize a chemotherapy-resistant and steroid-resistant bone marrow progenitor-like (BMP-like) tumor population shared between high-risk persons across the immunophenotypic spectrum of T-ALL. We used single-cell multiomics, large-cohort bulk genomics and primary patient-derived xenograft (PDX) models to establish multiomic signatures for rapid risk assessment and test the sensitivities of BMP-like blasts to currently available targeted therapeutics.

Results

Multiomic developmental atlases define T-ALL cellular arrest

We studied 25 participants with early T cell precursor (ETP)-ALL, 5 participants with near-ETP-ALL and 10 participants with non-ETP-ALL with varied clinical response to treatment with AALL0434 (NCT00408005), an international phase 3 randomized Children’s Oncology Group (COG) trial that reported the best published outcomes for children and young adults with T-ALL10 (Fig. 1a,b). We selected participants who quickly went into minimal residual disease (MRD)-negative remission and were cured (n = 16), participants who had intrinsic chemotherapy resistance (refractory, enriched for cases with induction failure; n = 10) and participants who relapsed (n = 14) (Supplementary Tables 1 and 2). We had nine genetic subtypes represented, with the most common being the ‘ETP-like’ transcriptional subtype13 (n = 24), followed by T cell leukemia homeobox 3 (n = 5) and T-ALL protein 1 αβ-like (n = 3). We performed cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) and single-cell assay for transposase-accessible chromatin sequencing (scATAC-seq) on live cells sorted from cryopreserved diagnostic bone marrow aspirate (n = 32) or peripheral blood mononuclear cells (PBMCs; n = 8), capturing over 600,000 high-quality cells across both modalities (Fig. 1c,d, Extended Data Fig. 1a–j and Supplementary Tables 3 and 4). To robustly phenotype our single-cell dataset within the context of normal hematopoietic development (Fig. 1e,f), we assembled a multiomic reference map of healthy pediatric hematopoiesis, using normal thymus and bone marrow tissues collected from children (Fig. 2a–h and Supplementary Tables 5 and 6).

Fig. 1: Arrest states of T-ALL subtypes in reference to human hematopoiesis.
figure 1

a, Selection of n = 25 participants with ETP-ALL, n = 5 participants with near-ETP-ALL and n = 10 participants with non-ETP-ALL from the COG AALL0434 cohort (n = 1,411) based on response to induction therapy (day 29 MRD). b, UMAP representation of bulk RNA-seq data from n = 1,335 diagnostic T-ALL samples from COG AALL0434. Each point represents the bulk RNA-seq transcriptome for one participant. Participants selected for single-cell study are indicated by circular points. All participants with ETP are colored red. c, UMAP representation of CITE-seq (n = 271,603 cells) and scATAC-seq datasets (n = 332,663 cells; because of the size of the peak × cell matrix, 60,000 randomly downsampled cells are plotted). d, UMAP representation of healthy human hematopoiesis development reference trajectories, based on scRNA-seq (left; n = 49,623 cells) and scATAC-seq (right; n = 23,618 cells) data. The key stages of T cell development implicated are labeled. α/β, alpha–beta; γδ, gamma–delta. e, Arrest states of leukemic cells from 40 participants with T-ALL based on projection to healthy scRNA-seq (left) and scATAC-seq (right) reference. The D value from a two-sample Kolmogorov–Smirnov test is indicated to the side of brackets (*P < 2.2 × 10−16). Ten participants with T/M MPAL and ten participants with AML sequenced using identical assays are included as comparator groups (n = 60: ETP-ALL, 25; near-ETP-ALL, 5; non-ETP-ALL, 10; T/M MPAL, 10; AML, 10). f, Left, proportion of ETP blasts in four key T cell developmental stages, as compared to other participants with T-ALL. Right: proportion of ETP blasts in three key myeloid developmental stages, as compared to participants with T/M MPAL and AML. P values are based on a two-sided Mann–Whitney test. Results based on scRNA-seq projection are shown. The BMP stage encapsulates multipotent progenitors: HSPC, LMPP, CLP or ETP. The α/β stage encapsulates all cells that have moved past T cell commitment: DP, α/β, α/β (mature) or naive T. The box includes the median, hinges mark the 25th and 75th percentiles and whiskers extend 1.5 times the interquartile range (n = 60: ETP-ALL, 25; near-ETP-ALL, 5; non-ETP-ALL, 10; T/M MPAL, 10; AML, 10). Mono, monocyte; NK, natural killer.

Source data

Fig. 2: Multiomic developmental atlases to define cellular arrest state of leukemic blasts.
figure 2

a, Sample composition of scRNA-seq (n = 49,623 cells) and scATAC-seq (n = 23,618 cells) reference maps. b,c, UMAP representation of developmental reference trajectories, based on scRNA-seq (b; n = 49,623 cells) and scATAC-seq (c; n = 23,618 cells) data. The key stages of T cell development implicated in T-ALL and low-frequency thymic populations are labeled. d, Cell type composition of scRNA-seq (n = 49,623 cells) and scATAC-seq (n = 23,618 cells) reference maps. e,f, UMAP representation of scRNA-seq (n = 49,623 cells) and scATAC-seq (n = 23,618 cells) developmental reference trajectories. A total of 26 distinct cell populations defined by clustering and marker gene expression are labeled. g, Stem, T and S-phase marker gene expression (scRNA-seq, log-normalized counts) within developing T cell populations (n = 49,623 cells). h, Stem, T cell lineage and effector TF motif accessibility (scATAC-seq and chromVAR-Z) within developing T cell populations. Cells were randomly downsampled to n = 100 cells per group.

Source data

We mapped T-ALL blasts to the hierarchy of human hematopoietic development, overcoming the limitations of using references restricted by bone marrow18, thymus19 or murine thymocyte signatures15. To assess the integrity of our reference mapping method, we additionally projected single-cell RNA sequencing (scRNA-seq) and scATAC-seq data from ten acute myeloid leukemia (AML) and ten T–myeloid mixed-phenotype acute leukemia (T/M MPAL) samples onto our reference, finding that AML blasts projected to the monocytic lineage and T/M MPAL blasts projected to both monocytic and T cell lineages, as expected (Fig. 1e). Notably, all subtypes of leukemia showed a spectrum of developmental arrest states with notable overlap at the subpopulation level. Arrest in a multipotent progenitor-like (hematopoietic stem and progenitor cell (HSPC) or lymphomyeloid primed progenitor (LMPP)) state represented a shared cell state in ETP-ALL, T/M MPAL and AML. Pro-T cell-like arrested blasts were shared between T/M MPAL and all three subtypes of T-ALL, while pre-T cell-like arrested blasts were shared between near-ETP-ALL and non-ETP-ALL.

A primary hypothesis for ETP-ALL treatment resistance has been that ETP-ALL retains myeloid populations that confer resistance to ALL-directed therapy20,21. We found the median frequency of myeloid-projecting blasts (granulocyte macrophage progenitor (GMP), dendritic cell (DC) progenitor, plasmacytoid DC, conventional DC or monocyte) to be 0.167% in participants with ETP-ALL, in contrast to 16% and 82.5% in participants with T/M MPAL and AML, respectively (Fig. 1f). GMP-like and monocytic-like populations comprised <1% of blasts in 18/25 participants with ETP-ALL and were not detected in 5/25 participants, strongly supporting a lymphoid progenitor origin of ETP-ALL blasts and use of ALL-directed therapy.

We hypothesized that the developmental arrest spectrum of near-ETP-ALL would be similar to that of ETP-ALL. Therefore, we enriched our cohort with near-ETP cases from participants who relapsed (5/5 participants). The developmental arrest spectrum of near-ETP-ALL was closer to non-ETP-ALL, an unexpected finding given that near-ETP-ALL is defined by the ETP immunophenotype with the exception of high (>75%) CD5 expression11,12. To assess whether the divergence of ETP and near-ETP developmental arrest spectra could explain the diverging clinical responses of near-ETP and ETP cases observed within AALL0434 (Extended Data Fig. 2a), we performed differential expression (DE) and differential chromatin accessibility (DA) analyses (Extended Data Fig. 2b,c).

Near-ETP blasts had downregulation of stem and myeloid markers (SPINK2, C1QTNF4 and HLA-DRA) and upregulation of T cell receptor (TCR)-related machinery (LAT, CD3E, CD28, LCK and PTCRA) as compared to ETP-ALL, in line with commitment to the T cell lineage22,23. Near-ETP blasts also had increased expression and motif accessibility of TCF7 and LEF1, two transcription factors (TFs) central to T cell lineage commitment in healthy thymocytes24. Within healthy thymocytes, we observed the expression and accessibility of TCF7 and LEF1 to peak at the pre-T cell stage (Extended Data Fig. 2d). To ask whether TCF7 and LEF1 have an analogous T cell lineage-specific function in T-ALL blasts, we computationally inferred the regulators and targets of TCF7 and LEF1 in ETP-ALL, near-ETP-ALL and non-ETP-ALL by constructing subtype-specific transcriptional regulatory networks (Extended Data Fig. 2e–g). Our analysis predicted the repression of TCF7 and LEF1 expression by stem-related TFs25,26 (MEF2C, IRF1 and LYL1) in ETP-ALL and activation of TCF7 and LEF1 by core TFs of T cell commitment27 (BCL11B, SIX6 and TCF7L2) in near-ETP-ALL and non-ETP-ALL. Our analysis also predicted the direct regulation of CD5 expression and TCR-related genes (LAT, DNTT, MAL and CD3E) by TCF7 and LEF1 in near-ETP blasts, connecting TCF7 and LEF1 regulation to the CD5-bright phenotype observed within diagnostic flow cytometry. Elevated expression of our predicted TCF7 and LEF1 regulon was observed in bulk RNA-seq data of n = 110 participants with ETP and n = 168 participants with near-ETP (Extended Data Fig. 2h). Participants with ETP and near-ETP with higher expression of the TCF7 and LEF1 signature had more favorable outcomes within AALL0434 (Extended Data Fig. 2i–k), with the TCF7 and LEF1 signature having prognostic significance independent of MRD and central nervous system (CNS) status in participants with ETP-ALL (92.7% versus 79.3% 5-year overall survival (OS), P = 0.024; Extended Data Fig. 2j). Taken together, our results highlight functionally significant transcriptional regulatory circuits that underlie minute immunophenotypical differences.

BMP-like blasts are highly enriched in refractory ETP-ALL

High rates of treatment resistance, rather than eventual relapse, contribute to poor outcome in ETP-ALL28. Within AALL0434, participants with ETP-ALL were 7.1-fold less likely to achieve remission (<5% bone marrow blasts by morphology) after the first month of chemotherapy (day 29 end of induction (EOI)) and >2-fold more likely to have flow-detectable MRD compared to participants without ETP.

We enriched our single-cell (SC) cohort for treatment-refractory ETP cases to identify tumor cell states associated with initial treatment resistance. We first compared the initial developmental arrest state of ten participants with high-EOI-MRD (>20%) ETP to ten participants with EOI-MRD-negative (0.1%) ETP (Fig. 3a,b) and asked whether response to chemotherapy was correlated with the fraction of actively cycling tumor cells. Although we observed a small enrichment of cycling cells in MRD-negative participants (20% versus 16%; Fig. 3c), we observed all treatment-refractory participants to have significant proportions of cycling cells, prompting us to investigate whether treatment response could be explained by differences in cell arrest state. Within scRNA-seq and scATAC-seq, we observed that participants with high-EOI-MRD ETP had an enrichment of blasts at the HSPC, LMPP, common lymphoid progenitor (CLP) or ETP developmental stages (Fig. 3d). Multilineage potency is retained29,30 in these states; we termed this cell state ‘BMP-like’. In contrast, participants with MRD-negative ETP had an enrichment of blasts in the pro-T cell and pre-T cell stages. These states represent specification to the T cell lineage31,32; we henceforth refer to them as ‘T-specified’. We observed the proportion of tumor blasts in BMP-like and T-specified developmental stages to associate with day 29 MRD (Fig. 3d,e), event-free survival (EFS; Fig. 3f) and OS (Fig. 3f–h) and these populations were inversely correlated (Extended Data Fig. 3a–e) in single-cell-sequenced participants with ETP-ALL.

Fig. 3: Treatment resistance in ETP-ALL is associated with a BMP-like population.
figure 3

a, Selection of ten high-MRD and ten MRD-negative (control) participants from n = 123 participants with ETP-ALL diagnosed within COG AALL0434. b, OS of n = 10 high-MRD versus n = 10 MRD-negative participants profiled using single-cell genomics. The P value for the log-likelihood statistic of a Cox proportional hazard test with day 29 MRD as a covariate is shown. c, Proportion of non-cycling (G1: MRD-negative, n = 23,913 cells; high-MRD, n = 25,727 cells), cycling (S: MRD-negative, n = 2,862 cells; high-MRD, n = 2,274 cells) and dividing (G2M: MRD-negative, n = 6,125 cells; high-MRD, n = 5,499 cells) cells in n = 10 high-MRD versus n = 10 MRD-negative participants. The chi-squared test statistic and P value were computed by comparing the proportion of cells in the G1 versus non-G1 phase in each group (n = 3,500 cells per participant, 33,500 cells per group). d,e, Arrest states of leukemic cells from n = 10 high-MRD and n = 10 MRD-negative participants with ETP-ALL based on projection to the healthy scRNA-seq (left) and scATAC-seq (right) reference trajectory: proportion ranges from 0 to 0.3 (d) and from 0 to 0.5 (e). The D values from a two-sample Kolmogorov–Smirnov (K–S) test are indicated by the brackets (*P < 2.2 × 10−16; n = 25 participants with ETP-ALL: n = 6 induction failure, n = 4 high risk, n = 7 intermediate risk and n = 3 low risk). f, Proportion of ETP blasts in BMP-like and T-specified (T-spec) developmental stages in n = 25 single-cell-sequenced participants with ETP-ALL. The P values from two-sided t-tests are shown above the brackets. Alive indicates participants (n = 16/25) who were alive at last known follow-up (mean = 2,091 days). No event indicates participants (n = 13/25) who had no event at last known follow-up (mean = 2,108 days). g, Proportions of BMP-like blasts in n = 25 single-cell-sequenced ETP-ALL blasts were stratified into high (n = 11 participants) and low (n = 14 participants) using k-means clustering. h, Stratification of n = 25 single-cell-sequenced participants with ETP-ALL by BMP-like proportion (high: >30%, n = 11; low: <30%, n = 14) determined through k-means clustering (k = 2). i, DE surface markers between BMP-like blasts from non-responding participants and T-specified blasts from responding participants. The input matrix to DE analysis was a matrix of G1-phase ETP-ALL blasts with an equal number of cells per participant (n = 1,711 cells per participant and n = 42,775 cells in total). j, DEGs between BMP-like blasts from non-responding participants and T-specified blasts from responding participants. The input matrix to DE analysis was a matrix of G1-phase ETP-ALL blasts with an equal number of cells per participant (n = 1,711 cells per participant and n = 42,775 cells in total). k,l, Normalized gene (k) and surface marker (l) expression for DE BMP-like genes across cell subpopulations in T-ALL, AML, MPAL and healthy donors (HD). Cells were randomly downsampled to n = 200 in each comparison group. m, Stratification of n = 113 participants with ETP from AALL0434 using 119 DEGs between BMP-like and T-specified blasts obtained in d. The prognostic value of the BMP-like signature (BMP-DE-sig) in multivariate analysis (with day 29 MRD, CNS status, WBC count and age at diagnosis) is shown below the Cox proportional hazard log-likelihood P value with day 29 MRD as a covariate. Left: stratification with signature alone. Right: stratification with signature and EOI MRD status.

Source data

To test whether the molecular signatures of BMP-like and T-specified blasts could be used to stratify other participants with ETP within AALL0434, we performed DE and DA analyses to generate cell-type-specific molecular signatures (Supplementary Tables 79). We found that BMP-like blasts from non-responding participants with ETP had a high surface protein expression of myeloid and stem cell markers, including CD33, CD123, HLA-DR and CD34, and a low expression of T cell lineage surface protein markers, including CD3, CD4, CD2 and CD10 (Fig. 3i). The top DE genes (DEGs) for BMP-like blasts included stem cell31,33,34 (C1QTNF4, CD44, LGALS1 and HOPX), B cell lineage (IGLL1, IGKC and IGHM) and myeloid (S100A4 and SPINK2) lineage markers (Fig. 3j). Although relatively enriched in BMP-like blasts, the myeloid lineage genes and surface markers were expressed at lower levels than myeloid blasts and healthy myeloid cells (Fig. 3k–l). The top DE TFs in BMP-like blasts included TFs associated with self-renewal (MEF2C, HOXA3HOXA6, HOXA9HOXA11, MEIS1, HHEX and SPI1)26,35,36 and recovery from genotoxic stress37 (BCL11A) (Extended Data Fig. 3f). Several of the BMP-like genes involved in self-renewal (MEF2C, HOXA9 and FLI1) or T cell developmental block38,39 (SPI1) also had increased motif accessibility (Extended Data Fig. 3g,h). In contrast, the top DEGs for T-specified blasts included T cell differentiation proteins (MAL), TCR components (LAT, CD3E, LCK and TRGC2) and thymic honing molecules (CD99). The top DE TFs in T-specified blasts included NOTCH targets40 (HES4 and HES1) and core T cell commitment TFs22,23 (BCL11B, TCF7 and RUNX1). Interestingly, the top T-specified DEG by fold change was PRSS2, a trypsin gene lying adjacent to the TCRβ locus and becomes highly expressed as the TCRβ locus prepares for rearrangement41. We also assessed how the analogous populations differ between responding and non-responding participants through DEG and DE TF motif analyses (Extended Data Fig. 4a–d) and gene set enrichment analysis (GSEA) (Extended Data Fig. 5a–e), revealing that BMP-like non-responding blasts exhibited a similar differential phenotype compared to both T-specified responding and T-specified non-responding blasts.

Previous single-cell studies have also suggested that a portion of thymic seeding progenitors (TSPs) are more stem-like than others, retaining similar markers of multipotency seen in BMP-like ETP blasts19,29,42. We transcriptomically matched BMP-like ETP-ALL blasts to the most stem-like subset of TSPs (Extended Data Fig. 6a–c), revealing two putative mechanisms of treatment resistance. We found corticosteroid receptor (NR3C1) expression to be directly correlated with T cell differentiation state and NR3C1 expression highly predictive of EOI MRD (Extended Data Fig. 6d,e). BMP-like blasts had a significantly lower expression of NR3C1, rendering them highly resistant to prednisolone (>80-fold increase in half-maximal inhibitory concentration (IC50)) in vitro (Extended Data Fig. 6f). We also predicted that BMP-like blasts would have high self-renewal capacity, much like TSPs43. We observed the upregulation of leukemic stem-cell-related transcriptional programs in BMP-like blasts, associated with >100-fold resistance in vitro to T-ALL induction agents (Extended Data Fig. 6g,h).

Next, we sought to understand the transcriptional regulation of the BMP-like and T-specified states. We examined the TF motifs that were enriched in regions correlated with promoter accessibility of the BMP-like and T-specified DEGs (Supplementary Table 10). Motifs for TFs involved in the maintenance of progenitors (SPI1 and GABPA) were enriched in the BMP-like state and TFs involved in T cell specification (TCFL2 and LEF1) were enriched in the T-specified state (Extended Data Fig. 7a). We constructed transcriptional regulatory networks using SCENIC+ (ref. 44), which nominated MEF2C as the top activator of the BMP-like state. We found that TCF7L2 and BCL11B were the most specific activators of the T-specified state but also served repressive roles in the BMP-like state, suggesting reciprocal regulation by these TFs (Extended Data Fig. 7b,c). Transcriptional activation of these regulons in the bulk RNA-seq cohort demonstrated that they are associated with OS across subtypes (Extended Data Fig. 7d–f).

Lastly, we applied the BMP-like DEG signature to stratify n = 110 bulk-sequenced participants with AALL0434 ETP-ALL. Our 119-gene BMP-like signature was predictive of OS (66.4% versus 94.3% 5-year OS, P = 5 × 10−6) and EFS (68.2% versus 94.3% 5-year EFS, P = 2 × 109) independent of EOI MRD and CNS status (Fig. 3m, left). Our BMP-like DEG signature provided robust stratification when MRD status was considered as a binary variable (Fig. 3m, right), identifying a subset of EOI-MRD-negative participants with inferior survival (high BMP-like and MRD-negative, 75.0% 5-year OS) and a group of EOI-MRD-positive participants with favorable survival outcomes (low BMP-like and MRD-positive, 94.9% 5-year OS).

Distinct mutations underlie BMP-like and T-specified states

We next hypothesized that BMP-like and T-specified ETP blasts would harbor distinct mutational spectra. We leveraged the intersection of single-cell derived phenotypes with structural variant and single-nucleotide variant (SNV) calls (Supplementary Table 11) to identify associated drivers of these cell states (Fig. 4a,b and Extended Data Fig. 8a–c). BMP-like leukemias harbored fusion products known to drive high HOXA cluster expression, including MLLT10, KMT2A, NUP214 and direct HOXA::TCR fusions. T-specified leukemias had ZFP36L2 fusions (involved in cell-cycle control during T cell β selection), TLX1 and TLX3. Notably, all participants with ETP who died from disease on AALL0434 had BMP-like associated fusions (Fig. 4b). On the SNV level, BMP-like high participants had recurrent mutations in TF and signaling pathways (that is, JAK3, NRAS, WT1, ETV6 and SATB1) while T-specified high participants had mutations in T cell lineage regulators (that is, IL7R, NOTCH1 and RUNX1). Top BMP-like associated SNVs were associated with inferior outcome, while top T-specified SNVs showed the opposite trend (Fig. 4c–e and Extended Data Fig. 8d,e).

Fig. 4: Divergent mutational spectra associated with T-specified and BMP-like state.
figure 4

a, Recurrently mutated genes (left) and fusion drivers (right) among single-cell-sequenced participants with ETP-ALL versus tumor BMP-like and T-specified proportions among 25 participants with ETP-ALL. b, Recurrent driver fusions (n = 16, recurrent in >2 samples; left) and recurrently mutated genes (n = 26, recurrent in >5 samples; right) among 113 bulk-sequenced participants with ETP-ALL13 and associated BMP-like and T-specified signature scores. c, Top recurrently mutated genes associated with BMP-like and T-specified cell states in n = 110 ETP-ALL samples from AALL0434. Significance was assessed using a two-sided Wilcoxon rank-sum test. d,e, OS (d) and single-cell signature scores (e) among n = 110 bulk-sequenced participants with ETP-ALL grouped by mutation status within BMP-like (ETV6, NRAS, HLA-C and SAT1B; mutant (mut), n = 39; WT, n = 71) and T-specified (NOTCH1, IL7R, RUNX1 and SUZ12; mut, n = 60; WT, n = 50) associated genes. The P value for the log-likelihood statistic of a Cox proportional hazard test with day 29 MRD as a covariate is shown at the bottom left of the Kaplan–Meier curves. The box includes the median, hinges mark the 25th and 75th percentiles and whiskers extend 1.5 times the interquartile range. f, Arrest states of leukemic cells from NOTCH1-WT (n = 19) and NOTCH1-activated (n = 6) leukemias based on scRNA-seq and scATAC-seq developmental trajectories. The D value from a two-sample Kolmogorov–Smirnov test is indicated to the side of brackets (*P < 2.2 × 10−16). g, Proportion of leukemic cells in BMP-like and T cell lineage (pro-T cell to αβ) cell states in NOTCH1-WT and NOTCH1-mut in single-cell-profiled participants (n = 19 NOTCH1-WT and n = 6 NOTCH1-mut). Significance was assessed using a two-sided Wilcoxon rank-sum test. h, T-specified signature score among 110 bulk-sequenced participants with ETP-ALL from AALL0434. Participants are divided into three groups by NOTCH1 mutation status. P values from a two-sided Mann–Whitney test are shown. The box includes the median, hinges mark the 25th and 75th percentiles and whiskers extend 1.5× the interquartile range (n = 69 NOTCH1-WT, n = 23 NOTCH1-mut (single) and n = 18 NOTCH1-mut (two or more)). i, Summed VAF of NOTCH1 mutations in participants with AALL0434 ETP with 0 (WT, n = 69), 1 (n = 23) and two or more NOTCH1 (n = 18) mutations. P values from a two-sided Mann–Whitney test are indicated. The box includes the median, hinges mark the 25th and 75th percentiles and whiskers extend 1.5× the interquartile range. j, OS of n = 110 participants with AALL0434 ETP by NOTCH mutation status. The P value for the log-likelihood statistic of a Cox proportional hazard test with day 29 MRD as a covariate is shown to the bottom left.

Source data

In line with its essential role in T cell lineage commitment, the most recurrently altered gene associated with either cell state was NOTCH1. Within our single-cell cohort, we observed that NOTCH1-mutant participants (n = 6) had divergent arrest spectra compared to participants with wild-type (WT) NOTCH1 (n = 19) (Fig. 4f), with a notable depletion of BMP-like blasts (Fig. 4g). We found that T-specified blasts had DE NOTCH activation pathways compared to BMP-like blasts (Extended Data Fig. 8f) and NOTCH1 activation represented a universal marker of treatment sensitivity within the larger bulk-sequenced cohort. Of 110 bulk-sequenced participants with ETP-ALL, 41 harbored NOTCH1 mutations, with 18 having two or more mutations (range: 2–5). Participants with two or more NOTCH1 mutations had higher T-specified signature scores (Fig. 4h), aligning with an elevated NOTCH1 variant allele frequency (VAF; >50% versus <50%) (Fig. 4i). Remarkably, all 18 participants with two or more NOTCH1 mutations in the bulk-sequenced cohort were alive at last known follow-up, outperforming participants with single NOTCH1 mutations and WT NOTCH1 (Fig. 4j).

Interestingly, in EOI-MRD-negative participants with BMP-like associated drivers (KMT2A and MLLT10 fusions), two of three harbored tumors with multiple activating NOTCH1 mutations, suggesting that pan-tumor NOTCH1 activation can drive differentiation away from the BMP-like state (Fig. 5a). To study NOTCH1-mutant subclones at the single-cell level, we performed genotyping of transcriptomes (GoT) on two participants harboring a total of seven unique activating NOTCH1 mutations (Fig. 5b). We successfully detected 7/7 NOTCH1 mutations in scRNA-seq libraries, corroborating bulk-derived variant calls (Fig. 5c,d). We found that NOTCH1-mutant cells were predominantly in the T-specified state and identified hundreds of leukemic blasts carrying two distinct NOTCH1 mutations, likely resulting from selection for NOTCH1 mutation in separate alleles (Fig. 5e,f). We found a direct connection between NOTCH1 mutation dosage and the T-specified cell state, whereby blasts with two unique mutations had the highest expression of T-specified genes and lowest expression of BMP-like genes (Fig. 5g,h). Lastly, we analyzed the RNA-seq data of DND-41 and THP-6 cell lines treated with NOTCH pathway γ-secretase inhibitors45,46 (GSIs), finding that NOTCH pathway inhibition induced exposure-dependent transcriptomic shifts toward the BMP-like cell state (Fig. 5i,j). Taken together, these data offer high-resolution insight into how NOTCH1 mutations alter T-ALL developmental arrest and cement NOTCH1 mutation status as a critical biomarker for response to conventional therapy.

Fig. 5: Subclonal NOTCH1 mutations additively contribute to differentiation toward the T-specified state.
figure 5

a, Identification of n = 7 subclonal NOTCH1 mutations in n = 3 MRD-negative participants with BMP-like associated fusion drivers. b, Left: experimental workflow for detection of NOTCH1-mut leukemic blasts in scRNA-seq libraries through GoT. cb, cell bar code; umi, unique molecular identifier. c, Summary statistics from n = 9,314 genotyped cells from seven independent GoT experiments. P values from a chi-square test are shown (***P < 2.2 × 10−16). Right: expression of NOTCH1-mut UMIs amongst ETP blast and non-ETP blast populations in scRNA-seq data. P values from a two-sided Mann–Whitney test are shown (***P < 2.2 × 10−16). The median of both populations and 90th percentile of UMI reads in non-blast populations are indicated within each violin plot. frac, fraction. d, GoT detection of 7/7 mut transcripts determined from bulk WES and WGS (n = 7,754 cells; n = 5,625 NOTCH1-mut: PAVLKA, n = 3,352; PAWGWD, n = 2,273). e, Fraction of BMP-like and T-specified cells within NOTCH1-mut cells (n = 5,625 cells: PAVLKA, n = 3,352; PAWGWD, n = 2,273). f, Detection of n = 1,971 cells harboring two unique NOTCH1 mutations within leukemic blasts from PAVLKA (n = 1,649) and PAWGWD (n = 322). g, Association of BMP-like and T-specified signature scores with NOTCH1 mutation dosage in single cells. Cells are binarized into zero (WT; PAVKLA, n = 134; PAWGWD, n = 2,365), one (PAVKLA, n = 1,545; PAWGWD, n = 1,739) and two (PAVKLA, n = 1,649; PAWGWD, n = 322) mutations. P values from a two-sided Mann–Whitney test are shown. h, Association of NOTCH1-mut transcript expression and BMP-like and T-specified signatures (n = 7,754 genotyped single cells from PAWGWD and PAVLKA are plotted). The Pearson correlation coefficient and P value for Pearson’s product moment are indicated to the top right. i, Transcriptome similarity of GSI-treated DND-41 T-ALL cells (n = 3 per condition) with T cell developmental stages identified in scRNA-seq data. j, Upregulation of BMP-like transcriptomic signature in GSI-treated DND-41 (GSE173872, n = 3 per condition) and THP-6 (near-ETP; GSE138659, n = 3 per condition) T-ALL cell lines. Cell lines were scored using the 119 BMP-like signatures established in Fig. 2. P values from a two-sided t-test are shown.

Source data

BMP-like genetic signatures risk-stratify non-ETP-ALL

Relapsed and refractory T-ALL is nearly universally fatal. Given that BMP-like blasts are highly resistant to conventional T-ALL therapy, we wondered whether an analogous, less differentiated subpopulation could be responsible for treatment resistance and relapse in non-ETP-ALL. Analysis of bulk RNA-seq data from the AALL0434 non-ETP cohort supported this hypothesis, with DE between 355 MRD-positive and 714 MRD-negative non-ETP-ALL cases revealing gross differences in differentiation state (Supplementary Table 12). Within this analysis, MRD-negative participants overexpressed markers of the α/β stage (CD1B, CD1E, MAL, CD8A, PTCRA, RAG1 and RAG2), while MRD-positive participants expressed immature forms of the TCR (TRGC1 and TRGC2) and stem-related TFs (HHEX and LYL1). We used single-cell multiomics data to determine whether these differences were mediated by cell state or cell proportion differences.

We stratified our ten participants without ETP into two groups: six EOI-MRD-negative participants with complete response (CR) and four EOI-MRD-positive participants (EOI MRD > 0.1%). We observed that non-CR participants had an enrichment of cell states before T cell commitment (pro-T cell, CLP, LMPP, megakaryocyte–erythroid progenitor (MEP) and HSPC), while CR participants had an enrichment of cells in postcommitment states (CD4+CD8+ double-positive (DP) or α/β) (Fig. 6a,b). Because precommitted blasts represent a continuum of cell states from BMP-like to pro-T cell-like, we compared the distribution of precommittment blasts in CR participants and non-CR participants. We found that MRD-positive participants harbored a strong enrichment of BMP-like blasts, which were nearly absent in CR participants (Fig. 6c). DE analysis generated signatures for non-ETP precommitted and BMP-like blasts (Supplementary Tables 13 and 14), which both stratified participants without AALL0434 ETP independent of MRD, with the BMP-like signature having slightly better stratification (Fig. 6d).

Fig. 6: A consensus 17-gene BMP-like signature predicts OS across all subtypes of T-ALL.
figure 6

a, Arrest states of leukemic cells from CR (n = 6) and MRD-positive (n = 4) participants with non-ETP-ALL based on projection to healthy scRNA-seq (left) and scATAC-seq (right) reference trajectory. b, Proportion of non-ETP blasts in precommitment (pre-commit, all cells before the pre-T cell stage) and postcommitment (post-commit, after the pre-T cell stage) developmental arrest state. P values from a two-sided Mann–Whitney test are indicated (n = 68,801 cells; total cells: CR, 52,971; MRD-positive, 15,830; precomitted blasts: CR, 7,152; MRD-positive, 11,047). c, Arrest states of precommitted non-ETP blasts in CR (n = 6) and MRD-positive (n = 4) participants. BMP-like encapsulates all cells that possess multipotent potential (HSPC, LMPP, CLP or ETP). d, Kaplan–Meier plot showing OS of participants with non-ETP-ALL in AALL0434 when binarized using signatures (sig) derived from precommitted non-ETP blasts and BMP-like non-ETP blasts. The P value for the log-likelihood statistic of a Cox proportional hazard test with day 29 MRD as a covariate is shown to the bottom left. e, Overlap of ETP BMP-like and non-ETP BMP-like DEGs to create consensus signature for risk stratification in AALL0434 (fully sequenced) and AALL1231 (partially sequenced). BMP-like DEGs were filtered for mean log2 fold change (FC) > 0.9 between ETP and non-ETP comparisons. f, Expression score of BMP-17 signature score and BMP-17 marker genes within bone marrow and thymus scRNA-seq reference (n = 49,623 cells). Multipotent BMP populations with high BMP-17 expression are circled. g,h, Kaplan–Meier plot showing OS of bulk RNA-seq participants with T-ALL in AALL0434 (n = 1,335 participants) and AALL1231 (n = 75 participants) binarized using the BMP-17 signature. The prognostic value of the BMP-17 signature in multivariate analysis (with day 29 MRD, CNS status, WBC count and age at diagnosis) is shown below the Cox proportional hazard log-likelihood P value with day 29 MRD as covariate. i,j, Change in BMP-like and T-specified signature scores in AALL0434 diagnosis and relapse sample pairs (n = 27: near-ETP, 4; non-ETP, 23). P values from a two-sided paired t-test are shown. The box includes the median, hinges mark the 25th and 75th percentiles and whiskers extend 1.5× the interquartile range.

Source data

We next intersected the molecular signature of BMP-like blasts obtained from participants with ETP-ALL and non-ETP-ALL, revealing a shared BMP-like gene set (‘BMP-17’) composed of 17 marker genes typically expressed in stem, myeloid and B cell progenitors (Fig. 6e,f and Supplementary Table 15). We applied BMP-17 to five different clinical scenarios for risk stratification, observing robust risk stratification in all instances (Fig. 6g,h). BMP-17 powered risk stratification within the smaller, partially sequenced AALL1231 cohort and was prognostic independent of EOI MRD and CNS status within the fully sequenced AALL0434 cohort, including participants stratified by ETP status. To further test whether BMP-like subpopulations were enriched at relapse time points, we performed RNA-seq on 27 relapsed T-ALL cases within AALL0434. We identified a consistent enrichment of BMP-like gene signatures and decrease in T-specified signatures within relapse cases, further supporting the use of BMP-like signatures for risk stratification (Fig. 6i,j).

We next determined the surface immunophenotype of BMP-like blasts using CITE-seq data, revealing a nine-marker phenotype (‘BMP-surface-9’) that reflected similar lineage aberrancy to BMP-17 (Fig. 7a–c and Supplementary Table 16). We found that RNA expression of BMP-surface-9 genes robustly stratified AALL0434 and AALL1231 T-ALL cohorts (Fig. 7d–h) and further validated BMP-surface-9 in AALL0434 diagnostic flow cytometry cases (n = 99), finding a robust correlation of clinically used surface markers with the BMP-17 gene signature (Fig. 7i,j).

Fig. 7: Clinical utility of the BMP-like surface immunophenotype in risk stratifying participants with T-ALL.
figure 7

a, Overlap of ETP BMP-like and non-ETP BMP-like DE surface markers to create consensus surface marker signature for risk stratification in AALL0434 (fully sequenced) and AALL1231 (partially sequenced). Positive surface markers were filtered for log2FC > 0.5 and adjusted P value < 0.001. Negative surface markers were filtered for log2FC < −0.5 and adjusted P value < 0.001. b, Aggregate signature score of BMP-surface-9 signature (AUC of positive markers − AUC of negative markers) calculated using AUCell in scRNA-seq reference (n = 49,623 cells). The T cell developmental trajectory is indicated with an arrow. The progenitor populations are circled. c, RNA expression of BMP-surface-9 marker genes within scRNA-seq reference of normal hematopoiesis (n = 49,623 cells). Positive marker genes are shown in the top row; negative marker genes are shown in the bottom row. Left: the AUC of positive and negative surface marker genes within healthy hematopoiesis. dh, Kaplan–Meier plot showing the OS of bulk RNA-seq participants with T-ALL in AALL0434 and AALL1231. Participants in each analysis were binarized using RNA-seq-derived expression of BMP-surface-9 signature genes. Participants are grouped by subtype, with non-subtyped participants (n = 194) grouped with participants without ETP. The prognostic value of the BMP-surface-9 signature in multivariate analysis (with day 29 MRD, CNS status, WBC count and age at diagnosis) is shown below the Cox proportional hazard log-likelihood P value with day 29 MRD as a covariate. i, Pearson correlation of percentage expression of T, pan, stem and myeloid flow cytometric markers among n = 99 participants without ETP. P values from a correlation test are shown. j, Average percentage expression of stem or myeloid (CD117, CD34, HLA-DR, CD13 and CD33) and T (CD4, CD1A, CD3, CD5, CD8 and CD2) in BMP-high versus BMP-low non-ETP cases. BMP-high and BMP-low were defined using bulk RNA-seq and matched with flow cytometric values by unique sequence index. P values from a two-sided Mann–Whitney test are shown. The box includes the median, hinges mark the 25th and 75th percentiles and whiskers extend 1.5× the interquartile range (n = 56 BMP high; n = 43 BMP low).

Source data

Because non-pediatric T-ALL is enriched for both treatment-refractory cases and ETP phenotype, we next applied BMP-17 to young adult (age ≥ 18) cases on AALL0434. This analysis isolated a subset of high-BMP cases (Extended Data Fig. 9a–c) with high rates of EOI MRD and induction failure (Extended Data Fig. 9d,e) and reduced OS and EFS (Extended Data Fig. 9f). Lastly, we sought to determine whether we could make these signatures more parsimonious using machine learning regression, narrowing the BMP-17, BMP-surface-9 and BMP-119 signatures to 6, 3 and 20 genes, respectively, while maintaining or improving prognostic stratification (Extended Data Fig. 9g–i). Together, our results support a treatment paradigm for T-ALL involving early genetic screening for the chemotherapy-refractory BMP-like phenotype.

BMP-like blasts are uniquely sensitive to BCL-2 inhibition

The universal existence of BMP-like populations across treatment-refractory T-ALL cases prompted us to develop a pipeline for the discovery of BMP-like directed targeted therapy. To support the modeling of BMP-like therapy response, we first expanded blasts from 22 single-cell-sequenced participants in NOD scid gamma (NSG) mice (Fig. 8a). scRNA-seq on engrafted blasts from 16 participants indicated strong retention of specific features (Fig. 8b), with BMP-high and BMP-low participants maintaining their respective phenotypes after engraftment (Fig. 8c–f).

Fig. 8: Nomination and preclinical validation of targeted therapy against BMP-like blasts.
figure 8

a, Total engrafted PDX by subtype, ETP-ALL PDX by MRD status and BMP-like proportion. P values from a two-sided proportion test are shown (n = 22 PDX models: ETP, 9; near-ETP, 5; non-ETP, 6). b, UMAP representation of n = 16 primary participant samples and n = 16 corresponding PDX models profiled using scRNA-seq (n = 131,168 cells: primary, 93,458; PDX, 37,710). PDX engrafted blasts are connected to their primary sample by arrows. c, Proportion of BMP-projected blasts (HSPC, LMPP or CLP) in n = 16 participant–PDX pairs. Left: samples are ordered by the proportion of BMP-projected blasts in the primary sample. Participants with detectable blasts (>1%) are considered BMP-positive, while participants with >25% blasts are considered BMP high. USI, unique study identifier. d, Proportion of T-specified projected blasts (pro-T cell and pre-T cell) in each participant–PDX pair (n = 16). e, Difference between BMP-like and T-specified signatures scored using AUCell on bulk RNA-seq samples from n = 16 participants. f, OS and EFS swimmer plot of n = 16 participants with paired PDX models. Events are labeled. Relapse is indicated by R. g, Computational screening approach used to identify targetable genes within BMP-like blasts. A total of 552 BMP-like blast-specific DEGs (FDR < 0.05) were overlapped with drug target and dependency databases and ranked on the basis of the number of database hits and DE scores. The top ten targets by aggregate score are highlighted in red. h, A panel of 40 drugs was tested on PDX engrafted blasts from n = 5 BMP-high and n = 5 BMP-low participants (n = 10 participants: ETP, 4; near-ETP, 2; non-ETP, 4). i, Representative dose–response curves for n = 4 nominated therapeutics that showed differential activity in BMP-high (n = 3) versus BMP-low (n = 2) leukemias. j, Relative activity of drugs active in BMP-high versus BMP-low leukemias (n = 5 each). Mean values are shown. k, Correlations between drug sensitivity (−log2IC50) and the scRNA-seq derived BMP-like percentage (top) and the BMP-like signature score computed using 119 DEGs on bulk RNA-seq data (bottom). Bottom: the bulk RNA-seq correlations include the data from this study (n = 10) and data from Lee et al.47 (venetoclax, n = 28; nelarabine, n = 25; prednisolone, n = 107; mercaptopurine, n = 101). Spearman’s correlations and significance are shown.

Source data

We performed computational screening for targets specific to BMP-like blasts (Fig. 8g). We queried 552 BMP-like genes against three drug target databases (TTD, DrugIDB and OpenTargets), one transcriptomic-based compound screening database (LINCS1000) and one cancer gene vulnerability database (DepMap). The consensus results nominated four druggable surface proteins (CD44, LGALS1, ITGA4 and CD74), three homeostatic enzymes (S100A4, BCL-2 and Hsp90), two signal transduction molecules (Syk and Btk) and one TF (BCL-11A) (Extended Data Fig. 10a–d and Supplementary Table 17).

To test these computational predictions, we first performed in vitro drug screening using an established panel of 40 leukemia active drugs (Fig. 8h and Supplementary Table 18). PDX-expanded blasts from five of the 16 participants were screened using a stromal cell coculture system and dose–response curves were generated for each compound (Fig. 8i). Of 40 compounds, nine were active in all five PDX models and eight were active in at least one PDX model (Extended Data Fig. 10e,f). These compounds had different activity across BMP-high participants (n = 3) and BMP-low participants (n = 2) (Fig. 8i and Supplementary Table 19). After adding an additional five PDX models (n = 5 BMP-high and n = 5 BMP-low), we confirmed that BMP-high participants had increased sensitivity to venetoclax and navitoclax, while BMP-low participants were more sensitive to conventional cytotoxics (prednisolone, mercaptopurine and daunorubicin) (Fig. 8j). These associations were observed using bulk transcriptomic signatures across an integrated cohort47 (Fig. 8k and Extended Data Fig. 10g). Although BTK returned as a four-database hit, comparable ibrutinib sensitivity was seen in BMP-like and T-specified blasts, likely resulting from DE of Tec kinases48 between immature and mature T cells (Extended Data Fig. 10h).

The strong in vitro activity of BCL-2 inhibitors against BMP-like blasts prompted us to initiate in vivo efficacy studies in BMP-high (PATTDP, n = 6) and BMP-low (PAUNDK, n = 9) PDX models (Extended Data Fig. 10i–m). In the peripheral blood, venetoclax treatment resulted in the halting of disease progression in BMP-low models compared to control (Extended Data Fig. 10j). However, after the conclusion of the study, BMP-low PDX models still harbored notable residual disease within the bone marrow (>38% blasts) and spleen (>4% blasts) (Extended Data Fig. 10k). In contrast, venetoclax treatment resulted in the robust clearance of disease in BMP-high PDX (Extended Data Fig. 10k–l), with a reduction in blasts beyond our limit of detection (<0.01%) in the majority of PATTDP PDX models of the bone marrow and spleen (Extended Data Fig. 10m and Supplementary Table 21). Our results support further clinical testing of BCL-2 inhibitors in refractory BMP-like T-ALL.

Discussion

Our study reports a comprehensive mapping of T-ALL to healthy human hematopoiesis. We report the surprising discovery that T cell leukemias differing drastically by bulk immunophenotype are linked at the subpopulation level. Our integrated analysis identifies a shared BMP-like population tightly associated with treatment failure in ETP-ALL, near-ETP-ALL and non-ETP-ALL. This subpopulation can represent <5% of blasts at diagnosis, illustrating the limitations of current bulk-level tumor classification schemes. The cell of origin of ETP-ALL remains unknown and it is widely hypothesized that the transformation occurs in a T cell progenitor that maintains its immature cell state49. However, experimental work in mice demonstrated that dedifferentiation of T cell-committed blasts can also lead to murine ETP-ALL50. Interestingly, BMP-like blasts are also found in myeloid and mixed-phenotype leukemias, raising the possibility of one common cell of origin, which could be inferred through linage-tracing approaches. Supporting evidence indicating a shared progenitor cell of origin includes an enrichment of TF and signaling gene mutations within BMP-like T-ALL blasts (similar to myeloid leukemias), non-T cell lineage marker expression and shared drug sensitivity profiles with myeloid leukemia stem cells. The unique opportunity to intersect our single-cell data with large-cohort bulk whole-exome sequencing (WES) and whole-genome sequencing (WGS) data allowed us to the associate BMP-like phenotype and genotype. Within AALL0434, the NOTCH pathway was mutated in >70% of cases. Most NOTCH1-activating mutations (indels, SNVs and stop, frameshift and splice variants) were associated with superior 5-year OS and EFS; however, intronic NOTCH1 SNVs and intragenic deletions were associated with worse outcomes. Our data isolate transcriptomic signatures from leukemic blasts carrying two co-occurring NOTCH1 mutations, revealing a direct connection between NOTCH1 mutation dosage and T cell lineage differentiation and refining our understanding of the functional transcriptomic changes that occur with NOTCH mutation.

Although our study was heavily focused on pediatric T-ALL, our findings are perhaps even more relevant in the adult setting, where the ETP phenotype represents up to 52% of cases and 5-year survival rates are <50%. Detection of high-risk cases within young adults treated on AALL0434 using BMP-like signatures supports the hypothesis that BMP-like blast-mediated treatment failure extends beyond pediatric T-ALL cases. Multiple studies of pediatric51 (ALL-BFM 2000) and adult52 (LALA-94 and GRAALL-2003) trials found that NOTCH pathway activation was associated with favorable outcomes, further supporting a common mechanism of treatment resistance mediated by NOTCH1-WT BMP-like subpopulations. Lastly, our single-cell multiomic reference maps of human hematopoiesis present a valuable resource to further delineate the impact of developmental heterogeneity in human leukemia by enabling a higher-resolution dissection of rare phenotypes beyond what is possible through the deconvolution of bulk transcriptomic data. Use of these reference maps within five subtypes of acute leukemia underlined a BMP-like arrest state shared among lymphoid, myeloid and mixed-phenotype leukemic disease, highlighting an opportunity for further study. For instance, these reference maps and gene signatures could be used to study tumor evolution in the context of relapsed tumors and serial samples. Likewise, additional profiling is needed to identify the characteristics of blast states among the recently established transcriptionally defined T-ALL subtypes such as the broader category of ETP-like leukemia13 in relation to the ETP, near-ETP and non-ETP immunophenotypic classification.

Collectively, our study identifies a rare but clinically important BMP-like subpopulation, which represents a promising therapeutic target for relapsed and refractory persons with T-ALL. Single-cell approaches on a carefully selected cohort were uniquely powered to isolate the BMP-like gene signature for risk stratification and therapeutic targeting, illustrating how high-resolution single-cell analyses are needed to supplement high-throughput bulk genomic approaches for understanding clinically relevant tumor biology. Further studies are needed to demonstrate the clinical utility of the prognostic gene signatures across a range of populations and mutational subtypes. Lastly, future mechanistic studies are expected to clarify the actions of the drugs identified to understand their specificity and the subsequent alteration of leukemic phenotypes to demonstrate their translational applicability.

Methods

AALL0434 participant identification and clinical annotation

COG studies AALL0434 (NCT00408005) and AALL1231 (NCT02112916) were approved by the National Cancer Institute Cancer Evaluation and Therapeutic Program, the US Food and Drug Administration, the Pediatric Central Institutional Review Board (IRB) and local IRBs at all participating centers. Written informed consent was obtained from study participants and, when appropriate, their legal guardians, in accordance with the Declaration of Helsinki. All participant data were deidentified and written informed consent was obtained to publish the indirect identifiers in the present manuscript. Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Secondary genomic studies were approved by the Children’s Hospital of Philadelphia (CHOP) IRB. In total, 40 cases from AALL0434 (Supplementary Table 1) and eight healthy thymus and bone marrow controls (Supplementary Tables 5 and 6) were selected for single-cell study. The healthy thymus and bone marrow used for this work were residual tissues after collection for clinical care. Leukemia samples were bone marrow or blood samples collected and banked for COG trials. Within AALL0434, ETP status was centrally assessed in diagnostic bone marrow or peripheral blood samples using 8–9-color multiparameter flow cytometry26. ETP was defined as having lymphoblasts that were CD8CD1a (<5% positive), weakly expressed CD5 (either <75% positive or median intensity more than one log less than mature T cells) and expressed one or more myeloid or stem cell markers (>25% positive) including CD13, CD33, CD34, CD117 and HLA-DR (ref. 13). Subjects meeting the ETP immunophenotypic criteria but with stronger expression of CD5 were classified as near-ETP. Subjects with neither ETP nor near-ETP were defined as non-ETP. MRD was assessed using 8–9-color flow cytometry and was performed using established methods at a COG flow cytometry reference laboratory (University of Washington or Johns Hopkins University).

Processing of T-ALL diagnosis samples

Peripheral blood or bone marrow aspirate samples were thawed at 37 °C, treated with 1:10 (v/v) 1 mg ml−1 DNase I (Sigma-Aldrich, D4513) for 90 s at 37 °C, resuspended in 10 ml of Iscove’s modified Dulbecco’s medium (IMDM) + 2% FBS and centrifuged (160g for 5 min). Samples were retreated with DNase I and resuspended in fluorescence-activated cell sorting (FACS) buffer (Ca2+-free and Mg2+-free PBS + 1% BSA). Cell number and viability were recorded using a Countess II cell counter (Invitrogen). More than 1 million live cells were aliquoted for tail-vein injection into NSG mice, with the remaining stained with DAPI (Invitrogen, D1306) and subjected to FACS sorting (FACSAria Fusion, BD).

scRNA-seq and CITE-seq library preparation

FACS-sorted DAPI-negative live cells were centrifuged and resuspended in cell staining buffer (BioLegend, 420201) at 45 μl per million cells. Cells were blocked with Human TruStain FcX (BioLegend, 422301) at 5 μl per million cells (4 °C, 15 min). After blocking, cells were stained with a TotalSeq-A antibody cocktail (30 min, 4 °C). Cells were washed three times using cell staining buffer (BioLegend, 420201) and resuspended in PBS + 0.04% BSA. Cells were counted using a Countess II cell counter. A total of 17,000 cells per sample were then loaded onto 10x Genomics Chromium controller and processed with the Chromium NEXT GEM single-cell 3′ reagent kit (version 3.1). GEX libraries were constructed using the 10x Genomics library preparation kit following the instructions. Antibody-derived tag (ADT) libraries were constructed using the KAPA HiFi HotStart ReadyMix kit (Kapa Biosystems, KK2601). The following program was used for ADT library PCR: 98 °C for 2 min, 14–15 cycles of 98 °C for 20 s, 60 °C for 30 s and 72 °C for 20 s, followed by 72 °C for 5 min and a hold at 4 °C. Library quality was checked using the Agilent high-sensitivity DNA kit (Agilent, 5067-4626) and Bioanalyzer 2100. Libraries were quantified using the dsDNA high-sensitivity assay kit (Invitrogen, Q33231) on a Qubit fluorometer and quantified using the qPCR-based KAPA quantification kit (Kapa Biosystems, KK4844). Libraries were sequenced on an Illumina NovaSeq 6000 with 28:8:0:87 paired-end format.

scATAC-seq library preparation

DAPI-negative live cells were centrifuged at 300g (5 min at 4 °C), mixed in 45 μl of lysis buffer and incubated (3 min on ice). Next, 50 μl of prechilled wash buffer was added without mixing and centrifuged immediately at 300g (5 min at 4 °C). Then, 95 μl of supernatant was discarded, 45 μl of diluted nuclei buffer (10x Genomics) was added and the sample was centrifuged (300g; 5 min at 4 °C). The nuclear pellet was then resuspended in 7 μl of prechilled diluted nuclei buffer and the nuclear concentration was determined using a Countess II cell counter. A total of 7,000–20,000 nuclei were used for the transposition reaction in bulk, loaded onto the 10x Genomics Chromium controller and processed with the Chromium NEXT GEM scATAC reagent kit (version 1.1). Library quality was checked using the Agilent high-sensitivity DNA kit and Bioanalyzer 2100. Libraries were quantified using the dsDNA high-sensitivity assay kit on a Qubit fluorometer and quantified using the qPCR-based KAPA quantification kit. Libraries were sequenced on an Illumina NovaSeq 6000 with 49:8:16:49 paired-end format.

Expansion and profiling of T-ALL blasts in PDX

NSG mice (RRID:IMSR_JSX:005557) were used for all experiments. For the development of PDX models, we injected ~106 blasts from viably frozen participant samples (bone marrow or blood) per mouse to develop primagrafts (Supplemental Table 22). PDX-expanded blasts were isolated from the spleen or bone marrow. Frozen samples were thawed (37 °C), resuspended in IMDM + 2% FBS and treated with DNase I twice. Cells were washed twice with RPMI medium, resuspended in flow buffer, stained with DAPI and anti-human CD45 antibody (BD Pharmingen, 555485) and subjected to FACS sorting (FACSAria Fusion, BD). DAPI-negative hCD45+ sorted cells were stained with 10x Genomics 3′ CellPlex multiplexing solution, washed three times and immediately processed using the 10x Genomics Chromium controller and the Chromium NEXT GEM single-cell 3′ reagent kit (version 3.1). The 3′ GEX libraries were constructed using the 10x Genomics library preparation kit. CellPlex libraries were constructed using the 10x Genomics 3′ CellPlex kit. Library quality was checked using the Agilent high-sensitivity DNA kit and Bioanalyzer 2100. Libraries were quantified using the dsDNA high-sensitivity assay kit on a Qubit fluorometer and quantified using the qPCR-based KAPA quantification kit. Libraries were sequenced on an Illumina NovaSeq 6000 with 28:8:0:87 paired-end format.

CD34+ progenitor isolation from infant or pediatric thymi

Pediatric thymi were obtained and used according to and with the approval of the CHOP IRB. Thymus tissue was mechanically disrupted and treated with liberase (0.2 mg ml−1, 30 min at 37 °C; Roche) with intermittent shaking, as previously described19. Thymocytes were resuspended into flow buffer, sorted into DAPI-negative lineage-negative CD34+CD1A fractions and subjected to scRNA-seq and scATAC-seq.

Projection onto healthy reference trajectory

Participant-derived cells were projected onto the healthy reference trajectory using the MapQuery function in Seurat 4.0.5. For scRNA-seq data, participant and healthy control data were coembedded into a low-dimensional space using the default anchor-based canonical correlation analysis (CCA) method in Seurat 4.0.5 (30 dimensions, 2,000 anchor features) and cell type label transfer was performed on a sample-by-sample basis using the TransferData function. For scATAC-seq data, peaks from participant and healthy reference data were merged using the mergePeaks module from scATAC-pro53 and peak × cell matrices with merged peaks were reconstructed for each participant with the scATAC-pro reConstMtx module. This allowed for participant and healthy control data to be coembedded into a low-dimensional space analogous to the scRNA-seq data.

AALL0434 ETP-ALL stratification using BMP-119 signature

BMP-like and T-specified DEGs were stringently filtered using cutoffs of false discovery rate (FDR) < 0.001 and average log2 fold change (log2FC) > 0.5, leaving 66 BMP-like DEGs and 53 T-specified DEGs. The z-score-based signature scoring was performed on 110 bulk-sequenced participants with ETP-ALL with BMP-like DEGs as positive features and T-specified DEGs as negative features. For each participant, the mean T-specified feature z-score was subtracted from the mean BMP-like feature z-score, with a score of >0 being interpreted as more BMP-like than T-specified. This cutoff was selected to compare participants on the basis of a relative enrichment of either phenotype. Participants were binarized by BMP-like signature score (BMP-like > T-specified versus T-specified > BMP-like) and OS and EFS were compared using the Cox proportional hazard model with day 29 MRD and CNS status taken as covariates using the survfit function from survival 3.2-13 (‘survfit(Surv(time.survival, status.survival) ~ high.BMP + D29.MRD + D29.CNS.status’).

Integration of single-cell signatures with mutation calls

Bulk RNA-seq data for n = 110 ETP samples with corresponding WES and WGS mutation calls were scored using 66 BMP-like DEGs and 53 T-specified DEGs using AUCell 1.12.0. For 1,490 mutant genes in 110 ETP samples, the number of samples carrying mutations was quantified and the mean BMP-like area under the curve (AUC) and T-specified AUC were calculated. Mutant genes observed in ≥5 samples with mean VAF > 0.05 were plotted for visualization. Classification of genes was derived from a previous bulk genomics study on ETP-ALL. For fusion drivers, the mean BMP-like AUC, the mean T-specified AUC, the percentage of participants with positive EOI MRD, the percentage of participants that died during the trial and the number of unique fusion partners were calculated.

Identification of a consensus BMP-like gene signature

BMP-like DEGs from participants with ETP-ALL (n = 56 BMP-like versus T-specified) and participants with non-ETP-ALL (n = 445 BMP-like versus postcommitment) were overlapped and the average log2FC was calculated. A total of 17 genes with average log2FC > 0.9 were retained as a consensus BMP-17 signature. We performed AUC-based signature scoring using AUCell 1.12.0 (with the top 25% of expressed genes) on bulk RNA-seq diagnostic T-ALL samples from two independent COG trials using BMP-17 DEGs. We then binarized participants on the basis of AUCell signature score and used the Cox proportional hazard model with EOI MRD and CNS status taken as covariates using the survfit function from survival 3.2-13 (‘survfit(Surv(time.OS, status.OS) ~ BMP-17 + high.BMP-17 + D29.MRD + CNS.status). In each case, the top half of participants was compared to the bottom half of participants.

In silico drug screening against BMP-like blasts

Drug–target data from two independent drug target databases (TTD54 and DrugIDB55) and a third database (OpenTargets56) that focuses on next-generation targets were overlapped with BMP-like DEGs (log2FC > 0.2; adjusted P < 0.01). Targetable gene products were given a score of 1 for each database in which a resulting hit was obtained. To search for drugs that could specifically modify the BMP-like state, we inputted top BMP-like DEGs and TFs (n = 56) and top T-specified DEGs and TFs (n = 62) into the LINCS1000 (ref. 57) database under default parameters. Perturbation results were filtered in R to filter compound-mediated perturbations for compounds with defined targets, statistical significance (log10FDR > 1), effect size (normalized connectivity score > 0.8), specificity to BMP-like state (raw connectivity score > 0) and activity in two or more leukemia cell lines. Non-compound perturbations were filtered for statistical significance (log10FDR > 1) and effect size (normalized connectivity score > 0.8) and further separated into gene overexpression and gene knockdown (including short hairpin RNA knockdown, clustered regularly interspaced short palindromic repeats knockout and ligand-based perturbation) classes. BMP-like DEGs targeted by top compound perturbations and/or genes with overexpression or knockdown were given a score of 1. Lastly, we identified BMP-like DEGs that showed increased dependency in leukemia cell lines (n = 59) compared to non-leukemia and non-lymphoma cell lines (n = 1,052) in the cancer dependency map58 (DepMap) portal. Genes with negative dependency scores in leukemia cell lines (mean dependency score < −0.1), dependency FC > 2 and >25% expression in BMP-like blasts were assigned a score of 1. Next, BMP-like DEGs with log2FC > 1 were assigned a score of 1 and genes with log2FC between 0.5 and 1 were assigned a score of 0.5. Beyond high expression change, we prioritized BMP-like DEGs with high percentage expression in BMP-like blasts; genes with >80% expression were assigned a value of 1, whereas genes with 50–80% expression were assigned a value of 0.5. Finally, we ranked genes with high statistical significance (adjusted P < 1 × 10−100 was given a score of 1; adjusted P < 1 × 10−50 was given a score of 0.5). The sum of DE evidence (ranging from 0 to 3) and database evidence (ranging from 0 to 5) was taken to rank BMP-like DEGs for follow-up experimental studies.

In vitro drug screening with leukemia active drug panel

Human leukemia blasts were collected from mouse spleen and enriched using an immunomagnetic isolation kit (StemCell Technologies, 19849) and screened with a panel of 40 leukemia active drugs (Supplementary Table 18) using a previously described imaging-based assay with a stromal cell coculture system59.

Nomination of BMP-like specific drugs from drug screening

PDX-expanded blasts from five participants were screened in a stromal coculture system and dose–response curves were generated for each compound with the primary readout being cell viability (as a percentage of control). We defined ETP active drugs as compounds with IC50 < 1,000 nM and categorized each compound as not active (active in 0/4 participants with ETP), partially active (active in 1–3 participants with ETP) or active (active in 4/4 participants with ETP). We then compared IC50 values for ETP active compounds and used three comparisons to nominate drugs that were differentially active in high-BMP participants: BMP-high and MRD-positive (n = 3) versus BMP-low and MRD-negative (n = 2: one ETP and one non-ETP); BMP-high and MRD-positive (n = 2) versus BMP-low (n = 1); BMP-high and MRD-positive (n = 3) versus BMP-low (n = 1). Drugs with differential activity in BMP-high participants in all three comparisons were nominated as BMP-specific drugs. The sensitivity of these drugs was confirmed using PDX-expanded blasts from five additional participants, for a total of five BMP-high models and five BMP-low models.

scRNA-seq and CITE-seq data processing

Demultiplexing and alignment of RNA and ADT sequences were performed with Cell Ranger 3.1.0. Low-quality cells and red blood cells were then filtered by retaining only cells with between 300 and 2,500 genes in the scRNA-seq data, greater than 1,500 RNA counts, less than 10% mitochondrial RNA and fewer than three unique molecular identifiers (UMIs) mapping to hemoglobin B. To remove cell doublets in scRNA-seq data, DoubletFinder 2.0.3 was run with 5% of the expected rate of doublets. Participant cell × gene and cell × ADT count matrices were individually saved and subsequently concatenated using Seurat 4.0.5 for downstream analyses. For some analyses as specified, cell × gene and cell × ADT matrices for each participant were subset for G1 cells (representing the phenotype most resistant to conventional therapy) and were downsampled to match the lowest value in the cohort. After log-normalization, the FindVariableFeatures function in Seurat 4.0.5 was used to identify the top 5,000 features with greater than expected variance. Variable features with expression in >1% of cells were kept as the input to principal component analysis (PCA), with subsequent visualization being performed using uniform manifold approximation and projection (UMAP) of the top 50 principal components (PCs), 30 neighbors and two PCs. For visualization, we used the IntegrateLayers function in Seurat 5.0.3 with the RPCAIntegration method and default parameters.

scATAC-seq data processing

Demultiplexing of scATAC-seq reads was performed with Cell Ranger-ATAC 1.1.0 (alignment to hg38) and peak calling was performed with BWA and MACS2 using the scATAC-pro pipeline53 with default parameters. Low-quality cells were filtered for those cells with <3,000 (low quality) and >50,000 unique fragments (doublets), <40% reads in peaks (fraction of reads in peaks < 0.4) and >20% reads mapping to mitochondria. To construct a common peak set, the top 100,000 peaks (defined by MACS2 MapQ score) were selected for downstream merging, alongside 1,500 randomly selected cell barcodes from each participant. We defined two sample sets for merging peaks: one with 25 participants with ETP-ALL and one with 40 participants with T-ALL. For each sample set, peaks were merged with the scATAC-pro mergePeaks module and peak × cell matrices with merged peaks and downsampled cell barcodes were reconstructed with the scATAC-pro reConstMtx module.

Bulk RNA-seq analysis and visualization

Sequencing read adaptors were removed using Trim Galore 0.4.4 (https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) with parameters ‘-q 20 --phred 33 --paired’. Reads were aligned to the human genome GRCh38 using STAR. The resulting gene counts for each sample were estimated by RSEM, and combined as gene count matrix. RSEM expected counts were processed and filtered. First, samples were required to exhibit expression of over one count per million in ≥5 samples. Second, batch correction was performed using the sva R package 3.46 function ComBat_seq. Batches were defined on the basis of library type as stranded or unstranded and on the basis of cohort as TARGET or X01 sequenced samples. Third, the DESeq2 R package 1.38.3 vst function was used for data normalization. Limma 3.46.0 was used for DE analysis. For visualization, the raw counts were transformed into transcripts per million and visualized using UMAP with the top ten PCs and k = 30 neighbors.

Construction of healthy reference trajectory

Construction of the healthy reference trajectory began with sample-by-sample cell annotation followed by consensus clustering and annotation. Annotations from previously published bone marrow samples were kept60. Cell × gene matrices from each thymus sample were log-normalized and subjected to dimensionality reduction. Cells were clustered at multiple resolutions (k = 1, 2 and 3) and clusters were given preliminary labels on the basis of marker gene expression. Cell × gene matrices from all thymus donors (n = 3) were then concatenated, log-normalized and subjected to dimensionality reduction. Cells were reclustered at high resolutions (k = 3) and clusters were relabeled on the basis of marker gene expression and prior labels.

scRNA-seq

Cell × gene matrices from healthy thymus donors were then concatenated with cell × gene matrices from healthy bone marrow donors (n = 5), log-normalized and subjected to dimensionality reduction using the top 25 PCs. The FindVariableFeatures function in Seurat was used to identify top 2,000 variable genes. These 2,000 genes were then filtered in two iterations on the basis of the Gini coefficient61. Briefly, a shared nearest neighbor graph was constructed using 50 and 20 PCs; cells were clustered at k = 0.1 resolution and the Gini coefficient was calculated for each variable gene. Genes with a low Gini coefficient (bottom 10% percentile) and cluster level expression < 10% were removed in each iteration. The 134 cell-cycle-related genes previously described19 were removed. The remaining 931 variable features were used as input to PCA and UMAP dimension reductions (25 PCs). Trajectory analysis was performed using Slingshot 1.8.0 with HSPC as the start cluster and effector T, mature B and monocyte as the end clusters for T, B and myeloid trajectories, respectively. Principal curves were selected for T cell and myeloid cell trajectories and values were scaled to a maximum of 1 in each curve. Pseudotime values of shared cell states that occurred in both myeloid and T cell development (multipotent progenitors: HSPC and LMPP) were then averaged. Pseudotime values were scaled for myeloid development (0 to −1) and T cell development (0 to 1). Statistical comparisons in the overall arrest state were made using a two-sample Kolmogorov–Smirnov test, as previously described62.

scATAC-seq

Gene–activity matrices for scATAC-seq were constructed by summing counts within the gene body and 2 kb upstream, as previously described60. Integration of scATAC-seq samples was performed using gene–activity matrices and Seurat 4.0.5 using the default anchor-based CCA method with 30 dimensions, 2,000 anchor features and k.filter = 100. To learn labels for scATAC-seq data from scRNA-seq data, transfer anchors were computed using CCA with scRNA-seq as the reference and cell type label transfer was performed on a sample-by-sample basis using the TransferData function. Cell × peak matrices from all thymus donors were then concatenated. Cells were reclustered at high resolution (k = 3) and clusters were reannotated according to consensus labels. Dimensionality reduction was performed using UMAP of the top ten PCs of the concatenated scATAC-seq data and trajectory analysis was performed as described above.

Level 1 annotation of CITE-seq data

To distinguish malignant blasts from non-malignant cells, we first used a cluster-based statistic, Shannon entropy, to identify clusters of cells at multiple clustering resolutions (k = 1, 2 and 3) to identify four cell populations that had contribution from every participant. The Shannon entropy statistic was calculated using the formula \(-\sum {{p}}\!\left({{x}}\right)\times \log {{p}}\!\left({{x}}\right)\), where p(x) is the frequency of cells arising from any one participant in any one cluster, ranging from 0 to 1. Second, we concatenated and clustered participant-derived single-cell data with healthy bone marrow and thymus controls. Third, we calculated a similarity score to healthy controls across all participant-derived cells. Participant data and healthy control data were coembedded into a low-dimensional space using the default anchor-based CCA method in Seurat 4.0.5 (30 dimensions and 2,000 anchor features) and a k = 30 mutual nearest neighbor score was assigned for each cell using the TransferData function. Copy number profiles were analyzed using InferCNV 1.6.0 on a randomly downsampled (1:10) subset of participant data. We then compared blast percentages calculated in scRNA-seq to pathology reports of blast percentage obtained from diagnostic aspirate (mean absolute deviation = 8.9%; non-significant difference according to paired two-tailed t-test).

Level 1 annotation of single-nucleus (sn)ATAC-seq data

Firstly, annotated scRNA-seq data were used as a reference to annotate participant scATAC-seq data on a paired, sample-by-sample basis. For each participant, gene–activity matrices for scATAC-seq were constructed by summing counts within the gene body and 2 kb upstream, as previously described63. Integration of scATAC-seq samples with scRNA-seq data was performed using gene–activity matrices and Seurat 4.2.0 using the default anchor-based CCA method with 30 dimensions, 2,000 anchor features and k.filter = 100 using the TransferData function on a sample-by-sample basis. Then, participant data and healthy control data were coembedded into a low-dimensional space using the default anchor-based CCA method in Seurat 4.0.5 (30 dimensions, 2,000 anchor features and k.filter = 100) and a k = 30 mutual nearest neighbor score was assigned using the TransferData function to assess their similarity. Lastly, blast percentages calculated in scATAC-seq were compared to blast percentages calculated in scRNA-seq, showing high concordance (median deviation = 1.2%; non-significant difference according to paired two-tailed t-test).

Differential activity analyses

For TF motif enrichment analysis, cell × deviation score matrices were generated using the addGCBias, matchMotifs, getBackgroundPeaks and computeDeviations functions in chromVAR 1.12.0 with hg38 as the reference genome. Differential activity analysis was performed using the Wilcoxon rank-sum test with Benjamini–Hochberg multiple-testing correction with downsampling. For each motif in any particular comparison, we calculated the Δ median chromVAR deviation score, Δ mean chromVAR deviation score, adjusted P value, percentage expression of corresponding TF in paired scRNA-seq data and the ratio of median and mean chromVAR deviation score. DA TF motifs were defined by Δ median chromVAR deviation score > 0.0025, FDR < 0.001, >20% cell expression of corresponding TF and a ratio of median and mean chromVAR deviation score between 0.7 and 1.3, unless otherwise specified.

Subtype-specific transcriptional regulatory analysis

As we previously described60, for each cell in the scRNA-seq dataset, an scRNA-seq and scATAC-seq ‘metacell’ was defined by pooling counts for each gene or peak from the 30 nearest neighbors in the PC space by cosine distance. Metacell counts were log-normalized and scaled. For a gene of interest, we ran a linear regression model using metacell gene expression as the dependent variable and putative enhancer peaks within 500 kb of the transcription start site as regressors. Bonferroni-adjusted P values < 0.01 with a positive coefficient were considered significant. Top induced targets of TCF7 and LEF1 were defined by high-confidence EP regression (regression coefficient > 0.3) and log2FC > 0.5 for z-scoring on bulk RNA-seq data.

Promoter–enhancer coaccessibility networks (CCANs)

Cicero64, which identifies coaccessible pairs of DNA elements, was implemented in Signac65 through the make_cicero_cds function followed by the run_cicero function with the following parameters: sample_num = 100, window = 500,000. These links were aggregated into cis-coaccessible networks using the generate_ccans function with default parameters. The BMP-like and T-specified CCANs were isolated by identifying links to regions within 2,000 bp of the transcription start sites for the 66 and 53 DEGs for each state, respectively. Any peaks that overlapped regions within that coaccessibility group were then subset as potential regulators. This yielded 660 peaks in the BMP-like CCAN and 1,011 peaks in the T-specified CCAN. Those peaks were then used as input to HOMER with parameters ‘-size 200 -mask’ to identify motifs enriched in coaccessible regions. Motifs with a q value < 0.05 were considered significant.

Transcriptional regulatory analysis of developmental states

An integrated enhancer-driven transcriptional regulatory analysis was conducted using SCENIC+ 1.0a1 (ref. 44) following the standard vignettes with minor modifications. BMP-like and T-specified states from the scRNA-seq data and scATAC-seq data were extracted and 35 topics were empirically selected. The SCENIC+ pipeline was then run in non-multiome mode, using five cells per metacell. The search space was defined as 0–500 kb. Regulons were filtered with the following parameters: rho_threshold = 0.03, min_regions_per_gene = 0 and min_target_genes = 10. All other parameters were maintained as the defaults. Region-based and gene-based specificity scores for the BMP-like and T-specified states were calculated using the regulon_specificity_scores function.

AUCell pathway analysis and GSEA

Pathway analysis was conducted using two methods. First, pathway enrichment scores for gene signatures were defined from our single-cell analysis, including the gene sets defined through our transcriptional regulatory analyses or downloaded from the Molecular Signatures Database66, and were determined using AUCell 1.12.0 with the top 5% of genes. Additional gene sets for NOTCH activation were previously published67,68. GSEA was conducted to compare DE pathways between blast populations. A full gene list was constructed using the FindMarkers function in Seurat 5.0.3 with the following parameters: min.pct = 0.001, logfc.threshold = 0, only.pos = FALSE, max.cells.per.ident = 1,500. This gene list was sorted by log2FC to use as input to preranked GSEA using the fgsea package69.

Cell-cycle analyses in single-cell-sequenced participants with ETP-ALL

Cell-cycle signature scoring and phase classification was performed on ten high-MRD and ten MRD-negative participants with ETP using the CellCycleScoring function in Seurat 4.0.5 with default parameters. A total of 43 S-phase and 54 G2M-phase signature genes70 were provided as input. Cells were then randomly downsampled so that each participant would be represented by an equal cell number (3,350 per participant and 33,500 per group).

BMP-like DE analyses

BMP-like and T-specified DEGs were computed using the FindMarkers function in Seurat 4.0.5 with the following parameters: assay = RNA, logfc.threshold = 0, ident.1 = T-specified-R (T-specified blasts from ten MRD-negative participants), ident.2 = BMP-like-NR (BMP-like blasts from 15 MRD-positive participants) and max.cells.per.ident = 1,500. The input matrix to DE analysis was a matrix of G1-phase ETP-ALL blasts with an equal number of cells per participant (1,711 per participant and 42,775 cells total). To identify DE TFs and DE surface markers, the same process was repeated with using genes encoding human TFs71,72 (feature = TFs) and a change of assay to normalized ADT count matrix (assay = ADT).

Intersection of DE TF and DA motifs

DE TFs from scRNA-seq data, defined by average log2FC > 0.15 and FDR < 0.001, were intersected with DA TF motifs from scATAC-seq data. DA TF motifs were defined by Δ median chromVAR deviation score > 0.0025, FDR < 0.001, >20% cell expression of corresponding TF and a ratio of median and mean chromVAR deviation score between 0.7 and 1.3.

Identification of NOTCH1 mutations in scRNA-seq

Samples were demultiplexed into FASTQ files using bcl2fastq. FASTQ files were then processed using IronThrone 2.1 with the default parameters and inputs for 10x version 3.1 scRNA-seq data. Specifically, for each variant, IronThrone was run in circularization mode (--run = circ) with UMI length 12 (--umilen 12) and cell barcodes from each sample’s Cell Ranger output (--whitelist sample.specific.barcodes.tsv), following the configuration set within IronThrone 2.1 documentation (https://github.com/dan-landau/IronThrone-GoT).

Identification of BMP-like blasts in participants without ETP

Precommitted blasts in ten participants without ETP (six EOI-MRD-negative participants, 7,152 precommitted blasts; four EOI-MRD-positive participants, 11,047 precommitted blasts) were subsetted (total of 52,971 blasts and 15,830 blasts, respectively) and mean proportions for corresponding cell fractions (BMP-like, MEP-like and pro-T cell-like) were quantified in each participant. The mean proportion of each cell type for participants of each group was plotted, with the proportion of BMP-like blasts in MRD-negative versus MRD-positive participants being compared using the prop.test function.

Single-cell signature-based stratification of non-ETP cases

Precommitment and postcommitment DEGs were computed as described above. The input matrix to DE analysis was a matrix of G1-phase non-ETP-ALL blasts with a maximum of n = 1,500 cells per participant (34,384 cells in total). DEGs located on the X and Y chromosomes were filtered out to retain the core biology of both cell fractions. A z-score-based signature scoring was performed on 1,051 bulk-sequenced diagnostic participants with non-ETP-ALL with BMP-like DEGs as positive features and T-specified DEGs as negative features. Human TFs were previously curated71,72. Survival was analyzed using Cox proportional hazards as described above.

Identification of a BMP-like surface marker signature

DE ADTs from ETP-ALL BMP-like blasts and non-ETP BMP-like blasts were overlapped and the average log2FC was calculated. Nine genes with |log2FC| > 0.5 (five with positive expression and four with negative expression) were retained as a consensus BMP-surface-9 signature. Gene signature scoring and survival analysis were conducted as described above.

LASSO (least absolute shrinkage and selection operator) optimization of prognostic gene signatures

Refined gene sets were found using LASSO penalized regression. The Cox proportional hazards model was used with gene z-scores as features (glmnet and survival R packages). The model was adjusted for participant age, white blood cell (WBC) count at diagnosis, CNS status and treatment protocol, by including these as covariates on which no penalty was applied. The penalty was only applied to the gene features but the range of the predicted coefficients was bounded such that genes enriched in the BMP signature were given positive coefficients (increased hazard) and genes enriched in the T-specified signature were given negative coefficients (decreased hazard). The models were fit to the entire AALL0434 RNA-seq dataset and ETP status was used to stratify the survival allowing for different baseline hazards, followed by 100-fold cross-validation to determine the penalty parameter with the lowest root-mean-squared error.

Identification of BMP-like signature expression patterns

To contextualize BMP-17 and BMP-surface-9 marker genes within normal hematopoiesis, we performed AUC-based signature scoring of healthy donor scRNA-seq reference maps using AUCell with the top 10% of expressed genes considered for computational efficiency (aucMaxRank = 0.1). BMP-surface-9 marker genes were divided into positive DEGs (n = 4) and negative DEGs (n = 5) and AUCs were calculated for each gene set. Overall enrichment (that is, the aggregate AUC) of the BMP-surface-9 signature was calculated by taking the difference in AUC between positive and negative surface markers.

Integration of AALL0434 and Lee et al.’s bulk RNA-seq data

In vitro drug sensitivity data were integrated from the current study (ten PDX models) with previously published drug screening data47. To generate a consensus BMP-like signature across data, we first converted the AALL0434 bulk transcriptomic data to fragments per kilobase of transcript per million mapped reads format using the convertCounts function in the DGEobj.utils package. A BMP-like gene signature was scored using the 119 BMP-like and T-specified DEGs with a robust z-score. The negative features (T-specified DEGs) were subtracted from the positive features (BMP-like DEGs).

Statistics and reproducibility

No statistical method was used to predetermine sample size. All data meeting the quality control threshold were included. The investigators were not blinded to allocation during genomic profiling and assessment of participant data. No data points were excluded from analyses related to single-cell and bulk genomics. No animal models were excluded from PDX-related analyses. Randomization and blinding were used for all in vitro and in vivo experiments. Statistical comparisons were made using a two-sided Wilcoxon rank-sum test unless otherwise specified in the figure legend. As the Wilcoxon test is non-parametric, we did not formally test for normality of the data. The chi-squared test was used to compare cell type proportions. The Cox proportional hazards model was used for the assessment of survival outcomes.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.