Introduction

Breast cancer is a highly heterogeneous collection of tumors that are characterized by distinct molecular subtypes1. Advances in the subtype-specific therapies have yielded significant survival benefits for patients, primarily through the administration of hormone and HER2-targeted therapy2. In recent years, there has been a growing interest in the combination of immune checkpoint inhibitors (ICIs) and chemotherapy as a treatment for breast cancer3. The success of several clinical trials, including IMpassion1304, KEYNOTE-3555, and KEYNOTE-522 trials6, has led to the approval of atezolizumab and pembrolizumab for triple-negative breast cancer patients with PD-L1 expression. Despite these promising achievements, the efficacy of ICIs in treating other molecular subtypes and PD-L1-negative patients remains limited, partly due to inadequate immune infiltration in the tumor microenvironment (TME)3,7. A deeper understanding of TME heterogeneity and the mechanisms dictating immune infiltration is required to optimize immunotherapy efficacy.

The post-capillary venules have been identified as the major channel for lymphocytes extravasation through the endothelial barrier8,9. Mechanistically, circulating lymphocytes engage venular endothelial cells (ECs) under shear stress via the binding of endothelial selectins, P-selectin and E-selectin, leading to the tethering and rolling of lymphocytes along the vessel wall8. Rolling lymphocytes are then activated through the interaction between CCR7 and homeostatic chemokines, including CCL21 and CCL19, that are immobilized on the EC surface10. Subsequently, activation of lymphocytes mediates firm adhesion and enables lymphocytes to crawl on the luminal surface of venules until they transmigrate through the barriers of ECs, associated pericytes/vascular smooth muscle cells (SMCs), and the basement membrane11.

Previous single-cell and experimental studies have provided valuable insights into the interaction between ECs and lymphocytes12,13,14,15, but ECs present significant plasticity and spatial heterogeneity that are coordinated with different functional states12. The spatial organization of venular ECs and other EC functional subtypes within intact tumors from patients in a clinical setting remains unclear. More importantly, our understanding of their interaction and coordination with perivascular cells in mediating immune infiltration is still limited, primarily due to limitations in technical precision and a lack of suitable experimental models11,16. A high-resolution, unbiased spatial map of the tumor ecosystem and its vasculature could help to elucidate the complex spatial and functional networks between the immune and vascular systems.

The application of spatial transcriptomics (ST) technologies has revolutionized our ability to retrieve spatial information that is lost in single-cell RNA sequencing (scRNA-seq)17. Wu et al. performed ST on six breast cancers to reveal the coordination of tumor and host cell phenotypes18. However, conventional ST methods, such as Visium, exhibit insufficient resolution due to their reliance on 55 μm barcoded spots, with a center-to-center distance of 100 μm, that encompass dozens of cells, thus impeding precise localization of dispersed ECs and perivascular cells within the TME. An innovative technology, SpaTial Enhanced REsolution Omics-sequencing (Stereo-seq)19, has recently been developed that combines 220 nm DNA nanoball (DNB)-patterned arrays and in situ tissue RNA capture to create spatial maps at cellular resolution, making it an ideal tool to depict the intricate tumor vasculature20.

In this study, we applied Stereo-seq to characterize the geospatial architecture and cell-cell communication in 30 surgically resected specimens of primary breast tumors and paired lymph node metastases. We unveiled that tumor vasculature displayed distinct heterogeneity of biological phenotype, spatial organization, and interactions with surrounding cells. CCL21+ venular SMCs and ACKR1+ venular ECs cooperated through multiple chemokines to promote lymphocyte infiltration in breast cancer.

Results

Depicting a high-resolution spatial atlas of human breast cancer

To comprehensively characterize the spatial landscape of breast cancer, we employed Stereo-seq on 30 fresh frozen tissues derived from 23 breast cancer patients, including 23 primary breast cancers and 7 paired metastatic lymph nodes that collectively covered all four molecular subtypes (luminal A: n = 8; luminal B: n = 9; HER2: n = 8; triple-negative: n = 5). Adjacent tumor tissues were subjected to scRNA-seq to facilitate cell type identification in the ST slides. We further applied immunohistochemistry (IHC) and multiplex immunofluorescence (mIF) on the adjacent slides of all 30 ST slides and on the validation cohort to confirm our discovery (Fig. 1a). Detailed clinical and pathological information of patients is provided in the Supplementary Data 1.

Fig. 1: A spatially resolved atlas of human breast cancer slides across different molecular subtypes and tumor sites.
figure 1

a Schematic diagram illustrating the acquisition and analysis workflow of the ST maps. Stereo-seq was performed on 30 slides of primary breast tumors and paired lymph node metastases from 23 patients, yielding a total of 4,894,305,510 DNBs. ScRNA-seq was performed on the fresh tumor samples from 6 out of 23 patients and identified 78,848 qualified cells. Each block represents a ST slide and is colored by molecular subtype. Red, blue, orange, and green spots represent venular endothelial cells (vECs), venular smooth muscle cells (vSMCs), tumor cells, and immune cells, respectively, in the projection of scRNA-seq and ST data. Schematic plots were created using bioRender. b Uniform manifold approximation and projection (UMAP) plot of the scRNA-seq dataset. Each cluster is colored by its corresponding cell type. EC: endothelial cell; SMC: smooth muscle cell, including vascular SMC and pericyte; T/NK: T or natural killer cell. c Differential expression of canonical markers in each cell type cluster. Macro: macrophage; Fibro: fibroblast; Melano: melanocyte; Granulo: granulocyte; PB: plasmablast. df Hematoxylin and eosin (H&E) staining (d), gene count maps (e), and cell type annotation maps (f) of the ST slides of the primary tumor (PT) and metastatic lymph node (LN) from the patient NCC-P05. ST spots are mapped to the H&E staining of adjacent slides. Raw spatial expression matrix of each bin/spot is convoluted into pseudo 50 spot with 25 µm squares (50 × 50 bins/spot, bin50 spot for short). IDC: invasive ductal carcinoma; DCIS: ductal carcinoma in situ. g The cell type annotation, H&E staining, gene counts, and unique molecular identifiers (UMI) counts in the selected area of NCC-BR5 slide highlighted in Fig. 1d–f. h Relative proportions of cell types across 30 ST slides. ST slides are grouped according to the molecular subtype and tumor site.

After quality control (Supplementary Fig. 1a-c), the scRNA-seq analysis of 6 breast cancer samples identified 78,848 qualified cells, which were categorized into 13 dominant cell types based on the canonical cell markers, including breast tumor cell (EPCAM), T or natural killer (T/NK) cell (TRBC2), B cell (CD79A), plasma cell (JCHAIN), plasmablast (GZMB and JCHAIN), macrophage (CD163), fibroblast (DCN), EC (VWF), vascular SMC/pericyte (referred to as SMC; NOTCH3), basal cell (COL17A1), neuron (CNTN5), granulocyte (CPA3), and melanocyte (CDH19) (Fig. 1b, c). In order to validate the robustness of cell classification, we conducted cell clustering and cell type annotation on an external scRNA-seq dataset of 26 breast cancer samples consisting of 88,876 cells18, which produced similar clustering results and feature genes for the dominant cell types (Supplementary Fig. 1d, e).

For the ST data, Stereo-seq of 30 slides yielded a total of 4,894,305,510 DNBs (220 nm × 220 nm), with an average of 163,143,517 DNBs per slide. Utilizing a resolution similar to that of Visium, Stereo-seq captured an average of 7565 pseudo-spots (bin200 spot, 200 × 200 DNB bins) per slide. Each spot, on average, quantified approximately 6120 genes (Supplementary Fig. 1f). To gain a more detailed view of slides with sufficient genes for spatial annotation, we convoluted the raw spatial expression data of each slide into smaller pseudo-spots (bin50 spot, 50 × 50 DNB bins, 25 μm diameter) (Fig. 1d-g), with each bin50 spot covering approximately 3-5 tumor cells and having an average detection of 921 genes and 2058 UMIs per spot (Fig. 1g) (Supplementary Fig. 1g). Random sampling of 10,000 bin50 spots taken from each slide showed few batch effects (Supplementary Fig. 1h).

Using the internal and external scRNA-seq results as independent reference datasets, we utilized the Robust Cell Type Decomposition (RCTD) algorithm to infer the cell composition for each bin50 spot21. Annotation of eight major cell types, including tumor cell, T/NK cell, B cell, macrophage, fibroblast, EC, SMC, and basal cell, produced highly correlated results in both references (Supplementary Fig. 2a, b). In addition, we calculated signature scores of cell type for each spot using signature gene sets identified from our scRNA-seq dataset (Supplementary Fig. 2c) (Supplementary Data 2), which also showed high consistency with the RCTD annotation (Supplementary Fig. 2b, d). On the UMAP projection of ST data, spots of the same cell type label clustered with the expression of their canonical cell markers (Supplementary Fig. 1i, j), indicating the reliability of the inference strategy.

We then projected the cell composition information onto H&E-stained slides that had been annotated by pathological experts (Fig. 1d-g). The integration of transcriptomic map and histological annotation revealed a detailed distribution of various cells in primary tumors and metastatic lymph nodes, enabling further characterization of the subtle structure of the TME. Across 30 slides, tumor cells dominated the vast majority of analyzed spots, comprising an average proportion of 56% (Supplementary Fig. 1k). Stromal cells (including ECs, SMCs, and fibroblasts) and immune cells (including B cells, T/NK cells, and macrophages) displayed comparable proportions, typically ranging from 5% to 10%, albeit with extensive variation observed in different slides (Fig. 1h). A significant increase in tumor-infiltrating B and T/NK cells was found within the cancerous areas of the paired lymph node metastases compared to the primary tumors (Supplementary Fig. 1l).

Characterization of endothelial heterogeneity uncovered the immunomodulatory function of venular ECs

ECs play a crucial role in angiogenesis, immune infiltration, and the development of the breast cancer microenvironment22. Notably, tumoral ECs exhibited extensive phenotypic divergence that aligns with their anatomic localization23. To gain insights into their heterogeneity of biological phenotype and spatial distribution, we clustered ECs in our single-cell data using two independent clustering methods (SNN and K-means) (Fig. 2a) (Supplementary Fig. 3a, b), and robustly classified PECAM1 (CD31)-positive ECs into three categories characterized by distinct endothelium identity markers: venular ECs (ACKR1, PLVAP), arterial ECs (IGFBP3), and capillary ECs (PLVAP) (Fig. 2b). Other genes related to endothelial function varied among three subtypes of ECs, such as CCL14, SELP (P-selectin), SELE (E-selectin), and HLA family genes in venular ECs, VEGFC, JAG2, CLDN5 (Claudin-5) in arterial ECs, ESM1, KDR, FLT1 (VEGFR1) in capillary ECs, indicating functional divergence compatible with their anatomic localization (Fig. 2c). We confirmed the transcriptomic sub-clustering of ECs using external single-cell datasets and found similar marker genes among endothelial subtypes as well as the whole dataset (Supplementary Fig. 3c, d).

Fig. 2: Characterization of endothelial subtypes revealed the immunomodulatory function of venular ECs.
figure 2

a The sub-clustering of ECs in the in-house scRNA-seq dataset, with each EC subtype being distinguished by a different color. b Differential expression of canonical endothelium identity markers (PECAM1, IGFBP3, ACKR1, PLVAP) in three EC subtypes. c Heatmap showing the DEGs between venular ECs (vECs), arterial ECs (aECs), and capillary ECs (cECs). Top fifteen DEGs for each EC subtype are shown based on their fold change value. d The GO BP-terms (biological process terms) enriched in the DEGs between vECs, aECs, and cECs. Top six GO BP-terms for each EC subtype are shown based on their adjusted p-value. e Heatmap displaying smoothed pseudotime-dependent differential gene expression in vECs and cECs, with each row indicating a gene and each column indicating a cell. The genes are grouped into three clusters based on their pseudotime-dependent expression patterns. f Differential gene expression analysis of HEVs or non-HEV venular ECs, in comparison to cECs and aECs. The x-axis and y-axis, respectively, indicate the log2(fold change) value of the DEGs in HEVs and non-HEV venules. g Spatial maps of ECs annotated by RCTD and IHC staining of CD31 illustrating the highly consistent ___location of ECs in the representative areas on the ST slides of NCC-BR5. h UMAP plot showing the bin50 spots labeled as ECs across 30 ST slides. Three subtypes of ECs are identified by their signature genes and annotated by different colors. i Relative proportions of vECs, aECs, and cECs in the ST slides of 23 primary breast tumors. Red, green, and yellow indicate vECs, aECs, and cECs, respectively. j Relative proportions of vECs, aECs, and cECs in the ST slides of paired primary tumors and lymph node metastases. Red, green, and yellow indicate vECs, aECs, and cECs, respectively.

Gene Ontology (GO) enrichment analysis of differentially expressed genes (DEGs) revealed that, besides the enrichment for classical vascular functions, such as cell migration and adhesion, venular ECs showed characteristic upregulation of immune activation pathways, including MHC class II protein complex assembly and peptide antigen assembly (Fig. 2d), consistent with their high level of HLA family gene expression (Fig. 2c). These gene sets associated with MHC protein assembly and antigen presentation (C3 gene sets) were also observed to increase along pseudotime as endothelium development progressed from capillary to venule (Fig. 2e) (Supplementary Fig. 3f). On the other hand, high endothelial venules (HEVs) are specialized post-capillary venules restricted to various secondary lymphoid organs to efficiently mediate infiltration of naïve lymphocytes for education. They can sometimes arise spontaneously in human tumors and display an immature phenotype24,25. Using P-selectin as an immature HEV marker, we identified a number of SELP-positive HEV-like venular ECs within the cluster of venules (Supplementary Fig. 3e). Nevertheless, compared to arterial and capillary ECs, both HEV-like and non-HEV-like venular ECs were enriched for molecules involved in antigen presentation and immune activation signaling, such as HLA-DRB1, HLA-DRA, CCL14, and CD74, suggesting a common immunomodulatory role for venules in the breast tumor environment (Fig. 2f).

Next, we employed our ST data to identify and spatially locate ECs and their subtypes. We used a threshold value of 0.1 for the RCTD cell type score of ECs and successfully identified sporadic ECs on the ST slides. These ECs corresponded well with the CD31-positive ECs recognized by IHC on the adjacent slide (Fig. 2g). Using feature genes identified in the single-cell analysis (Supplementary Table 1), we were able to further identify three subtypes of ECs. ECs of the same subtypes aggregated together on the UMAP layout and displayed functional distinctions similar to our scRNA-seq results (Fig. 2h) (Supplementary Fig. 3g, h). Arterial ECs were characterized by muscle contraction and extracellular matrix organization, while capillary ECs showed enrichment in metabolic pathways such as oxidative phosphorylation and glycolysis, indicating high metabolic activity in the local microcirculation environment. Venular ECs, on the other hand, displayed strong antigen presentation and lymphocyte immune pathways (Supplementary Fig. 3g).

We assessed the proportions of three EC subtypes in the entire ST slide of each sample. Capillaries were the most abundant, constituting roughly 50% of all blood vessels in the primary tumors, with venule and artery accounting for an average of 35% and 15%, respectively (Fig. 2i). Meanwhile, an increasing proportion of venules and a decreasing proportion of capillaries were observed in metastatic lymph nodes (arterial: 11.9%; venular: 55.4%; capillary: 29.8%) as compared to their paired primary tumors (arterial: 21.3%; venular: 29.9%; capillary: 44.4%) (Fig. 2j).

Distinct microenvironment of different endothelial subtypes

To investigate the milieu of different EC subtypes, we determined the cell composition in each bin50 spot containing ECs (bin50 model) and in its neighboring eight bin50 spots (bin150 model), respectively (Fig. 3a), thereby obtaining the composition of various neighbor cells surrounding ECs across 30 ST slides (Fig. 3b). We quantified the proximity relationship between ECs and other major cell types by the enrichment score (ES) and the normalized enrichment score (NES) (see “Methods” for the detailed calculation algorithm). Within each bin50 spot, we noticed an enrichment of tumor cells around capillaries compared to other types of blood vessels, and this distribution pattern widely existed across different molecular subtypes and in the lymph node metastases (Fig. 3c). Moreover, when the observation space extended from the bin50 model to the bin150 model, the co-localization of capillary ECs and tumor cells remained significant (Supplementary Fig. 4a), consistent with the extensive angiogenesis reported in breast cancer22.

Fig. 3: Distinct microenvironment of different endothelial subtypes.
figure 3

a Schematic plot illustrating cell type co-localization analysis of ST data. A square represents a bin50 spot that covers a 25 µm area in the ST slide. In the bin50 model, the strength of co-localization between different cell types is explored in a single bin50 spot, while in the bin150 model, co-localization between cell types is explored between a bin50 spot and its eight neighboring bin50 spots. b Heatmap displaying the scaled NES between different cells in the bin50 model across thirty slides. c, d The NES between target cells and EC subtypes in the bin50 model across ST slides from PT (left panel) and from LN (right panel). The target cells include tumor cells (c) and T/NK cells (d). The NES between target cells and vECs, aECs, cECs, and ECs are compared by the Wilcoxon rank sum test (paired samples, two-tailed); ***p < 0.001. e, f The ES between vECs, aECs, cECs, all ECs and various target cells in the bin50 model across ST slides from PT (e) and from LN (f). The ES values between vECs, aECs, cECs, all ECs and target cells in the ST slides are calculated and each dot represents the median value of an ES set. The four ES sets are compared using a Wilcoxon rank sum test (paired samples, two-tailed). The ES set that is significantly higher than the other three ES sets is highlighted; *p < 0.05; ***p < 0.001. g, h Spatial in situ projection illustrating the distribution of different cells in ST slides of paired PT and LN from NCC-P18 and NCC-P12 patients. i Distribution of vECs, aECs and cECs on the representative ST slides of samples containing tertiary lymphoid structures (TLS), with TLS area labeled in black. j Proportion of vECs, aECs and cECs within the TLS areas across seven samples that contained TLS. k Proportion of vECs, aECs, and cECs with the closest Euclidean distance to TLS across all samples containing TLS (n = 7). P-values are determined by the two-tailed Wilcoxon rank sum test.

Meanwhile, venular ECs were located in close proximity to T/NK cells in the bin50/bin150 models of primary tumors and lymph node metastases, respectively (Fig. 3d) (Supplementary Fig. 4b). To obtain a comprehensive insight into the distinct immune milieu of venular ECs, we, in turn, calculated the ES to compare the prevalence of three EC subtypes surrounding various cells in the breast cancer microenvironment (Fig. 3e, f). Compared to capillaries or arteries, only venules were spatially close to T/NK cells and B cells, exhibiting remarkable stability across different tumor subtypes (molecular subtypes of breast cancer), tumor locations (primary tumor and lymph node metastasis), observation spaces (bin50 and bin150 model), or proximity models (ES and NES) (Fig. 3d-f) (Supplementary Fig. 4b-f). Nevertheless, T and NK cells constituted a minor fraction of the cellular landscape in ST slides. To accurately assess the spatial relationship between different EC subtypes and each immune cell, we calculated the Euclidean distances between each T or NK cell and the surrounding ECs, respectively, thereby identifying the subtype of the nearest EC to each immune cell. Consequently, by comparing the prevalence of different EC subtypes as the nearest cells, we validated that both T and NK cells are spatially closest to venular ECs (Supplementary Fig. 4g, h).

Through the in situ projection on the ST slides, we also visually confirmed the unique co-localization patterns of EC subtypes with tumor cells and immune cells: most capillaries were distributed inside the tumors (Fig. 3g), while venules were largely clustered around T/NK and B cells (Fig. 3h). Furthermore, mIF results from the adjacent sections corroborated these observations, demonstrating substantial aggregation of T, B, and NK cells around venules, which resembled tertiary lymphoid structures (TLS) (Supplementary Fig. 4i). TLS has been widely reported to promote adaptive immune response and play a pivotal role in enhancing tumor immune infiltration, thereby modulating the efficacy of immunotherapy26. By integrating H&E staining results with spatial cell annotation maps, we identified TLS-like organization in the ST slides of seven primary tumors (Fig. 3i). Notably, venular ECs were found to almost entirely constitute the vascular network within these TLSs (Fig. 3j). In addition, outside of the TLS, a higher proportion of venular ECs was in closest Euclidean distance to the TLS (Fig. 3k). These findings suggest a significant correlation between venular ECs and the aggregation of immune cells, as well as the formation of TLS.

Venular SMCs exhibited spatial positioning and biological phenotypes coordinated with venular ECs

The intima ECs and the adventitia SMCs together constitute the main body of blood vessels. These two types of cells are closely related both in space and function, and are involved in various physiological processes27. To elucidate the synergistic interaction between venular ECs and SMCs in the regulation of immune infiltration, we further characterized the heterogeneity of SMCs and categorized them into three distinct groups: SMC.1, SMC.2, and SMC.3 (Fig. 4a). All three groups highly expressed SMC markers, such as TAGLN (transgelin) and ACTA (alpha-actin), but SMC.2 exclusively expressed skeletal muscle markers, such as MUSTN1 and PLN, which are involved in vascular smooth muscle contraction and cell adhesion (Fig. 4b). In contrast, SMC.1 was enriched for viral immune and TNF signaling pathways and distinguished by the homeostatic chemokines, CCL21 and CCL19 (Fig. 4c, d). Similar clustering and subtype markers between SMC subtypes were confirmed in the external cohort (Supplementary Fig. 5a, b).

Fig. 4: Phenotypic and spatial heterogeneity of SMC subtypes.
figure 4

a The sub-clustering of SMCs in the in-house scRNA-seq dataset, with each SMC subtype distinctly color-coded. b Differential expression of feature genes (TAGLN, KCNJ8, MUSTN1, CCL21) among three SMC subtypes. c The KEGG pathways enriched in the DEGs between the SMC subtypes. Top six KEGG pathways for each SMC subtype are selected based on their adjusted p-value. d The DEGs between three SMC subtypes. Top fifteen DEGs for each SMC subtype are shown based on their fold change value. e, f The NES between target cells and vECs, aECs, cECs, all ECs in the bin50 model across ST slides of primary tumors. The target cells represent SMC.1 (e) and SMC.2 (f), respectively. The NES are compared by the Wilcoxon rank sum test (paired samples, two-tailed); ***p < 0.001. g Multiplex immunostaining for ACKR1 (vEC marker; orange), CCL21 (vSMC marker; green), CD31 (classical EC marker; red), and SMA (classical SMC marker; gray) validates the cell-type-specific expression of ACKR1 and CCL21, and further highlights the co-localization of vECs and vSMCs. h Relative proportions of bin50 spots that are labeled as venular SMCs (vSMCs) or arterial SMCs (aSMCs) in ST slides from PT and LN. i The ES between vECs/aECs/cECs and vSMCs/aSMCs in the bin50 model in ST slides from PT. ***p < 0.001, by the Wilcoxon rank sum test (paired samples, two-tailed). j The ES between vSMCs/aSMCs/SMCs and various target cells (including T/NK cells, B cells, macrophages, fibroblasts, and tumor cells) in ST slides from PT. ***p < 0.001, by the Wilcoxon rank sum test (paired samples, two-tailed). k Spatial map of NCC-LN13 ST slide showing the localization of vSMCs, aSMCs, and three EC subtypes, revealing the co-localization relationship between vSMCs and vECs and between aSMCs and aECs. Colors in each subgraph represent the corresponding cell type of bin50 spots. l Spatial map of NCC-LN13 ST slide showing the localization of vSMCs, aSMCs, immune cells, and tumor cells, revealing the co-localization relationship between vSMCs and different immune cells. Colors in each subgraph represent the corresponding cell type of bin50 spots.

Based on the feature genes determined from single-cell data, we identified SMC.1 and SMC.2 clusters in the ST datasets (Supplementary Fig. 5c, d) (Supplementary Table 1). Enrichment analysis revealed a similar phenotypic divergence of two SMC subtypes observed in scRNA-seq data (Supplementary Fig. 5e). To clarify the spatial relationship between SMC subtypes and EC subtypes, we calculated the NES of SMC.1 and SMC.2, respectively, within the neighborhood of ECs and three EC subtypes in primary tumors and metastatic lymph nodes. SMC.1 was found in close proximity to venular ECs (Fig. 4e) (Supplementary Fig. 5f), while SMC.2 was significantly adjacent to arterial ECs (Fig. 4f) (Supplementary Fig. 5g). Accordingly, on the basis of their characteristic spatial localization and biological function, SMC.1 and SMC.2 were recognized as venular SMCs and arterial SMCs, respectively.

We selected CCL21 and ACKR1 as distinct markers for venular SMCs and venular ECs, respectively. By integrating mIF staining results for SMA and CD31, we identified cell-type-specific expression patterns of these two markers and further revealed a spatial co-localization between CCL21-positive SMCs and ACKR1-positive ECs (Fig. 4g) (Supplementary Fig. 5h). Historically, CCL21 has long been regarded as a marker of certain post-capillary venules and HEVs24,25. Together with CCL19, it functions as one of the only two ligands for CCR7. In the context of tumor immune infiltration, CCL21 is often broadly described as being expressed by venules and presented on the surface of venular ECs to facilitate CCR7+ lymphocyte trafficking10,16. However, the precise cellular source of CCL21 has not been definitively characterized. Here, we provided evidence that venular SMCs are the primary source of CCL21 expression.

Quantification of SMC subtypes in each sample revealed a higher proportion of venular SMCs over arterial SMCs across ST slides (Supplementary Fig. 5i). In addition, lymph node metastases exhibited a significantly higher content of venular SMCs than primary lesions, following the same trend as was observed in venular ECs (Fig. 4h).

When looking into the immune cells in the neighborhood of venular SMCs, we discovered an enrichment of lymphocytes surrounding venular SMCs. Specifically, CCL21+ venular SMCs were found to be in close proximity to T/NK cells, B cells, and macrophages (Fig. 4i, j) (Supplementary Fig. 5j, k). As expected, the in situ projection on the slides visually exhibited the spatial proximity relationships between two pairs of vascular EC/SMC: venular EC/SMC and arterial EC/SMC. In the spatial map, we observed a characteristic annular distribution of arterial SMCs along the outer perimeter of the arterial ECs, highlighting the precise localization achieved through Stereo-seq (Fig. 4k) (Supplementary Fig. 5l). On the other hand, venular SMCs, similar to venular ECs, were abundantly present in regions of immune cell aggregation (Fig. 4l) (Supplementary Fig. 5m).

Venular components correlated with more lymphocyte infiltration and better clinical outcome in breast cancer patients

To explore the clinical significance of EC/SMC subtyping, we defined subtype-specific signature genes using single-cell datasets and then applied these markers to two large breast cancer cohorts to estimate the proportion of venular components in patients (Supplementary Fig. 6a) (Supplementary Table 2-3). The results indicated that a high proportion of venular ECs or SMCs was significantly correlated with improved overall survival (OS), as observed in both the METABRIC and the TCGA cohort (Fig. 5a, b) (Supplementary Fig. 6b, c)28,29.

Fig. 5: High numbers of venular ECs and SMCs were associated with better clinical outcomes.
figure 5

a, b Kaplan-Meier analysis of overall survival (OS) of breast cancer patients from the METABRIC cohort. Patients are stratified by their vECs signature scores (a) and vSMCs signature scores (b). P-values are determined using the log-rank test. c, d Forest plot demonstrating the prognostic values of vECs (c) and vSMCs (d) in patients from the METABRIC cohort (n = 1980). Multivariate analysis of OS was executed, wherein age, ER status, PR status, HER2 status, tumor stage, lymph node metastatic status, and the signature score of vECs or vSMCs are considered as covariates. Hazard ratios (HR) with 95% confidence intervals are calculated using Cox proportional hazards regression. e Heatmap illustrating the correlation between EC subtypes, SMC subtypes and other cells using bulk RNA-seq data of primary breast cancers from the METABRIC cohort. The Pearson correlation coefficients between the estimated proportions of different cell types were calculated. f Representative IHC images and recognition plots of tumors with high or low density of ACKR1 venules. Scale bar: 100 μm. g The distribution of the density of ACKR1+ venules in the tumors of recurrent and recurrent-free patients. Patients were stratified into two groups (venule-high and venule-low) using a value of 3 positive venules per 20× magnification field as the threshold. P-value is calculated by the two-sided Pearson’s Chi-square test.

We further examined the receptor expression in the venule-high group patients and found that the proportion of venular ECs was not associated with ER or PR status, but was negatively correlated with HER2 receptor expression (Supplementary Fig. 6d). Nevertheless, subgroup survival analysis for patients with different receptor statuses revealed that venular ECs consistently demonstrated the ability to predict a favorable prognosis across all subgroups (Supplementary Fig. 6e). Moreover, after conducting multivariate regression analysis that took age, tumor stage, lymph node metastasis, and ER/PR/HER2 receptor status into the calculation, venular ECs or venular SMCs still emerged as an independent factor influencing patient prognosis (Fig. 5c, d).

Intriguingly, when the multivariate analysis included both venular ECs and SMCs as covariates, their protective effects disappeared (Supplementary Fig. 6f), indicating an interdependent relationship between these two venular components in improving OS. In agreement, correlation analysis of bulk data revealed a significantly positive correlation between venular ECs and venular SMCs, and these two venular components also exhibited the highest correlation with T/NK and B lymphocytes compared to other subtypes of ECs and SMCs (Fig. 5e) (Supplementary Fig. 6g).

In addition, we established a recurrent breast cancer cohort, matching 148 individuals who experienced recurrence within five years with 148 recurrence-free patients based on their clinical characteristics, including age, tumor stage, and molecular subtype (Supplementary Fig. 6h). Through quantitative IHC analysis of the intratumoral ACKR1+ venules density, tumors were stratified into venule-high and venule-low groups (Fig. 5f). Notably, the recurrence-free patients exhibited a higher proportion of venule-high classification (Fig. 5g), despite having comparable overall levels of vascularization to the recurrent patients (Supplementary Fig. 6i).

Intercellular machinery of the pro-inflammatory vascular niche orchestrated by venular ECs and SMCs

We have demonstrated the unique spatial co-localization of ACKR1+ ECs and CCL21+ SMCs in relation to immune cells and the shared functionality of these two venular components in promoting immune infiltration. Next, we conducted differential analysis on spots containing only venular ECs or only venular SMCs versus spots containing both cell types, yielding a total of 194 DEGs (Fig. 6a). Co-expression analysis of these genes identified four modules in both scRNA-seq datasets (Fig. 6b) (Supplementary Fig. 7a) (Supplementary Data 3), with M3 and M4 modules showing the strongest expression in T/NK cells and B/macrophage cells, respectively (Fig. 6c) (Supplementary Fig. 7b). The M3 module was primarily involved in T cell receptor signaling and antigen receptor-mediated signaling pathways, while the M4 module mainly participated in the MHC-dependent antigen presentation pathway (Fig. 6d). We calculated the ES of spots with high expression of M3 and M4 modules and observed significant enrichment in spots where venular ECs and venular SMCs coexisted, as opposed to spots containing only venular ECs, venular SMCs, T/NK cells, or B cells (Fig. 6e), suggesting that the coexistence of venular ECs and venular SMCs was a prerequisite for the immune activation of T/NK cells and B cells surrounding blood vessels and this is independent of molecular subtype (p > 0.5, by the Wilcoxon rank sum test).

Fig. 6: Spatial cell-cell communication analysis identified a pro-inflammatory vascular niche.
figure 6

a Venn diagram showing the DEGs between bin200 spots labeled as different cell types. Set 1 (blue circle) represents DEGs between bin200 spots labeled as both vSMCs and vECs and bin200 spots labeled as vSMCs only. Set 2 (red circle) represents DEGs between bin200 spots labeled as both vSMCs and vECs and bin200 spots labeled as vECs only. A total of 194 DEGs are identified, of which 135 DEGs are shared by the two sets. b The co-expression modules detected from the scRNA-seq dataset through hierarchical clustering analysis of the gene co-expression relationships between the 194 DEGs listed in Fig. 6a. c Visualization of the M1, M2, M3, and M4 module expression in the UMAP plot of the in-house scRNA-seq dataset. d The GO BP-terms enriched in the gene modules of M1, M2, M3, and M4, with the top six BP-terms for each module selected based on their adjusted p-value. e Comparison of the ES of bin200 spots with high signature scores of M3 and M4 modules among five bin200 spot groups, by the Wilcoxon rank sum test (paired samples, two-tailed); ***p < 0.001. f The cell-type-pair specific LRIs inferred from scRNA-seq data using CellChat. g Scatter plot showing the enrichment of LRIs in the spatial locations containing the corresponding cell type pairs across the thirty ST slides. The plot examines the enrichment of LRI co-localization spots in the cell-type co-localization spots, with vE referring to vECs, aE to aECs, vS to vSMCs, and aS to aSMCs. P-values are determined using two-sided paired Wilcoxon rank-sum test (See “Methods”). h Spatial maps illustrating the cell distribution and gene expression in the NCC-BR9 ST slide. i The human umbilical vein endothelial cell (HUVEC) monolayers with ACKR1 overexpression promote trans-endothelial migration of PBMCs under CCL2 or CCL5 chemotactic stimulation. Each dot represents the number of cells in one well of the plate (n = 4 per group). Data are presented as mean ± SD. P-values are determined by the two-tailed Student’s t test.

To further elucidate the intercellular crosstalk between ACKR1+ ECs and CCL21+ SMCs, as well as the mechanism underlying their collaborative promotion of lymphocyte infiltration, we utilized CellChat to conduct cell communication analysis on both the in-house and external scRNA-seq datasets30, yielding a large number of cell-type-pair specific ligand-receptor interactions (LRIs) (Fig. 6f). We then tested the co-existence of these LRIs in the spatial locations containing the corresponding cell types and finally filtered out eight spatially co-localized LRIs (see “Methods”) (Fig. 6g). These eight interaction pairs included CCL21/CCL19-CCR7 interaction between venular SMCs and T/NK/B cells, CCL2-ACKR1 interaction between venular SMCs and venular ECs, CCL5-ACKR1 interaction between venular ECs and T/NK cells, and JAG1/2-NOTCH3 interaction between arterial ECs and arterial SMCs (Fig. 6g) (Supplementary Fig. 7c). These LRIs were significantly co-expressed in spots where corresponding sender and receiver cells coexisted (Supplementary Fig. 7d-g), such as venular SMC-expressed CCL2 ligand binding to ACKR1 receptor on venular ECs (Supplementary Fig. 7d, f), and T/NK cell-expressed CCL5 ligand binding to ACKR1 receptor on venular ECs (Supplementary Fig. 7e, g). The in situ projection of cell type and LR expression on slides also confirmed that the expression of LR pairs was markedly consistent with the corresponding cell-enriched area (Fig. 6h) (Supplementary Fig. 7h). To investigate whether ACKR1 receptor mediated CCL2/CCL5-induced lymphocyte transendothelial migration, we further established an endothelial-peripheral blood mononuclear cell (PBMC) co-culture system. As expected, we observed that EC monolayers with ACKR1 overexpression significantly elevated the transmigrating ability of PBMCs under CCL2 or CCL5 chemotactic stimulation (Fig. 6i). This highlights the crucial role of the ACKR1-mediated intercellular signal communication within the venular niche in promoting immune infiltration.

Based on the synergistic relationships among the venular niche and its surrounding cells in terms of their biological phenotype, spatial organization, and intercellular communication, we can ultimately speculate on the molecular mechanism by which ACKR1+ ECs and CCL21+ SMCs coordinate to promote immune infiltration (Fig. 7). The CCR7 receptor on the surface of lymphocytes plays a critical role in mediating immune cell migration and infiltration in response to its chemokine ligands, CCL21 and CCL1931. Here, CCL21+ venular SMCs are identified as the true source of these chemokines to recruit circulating lymphocytes in the blood vessel. Meanwhile, the ACKR1 receptor on the surface of venular ECs differs from typical chemokine receptors; upon binding with chemokine ligands, ACKR1 does not activate intracellular signaling but rather internalizes and transports these ligands, thus facilitating intercellular signaling between ECs and perivascular cells31. ACKR1-mediated binding and presentation of ligands contribute to a chemokine “depot” within the endothelial junction, leading to the luminal-to-abluminal migration of lymphocytes32. In this context, when lymphocytes are attracted to the venular niche by CCL21/CCL19 released from venular SMCs, ACKR1+ venular ECs bind CCL5 ligands from T/NK or B cells and CCL2 ligands from venular SMCs via ACKR1, promoting close-range communication between venular SMCs and lymphocytes and creating a chemokine-rich vascular niche, where lymphocytes efficiently become activated and migrate through venular walls to participate in the anti-tumor response within tumors (Fig. 7).

Fig. 7: Hypothesized model of the collaborative function between venular SMCs and ECs to augment lymphocyte extravasation.
figure 7

In the breast cancer microenvironment, circulating CCR7+ lymphocytes in the blood are first guided to the venules under the chemotaxis of CCL19 and CCL21 secreted by venular SMCs. Venular ECs then bind CCL5 ligands from T/NK or B cells and CCL2 ligands from venular SMCs via atypical chemokine receptor ACKR1, which mediates the transcytosis and presentation of these chemokines. This process promotes cell-cell communication between venular SMCs and lymphocytes and creates a chemokine-rich vascular niche, where lymphocytes are activated to transmigrate across the endothelium into the extravascular space and participate in the anti-tumor response. Schematic plot was created using bioRender.

Discussion

In this study, by integrating scRNA-seq with Stereo-seq data, we established a spatially resolved transcriptomic atlas of 30 human breast tumors across four molecular subtypes, providing a unique resource that comprehensively delineates the ecosystem of primary breast tumors and metastatic lymph nodes. We utilized the DNB-based genome-wide sequencing technology to characterize TME in an unbiased manner. Compared with other ST methods, the nanoscale resolution mapping enabled an accurate characterization of tumor vasculature. We focused on illuminating the spatial and functional heterogeneity of vascular ECs and SMCs and found that venular ECs and SMCs created a distinct pro-inflammatory niche in which they cooperated through multiple chemokines to augment lymphocyte chemotaxis and entry into tumors.

The infiltration of activated lymphocytes is deemed a prerequisite for successful anti-tumor response following immune checkpoint blockade16,33. Here, a synthesis of spatial and transcriptional evidence confirmed the critical role that venules play in facilitating lymphocyte infiltration and activation. Both venular components displayed upregulation of genes involved in antigen presentation or immune activation and exhibited significant spatial proximity to lymphocytes in primary tumors and metastatic lymph nodes. Compared to lymphocytes positioned away from venules, those located adjacent to venules possessed higher activity of immune activation signaling. Consistently, a high number of venules in breast cancer was correlated with increased immune infiltration, less tumor recurrence, and superior overall survival. Nevertheless, compared to capillaries, venules tended to be located at the interphase or periphery of primary and metastatic tumors, which can curtail T and B cells accumulation in the vicinity of tumors and limit the immune-stimulating function of venules. Therefore, exploring therapies that can induce the formation of venules within the tumor could be a promising direction for future research. Recent studies have demonstrated that the triple combination of anti-VEGFR2, anti-PD-L1, and lymphotoxin-beta-receptor (LTβR) agonist can effectively induce the formation of tumor HEVs13.

Over the past decade, significant progress has been made in elucidating the mechanisms underlying the migration of immune cells across the vascular barrier16,34. However, a lack of scRNA-seq resources for tumor HEVs and high-resolution spatial information on tumor vasculature remains to date25. Current investigations typically treat venules as a singular entity and focus on the interaction between ECs and immune or tumor cells. The role of SMCs, as the closest partners of ECs, has largely been overlooked due to technical limitations. In this study, the integration of scRNA-seq and Stereo-seq enabled in-depth documentation of the EC-SMC interactome and revealed their synergistic effect in promoting immune infiltration (Fig. 7). We have identified the specific cell type responsible for CCL21 and CCL19 production within the TME, which are actually secreted by venular SMCs. This discovery emphasizes the pivotal role of venular SMCs in regulating immune infiltration through venules, reminiscent of the established function of fibroblastic reticular cells (FRCs) in the secondary lymphoid organs35. CCL21 and CCL19 are then under transcytosis through ECs and enter the lumen of venules to attract CCR7 + T/NK cells31. Our findings also revealed CCL2-ACKR1 interaction between venular SMCs and ECs and CCL5-ACKR1 interaction between T/NK cells and venular ECs, indicating active paracrine signaling loops between lymphocyte and the venular components within the chemokine-rich vascular niche. The co-existence of ACKR1+ ECs and CCL21+ SMCs was found to be indispensable for the immunomodulatory function of venules and effective activation of lymphocytes. This presents the possibility of targeting SMC to overcome endothelial anergy in tumors16, where lymphocyte-endothelial cell interaction becomes unproductive in an angiogenic TME. Indeed, it has been reported that tumors can remodel HEVs in the lymph node by triggering the dysregulation of CCL21-expressing FRCs and depriving HEVs of LTβR signaling36,37.

Finally, the prognostic significance and spatial localization/co-localization pattern of venules in TME are independent of molecular subtypes or other clinical variables of breast cancer patients. This immunostimulatory function of ACKR1+ ECs or associated CCL19/21+ SMCs/fibroblasts has also been demonstrated in other tumors, including colorectal38,39, gastric40, and non-small cell lung cancer41, as well as other immune-related diseases, such as thyroid-associated ophthalmopathy15, Hashimoto’s thyroiditis14, and cardiovascular disease42, highlighting a shared responsibility for the venular niche, wherein venular ECs and SMCs cooperate to mediate immune infiltration during inflammation and immune response throughout the body.

In conclusion, this study presents the most comprehensive investigation to date into the geospatial structure of human breast cancer across different molecular subtypes. High-resolution transcriptomic maps showcase the phenotypic and spatial heterogeneity of tumor vasculature and characterize an immune-boosting venular niche driven by the coordination between venular ECs and SMCs. Further functional validation of various paracrine signaling loops may lead to effective vascular-targeted therapies to enhance the efficacy of immunotherapy for patients with breast cancer.

Methods

Ethics statement

This study conformed to the ethical guidelines of the Declaration of Helsinki and was approved by the Ethics Committee of National Cancer Center/Cancer Hospital, Chinese Academy of Medical Sciences (23/226-3968). Written informed consent was obtained from all participants prior to their inclusion in the study. All patient-derived data were anonymously analyzed.

Patients and samples

The fresh specimens of the 30 primary breast cancers and the paired metastatic lymph nodes in the ST cohort and the FFPE specimens of 296 patients in the validation cohort were obtained from Cancer Hospital, Chinese Academy of Medical Sciences. Clinical and pathological information of patients was collected by the review of electronic medical record. The follow-up was performed by regular patient visits or telephone calls (Supplementary Data 1).

Preparation, sequencing, and alignment of scRNA-seq libraries

The scRNA-seq libraries were prepared using the DNBelab C4 system as previously described43. To prepare scRNA-seq libraries, fresh tumor bulk tissues were rinsed with PBS and minced into small pieces using mechanical dissociation. The tissues were then incubated in DS-LT buffer (0.2 mg/ml CaCl2, 5 μM MgCl2, 0.2% BSA, and 0.2 mg/ml Liberase in HBSS) at 37 °C for 1 hour, followed by the addition of FB to stop the tissue digestion. Subsequently, the tissue samples were filtered through a 100-μm cell strainer and centrifuged at 500 g for 5 minutes at 4 °C. The resulting pellets were then resuspended in cell resuspension buffer at a concentration of 1,000 cells per μl to prepare the library.

The libraries were sequenced using a DNBSEQ-T1 or DNBSEQ-T7 sequencer at the China National GeneBank. Each library had a 41-bp read length for read 1 and a 100-bp read length for read 2. Raw sequencing reads were filtered and demultiplexed using PISA (v0.2). We then aligned the reads to the reference human genome using STAR (v2.7.4a) and sorted them using sambamba (v0.7.0). Finally, we generated a cell versus gene UMI count matrix for each library using PISA.

Preparation, sequencing, and alignment for ST libraries

The ST libraries were prepared following the Stereo-seq protocol19. Fresh bulk tumor tissues were collected, snap-frozen in optimum cutting temperature compound (OCT) (Tissue-Tek; Sakura Finetek USA, Torrance, CA, USA), and stored at -80 °C. We sliced these pre-frozen tissues transversely in OCT at a thickness of 10 μm using a CM1950 cryostat (Leica, Wetzlar, Germany) and then affixed the tissue sections to the Stereo-seq chip surface. After incubating the tissue sections for 3 min at 37 °C, they were fixed in methanol and incubated at -20 °C for 40 minutes. H&E staining was then performed on these sections along with the adjacent tissue sections. We used a Ti-7 Eclipse microscope (Nikon, Tokyo, Japan) to image both procedures.

The libraries were sequenced on an MGI DNBSEQ-Tx sequencer. Read 1 contained the coordinate identity (CID) bar code and molecular identifiers (MIDs) (CID: 1-25 bp, MID: 26-35 bp), while read 2 consisted of the cDNA sequences. Raw sequencing reads were filtered and demultiplexed using the SAW pipeline (https://github.com/BGIResearch/SAW). We generated a CID-containing expression profile matrix for each library.

ScRNA-seq data analysis

Raw data processing

Raw scRNA-seq data from either the in-house scRNA-seq dataset or Wu et al.'s scRNA-seq dataset obtained from GEO (https://www.ncbi.nlm.nih.gov/geo/, accession: GSE176078) were processed using the R package Seurat44 (v4.1.1). We merged the gene expression matrix of libraries from each sample, calculated the percentage of reads that mapped to the mitochondrial genome (percent.mt), and counted the genes in each cell. We recorded the gene number as Nexpress based on the cell with the most genes. Finally, we filtered out cells with less than 400 genes or with more than 0.9 * Nexpress genes or with a percent.mt larger than 0.2.

To exclude reads from ambient RNA, we applied the R package DecontX45 to each sample with default parameters. For the removal of doublet, we employed the R package DoubletFinder with default parameters46, and removed the 7.5% of cells that most closely resembled pseudo-doublets.

Integration, clustering, and annotation of scRNA-seq data

We conducted independent data integration, cell clustering, and cell type annotation analysis on both the in-house dataset and Wu et al.'s dataset using Seurat. In each dataset, we performed log-normalization using the NormalizeData function with a scale factor of 10,000 and identified highly variable features using the FindVariableFeatures function (nfeatures = 2000). Data in each dataset was integrated using the FindIntegrationAnchors and Integrate Data functions, and the counts were scaled and centered in the integrated dataset using the ScaleData function. For dimensional reduction, we used RunPCA with the first 50 principal components and performed graph-based clustering with a resolution of 0.1 using the FindClusters function. We then employed RunUMAP to execute uniform manifold approximation and projection (UMAP). The FindAllMarkers function was used to identify cluster marker genes. Cell types were annotated by overlapping the cluster markers with their canonical markers.

For the sub-clustering of ECs and SMCs, we conducted cell type sub-clustering analysis in the in-house and Wu et al.'s datasets using the analysis process described above. We used the FindClusters function with a resolution parameter of 0.8 for the in-house dataset and a resolution parameter of 0.8 (for ECs) and 1.6 (for SMCs) for Wu et al.'s dataset. The FindAllMarkers function was used to find cluster marker genes for the sub-clusters of ECs and SMCs, and the cell subtypes were annotated by overlapping the cluster markers with the signature genes. We merged cell sub-clusters that shared the same signature genes.

In addition, we also conducted sub-clustering of ECs using the k-means algorithm on the in-house dataset. After data integration and cell clustering analyzes, we obtained the embedding values of cells from the Seurat object. We then clustered the cells using the k-means algorithm with a value of k = 17 based on the embedding values. The FindAllMarkers function was used to capture cluster marker genes for the sub-clusters, and the EC subtypes were annotated by overlapping the cluster markers with the signature genes. We merged cell sub-clusters that shared the same signature genes.

Differential gene expression, gene set enrichment, and cell trajectory analyzes

We utilized the FindMarkers function built in Seurat to conduct differential gene expression analysis between subtypes of ECs and SMCs and identified genes with adjusted P-values < 0.05 as DEGs. The R package clusterProfiler47 (v4.4.4) was used for gene set enrichment analysis, with the enrichKEGG and enrichGO functions applied. Intra-cell type trajectories of venular and capillary ECs were constructed using monocle2 in the in-house dataset48.

Uncovering cell-type-pair specific LRIs in scRNA-seq data

We employed the R package CellChat30 (v1.5.0) to detect significant ligand-receptor interactions (LRIs) between specific cell type pairs, including venular ECs and venular SMCs, venular ECs and T/NK cells, venular SMCs and T/NK cells, venular ECs and B cells, venular SMCs and B cells, tumor cells and capillary ECs, and arterial ECs and arterial SMCs, within both the in-house dataset and Wu et al.'s dataset. We performed data integration, clustering, and annotation analysis on the in-house dataset and then extracted the log-normalized read count matrix from the Seurat object. Using the default CellChatDB as the LRI database, we employed the identifyOverExpressedGenes function with a thresh.pc parameter of 0.2 and a thresh.fc parameter of 1.5 to filter genes. We then utilized the functions identifyOverExpressedInteractions, computeCommunProb, filterCommunication, and subsetCommunication step by step to identify potential LRIs on the in-house dataset. The analysis process used to obtain potential LRIs on Wu et al.'s dataset was similar.

LRIs were chosen for each dataset between different cell type pairs. We considered LRIs as the cell-type-pair specific LRIs if they had a “prob” value greater than 0.01 in both datasets.

ST data analysis

Raw data processing

In order to address the issue of low RNA capture efficiency on single DNB spots with a resolution of 500 nm, we convolved the raw spatial expression matrix of each of the thirty samples into larger pseudo-spots of a 50 × 50 window size (bin50 model). We then filtered out spots containing less than 200 genes, which resulted in thirty bin50 stereo-seq datasets. Additionally, we convolved the raw spatial expression matrix of each sample into larger pseudo-spots of a 200 × 200 window size (bin200 model) using a similar approach, resulting in thirty bin200 stereo-seq datasets.

Inferring cell composition in a bin50 ST dataset using the RCTD algorithm

To determine the composition of cell types in each bin50 spot, we applied the RCTD algorithm to two independent reference scRNA-seq datasets, including our in-house dataset and Wu et al.'s dataset, using the R package spacexr21 (v2.2.0). The RCTD algorithm was applied using the full model to each bin50 dataset with both reference scRNA-seq datasets. Cell type scores for each bin50 spot were normalized using the normalize_weights function, and cell type labels were assigned to a spot if the scores exceeded the corresponding threshold value. The threshold value differed among cell types, with a value of 0.2 for fibroblast, 0.5 for tumor cell, 0.05 for SMC, and 0.1 for EC, basal cell, macrophage, B cell, and T/NK cell.

Inferring cell composition in a bin50 ST dataset with cell type signature genes

To determine the composition of cell types in a bin50 dataset using gene signature scores, we first extracted the gene signatures for fibroblasts, tumor cells, SMCs, ECs, basal cells, macrophages, B cells, and T/NK cells from scRNA-seq data (Supplementary Data 2). We then performed log-normalization on a bin50 dataset using the NormalizeData function of Seurat. We summed the normalized expression values of the corresponding gene signature in a bin50 spot as the gene signature score of a cell type for the bin50 spot. Subsequently, we calculated the gene signature scores of the eight cell types for each bin50 spot in thirty bin50 datasets.

Cellular spatial distance analysis

To elucidate the spatial juxtaposition of tertiary lymphoid structure (TLS) and lymphocytes (T and NK cells) with three endothelial subtypes, we firstly performed manual annotation of TLS based on the H&E staining images adjacent to ST slides and spatial cell type maps. TLS was defined as a dense, round or oval-shaped lymphoid aggregation with or without germinal center. Then, we quantified the mean Euclidean distances from each endothelial subtype to the nearest TLS. For each slide containing TLS, we calculated the Euclidean distance from the first cell within the TLS to each endothelial cell located outside the TLS. The subtype of the nearest endothelial cell was recorded and this process was repeated for each cell within the TLS. Once the distances were determined, we counted the numbers of venular EC (vEC), arterial EC (aEC), and capillary EC (cEC) closest to the TLS, then calculated their proportions relative to the total number of cells for each subtype, respectively. The same methodology was applied to calculate distances to lymphocytes.

Integration and clustering of bin50 data

We applied R packages Seurat and Harmony49,50 (v0.1.1) to integrate and cluster the thirty bin50 datasets. We sampled 10,000 spots from each bin50 dataset and integrated them with Harmony. We employed RunPCA to conduct dimensional reduction using the initial 30 PCs and FindClusters to perform graph-based clustering with a resolution of 0.1. Next, we employed RunUMAP to execute UMAP.

Inferring EC subtypes and SMC subtypes in bin50 datasets

We employed Seurat and Harmony to assign EC subtype labels to spots comprising these cells in the thirty bin50 datasets. Initially, we extracted the spots labeled as ECs from each bin50 dataset and integrated them with Harmony. We then employed RunPCA to conduct dimensional reduction using the initial 30 PCs and FindClusters to perform graph-based clustering with a resolution of 0.7. Next, we employed RunUMAP to execute UMAP. Clusters with significant expression of the feature genes of EC subtypes (Supplementary Table 1) were labeled as corresponding subtypes, respectively. We employed a similar procedure to assign SMC subtypes labels to spots using a resolution of 0.8.

Differential gene expression and gene set enrichment analysis

The differential gene expression and gene set enrichment analysis performed on the ST data were identical to those performed on the scRNA-seq data.

Co-localization analysis of cell type/subtype

We have introduced two proximity models, namely the Enrichment Score (\({ES}\)) and the Normalized Enrichment Score (\({NES}\)), to measure the co-localization strength between distinct cell types within two distinct observation spaces: bin50 and bin150 models. We illustrated the algorithm by utilizing \({ES}\) and \({NES}\) to evaluate the co-localization strength between T/NK cells and target cells (ECs, vECs, aECs, and cECs) in the 50 × 50 DNB window (bin50 model).

In the first step, we enumerated the spots within a bin50 dataset labeled as T/NK (\(x\)), EC (\({y}_{1}\)), vEC (\({{{{\rm{y}}}}}_{2}\)), aEC (\({y}_{3}\)) and cEC (\({y}_{4}\)). For the dataset satisfying \(\min\) \(({y}_{1},{y}_{2},{y}_{3},{y}_{4}) \, > \, 99\), we further computed the spots with both T/NK and EC labels (\({z}_{1}\)), both T/NK and vEC labels (\({z}_{2}\)), both T/NK and aEC labels (\({z}_{3}\)), and both T/NK and cEC labels (\({z}_{4}\)) within the same dataset. \({ES}\) between T/NK and EC was obtained by dividing \({z}_{1}\) by \({y}_{1}\). Similarly, we calculated ESs between T/NK and vEC, aEC, cEC as \({z}_{2}/{y}_{2}\), \({z}_{3}/{y}_{3}\), and \({z}_{4}/{y}_{4}\), respectively. \({NES}\) between T/NK and EC was estimated as \(({z}_{1}/x)/\left({y}_{1}/\min ({y}_{1},{y}_{2},{y}_{3},{y}_{4})\right.\). Likewise, we computed \({NESs}\) between T/NK and vEC, aEC, and cEC as \(({z}_{2}/x)/\left({y}_{2}/\min ({y}_{1},{y}_{2},{y}_{3},{y}_{4})\right.\), \(({z}_{3}/x)/\left({y}_{3}/\min ({y}_{1},{y}_{2},{y}_{3},{y}_{4})\right.\), and \(({z}_{4}/x)/\left({y}_{4}/\min ({y}_{1},{y}_{2},{y}_{3},{y}_{4})\right.\), respectively. In this way, we measured \({ES}\) and \({NES}\) between T/NK and EC, vEC, aEC, and cEC in thirty bin50 datasets.

In the second step, we employed Wilcoxon rank-sum test (paired samples and two-tailed model) to compare four sets of \({ES}\). Specifically, the four \({ES}\) sets were composed of the ESs between T/NK and vEC, aEC, cEC obtained from 23 bin50 datasets of primary tumor samples. If one \({ES}\) set was considerably higher than the remaining three ES sets (p-value < 0.05), this would suggest that the specific EC type (or subtype) maintained the strongest co-localization strength with T/NK using enrichment score model in the bin50 model of primary tumor samples. An analogous procedure was conducted to compare the four \({ES}\) sets in 7 bin50 datasets of samples from lymph node metastases. Likewise, the four sets of \({NES}\) were compared within bin50 datasets derived from either primary tumor samples or lymph node metastases samples.

We also utilized \({ES}\) and \({NES}\) to estimate the co-localization strength between T/NK cells and target cells (ECs, vECs, aECs, and cECs) within the 150 × 150 DNB windows (bin150 model). To enhance the precision of cell type characterization in bin150 spots, we leveraged the higher resolution bin50 spots. By considering a bin150 spot as an aggregate of nine bin50 spots, we initiated our analysis by selecting a central bin50 spot and identifying its eight immediate neighbors within the bin50 dataset. Subsequently, we cataloged the cell type labels for the ensemble of nine spots. These labels were then extrapolated to the central bin50 spot, providing an inferred cell type designation for the corresponding bin150 spot. This inferred label, deemed a “succeed” label, denotes the cell type presence within a bin150 spot, centered upon the chosen bin50 spot. This process was repeated for every spot in the thirty bin50 datasets.

In the second step, we enumerated the spots within a bin50 dataset labeled as T/NK (\(x\)), EC (\({y}_{1}\)), vEC (\({y}_{2}\)), aEC (\({y}_{3}\)) and cEC (\({y}_{4}\)). For the dataset satisfying \(\min ({y}_{1},{y}_{2},{y}_{3},{y}_{4}) \, > \, 99\), we further computed the spots with “succeed” T/NK label and EC label (\({q}_{1}\)), “succeed” T/NK label and vEC label (\({q}_{2}\)), “succeed” T/NK label and aEC label (\({q}_{3}\)), and “succeed” T/NK label and cEC label (\({q}_{4}\)) within the same dataset. \({ES}\) between T/NK and EC was obtained by dividing \({q}_{1}\) by \({y}_{1}\). Analogously, we calculated ESs between T/NK and vEC, aEC, cEC as \({q}_{2}/{y}_{2}\), \({q}_{3}/{y}_{3}\), and \({q}_{4}/{y}_{4}\), respectively. \({NES}\) between T/NK and EC was estimated as \(({q}_{1}/x)/\left({y}_{1}/\min ({y}_{1},{y}_{2},{y}_{3},{y}_{4})\right.\). Likewise, we computed \({NESs}\) between T/NK and vEC, aEC, and cEC as \(({q}_{2}/x)/\left({y}_{2}/\min ({y}_{1},{y}_{2},{y}_{3},{y}_{4})\right.\), \(({q}_{3}/x)/\left({y}_{3}/\min ({y}_{1},{y}_{2},{y}_{3},{y}_{4})\right.\), and \(({q}_{4}/x)/\left({y}_{4}/\min ({y}_{1},{y}_{2},{y}_{3},{y}_{4})\right.\), respectively. The bin150 model was utilized to estimate the co-localization strength between bin50 spots and the “succeed” label, which represents the cell type associated with a bin150 spot. In this way, we evaluated \({ES}\) and \({NES}\) between T/NK and EC, vEC, aEC, and cEC with bin150 model in thirty datasets.

In the third step, we executed the same Wilcoxon rank-sum test (paired samples and two-tailed model) as described above to compare the four sets of \({ES}\) and four sets of \({NES}\).

Co-localization analysis of gene modules

In the initial step, we designated cell type labels for spots in a bin200 dataset by utilizing the cell type labels of spots present in the bin50 dataset of the same ST slide. Bin50 spots and bin200 spots were linked if they shared spatial areas on the ST slide. Subsequently, the cell type labels of bin200 spots were determined based on the cell type labels of their linked bin50 spots.

In the second step, we merged the thirty bin200 datasets into one using Seurat’s merge function. Within the dataset, we identified three spot groups: 1) spots with both vEC and vSMC labels, 2) spots with vEC label but not vSMC label, and 3) spots with vSMC label but not vEC label. Using the FindMarkers function built in the Seurat package, we conducted differential analysis on 2) spots containing only vECs or 3) only vSMCs versus 1) spots containing both cell types, yielding a total of 194 DEGs, of which 135 DEGs were shared by the two sets.

In the third step, we identified co-expression gene modules for the DEGs on the in-house scRNA-seq dataset. We first enumerated the gene pairs within the DEGs and calculated the co-expression value of each pair. As an example, for the CCL5-TRBC1 pair, we created Set1 with cells expressing the top 5% of log-normalized CCL5, and Set2 with cells expressing the top 5% of log-normalized TRBC1. We counted the number of cells shared between Set1 and Set2 (\({a}_{1}\)) and the number of unique cells in Set1 and Set2 (\({a}_{2}\)). We then determined the co-expression value of CCL5 and TRBC1 as \({a}_{1}/{a}_{2}\). Finally, we utilized hierarchical clustering analysis on the co-expression values of gene pairs to manually identify four distinct gene modules (M1, M2, M3, and M4) (Supplementary Data 3).

In the fourth step, we focused on gene module M1 in the in-house dataset, where the module score of a cell was determined by calculating the sum of its log-normalized gene expression values within the module. We then computed the module scores for all cells within the in-house dataset, and similarly for modules M2, M3, and M4.

In the fifth step, we conducted co-localization analysis between modules M3 and M4, utilizing the enrichment score (\({ES}\)). Using the top 10% of M3 module scores as the M3-high spot set, we identified the M4-high spot set in a similar manner within a bin200 dataset. The M3-M4 co-localization spots were defined as the spots that held high scores in both M3 and M4 in the bin200 dataset. We then counted the number of bin200 spots with vEC and vSMC label (\({i}_{1}\)), vEC label not vSMC label (\({i}_{2}\)), vSMC label not vEC label (\({i}_{3}\)), T/NK label not vEC or vSMC label (\({i}_{4}\)), and B label not vEC or vSMC label (\({i}_{5}\)). Additionally, we counted the number of M3-M4 co-localization spots in the five spot groups (\({j}_{1}\), \({j}_{2}\), \({j}_{3}\), \({j}_{4}\), and \({j}_{5}\), respectively). Using this data, we calculated the \({ES}\) of M3-M4 co-localization spots on the five spot sets as \({j}_{1}/{i}_{1}\), \({j}_{2}/{i}_{2}\), \({j}_{3}/{i}_{3}\), \({j}_{4}/{i}_{4}\) and \({j}_{5}/{i}_{5}\). We calculated the \({ES}\) of M3 and M4 in the thirty bin200 datasets in a similar fashion.

In the sixth step, we executed the same Wilcoxon rank-sum test (paired samples and two-tailed model) as described above to compare the \({ES}\) sets.

Co-localization analysis of LRIs

In the first step, as described above, we employed the cell type labels of spots found within the bin50 dataset of the same ST slide to designate cell type labels for spots within the bin200 dataset.

In the second step, we determined the co-localization of cell-type-pair specific LRIs within each spot of a bin200 dataset. For example, to determine the co-localization state of the CCL2-ACKR1 LRI, we established “co-localization” if a spot contained more than one CCL2 UMI and one ACKR1 UMI. The co-localization state of all cell-type-pair specific LRIs was similarly determined within the thirty bin200 datasets.

In the third step, we conducted co-localization analysis of the cell-type-pair specific LRIs in the corresponding cell types, utilizing the enrichment score (\({ES}\)). For instance, using “vSMCs-vECs: CCL2-ACKR1” as an example, we counted the number of spots with venular EC and venular SMC label (\({n}_{1}\)), venular EC label not venular SMC label (\({n}_{2}\)), venular SMC label not venular EC label (\({n}_{3}\)) within a given bin200 dataset. Additionally, we counted the number of CCL2-ACKR1 co-localization spots within the three spot groups (\({m}_{1}\), \({m}_{2}\) and \({m}_{3}\), respectively). From this information, we calculated the \({ES}\) of CCL2-ACKR1 co-localization spots within the three spot groups as \({m}_{1}/{n}_{1}\), \({m}_{2}/{n}_{2}\) and \({m}_{3}/{n}_{3}\). We calculated the \({ES}\) of other cell-type-pair specific LRIs in the same manner within the thirty bin200 datasets.

In the fourth step, we calculated the fold change value and p-value for each cell-type-pair specific LRI. For instance, using the example of “vSMCs-vECs: CCL2-ACKR1,” we calculated the median value of the three \({ES}\) sets representing the enrichment of CCL2-ACKR1 on the above three spot groups (\({{es}}_{1}\), \({{es}}_{2}\), and \({{es}}_{3}\) respectively). The fold change value was then computed as \({{es}}_{1}/\max ({{es}}_{2},{{es}}_{3})\). We employed the Wilcoxon rank-sum test (paired samples and two-tailed model) to compare the \({ES}\) of CCL2-ACKR1 in spots with venular EC and venular SMC label versus those with venular EC label not venular SMC label to derive a p-value (\({p}_{1}\)). We also compared \({ES}\) of CCL2-ACKR1 in spots with venular EC and venular SMC label versus those with venular SMC label not venular EC label to obtain a p-value (\({p}_{2}\)). From these results, the p-value of “vSMCs-vECs: CCL2-ACKR1” was determined as \(\max ({p}_{1},{p}_{2})\). Using this methodology, we calculated the fold change value and p-value of other cell-type-pair specific LRIs within the thirty bin200 datasets.

Bulk RNA-seq data analysis

Data acquisition and processing

The bulk RNA-seq data and clinical information of TCGA (The Cancer Genome Atlas) and METABRIC (Molecular Taxonomy of Breast Cancer International Consortium) projects were downloaded from cBioPortal (https://www.cbioportal.org/). The z-score normalized expression values of genes were extracted from primary breast tumor samples to create both the TCGA and METABRIC datasets.

Survival and correlation analysis

To define the signature genes for venular ECs and SMCs, we performed differential gene expression analysis on venular ECs (Supplementary Table 2) and venular SMCs (Supplementary Table 3) compared to all other cell populations, respectively. Genes with a log2FC greater than 1.5 and a pct.2 less than 0.1 were retained and non-specific expression genes were excluded based on the expression bubble plot (Supplementary Fig. 6a). Ultimately, a set of venular EC (ACKR1, CCL14, SELE, ZNF385D, CAVIN2, SELP, CADM3-AS1, LIFR, IL33) and venular SMC (CCL19, CCL21, SYNPO2, UBD, FGF7, GGT5, RARRES2) signature genes were identified for survival analysis.

Using the signature scores of venular ECs and SMCs, we predicted the prognosis of samples in both the TCGA dataset and the METABRIC dataset. The Kaplan-Meier plotter analysis and the Cox regression analysis were performed using the R packages survival (v3.4.0) and survminer (v0.4.9). Bisque (v1.0.5) was utilized to estimate the cell type proportion in the TCGA and the METABRIC dataset. The Pearson correlation coefficients between the proportions of different cell types/subtypes were then calculated.

Immunohistochemistry (IHC) staining

Formalin-fixed paraffin-embedded (FFPE) tissue blocks of the validation cohort (296 cases) were serially cut into 4-μm sections. Sections were then deparaffinized, rehydrated, and subjected to pretreatment for antigen retrieval. Primary antibodies for ACKR1/DARC (Clone EPR5205, ab137044, Abcam, Shanghai, China) and for CD31 (Clone EP78, Zhongshan Golden Bridge Biotechnologies, Beijing, China) were applied to the sections using the Ventana BenchMark Ultra detection system and automatic staining device. The adjacent frozen slides of all 30 ST slides were also stained with CD31 using the same protocol.

Scoring of ACKR1-positive venules

The stained tissue sections were scanned using the KFBIO digital pathology scanner (KF-PRO-040) to obtain images of the whole slides. These scanned images were then viewed and captured under 20× magnification with the K-Viewer software. Quantification of ACKR1-positive venules on tissue sections was conducted by evaluating the number of positive venules with diameters ranging from 50 μm to 200 μm across five 20× magnification fields. This counting process was performed using a combination of image analysis software (Image-Pro Plus 6.0, Media Cybernetics, U.S.A) recognition and manual review. The validation cohort cases were then classified into two groups according to their average ACKR1-positive venules, using a median value of 3 positive venules per 20× magnification field as the threshold for stratification. To compare the vascularization status of these two groups, we also quantified the CD31-positive vessels on the tissue sections by evaluating their average number in the tumor region across five 40× magnification fields.

Multiplex immunofluorescence (mIF) staining

Thirty fresh OCT-embedded tissue blocks of the ST cohort were sectioned serially for mIF staining. Polychromatic immunofluorescence staining was performed by using the five-color multiple fluorescent immunohistochemical staining kit (PR001, abcarta, Suzhou, China) based on the Tyramide signal amplification (TSA) technique following the manufacturer’s manual. Briefly, the tissue sections were subjected to the blocking protocol using the 3% Hydrogen peroxide for 10 minutes and incubated for 30 minutes at ambient temperature with the primary target antibody, then slides were washed with buffer solution (BOND Wash Solution 10× Concentrate, 1:10). Next, sections were incubated for 20 minutes with Horseradish peroxidase-labeled goat anti-rabbit/mouse, which was developed with fluorescent dye diluted by the signal amplification reagent provided with the kit. Then, the buffer solution was used to remove binding antibodies, and the whole process was repeated until all antigens were stained. We designed two mIF staining panels including a co-localization panel and an immune panel. The co-localization panel included ACKR1/DARC (Clone EPR5205, ab137044, Abcam, Shanghai, China; 1:500), CCL21 (ab9852, Abcam, Shanghai, China; 1:400), CD31 (Clone EP78, Zhongshan Golden Bridge Biotechnologies, Beijing, China), and SMA (Clone 1A4, MXB Biotechnologies, Fuzhou, China); The immune panel included ACKR1/DARC (Clone EPR5205, ab137044, Abcam, Shanghai, China; 1:500), CD8 (Clone SP16, MXB Biotechnologies, Fuzhou, China), CD20 (Clone L26, MXB Biotechnologies, Fuzhou, China), and CD56 (Clone MX039, MXB Biotechnologies, Fuzhou, China). The corresponding fluorescent dyes were TSA520 (ABT00021), TSA570 (ABT00025), TSA620 (ABT00027), TSA650 (ABT000029), and TSA690 (ABT00031). Lastly, fluorescent mounting medium with DAPI (4,6-diamidino-2-phenylindole) was used to stain nuclei. Control slides were run for reproducibility alongside each experimental slide-staining run. The stained tissue sections were scanned using the KFBIO digital pathology scanner (fluorescence; KF-FL-005) to obtain fluorescence images of the whole slides. The scanned images were viewed and captured with the KF-Viewer software.

Trans-endothelial migration experiment

Human umbilical vein endothelial cells (HUVECs) and peripheral blood mononuclear cells (PBMCs) were obtained, respectively, from ATCC and Milestone Biological Science & Technology Co., Ltd. After transfected with pHS-AVC-5297 vector (pLV-hef1a-mNeongreen-P2A-Puro-WPRE-CMV-flag; Hesheng Biotechnology) encoding human ACKR1 or the empty vector, HUVECs were seeded on transwell inserts and stimulated by 2 ng/ml IL-1β after they grew confluently. Then, DMEM media containing human CCL2 or CCL5 (10 ng/ml) was added to the lower chamber, and 1 × 106 PBMCs were added to the inserts in the absence of chemokines. After 20 h co-culture, migrated PBMCs were stained with trypan blue and counted by cell counter.

Statistics and reproducibility

Data were analyzed and processed using GraphPad Prism (v8.0) and R (v4.1.2). Error bars show mean ± SD unless stated otherwise. Boxplots show the mean with interquartile range (error bars: 95% confidence intervals). All cell-based experiments were independently repeated three times with consistent results. For IHC assays, due to the high consistency of staining patterns observed in preliminary studies, each marker was stained once per patient, and representative images are shown. No statistical method was used to predetermine sample size. No data were excluded from the analyzes.

For significance determination, the two-sided Wilcoxon rank-sum test (continuous data) and Pearson’s Chi-square test (qualitative data) were used to calculate significance unless stated otherwise. GO and KEGG pathway enrichment analysis was performed using the one-sided Fisher’s Exact test with multiple comparisons correction using false discovery rate (FDR). DEG was determined using the two-sided Wilcoxon rank-sum test with Bonferroni correction for multiple comparisons. P values: NS. not significant, p ≥ 0.05; * p < 0.05; ** p < 0.01; *** p < 0.001.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.