Introduction

Mammalian pre-implantation development is distinguished by two keystone events: zygotic genome activation (ZGA) and the first lineage specification away from the totipotent zygote1. In mice, ZGA consists of minor and major waves, with the minor wave taking place in the zygote and early two-cell embryo, resulting in transcription of about 100 genes1. The major wave primarily occurs in the late two-cell embryo, leading to transcription of thousands of genes2. Owing to the markedly different organization of the sperm and oocyte genomes, global epigenetic events reprogram the highly differentiated gametes to a totipotent embryo that undergoes substantial transformations in the course of pre-implantation development. Past investigations have demonstrated that pre-implantation development is associated with profound changes in DNA methylation, histone modifications, chromatin accessibility, and chromatin 3D structure, which contrast sharply with those observed in somatic cells3,4,5.

Nucleosomes, the fundamental units of chromatin in eukaryotes, consist of an octamer of histone proteins around which genomic DNA is wrapped6,14. These structures are ubiquitously distributed across DNA but are conspicuously absent at regulatory regions such as promoters, enhancers, and transcription factor binding sites in somatic cells. Nucleosome occupancy is inversely correlated with gene expression changes, particularly around transcription start sites (TSS)7. Transcription factors can displace nucleosomes during binding, and nucleosome occupancy can inhibit transcription factor interactions with promoters. Proximal promoters, especially the + 1 nucleosome, facilitate the pausing of RNA polymerase II8. The organization of nucleosomes, by regulating the accessibility of DNA to transcriptional machinery, plays a critical role in modulating gene activity.

Studies in in vitro fertilization have shown that the patterns of nucleosome positioning in the male and female pronuclei during ZGA are distinctly different from those observed in somatic cells8. Additionally, compared to somatic cells, embryonic stem cells (ESCs) exhibit an open chromatin structure and heightened chromatin accessibility, supporting their pluripotency and differentiation potential9. Observations by Lilien N. Vong et al. have suggested that fragile nucleosomes occupy previously designated nucleosome-depleted regions near the transcriptional start and end points of nearly all mouse genes. Nucleosome targeting DNA binding proteins subgroups, such as the CTCF and pluripotency factors, are examples of this interaction7. Teif et al. found that chromatin remodeling and DNA-sensitive regions commonly manifest as nucleosome deletions in ESCs, although they may become nucleosome dense during differentiation, and regions of nucleosome deletions near transcription initiation and termination sites are more extensive and pronounced for active genes10. These findings indicate that the patterns of nucleosome positioning during ZGA and within ESCs display significant differences from those in somatic cells, highlighting unique characteristics. However, the re-organizations of nucleosome positioning during pre-implantation development of naturally fertilized mouse embryos and their potential regulatory effects on gene expression remain largely unknown.

Genome-wide chromatin accessibility assays include DNase-seq, MNase-seq and ATAC-seq11,12,13. For chromatin accessibility exploration, DNase-seq and ATAC-seq have been reported to have good performance, which can obtain signals enriched in nucleosomal depleted regions. While the MNase-seq method has lower background, previous profiling work using this method required millions of cells as input materials, which prevents its effective application on embryo samples composed of small numbers of cells. Recently, we have developed a single-cell sequencing method which could simultaneously measure the genome-wide nucleosome positioning and chromatin accessibility in single cells called scMNase-seq14,15. scDNase-seq has been employed to detect genome-wide DNase I hypersensitive sites (DHS) in single cells, revealing a progressive establishment of DHS during embryonic development, with a sharp increase observed in 8-cell embryos16, suggesting the efficacy of single-cell strategies in studying embryogenesis. Thus, in this study, we used scMNase-seq to profile genome-wide nucleosome positioning from hundreds of mouse embryos to elucidate the dynamic organization process of nucleosomes during pre-implantation development. We investigated the re-modeling dynamics of nucleosomes from the oocyte to the morula stage, and generated maps of TSS, TES, CTCF, H3K4me3, H3K27me3, and DHS sites during each stage of embryonic development. We found that both nucleosome size and positioning at TSS were drastically switched at the 2-cell stage after natural fertilization, while nucleosome positioning and phasing at CTCF binding sites were gradually re-built until morula stage. At the same time, we observed significant heterogeneity in nucleosome enrichment at H3K4me3 modified genes during pre-implantation development, which suggests that these nucleosome distribution patterns may have potential regulatory functions. Overall, our scMNase-seq based nucleosome dataset of mouse embryos firstly provide a comprehensive perspective of nucleosome re-modeling during pre-implantation.

Materials and methods

Female mouse superovulation and pairing

30 adult female KM mice (8–10 weeks old) and 15 adult male KM mice (8–10 weeks old) were purchased from the Animal Experiment Management Center of the Army Medical University. Prior to the experiment, mice were acclimated to standard animal housing conditions for 1 week. Female mice were first injected with 10IU of PMSG, followed by an injection of HCG 48 h later. They were then paired with male mice 1:1 for 16–18 h to collect zygote (n = 8). Two-cell embryos were collected 24 h later (n = 5), followed by collection of 4-cell embryos at 42 h (n = 5), 8-cell embryos at 54 h (n = 5), and morula at 66 h (n = 5). Throughout the experiment, all mice were housed individually (without other animals outside of pairing), provided with the same food and water ad libitum, and maintained in a 12-h light–dark cycle with a temperature of 18 °C under air-conditioning. The procedures for care and use of animals were approved by the Ethics Committee of the Laboratory Animal Welfare and Ethics Committee of the Army Medical University (AMUWEC20224182) and all applicable institutional and governmental regulations concerning the ethical use of animals were followed. Ethical considerations were paramount in the design and execution of this study. Measures were taken to reduce pain, suffering, and distress to the lowest possible levels.

All experiments were performed in accordance with relevant guidelines and regulations. The study adheres to the ARRIVE guidelines for reporting animal experiments (https://arriveguidelines.org).

Preimplantation embryo collection

Female mouse fallopian tubes were collected in PBS prewarmed at 37 °C. The procedure was performed on a heating plate at 37 °C. The fallopian tubes of the mice were flushed with embryo culture medium using ophthalmic forceps and needles to flush out the embryos. Single embryos with uniform size, transparency, and intact zona pellucida were selected and placed into individual 200 μl EP tubes and stored at − 80 °C. For oocytes, hyaluronidase treatment was used to remove the surrounding cumulus cells.

scMNase library construction

The frozen single-cell embryos were thawed and subjected to titration with different concentrations of MNase enzyme to determine the optimal concentration for embryo digestion at each stage. Over-digested samples, characterized by excessive DNA fragmentation, were identified to ensure that the digestion conditions were within the desired range for accurate analysis (sFig. 1A). The samples that were underdigested by MNase were significantly larger, and the small number of fragments indicated insufficient enzyme activity (sFig. 1B). After MNase digestion was completed, the cells were terminated by the addition of EGTA, incubated overnight at 65° C with 10% SDS and proteinase K, and digested embryos were then subjected to DNA purification using spin columns. The DNA ends were repaired, an A base was added to 3’ ends, Y adaptors were ligated, and the ligated DNA was amplified using multiplexed QPCR primers. The PCR products were visualized and separated using Invitrogen 2% E-Gel to obtain DNA fragments between 150 and 320 bp as the library.

Data alignment

The paired-end reads from scMNase-seq were mapped to the reference genome (mm9) using Bowtie2. Parameters described in the study by Lai et al.14 were applied to further remove redundant fragments. Fragments within the range of 140–180 bp were selected as canonical nucleosomes, while fragments shorter than 120 bp were classified as subnucleosome-sized particles for further analysis.

Principal component analysis

Genome-wide nucleosome positioning patterns were analyzed using principal component analysis (PCA). We extracted mean nucleosome occupancy values in 10-kb bins across chromosomes from MNase-seq datasets covering six developmental stages (Oocyte, Zygote, 2-cell, 4-cell, 8-cell, and Morula). Data were standardized (z-score normalization) prior to PCA. The first two principal components (PC1 and PC2) were computed using sklearn’s PCA implementation (v1.2.2) and visualized in a two-dimensional scatterplot with points colored by developmental stage.

TSS nucleosome positioning analysis

Stage-specific gene sets of embryos from previous studies17 were utilized, with mESCs substituted for morula (Table 1). The script described in the study by Lai et al. was used for the analysis. The resulting nucleosome positioning probabilities were then visualized using R, with a 2000 bp extension on both sides of the TSS sites. When FPKM > 10 was taken as the number of specific genes in each stage, they were: oocyte, 4143 genes; zygote, 3896 genes; 2-cells, 4000 genes; 4-cells, 4059 genes; 8-cells, 3898 genes; morula, 3898 genes.

Table 1 Referance data.

Analysis of gene expression trends

Temporal trend analysis was performed using the MFuzz R package (v2.58.0)25 on differentially expressed genes across developmental stages in the GSE66582 dataset, ordered sequentially from oocyte to mESC. Genes exhibiting dynamic expression patterns during embryogenesis were clustered into 8 distinct temporal groups based on their expression trajectories. The cluster containing Ctcf was systematically identified, and genes within this cluster annotated as epigenetic regulators (e.g., chromatin remodelers, histone-modifying enzymes, and DNA methylation-associated factors) were extracted. Stage-specific expression profiles of these genes were visualized via a heatmap.

V-plots generated

V-plots are generated using Plot2DO26, with specific parameters set to “–file = input.bed --sites = peaks_center.bed --align = center --upstream = 2000 &1000 --downstream = 2000 & 1000”.

Heatmap of nucleosome positioning

computeMatrix in Deeptools was used to calculate the upstream and downstream nucleosome arrangement of specific sites, and plotHeatmap was used to visualize the arrangement to obtain the nucleosome arrangement heat map27.

CTCF binding site nucleosome positioning analysis

CTCF ChIP-seq data (GSM918748)20 generated using the ES-Bruce4 (embryonic cell line), which contains a total of 33,880 CTCF binding site peaks, sourced primarily from the GEO database. Specifically, these peaks are identified within the “GSM918748_mm9_wgEncodeLicrTfbsEsb4CtcfME0C57bl6StdPk.broadPeak.gz” BROADPEAK file within the dataset. In addition to this, we used a similar strategy to obtain data sets of CTCF binding site detection in other ESC studies, mainly including: GSM691014(44,424 peaks)9,21, ENSCR000CCB.CTCF.ES-Bruce4(43,085 peaks)20, GSM2533853(121406 peaks)22, GSM3771259(52,089 peaks)23. Data on unique CTCF binding sites in preimplantation embryos at each stage were obtained from the dataset PRJCA01812424. Following the provided data processing protocols from the original study which included alignment with Bowtie2, filtering, and peak calling with MACS2, selecting peaks with a fold enrichment greater than 5 and -log10(qvalue) greater than 5 as CTCF binding sites for pre-implantation embryonic stages. We obtained CTCF binding sites for MII oocyte (1138), zygote (17,782), 2-cell (53,471), 4-cell (66,919), 8-cell (65,791), and morula (76,962). Extract peaks at the intersection of CTCF motifs for analyzing the nucleosome positioning within 1000 bp upstream and downstream of these points. Bedtools jaccard was used to calculate correlations between CTCF binding sites in ESCs datasets.

DHS Site nucleosome positioning analysis

Stage-specific DHS gene lists from previously published embryonic DNase-seq studies18 were obtained (Table 1). Analysis of the starting sites of these genes allowed for an examination of the nucleosome positioning at the DHS gene starting sites at each stage of mouse preimplantation embryos (oocyte, 866 sites; zygote, 1065 sites; 2-cells, 1006 sites; 4-cells, 2957 sites; 8-cells, 12,323 sites; morula, 19,476 sites).

H3K4me3 and H3K27me3 nucleosome positioning analysis on gene bodies

The gene lists with of promoters enriched for H3K4me3 and H3K27me3 modifications in embryos were obtained from Liu et al. study19 (Table 1). The nucleosome positioning patterns on these genes were analyzed, and potential associations between nucleosome positioning patterns and H3K4me3 modifications were considered. Additionally, associations between nucleosome positioning and gene expression were evaluated using RNA-seq data.

GO enrichment analysis

Intersection gene sets were obtained by analyzing the Venn diagram of genes with differential FPKM values and those marked by H3K4me3. GO enrichment analysis was performed using the clusterProfiler package (v4.6.2) in R28 to identify biological functions significantly enriched in the target gene sets. The workflow included the following steps: Input gene sets were defined, with the entire genome of Mus musculus serving as the background gene set (annotated using the org.Mm.eg.db database). The enrichGO function was employed to conduct a hypergeometric test, filtering significantly enriched GO terms based on predefined p-value and q-value thresholds. The top 3 enriched terms associated with embryonic development-related signaling pathways were selected for visualization (details provided in Supplementary Table 2).

Nucleosome sparsity calculationand statistical Evaluation

The sparsity index was operationally defined as the integrated tag density within a symmetrically centered 30-bp window (± 15 bp relative to TSS coordinates). This metric quantifies nucleosome depletion intensity at core regulatory regions. Expression-level stratification was performed based on FPKM thresholds: low (0.1–1), medium (1–10), and high (> 10). Monotonic trends in nucleosome sparsity across expression tiers were assessed using the Jonckheere-Terpstra test (R clinfun package v2.0.1), with Benjamini–Hochberg correction for multi-testing (FDR < 0.05).

Results

scMNase-seq library construction of mouse pre-implantation embryos

Pre-implantation embryos were processed according to the operation protocol of scMNase-seq (Fig. 1A). Samples in which the majority of fragments are concentrated around 140-180 bp are considered as valid samples. Finally, we obtained scMnase-seq data from samples of embryos at each stage, containing 6 oocytes (1,607,343 unique reads), 5 zygotes (2,939,202 unique reads), 20 2-cells (8,167,859 unique reads), 13 4-cells (22,232,434 unique reads), 22 8-cells(18,663,373 unique reads), 31 morulae (38,702,661 unique reads) (sFig. 2). Nucleosomes primarily occupy DNA sites and make them inaccessible, whereas subnucleosome-sized particles represent chromatin accessibility14. We identified subnucleosome-sized particles (< 120 bp, blue lines to the left) and nucleosomes (140–180 bp, between red lines) according to the fragment occupation (Fig. 1B). Interestingly, nucleosome size of zygote showed a majority of less than 50 bp length while oocyte and other embryos showed normal nucleosome size length. This is consistent with expectations that the DNA wrapped around protamine in sperm is small (about 12 bp for protamine 1 and about 15 bp for protamine 2). Additionally, research on mouse sperm has revealed that the protamine-DNA and the displaced transitional protein-histone complexes constitute a relatively stable transitional state, producing small subnucleosomal structures post-MNase digestion29. As such, we suspected that zygote contained both sperm and oocyte DNA, sperm’s DNA was wrapped by protamine and its size distribution was intrinsically different from histone-wrapped nucleosome observed at later stages.

Fig. 1
figure 1

Preimplantation embryos scMNase-seq library construction. (A) Schematic diagram of preimplantation embryo acquisition and scMNase-seq library construction. (B) Fragment-length density of scMNase-seq data in preimplantation embryos.

PCA revealed distinct nucleosome positioning heterogeneity. Intra-stage: Morula exhibited pronounced PC2 dispersion, indicating dynamic remodeling; Zygote showed constrained PC1 but broad PC2 spread, suggesting reprogramming variability. Inter-stage: PC1 values increased progressively (Oocyte, Zygote, 2-cell), aligning with embryonic genome activation. The 8-cell stage uniquely displayed a positive mean PC2 shift, potentially reflecting chromatin reorganization during compaction initiation (sFig. 3).

The classical nucleosome organization in mouse preimplantation embryos began at the 4-cell stage of DHS and TSS

The accurate determination of TSS is crucial for precisely locating the initiation position of gene transcription and investigating the intricate mechanisms of gene regulation14. To further clarify the occupation of nucleosomes and subnucleosome-sized particles around TSS of genes at different stages of preimplantation embryos, we first analyzed the nucleosome distribution near the TSS region across all genes. In oocytes, zygote, and 2-cell embryos, we observed that the occupation of nucleosomes did not exhibit the typical “nucleosome depletion” state, but rather showed a lower enrichment trend in the TSS region. Instead, the typical nucleosome occupation pattern was only observed starting from the 4-cell stage (Fig. 2). Furthermore, the typical enrichment of subnucleosome-sized particles in the TSS region was not prominent in oocytes and 2-cell embryos but was observed in zygote (Fig. 2). The lack of typical nucleosome and subnucleosome-sized particles occupation patterns in the TSS region of 2-cell embryos may be related to the dominance of the ZGA process at this stage. In subsequent developmental stages, there was a gradual increase in the occupation of subnucleosome-sized particles verge on the TSS region of somatic cells (Fig. 2). To clarify the impact of ZGA on nucleosome organization, we analyzed stage-specific genes expressed in early and late 2-cell embryos. No significant enrichment of subnucleosomal particles was observed at transcription start sites (TSSs) in either early or late 2-cell stages. However, nucleosome positioning at TSSs exhibited a nucleosome-depleted region (NDR)-like pattern in both stages, though less pronounced than in somatic cells. Furthermore, nucleosome depletion at TSSs became more evident with increasing gene expression levels (sFig. 4A–D). Notably, canonical nucleosome phasing was absent near TSSs of maternal genes, minor ZGA-associated genes, and major ZGA-associated genes across all preimplantation stages (sFig. 4E).

Fig. 2
figure 2

Profiles of nucleosome and subnucleosome-sized particles around the TSS of all genes in preimplantation embryos.

To address potential confounding effects of sequencing depth, we performed random downsampling (20%, 40%, 60%, and 80%) on 4-cell, 8-cell, and morula-stage embryos (which display well-phased nucleosome patterns). Nucleosome positioning remained stable across all subsampled datasets (sFig. 5), confirming that sequencing depth did not bias our observations.

In order to further elucidate the occupation of nucleosomes and subnucleosome-sized particles near the TSS of stage-specific expressed genes, we conducted correlation analysis and found that the results for stage-specific genes were similar to those for all genes. Indicating that this occupation of nucleosomes is not influenced by gene expression and represents changes in the overall genomic occupation state driven by embryonic stage (sFig. 6).

DHS are specific regions on DNA that are easily recognized by nucleases such as DNase I or Tn5 transposase. DHS typically corresponds to more relaxed regions on chromosomes, providing the physical conditions for TF and transcription complex to bind to them. DHSs are associated with cellular destiny and gene expression profiles30, we further investigated the occupation of nucleosomes and subnucleosome-sized particles around the DHS sites in pre-implantation embryos. We selected DHS sites from different developmental stages as reported in previous studies for analysis18. Our data showed that during early embryonic development, the occupation of nucleosomes around the 0–2 cell DHS sites was atypical (sFig. 7), which may be due to low detected DHSs number. As fewer DHS sites have been reported in the initial stages of pre-implantation development, we were also unable to identify a smooth occupation pattern of nucleosomes. However, we found that starting from the 4-cell stage until the 16-cell stage, the occupation of nucleosomes began to exhibit the typical DHS occupation pattern, with nucleosomes depletion at the midpoint of DHS but stacked in regular intervals upstream and downstream of the DHS sites (sFig. 7). In contrast, subnucleosome-sized particles started to accumulate at the midpoint of DHS from the fertilization stage, indicating a high abundance of subnucleosome-sized particles binding in the relaxed chromatin regions (sFig. 7). These results suggest that the typical nucleosome occupation at DHS sites starts from the post-fertilization stage until the four-cell stage, while the positioning of subnucleosome-sized particles at the midpoint of DHS occurs earlier than nucleosomes. This pattern of subnucleosome-sized particles alterations preceding nucleosomes indicated that pioneer TF(s) may regulate the correct occupation pattern and development of nucleosomes in preimplantation embryos.

The classical nucleosome organization in mouse preimplantation embryos of CTCF binding site emerges at the morula stage

CTCF is considered to be a chromatin structural protein and plays an important role in the complex with Cohesin in regulating long-distance chromatin interaction and maintaining chromatin “looping”. Based on previously reported results of CTCF in somatic cells, nucleosomes are absent at CTCF sites31.

To explore the nucleosome organization around CTCF sites during preimplantation embryonic development, we utilized CTCF ChIP-seq data from respective stages to detail CTCF binding sites(PRJCA018124), further mapping our scMNase-seq data around these central points. The findings indicate an enrichment of nucleosomes and subnucleosome-sized particles at CTCF midpoints prior to the two-cell stage. From the two-cell stage onwards, nucleosomes are absent at the midpoints, whereas regular distribution on both flanks of the midpoints starts to be evident only by the morula stage with subnucleosome-sized particles presenting an alternating pattern with nucleosomes (Fig. 3). To elucidate the stage-specific dynamics of CTCF expression, we performed trend analysis to categorize genes based on their expression patterns during embryonic development. Notably, Ctcf was classified into cluster 6 (sFig. 8A), a group predominantly comprising epigenetic modifier enzymes. Genes in this cluster exhibited a characteristic expression peak at the 2-cell stage followed by subsequent downregulation (sFig. 8B). This expression dynamics of Ctcf coincides with the initiation of nucleosome reorganization from the 2-cell stage, suggesting its potential involvement in establishing the primary nucleosome architecture during this critical developmental window. Nevertheless, the complete maturation of nucleosome positioning patterns observed in later stages likely requires synergistic regulation by additional factors. We observed that during the 4-cell stage, nucleosome positioning exhibited a distinct pattern, with pronounced nucleosome enrichment at CTCF binding sites (Fig. 3). Given prior reports of replication ___domain emergence starting abruptly at the 4-cell stage32,33, we investigated nucleosome organization at replication initiation regions across preimplantation stages. We found that nucleosomes began forming high-density regions at replication initiation sites as early as the 2-cell stage, which may act as physical barriers to restrict the spread of replication initiation signals, ensuring precise localization of replication origins (e.g., CTCF-bound loci). However, subnucleosomal particle enrichment at replication initiation regions remained low in the 2-cell stage, followed by a sharp increase at the 4-cell stage (sFig. 9). The heightened subnucleosomal peaks likely reflect local chromatin decompaction during replication fork progression, with their abrupt amplification suggesting a marked increase in replication-associated helicase activity at the 4-cell stage, driving the establishment of replication domains.

Fig. 3
figure 3

Profiles of nucleosome and subnucleosome-sized particles profiles around the stage specifc CTCF binding sites in preimplantation embryos. ChipSeq sequencing data (PRJCA018124) of preimplantation embryos at each stage were used to obtain the midpoint information of stage-specific CTCF binding sites. Plot2DO was used to generate V-plots to determine the distribution of scMNase-seq fragments at different developmental stages before implantation. The arrangement of nucleosomes and subnucleosome-sized particles is highlighted, with red lines representing nucleosomes and blue subnucleosome-sized particles illustrating their respective arrangement near the midpoint of the phase-specific CTCF binding site.

However, it has been shown that the 3D structure of embryonic chromatin does not begin to develop until the 8-cell stage530. To determine patterns of establishment of 3D chromatin structures similar to those observed in ESCs, we used multiple CTCF ChIP-seq datasets from various references20,23,24. These data sets were used to identify CTCF-binding sites in ESCs, and combined with our nucleosome localization data, the distribution of nucleosomes around ESCs CTCF-binding sites was evaluated in different data sets. We found that the typical NDR of CTCF binding sites was not visible and that the regular arrangement of nucleosomes around CTCF binding sites in some ESCs began as early as 4 cells, while others did not appear until morula (sFig. 1014). To further clarify the source of these differences, our analysis found that the peaks of CTCF ChIP-seq data from different ESCs were of low similarity (Fig. 4A). In order to avoid the influence of different ESC cell types and CTCF binding site detection methods on the final CTCF binding site, we selected 14,936 peaks shared in the five data sets as the research object (Fig. 4B). The organization of nucleosomes and subnucleosome-sized particles around these sites is mapped. Notably, at the 4-cell stage, we observed a high enrichment of nucleosomes at CTCF binding sites that persisted until the 8-cell stage, and canonical NDR did not appear until the morula stage (Fig. 4C). We speculate that CTCF is expressed in large amounts at the 2-cell stage, but because most sites are occupied by nucleosomes and subnucleosome-sized particles at this stage, its normal 3D structure cannot be effectively formed (sFig. 8A–B). In contrast, recruitment of CTCF precursor TFs to these regions at the 4-cell stage may require gradual establishment of regular nucleosome positioning and spacing until the morula stage. To address potential confounding effects of sequencing depth on nucleosome positioning near CTCF binding sites, downsampling analysis was performed on 4-cell, 8-cell, and morula embryo samples. It was confirmed that sequencing depth did not alter nucleosome positioning patterns at these loci (sFig. 15). These results suggest that the chromatin folding state differs from that of ES cells at the pre-16-cell stage and that nucleosomes at the CTCF locus may be associated with the transition from totipotency (pre-4-cell) to pluripotency (post-4-cell).

Fig. 4
figure 4

Profiles of nucleosome and subnucleosome-sized particles around the common CTCF binding sites of ESC in preimplantation embryos. Heat map of correlation of ChipSeq sequencing results of ESCs CTCF from GSM3771259, ENCSR000CCB, GSM2533853, GSM691014, GSM918748. Venn diagram of CTCF peaks in GSM3771259, ENCSR000CCB, GSM2533853, GSM691014 and GSM918748 to obtain shared peaks. Plot2DO was used to generate V-plots to clarify the distribution of scMNase-seq fragments around the center of these common CTCF binding sites at different stages of preimplantation embryo development. The arrangement of nucleosomes and subnucleosome-sized particles is highlighted, with red lines representing nucleosomes and blue subnucleosome-sized particles, illustrating their respective arrangement near the midpoint of the CTCF binding site.

Relationship between nucleosome occupancy patterns and bivalent gene activation status in early embryos

Histone modifications play a key role in regulating the expression of pivotal genes during mammalian embryonic development, and genome-wide profiling of histone H3 lysine 4 trimethylation (H3K4me3) and histone H3 lysine 27 trimethylation (H3K27me3) have been shown associated with gene activation and repression19. By analyzing the occupation of nucleosomes around specific histone modification sites, we found that nucleosome exhibited a distinct occupation pattern around around stage-specific genes that have been reported for H3K4me3 modification: nucleosomes around genes with H3K4me3 modifications showed a disordered state in oocytes, and only started to exhibit a depleted state near the TSS and TES at the 2-cell stage, and subsequently maintained a relatively stable depletion throughout the later developmental stages (Fig. 5A). In contrast, nucleosomes on genes with H3K27me3 modifications showed occupation at TSS and TES, besides, nucleosome occupation accumulated on the gene body from the 4-cell stage, covering the entire gene body (Fig. 5B). In addition to these observations, nucleosomes near the bivalent region genes showed a depletion of nucleosomes at the TSS and TES but exhibited a distinct pattern from H3K4me3 and H3K27me3 on the gene body, where nucleosomes gradually accumulated from the TSS to the TES from the 4-cell stage (Fig. 5C). Furthermore, near the non-marked region genes, the occupation of nucleosomes resembled the state of genes with H3K27me3 modifications (Fig. 5D). These results indicate that nucleosome re-organization of genes with H3K4me3 modifications precedes that of H3K27me3 sites and consistent with the stage of activation of the epigenetic enzymes for H3K4me3 (sFig. 8B). At the same time, the bivalent region may correspond to a state where genes are priming to be expressed that is regulated by nucleosome occupancy.

Fig. 5
figure 5

Nucleosome profiles around H3K4me3/H3K27me3 marked genes. (A) Nucleosome profiles around H3K4me3 marked genes in preimplantation embryos. (B) Nucleosome profiles around H3K27me3 marked genes in preimplantation embryos. (C) Nucleosome profiles around bivalent marked genes in preimplantation embryos. (D) Nucleosome profiles around nonmarked genes in preimplantation embryos.

Consistent with the nucleosomes occupation patterns observed above, subnucleosome-sized particles also exhibited different occupation patterns across the four types of H3K4me3/H3K27me3 marked genes (sFig. 16). Specifically, starting from the 2-cell stage, subnucleosome-sized particles are enriched in the TSS region of genes with H3K4me3 modifications, while the enrichment of subnucleosome-sized particles on H3K27me3 modified genes was not clearly observed at any stage (sFig. 16). In the bivalent region, subnucleosome-sized particles began to accumulate in the TSS region from the 4-cell stage, although the enrichment is lower compared to genes with H3K4me3 modifications in the TSS region. The occupation of subnucleosome-sized particles near the unmarked region is similar to that of genes with H3K27me3 modifications (sFig. 16). The observation of subnucleosome-sized particles occupation at 2-cell stage may be correlated with ZGA process. The organization of subnucleosome-sized particles in the TSS region of the bivalent region is likely related to the aforementioned balanced state, where the enrichment level of H3K4me3 and H3K27me3 is in flux, although both show nucleosome depletion at TSS of H3K4me3 modification genes.

Broad and narrow H3K4me3 domains correspond to the distribution density of nucleosomes in TSS and TES

H3K4me3 marks have been considered to be indicators of promoter activation in somatic cells, and studies have shown that genes with broad H3K4me3 domains exhibit higher levels of RNA expression before embryonic implantation, while genes with narrow H3K4me3 domains or lacking H3K4me3 signals tend to be silenced19. To investigate whether the expression of genes with narrow H3K4me3 domains is related to the localization of their nucleosomes, we further analyzed the nucleosome occupation patterns and H3K4me3 ___domain widths across marked genes. We found no clear associations between nucleosome occupancy and H3K4me3 ___domain width in oocytes (Fig. 6). However, on normal and wide H3K4me3 genes at the 2-cell stage, a depletion state of nucleosomes at the TSS and TES was observed, while genes with narrow H3K4me3 domains showed a disordered occupation of nucleosomes (Fig. 6). From the 4-cell stage onwards, the occupation of nucleosomes on H3K4me3 domains gradually became fixed, with nucleosomes being depleted at the TSS and TES for all genes with H3K4me3 domains (Fig. 6). However, compared to genes with broad and normal H3K4me3 domains, the occupation of nucleosomes upstream and downstream of the TSS was denser in genes with narrower H3K4me3 domains (Fig. 6). These results showed that broad and narrow H3K4me3 domains correspond to the distribution density of nucleosomes in TSS and TES and suggest that the density of nucleosome occupation at the TSS and TES may be correlated with gene expression levels.

Fig. 6
figure 6

Nucleosome profiles around genes with different length of the H3K4me3 domains. H3K4me3 ULI-NChIP–seq datasets of pre-implantation embryos (GSE73952) were employed to obtain the gene information of the control, normal, narrow, broad H3K4me3 peaks, The arrangement of nucleosome fragments within the 300 bp range upstream and downstream of these genes in the corresponding stage of pre-implantation embryos was shown.

H3K4me3 marks key genes in embryonic development, and the sparsity of TSS and TES nucleosome organization is positively correlated with their expression levels

Since our results above show heterogeneity of nucleosome occupation pattern at the TSS and TES of H3K4me3 modified genes, we hypothesized that the heterogeneity of nucleosome occupancy may be an indicator of heterogeneous transcriptional activity. In order to explore the relationship between H3K4me3-marked genes and their RNA expression level during embryonic development, we stratified genes according to their FPKM values from previous transcriptome sequencing studies17 (0 < FPKM < 0.1, silent genes; 0.1 < FPKM < 1, low expression genes; 1 < FPKM < 10, moderately expressed genes; 10 < FPKM, highly expressed genes) (sTable 1). Interestingly, we found that the expression level of H3K4me3-marked genes was significantly correlated with nucleosome sparsity at TSS (see “Methods”). However, for other genes without H3K4me3 marks, the nucleosome occupation of silenced genes showed a similar occupation pattern to H3K27me3-marked genes but became increasingly disordered with increasing expression levels (Fig. 7).

Fig. 7
figure 7

Nucleosome profiles around genes with different FPKM values and H3K4me3 modification. The H3K4me3 ULI-NChIP–seq datasets of pre-implantation embryos (GSE73952) were utilized to obtain the gene information of the H3K4me3 peaks, and the RNA-seq datasets of pre-implantation embryos (GSE66582) were employed to obtain the gene expression levels of pre-implantation embryos. The genes at each stage were classified according to the FPKM values in the RNA-seq data, and the intersection with the genes marked by H3K4me3 was taken to draw the Venn diagram. The arrangement of nucleosome fragments within the 300 bp range upstream and downstream of genes with or without H3K4me3 modification at different expression levels was analyzed respectively. Gene body was normalized to a length of 300 bp. Blue/red text as results of GO analyses.

As such, we speculated that the heterogeneity of nucleosome occupation with respect to gene expression may be related to biological function of each gene in embryonic development. Indeed, we observed that the H3K4me3 mark is present on genes related to embryonic development across each stage. Genes with lower expression levels predominantly encompass signaling pathway genes associated with late-stage development, whereas those with higher expression levels are primarily concentrated in pathways related to pre-implantation embryo vitality and RNA synthesis. Moreover, our analysis revealed a positive correlation between the expression levels of these marker genes and the sparse positioning of nucleosomes in the TSS region. Notably, this correlation was initially observed in oocytes. However, during the 2-cell stage, no clear pattern was evident. Starting from the 4-cell stage, this trend began to emerge and became statistically significant by the morula stage (p = 0.0244) (sFig. 17). Therefore, we hypothesize that H3K4me3 in pre-implantation embryos may primarily serve to mark key genes during embryonic development, and the degree of nucleosome sparsity determines their expression levels. This type of mechanism may enable the rapid gene expression upregulation observed during later stages of embryonic development.

Discussion

The formation of totipotent embryos from terminally differentiated oocyte and sperm is always accompanied by drastic epigenetic remodeling. The precise spatiotemporal gene regulation of epigenetics ensures the successful development of embryos, Nonetheless, due to long-standing technical constraints, the epigenomic changes and global chromatin remodeling that occur during early embryo development have remained obscure. Fortunately, recent advancements in single-cell and low-input sequencing technologies have begun to unveil specific aspects of epigenetic regulation during embryogenesis and embryo development, mainly including transcriptome, DNA methylation, various types of histone modifications, chromatin accessibility and the dynamic changes of three-dimensional chromatin structure, which has greatly enriched our understanding of early embryonic development. Existing research has reported the phenomenon that during the early stages of murine embryonic development, the organizational structure of nucleosomes un dergoes significant remodeling. Globally, the localization of nucleosomes within the zygote appears indistinct; however, as development progresses, their positioning becomes more precise. Furthermore, a reduction in chromatin assembly within embryonic stem cells can partially reverse the organizational structure of nucleosomes, rendering them similar to that observed in zygotes34. This suggests a potential linkage between the chromatin assembly pathways and the indistinct nucleosome positioning in zygotes. However, the localization and dynamics of nucleosomes during early embryonic development are still not fully understood. Therefore, it is of great significance to explore the dynamic changes and localization of nucleosomes in naturally fertilized preimplantation embryos, which can provide us with a deeper understanding of the mechanism of embryonic chromatin remodeling and help to resolve the clinical defects during embryonic arrest. In this study, we employed scMNase-seq to concurrently assess genome-wide nucleosome positioning and chromatin accessibility in single cells14,15, aiming to unravel the dynamic organization of nucleosomes during naturally fertilized mice pre-implantation embryos development.

Nucleosome-depleted regions (NDRs), hallmark features of regulatory elements in eukaryotic genomes35,36, exhibit dynamic establishment coupled with chromatin relaxation at DNase hypersensitive sites (DHSs). Our analysis reveals that in naturally fertilized embryos, DHS centers display nucleosome depletion starting from the 4-cell stage, accompanied by progressive nucleosome condensation within limited flanking regions (sFig. 6). Notably, the preferential positioning of subnucleosome-sized particles at DHS midpoints suggests their potential role in fine-tuning transcription factor binding through partial DNA occlusion. This contrasts starkly with the spatiotemporal NDR establishment patterns observed in ICSI-derived embryos—paternal genome NDRs emerge as early as 1.5 hpf, while maternal NDRs are delayed until 3 hpf8. Furthermore, SCNT embryos exhibit prolonged NDR reconstruction around transcription start sites (TSS), requiring 6 h post-activation compared to fertilized counterparts6. These temporal disparities highlight that natural fertilization likely coordinates parental genome reprogramming through unique chromatin reorganization mechanisms, potentially circumventing developmental perturbations associated with artificial fertilization techniques37.While PCA successfully captured hallmarks of intra—and inter—stage nucleosome dynamics (e.g., Morula’s broad PC2 dispersion reflecting dynamic chromatin remodeling, and the stepwise PC1 increase paralleling embryonic genome activation), the lack of discrete clustering between developmental stages in the PCA plot raises critical interpretive considerations (sFig. 2). This pattern suggests that intra—stage cellular heterogeneity may outweigh inter—stage differences in early embryonic chromatin states. Biologically, this is consistent with the highly dynamic and cell—autonomous nature of chromatin reprogramming during preimplantation development: individual blastomeres or embryos may undergo epigenetic remodeling at distinct kinetics, generating substantial variability within a nominal stage that overshadows canonical stage—specific signatures.

After fertilization, the relaxation of chromatin is a prominent feature of mammalian zygotes38. This relaxation is also evident from Hi-C results, which show significant weakening of topologically associating domains (TADs) and A/B compartments in zygotes and 2-cell stage embryos36. Extensive decompaction of chromatin after fertilization has also been observed in other sexually reproducing species39,40,41,42. What factors contribute to the lack of organization in the early embryo and which factors subsequently lead to its reconstitution remain unclear5.

CTCF-mediated chromatin reorganization exhibits parent-of-origin asynchrony, with paternal genome NDRs emerging at CTCF motifs by 1.5 hpf, while maternal counterparts delay until 3 hpf8. Our CTCF ChIP-seq analysis reveals stage-specific nucleosome patterning: pre-2-cell embryos show co-occupancy of nucleosomes and subnucleosome-sized particles at these sites, whereas a progressive midzone depletion emerges post-2-cell stage (Fig. 3). This transition coincides with Ctcf transcriptional upregulation (sFig. 7) and precedes full TAD establishment at the 8-cell stage31, suggesting a stepwise stabilization mechanism for chromatin architecture. The protracted chromatin remodeling timeline—from CTCF-guided nucleosome displacement to compartmentalization—highlights hierarchical control of 3D genome reorganization during embryogenesis.

Furthermore, our integrative analysis of CTCF binding sites across embryonic stages reveals stage-specific chromatin organization: classical nucleosome positioning at these loci only emerges by the morula stage, with persistent nucleosome occupation observed in earlier embryos (Fig. 3). Notably, 4-cell embryos exhibit transient nucleosome fragment enrichment at ESC CTCF sites, consistent with their intrinsic histone octamer affinity10. This phased chromatin restructuring aligns with hierarchical genome organization—CTCF-guided nucleosome positioning initiates at 4-cell stage, followed by replication ___domain establishment (post-4-cell) and culminating in TAD formation (8-cell)32,33. This is further supported by coordinated nucleosome/subnucleosome dynamics at replication origins facilitating replisome assembly (sFig. 8). This gradual definition of replication domains may reflect the progressive loading of replication machinery or the maturation of epigenetic marks, which we found to be reorganized from the 4-cell stage. Notably, while CTCF binding motifs are recognized prerequisites for topologically associating ___domain (TAD) boundary formation, their functional implementation as architectural elements strictly depends on active cohesin-mediated loop extrusion processes43,44.

H3K4me3 is a mark that recruits chromatin remodeling proteins and transcriptional machinery, allowing for the activation of promoters45. In mature oocytes, non-canonical forms of H3K4me3 have been observed as broad domains at promoters and DHS46,47. The non-canonical H3K4me3 mark is inherited in the zygote and cleared at the 2-cell stage, a process essential for ZGA and full developmental potential47. Canonical sharp H3K4me3 domains at promoters are rapidly established after fertilization48. Further studies suggest that the width of H3K4me3 domains should be precisely controlled, as expanding H3K4me3 domains by knocking down genes encoding H3K4me3 demethylases, particularly Kdm5b, disrupts lineage differentiation during the blastocyst stage47. Wider H3K4me3 domains at promoter regions may indicate higher transcriptional levels post-fertilization19, and the transition from non-canonical to canonical H3K4me3 during the 2-cell stage requires co-activator transcription47. Our results indicate that genes with H3K4me3 modifications show depletion of nucleosomes at both TSS and TES from the 2-cell stage onwards, with a greater loss at TSS during the morula stage. Furthermore, at TSS of genes with H3K4me3 modifications, there is a significant enrichment of subnucleosome-sized particles from the 4-cell stage onwards, possibly indicating widespread accessibility at these sites. Additionally, we found that nucleosome occupation at H3K4me3-marked sites do not always correlate with gene expression levels, as some genes with H3K4me3 modifications are in a silent state and are enriched in pathways related to late embryo developmental processes. Studies also reveal that genes with broad H3K4me3 domains exhibit higher RNA expression levels, while genes with narrow H3K4me3 domains or lacking H3K4me3 signals are mostly silent19. In our findings, we observed that nucleosome positioning is depleted at the TSS region of genes with narrow H3K4me3 domains but shows a closer occupation upstream and downstream, suggesting that H3K4me3 may mark a large number of genes related to embryo development post-ZGA, and the accessibility of these genes may be achieved through sparse or tight nucleosome occupation.

It is noteworthy that the MII oocytes analyzed in this study exist in a transcriptionally silent state and are arrested at metaphase II of meiosis. This transcriptional quiescence represents a defining feature of mature oocytes, as post-fertilization events prior to ZGA are predominantly governed by maternal mRNA translation rather than de novo transcription49. Consequently, our interpretation of nucleosome positioning patterns at transcription start sites (TSS) and gene body regions must be contextualized within this transcriptional dormancy. The observed nucleosome depletion at TSS regions of highly expressed genes marked by H3K4me3 in early embryos (Fig. 7) likely reflects pre-established epigenetic signatures inherited from oocytes, which may prime these loci for rapid transcriptional activation during ZGA. Emerging evidence demonstrates that maternal mRNAs and chromatin modifications in MII oocytes—including non-canonical H3K4me3 domains—undergo dynamic remodeling following fertilization to orchestrate embryonic chromatin reprogramming49.These findings underscore the critical importance of cytoplasmic and epigenetic maturation in oocytes for shaping embryonic chromatin architecture. Our nucleosome profiling data may therefore capture both the enduring legacy of oocyte-derived epigenetic programming and the initiation of zygotic chromatin reorganization.

Research has shown that in comparison to the rapid establishment of H3K4me3, the overall gain of H3K27me3 temporally lags behind. The number of promoters marked by H3K27me3 is significantly reduced in the 2-cell stage compared to MII oocytes. Particularly, a substantial loss of H3K27me3 at developmental genes was observed at the time of fertilization19. Furthermore, the co-existence of H3K4me3 and H3K27me3 is often mutually exclusive in the early cleavage stages, but this non-overlapping occupation pattern significantly diminishes at the blastocyst stage during the occurrence of the first lineage differentiation19. Our findings reveal a high enrichment of nucleosome positioning on genes within H3K27me3 domains from the 4-cell stage onward, suggesting the difficulty in accessing the genes. Additionally, we observed a distinctive nucleosome positioning in the bivalent region, with the absence of nucleosomes at the gene TSS and TES positions. Unlike the conventional occupation of H3K4me3, these genes exhibit a gradually accumulating nucleosome positioning along the gene body, with a sparse occupation in the front half and a dense occupation extending downstream (Fig. 6). Studies show that nearly 72% of ICM bivalent genes maintain their bivalency in ES cells. Approximately 22% of genes lose H3K27me3 and return to an active state. Less than 1% of genes lose H3K4me3 and return to a repressed state19. Bivalency is thought to be a “localizing” mechanism that suspends genes in inactive or low-expressed states while maintaining the potential for rapid activation in response to developmental cues19,47,48. The gradual accumulation of nucleosomes in the bivalent regions may contribute to the dynamic changes in gene expression, and the nucleosome in the latter half could rapidly respond to complete gene expression upon binding with responsive transcription factors.

In summary, combining previous research findings with our data, we present a genome-wide map of nucleosome positioning in pre-implantation mouse embryos under natural fertilization conditions. In our current research, it was found that nucleosomes exhibit a classical occupation pattern at the TSS region starting from the 4-cell stage. It is not until the morula stage that nucleosomes display an occupation pattern governed by CTCF, indicating that nucleosome occupancy of CTCF may contribute to chromatin remodeling barriers. Furthermore, during pre-implantation embryo development, the H3K4me3 modification likely serves as a mark for development-related genes, and the sparse occupation of nucleosomes at the TSS and its surrounding regions may determine the expression levels of these genes. Such regulatory mechanisms may facilitate the coordinated opening and expression of a large number of genes specific to embryonic development stages, allowing for a prompt response to the dynamic pre-implantation embryo developmental process.

Limitations of the study

There are several caveats in our study. Firstly, because mouse preimplantation embryos were harvested by in vivo fertilization, it was difficult to observe the changes in nucleosome occupation during minor and major ZGA. Secondly, due to the limitation of the reference data set, there are many more representative nucleosomes near special sites that are not included. Third, although we used scMNase-seq counts for library construction, we could not obtain as much nucleosome fragmentation data as possible. It is necessary to improve the scMNase library construction to obtain more fragments to mine further nucleosome occupation information.