Introduction

In mammals, genomic evolution entailed multiple innovations in gene regulation, including genomic imprinting, an epigenetic mechanism that results in parent-of-origin specific monoallelic expression of certain genes. Genomic imprinting is involved in the normal mammalian development, fetal growth, metabolism, and adult behavior and is also implicated in human diseases1,2,3. During mammalian evolution, the acquisition of the placenta is closely related to genomic imprinting4, making it a key target for studying the quantity and function of imprinted genes. Most of the imprinted genes are controlled by parent-of-origin specific gametic differentially methylated regions (gDMRs), also called canonical imprinting, while a subset of placenta-specific imprinted genes is controlled by maternally derived histone modifications (termed non-canonical imprinting)5,6. Through a comparative analysis of rats and mice, a recent study found that the expression of canonical imprinted genes was highly conserved in embryonic tissues, while non-canonical imprinting showed evolutionary divergence in extraembryonic tissues of murids7. Another recent study found that the non-canonical imprinting was absent in the human placenta8. Interestingly, the human placenta-specific imprinting associated with gDMRs has been reported to be poorly conserved in mice9,10,11. A previous study reported that the primate-specific imprinted miRNA cluster on chromosome 19 (C19MC) was a primary driver of human embryonic stem cell differentiation into trophoblast lineages12. These studies underscored the ongoing evolution of imprinted genes in mammalian placentas and their essential function in placentation.

Hybrid mapping is a powerful approach for identifying imprinted genes, as polymorphisms between genetically distinct individuals can serve as genetic markers to differentiate paternal and maternal alleles. To date, many studies have leveraged allele specific expression in hybrids to identify imprinted genes in the placenta and other tissues at the global tissue level across different mammalian species, including mice13,14, rats7, humans15, and pigs16,17,18. However, the full extent of the imprinted genes is still lacking. Additionally, whether imprinted genes have developed alternative or shared imprinting mechanisms between mammalian species remain controversial. Emerging evidence suggests that the parent-of-origin effects on gene expression are regulated in a complex cell-type- and developmental-stage-dependent manner. Using nascent RNA in situ hybridization in the mouse brain, a previous study uncovered that individual imprinted genes were monoallelic in some cells but biallelic in others19. Another study reported the cell-type- and developmental-stage-dependent allelic effects for the mouse imprinted gene Ube3a by using the reporter fusion proteins20. These pioneering efforts provided some evidence for individual genes but lacked the high-throughput capacity for a systematic analysis for imprinted genes. Recent advances in single-nucleus and single-cell RNA sequencing (snRNA-­seq and scRNA­-seq) have enabled the capture of cell types and elucidation of cell-type transcriptomic dynamics across many tissues in various mammals, such as mice21, humans22, cows23,24, and pigs25,26. The integration of snRNA­-seq or scRNA­-seq with hybrid mapping offers an efficient approach for the high-throughput screening of imprinted genes at single cell level.

Here, we use the reciprocal crosses between two distinct pig breeds and develop a framework for integrating snRNA-seq and whole-genome sequencing to decode the landscape of placental genomic imprinting. We construct a high-resolution porcine placental cell atlas (PPCA) to identify imprinted genes in different developmental phases at cellular level. Through cross-species analysis, we characterize a general feature of imprinting regulation, wherein the integration and modification of signaling pathways into a cell-type-specific autocrine network plays a critical role in parent-of-origin effects on gene expression.

Results

Single-cell atlas of the pig placenta

To delineate the cell-type compositions of the porcine placenta and determine cell-type-specific allelic expression at cellular resolution, we firstly performed snRNA-­seq on the porcine placental tissues of F1 progeny from reciprocal crosses between two breeds with distinct genetic background, Duroc and Lulai (a local Chinese pig breed) (Fig. 1a). Placental tissue samples were collected at embryonic day 40 (E40) and day 70 (E70), representing stages of rapid and complete development in terms of chorion surface area and weight. We totally obtained over 120,000 nuclei profiles at the E40 and E70 stages using the 10x Genomics Chromium platform. After quality control, 53,610 and 52,407 nuclei profiles from the E40 and E70 stages were retained for further analyses, respectively (Fig. 1b, c). Through the cell clustering analysis of nuclei profiles from two stages separately, we identified six major cell types shared between the E40 and E70 stages: trophoblast cells (TCs), blood vascular endothelial cells (BVECs), vascular smooth muscle cells (VSMCs), fibroblast (Fib), Hofbauer cells (HBCs), and decidual cells (Fig. 1b, c).

Fig. 1: Construction of single-cell landscapes of the pig placenta.
figure 1

a Schematic diagram of the experimental design and analysis framework. b UMAP projection of cells from snRNA-seq of the E40 porcine placental tissues (n = 53,610 nuclei) and the cell type identities of cell clusters. c UMAP projection of cells from snRNA-seq of the E70 porcine placental tissues (n = 52,407 nuclei) and the cell type identities of cell clusters. d, e Dot plots showing the expression of representative marker genes for the five major cell types in the snRNA-seq of the E40 (d) and E70 (e) porcine placental tissues. snRNA-seq single-nucleus RNA sequencing, BVECs blood vascular endothelial cells, Fib fibroblast, HBCs Hofbauer cells, TCs trophoblast cells, VSMCs vascular smooth muscle cells, E40 embryonic day 40, E70 embryonic day 70.

The TCs were identified based on the high expression of canonical marker genes such as GATA2, PPARG, and AKR1B1 (Fig. 1d, e)27,28,29. We discovered the BVECs (highly expressing PECAM1, EGFL7, and KDR)25,30 and VSMCs (highly expressing ACTA2, MYH11, and NOTCH3)23,31 in the porcine placenta (Fig. 1d, e). We identified the placental fibroblast which specifically highly expressed marker genes PDGFRA, DCN, and COL3A1 (Fig. 1d, e)30,32. The HBCs, the placental resident macrophage, showed high expression of CD163 and CSF1R33,34. Although the placental tissues were meticulously dissected from the junction between the trophectoderm and maternal decidua, we investigated the expression of the female-specific XIST gene in both male and female fetal placental tissues to further distinguish fetal and maternal cellular origin (see methods). The maternal decidual cells (Cluster 5, 8, 13, 15, and 18 at the E40 stage, and Cluster 8, 16, and 17 at the E70 stage) were identified by the absence of XIST expression in male fetal placental tissues (Supplementary Fig. 1). The subsequent cell clustering analysis excluding decidual cells reaffirmed the presence of these five major transcriptionally distinct cell types (TCs, BVECs, VSMCs, Fib, and HBCs) across placental tissues (Supplementary Figs. 2 and 3).

The TCs are exclusive to the placental tissues and exhibit numerous subtypes, yet their diversity remains poorly understood in the porcine placenta. To gain deeper insights into the trophoblast subtypes, we performed re-clustering analysis of the TCs and discovered a diversity of cell subtypes that have not previously been characterized. In total, we found 12 distinct trophoblast subtypes at both E40 and E70 stages (Fig. 2a, b), based on their expression of specific marker genes and conserved transcriptome patterns (Fig. 2c, d; Supplementary Fig. 4 and Supplementary Data 1). Ten of these cell subtypes were shared between the two developmental phases, including two trophoblast precursor cell types (TPC1 and TPC2; high canonical marker gene EPHA4 expression)35, three areolar epithelial cell types (AEC1, AEC2, and AEC3; high canonical marker genes CTSL or SPINT1 expression)36,37, and five other trophoblast subtypes (TC1, TC2, TC3, TC4, and HEPACAM2+ TC). In addition, we also identified four cell types were presented in a stage-dependent manner. For instance, the AEC precursor cells (co-expressing markers of AEC and trophoblast precursor cells) and AEC-like cells (high SLC2A2 expression) were only identified at the E40 stage, whereas the CLCN4+ TC and TPC3 subtypes were only found at the E70 stage. The cell number of each trophoblast subtype at the E40 and E70 stages was available in the Supplementary Data 2. The cell-type similarity analysis showed that the AEC-like cell subtypes were weakly correlated with all of trophoblast subtypes of the E70 stage (Supplementary Fig. 4; area under the receiver operating characteristics (AUROC) score <0.95). The gene expression patterns of TPC3 and CLCN4+ TC subtypes did not align with all of trophoblast subtypes of the E40 stage (Supplementary Fig. 4; AUROC score <0.95). Furthermore, we performed immunofluorescence assays to validate these findings, confirming the presence of the TPC1 (DIAPH3), HEPACAM2+ TC (HEPACAM2), AEC1 (THRB), and AEC2 (CHST3) (Fig. 2e–h). Together, these results underscore the widely existence of cell heterogeneity in the developing porcine placenta, suggesting the complexity of the temporal dynamic changes of the cell populations and their gene transcriptional patterns.

Fig. 2: Cell subtypes of placental trophoblast.
figure 2

a, c UMAP projection of trophoblast subtypes of the E40 (a) and E70 (c) porcine placental tissues. b, d Dot plots showing the expression of representative marker genes for each trophoblast subtype of the E40 (b) and E70 (d) porcine placental tissues. e–h Representative immunofluorescence imaging of sections of placenta stained by DIAPH3 (e), HEPACAM2 (f), THRB (g), and CHST3 (h). Each experiment was conducted three time (n = 3 biological replicates) with similar results. Scale bars, 20 μm. AEC areolar epithelial cell, TPC trophoblast precursor cell, TC trophoblast cell, E40 embryonic day 40, E70 embryonic day 70.

To further examine whether trophoblast subtypes of pigs are conserved with those of other mammalian species, we incorporated the human38 and mouse35 placental snRNA-seq datasets with well-defined trophoblast subtypes and performed cross-species analysis. We found that, among the top 50 marker genes, except for certain genes associated with cell cycle and stemness that were shared by progenitor and stem cell populations, almost all porcine trophoblast subtypes exhibited no genes or at most one shared gene with human and mouse trophoblast subtypes (Supplementary Data 3 and Source data 1). In addition, aside from a few canonical maker genes, the top 50 marker genes of all trophoblast subtypes in pigs did not contain the orthologous canonical maker genes of human and mouse trophoblast subtypes, including the syncytiotrophoblast (SCT) markers TFAP2A, CYP19A1, and MFSD2A, the villous cytotrophoblast (VCT) markers TP63, EGFR, and YAP1, the VCT-cytotrophoblast cell columns (VCT-CCC) markers ITGB6, CDH5, LPCAT1, NOTCH1, and ITGA2, the VCT_fusing markers ERVW-1, ERVFRD-1, and GREM2, the extravillous trophoblast cells (EVTs) markers ADAMTS20, ITGA1, and PLAC8, the interstitial EVTs (iEVTs) markers ERBB2, SERPINE1, SERPINE2, and PLAU, the giant cells (GCs) markers AOC1, RAC1, CD81, PRG2, and PAPPA, the endovascular trophoblast cells (eEVTs) markers FLT4, NCAM1, GGT1, PPFIA4, MMP12, and EIF4E1B, the syncytiotrophoblast I (SynTI) markers Tgfa, Tfrc, Glis1, and Stra6, the syncytiotrophoblast II (SynTII) markers Synb and Vegfa, the SynTII precursor markers Lgr5 and Ror2, the labyrinth trophoblast progenitor 1 (LaTP_1) markers Tcf7l1, Met, and Pvt1, the LaTP_2 marker Egfr, the junctional zone precursor 1 (JZP_1) marker Cdh4, the JZP_2 markers Prune2 and Ncam1, the glycogen cells markers Pla2g4d, Plac8, and Pcdh12, the spongiotrophoblasts (SpT) markers Flt1 and Prl8a9, and the sinusoidal trophoblast giant cells (S-TGC) markers Ctsq, Lepr, and Podxl35,38. These results imply that the trophoblast subtypes of pigs poorly share conserved gene expression patterns with the trophoblast subtypes of humans and mice.

To confirm this, we further performed the cell-type similarity analysis on the trophoblast subtypes among the pigs, humans, and mice using the MetaNeighbour algorithm39. Except for progenitor cells, we observed that the gene expression patterns of almost all trophoblast subtypes of pigs were not conserved (AUROC score <0.95) with those of humans and mice (Supplementary Fig. 5). The observed divergent features of cell identity may be attributed to the species-specific placental characteristics, with porcine placentas exhibiting an epitheliochorial and diffused architecture, in contrast to the hemochorial and discoid morphology characteristic of human and mouse placentas40,41,42.

Cell-type-specific genomic imprinting landscape

Our high-resolution PPCA offers a powerful resource for investigating the cell-type-specific map of genomic imprinting. By combining snRNA-seq and whole-genome sequencing, we achieved the precise detection of genomic imprinting across various placental cell types. Leveraging more than 4.9 million informative single-nucleotide variations (SNVs) between Duroc and Lulai pig breeds, we identified parental origin of allelic expression in reciprocal F1 progeny. Here, we defined p1 (maternal ratio) as the expression ratio of Duroc alleles in Duroc mother × Lulai father cross and p2 (paternal ratio) as the Duroc expression ratio in Duroc father × Lulai mother cross. Using relatively stringent criteria (average p1 > 0.70, p2 < 0.30 for maternal expression (MAT) and average p1 < 0.30, p2 > 0.70 for paternal expression (PAT), each F1 replicate meeting cutoff of p1 > 0.70, p2 < 0.30 or p1 < 0.30, p2 > 0.70), and excluding genes on the sex chromosomes, we identified a total of 118 candidate imprinted genes expressed in at least one cell type or one developmental stage (Fig. 3a, b; Supplementary Figs. 6 and 7 and Source data 2). Of the 118 candidate imprinted genes, 97 and their orthologous genes have not been previously reported in mammalian genomic imprinting studies (Fig. 3b and Supplementary Figs. 6 and 7)13,14,15,16,17,18,43,44,45. Previous research has reported that certain imprinted genes showed variability in their imprinting status among individuals, being imprinted in some individuals while exhibiting near-biallelic expression in others46. Therefore, we further used a cutoff of average p1 > 0.70, p2 < 0.30 and p1 < 0.30, p2 > 0.70, but a conservative cutoff for each replicate (p1 > 0.60, p2 < 0.40 or p1 < 0.40, p2 > 0.60), to identify potential imprinted candidates. Additional 116 potential imprinted candidates were identified in the porcine placenta (Supplementary Data 4), including a subset of known imprinted genes (such as ZNF791, TRIOBP, GPR158, CCDC24, CHRAC1, ENSSSCG00000049216, TTC9C, PRR16, and SHMT1).

Fig. 3: Cell-type-specific genomic imprinting landscape.
figure 3

a The number of the maternally and paternally expressed candidate imprinted genes in each cell type. b The imprinting status of candidate imprinted genes in Chromosome 1 and 2 of pigs across cell types. The average parental allele-specific expression ratio is shown for each candidate imprinted gene (0 means 100% biased paternal expression and 1 means 100% biased maternal expression). The stringent criteria were defined as average p1 > 0.70, p2 < 0.30 for maternal expression (MAT) and average p1 < 0.30, p2 > 0.70 for paternal expression (PAT), and each F1 replicate meeting cutoff of p1 > 0.70, p2 < 0.30 or p1 < 0.30, p2 > 0.70. The standard criteria were defined as the average allelic ratio > 0.70 for MAT or <0.30 for PAT. c Correlation between the detectable candidate imprinted genes and cell number of cell types. Pearson’s r and corresponding P values (two-sided t-test) are indicated. Source data are provided as Source Data file. d Bar plot showing the number of the candidate imprinted genes grouped by their imprinting status across different cell types and development stages. MAT maternal expression, PAT paternal expression, BAE biallelic expression, BVECs blood vascular endothelial cells, Fib fibroblast, HBCs Hofbauer cells, VSMCs vascular smooth muscle cells, AEC areolar epithelial cell, TPC trophoblast precursor cell, TC trophoblast cell, E40 embryonic day 40, E70 embryonic day 70.

For the 118 candidate imprinted genes, we calculated the distribution of their allelic ratios. The results showed that 47 genes (40%) had a 0.95/0.05 average allelic bias and 74 genes (62.7%) had a 0.90/0.10 average allelic bias in at least one cell type or one developmental stage (Supplementary Fig. 8). Signals from mixed cellular population (i.e., monoallelic in some cells but biallelic in others) and variability in imprinting status among individuals likely lead to the observation of biased allelic expression in certain genes, rather than strict monoallelic expression46,47. We selected certain candidate imprinted genes including ENSSSCG00000052363, MMP16, ENSSSCG00000058181, ENSSSCG00000056411, SLC19A1, EIF3E, SYT14, PGM2L1, ABCA6, AMPH, and DPPA5 to validate their parent-of-origin-specific expression using the reverse transcription polymerase chain reaction (RT-PCR) and Sanger sequencing (Supplementary Fig. 9). Although most of these imprinted candidates were verified in all tested individuals, five genes including MMP16, ENSSSCG00000058181, EIF3E, ENSSSCG00000056411, and ABCA6 displayed variability in imprinting status among individuals. This variability may lead to a unique epigenetic signature for each individual and thus provides the opportunity for natural selection to operate under the regulation of parent-of-origin differences46. It is noteworthy that we employed Sanger sequencing of whole tissue samples to validate the imprinting status of specific genes. However, the overall imprinting status observed in the whole tissue may obscure cell type-specific variations in imprinting, which could be one of the reasons for the interindividual variability in imprinting status of a given gene. The extent of such variability likely depends on the ratio of cells exhibiting imprinting to those showing biallelic expression of the gene within the tissue, as well as on the expression levels of the gene in these distinct cellular populations. These findings underscore the necessity of validating imprinting status at the level of cell types or subtypes in future studies.

Among the 40 known pig autosomal imprinted genes in the Geneimprint database (http://www.geneimprint.com/), 28 of them had sufficient allele-specific snRNA-seq reads to assess their imprinting status in at least one cell type or one developmental stage. We found that 10 (IGF2R, IGF2, H19, KCNQ1, DLK1, NAP1L5, SGCE, PGM2L1, KBTBD6, and COPG2) of these 28 genes had a parent-of-origin expression using the stringent criteria mentioned above (Fig. 3b and Supplementary Figs. 6, 7) and 13 genes (MEST, DLX5, PON2, PEG10, INPP5F, PLAGL1, THRB, DHCR7, MEG3, PPP1R9A, TACC2, RASGRF1, and OSBPL1A) exhibited parental biasedly expression using the standard criteria (the average allelic ratio >0.70 (MAT) or <0.30 (PAT)) (Supplementary Fig. 10). By systematically summarizing the reported imprinting status and tissue origins of the 18 (13 meeting the standard criteria plus 5 showing biallelic expression in this study) imprinted genes in pigs from previous literature, we found that half of these genes were originally reported to exhibit imprinting in non-placental tissues (e.g., brain, liver, kidney, or skeletal muscle) rather than in placental tissues (Supplementary Data 5)16,48,49,50,51,52,53,54,55,56,57,58. This suggests that tissue-specific imprinting regulation and differences in developmental stages may contribute to the discrepancy between the imprinting status reported in databases and the findings of this study. Furthermore, we also acknowledged that factor such as limited SNVs coverage of 10x Genomics 3’ snRNA-seq approach, which may prevent the distinction of isoform-specific imprinting, may also account for the observed inconsistencies.

We further conducted a comparative analysis of candidate imprinted genes in the porcine placenta between the bulk RNA-seq study18 and the present snRNA-seq study. In this study, 16 and 25 candidate imprinted genes identified by Quan and co-workers18 had sufficient allele-specific snRNA-seq reads to assess their imprinting status in at least one cell type at the E40 and E70 stages, respectively. Of these, 16 out of 16 and 19 out of 25 exhibited parental biasedly expression at the E40 and E70 stages, respectively (Supplementary Data 6). Taken together, the predominance of imprinting status observed among the known imprinted genes in the porcine placental tissue underscores the sensitivity and precision of our single-cell integrative omics approach.

Notably, we observed a positive correlation between the number of the candidate imprinted genes identified and the number of cells detected within each cell type (Fig. 3c and Source data 3). This correlation is expected as our approach identifies imprinted genes through the entire cell clusters rather than individual cells. However, it should be noted that certain cell types can deviate from this pattern, as observed in BVECs and TPC1 at the E70 stage (Fig. 3c). Despite BVECs having approximately three times more cells than TPC1, both showed similar numbers of detected genes. Conversely, TPC1 exhibited approximately twice as many identified imprinted genes as BVECs. This highlights the specificity of gene regulatory programs and allelic expression patterns across different cell types.

Only approximately 25% of the candidate imprinted genes maintained consistent parent-of-origin effects across different cell types and developmental stages (Fig. 3d). This finding indicates that the imprinting status of nearly 75% of candidate imprinted genes in our study is dependent on cell types or developmental stages. For example, the maternal biased expression of IGF2R at cellular resolution was detected in BVECs, Fib, VSMCs, HBCs, and HEPACAM2+ TC, while biallelic expression was observed in almost all trophoblast subtypes (Fig. 3b). In pigs, previous studies reported that imprinting statuses of IGF2R were inconsistent and conflicting, even within the same tissue16,48,59. It is noteworthy that these studies relied on analyses of bulk tissue. As mentioned above, the overall imprinting status observed in bulk tissue might mask cell-type-specific variations in imprinting, which is likely to be one of culprits of the conflicting parent-of-origin effects previously reported for IGF2R or other imprinted genes.

In addition, we identified candidate imprinted genes whose imprinting statuses were developmental-stage-specific in many cell types. For instance, at the E70 stage, the candidate imprinted gene AMPH exhibited paternal expression in cell types such as TPC2, TC1, TC2, and TC3, with a tendency towards paternal expression in TPC1, while it was biallelic expressed in these cell types at the E40 stage (Supplementary Fig. 7). The potential imprinted candidate DPPA5 exhibited imprinting patterns similar to those of AMPH across trophoblast subtypes (Supplementary Data 4). The RT-PCR and Sanger sequencing validation experiments further confirmed that AMPH and DPPA5 were biallelic expressed at the E40 stage but exhibited paternal expression at the E70 stage (Supplementary Fig. 9). Additionally, our investigation identified a small subset of the 118 candidate imprinted genes, including PGM2L1, ABCA6, KYAT1, HTR4, PKDCC, CERK, ATP2C2, ENSSSCG00000055179, ALDH5A1, FOLH1, LASP1, ENSSSCG00000054715, SGSM2, and PDE1A, that exhibit opposite allelic expression across developmental stages. It is not uncommon for certain genes to display opposite allelic expression in the same tissue, often due to alterations in the regulation of gene expression during development or when contextual factors change60. To further validate this, we selected the known imprinted gene PGM2L1 and the candidate imprinted gene ABCA6 for RT-PCR and Sanger sequencing. The results demonstrated that both were maternally expressed at the E40 stage and exhibited paternal expression at the E70 stage (Supplementary Fig. 9). Allelic switches of imprinted genes were observed in the same cell types across different developmental stages, suggesting their regulatory roles in placental development and function.

Imprinted genes involved in trophoblast development

To further investigate the roles of imprinted genes in placental trophoblast development, we employed the partition-based graph abstraction (PAGA) method61 to infer the lineage trees from snRNA-seq data of trophoblast subtypes in the E40 and E70 placental tissues. We obtained the landscapes showing projections from trophoblast precursor cells (TPC1 and TPC2) towards the mature trophoblast cell subtypes (Fig. 4a, b). Notably, at both E40 and E70 stages, TPC1 and TPC2 were positioned at the starting point of the differentiation trajectories. However, their connected patterns with other trophoblast subtypes undergo significant changes between these two developmental phases, indicating dynamic gene transcriptional plasticity and differentiation patterns. We then identified differentially expressed genes (DEGs) between the E40 and E70 stages in TPC1 and TPC2 (Fig. 4c, d and Supplementary Data 7). Interestingly, the majority of DEGs in TPC1 and TPC2 were shared. Among these, the candidate imprinted genes DPPA5 and AMPH were more abundant at the E40 stage compared to the E70 stage. The DPPA5, a specific marker for pluripotent stem cells, encodes a protein involved in maintaining cell pluripotency and early embryogenesis62. AMPH encodes a cytoplasmic substrate for cyclin-dependent kinase-like 5 that is crucial for neuronal development63. These results suggest that the candidate imprinted genes DPPA5 and AMPH may contribute to the development of placental trophoblast.

Fig. 4: Involvement of imprinted genes in the development of trophoblast cells.
figure 4

a, b Partition-based graph abstraction (PAGA) showing the trajectory of trophoblast cells of the E40 (a) and E70 (b) porcine placental tissues. c Volcano plot of TPC1 transcriptomic comparison between the E40 and E70 stages. d Venn diagram represents the overlaps of the upregulated and downregulated DEGs between TPC1 and TPC2. e Gene set scoring analysis for TPC1 and TPC2 at the E40 and E70 stages using the DNA methylation and demethylation gene sets. Individual data points are shown (n = 1854 nuclei for TPC1 at the E40 stage from 6 porcine placental samples; n = 622 nuclei for TPC1 at the E70 stage from 6 porcine placental samples; n = 5926 nuclei for TPC2 at the E40 stage from 6 porcine placental samples; n = 8761 nuclei for TPC2 at the E70 stage from 6 porcine placental samples). Box plots show the median (center line) and the 25th and 75th percentiles (box edges) and whiskers indicate the 1.5x inter-quartile range above or below the box. The two-sided Wilcoxon rank-sum test P values and Cohen’s d effect size are displayed. P values < 0.05 and |d | > 0.2 are considered significant. f Correlation between the expression of DNA demethylation gene sets and DPPA5-AMPH gene set. Pearson’s r and corresponding P values (two-sided t-test) are indicated. The solid line and shaded area represent the linear regression fit and 95% confidence interval, respectively. AEC areolar epithelial cell, TPC trophoblast precursor cell, TC trophoblast cell, FC fold changes, E40 embryonic day 40, E70 embryonic day 70.

Epigenetic modifications, particularly DNA methylation, are recognized as an important role in regulating imprinted gene expression, cellular differentiation, and the maintenance of a functional placenta64,65,66. Therefore, we performed the gene set scoring (see methods) to analyze the expression profiles of genes involved in DNA methylation (DNMT1, DNMT3A, DNMT3B, DNMT3L, UHRF1, CXXC1, KMT2A, and SETDB1) and demethylation (GADD45A, RNF4, TET3, GADD45B, TET2, PARP1, TDG, TET1, XRCC1, MBD4, SMUG1, and AICDA) (Supplementary Data 8)66. We observed that genes responsible for maintaining DNA methylation showed no significant expression variations between different development stages in both TPC1 and TPC2 (Fig. 4e). However, genes involved in DNA demethylation were highly expressed in TPC1 and TPC2 at the E40 stage compared to the E70 stage (Fig. 4e). We sought to determine whether the DNA demethylation was associated with the expression of candidate imprinted genes DPPA5 and AMPH. We performed a Pearson correlation analysis and found that the expression levels of genes involved in DNA demethylation, particularly AICDA, which encodes a cytidine deaminase involved in DNA demethylation, were positively correlated with the expression levels of DPPA5 and AMPH in both TPC1 and TPC2 (Fig. 4f). These results provide valuable insights into gene expression patterns that may be influenced by the epigenetic regulatory mechanisms. Nevertheless, the expression analysis of DNA methylation-related genes suggests only a potential association with biased allelic expression, rather than establishing causal mechanisms. Future studies directly examining allelic DNA methylation or histone modifications at these genes will provide the most conclusive evidence for their regulatory mechanisms.

Cross-species analysis reveals cell-cell communication driving cell-type-specific allelic expression

Genomic imprinting, as a unique form of epigenetic regulation, has inspired several evolutionary theories, such as the kinship theory67,68, sexual antagonism theory69, and maternal-offspring coadaptation theory70. Despite their varied foundations, these theories share a critical common feature: selective asymmetry between maternally and paternally inherited allele copies drives the preferential expression of certain alleles. Our preceding results show that parent-of-origin effects are cell-type-dependent, which suggests that the selective forces may also be cell-type-specific. However, this important principle remains untested.

The cell-cell communication through ligand-receptor signaling pathways plays a vital role in adaptive responses of cell types to internal or external factors. Herein, we hypothesize that the integration and modification of signaling pathways into a cell-type-specific autocrine or paracrine network is an important mode of imprinting regulation. To address this point, we conducted a systematic analysis of cell-cell communication among all the cell types in the E70 porcine placenta using CellChat, a tool that quantitatively infers and analyzes intercellular communications based on ligand-receptor interactions71. Our results showed that the IGF signaling is the primary communicating pathway among the placental cell types in pigs (Fig. 5a). Network centrality analysis identified that BVECs were the dominant mediator, suggesting its role as a gatekeeper of cell-cell communication (Fig. 5b). Interestingly, the majority of IGF interactions among trophoblast subtypes were paracrine, with significant autocrine signaling observed only in BVECs, Fib, and VSMCs (Fig. 5b). Notably, among all known ligand-receptor pairs, we observed that all placental cell types expressed IGF2. However, IGF signaling was primarily mediated by IGF2 ligand and its IGF2R receptor in BVECs, Fib, and VSMCs (Fig. 5c, d). Consistent findings were also observed in the E40 porcine placenta (Supplementary Fig. 11). IGF2 consistently exhibited paternal expression across all cell types, whereas IGF2R primarily maternally expressed in BVECs, Fib, and VSMCs. This pattern suggests a non-coincidental relationship between the imprinting status of IGF2 and IGF2R and their integration into autocrine signaling networks within these specific cell types.

Fig. 5: Comparison of cell-cell communication among pigs, mice, and humans.
figure 5

a, e, i The IGF signaling pathway networks in the placental tissues of pigs (a), mice (e), and humans (i). Edge width represents the communication probability. b, f, j Heatmap shows the relative importance of each cell type based on the computed four network centrality measures of IGF signaling pathways in pigs (b), mice (f), and humans (j). c, g, k Relative contribution of each ligand-receptor pair to the IGF signaling pathways in pigs (c), mice (g), and humans (k). d, h, l Violin plot shows the expression distribution of ligands and receptors involved in IGF signaling pathways across cell types in the placental tissues of pigs (d), mice (h), and humans (l). m–o The allelic expression of IGF2 and IGF2R in the human placental Fib1 (m), BVECs (n), and dM2 (o). BVECs blood vascular endothelial cells, Fib fibroblast, HBCs Hofbauer cells, VSMCs vascular smooth muscle cells, AEC areolar epithelial cell, TPC trophoblast precursor cell, TC trophoblast cell, BVECs (d) decidual blood vascular endothelial cells, SpA-TGC spiral artery-associated trophoblast giant cell, PrE primitive endoderm, BVECs (p) placental blood vascular endothelial cells, SCT syncytiotrophoblast, VCT villous cytotrophoblast, EVT extravillous trophoblast, dM decidual macrophages, E40 embryonic day 40, E70 embryonic day 70.

To further elucidate the relationship between the intercellular communication and imprinting regulatory evolution in the placenta among different species, we compared our findings with scRNA-seq datasets of mouse and human placental tissues (see methods). We identified eight cell types (trophoblast cell, spiral artery-associated trophoblast giant cell (SpA-TGC), BVECs, stromal cell, immune cell, primitive endoderm (PrE) cell, hematopoietic cell, and erythrocyte) in the mouse placenta (Supplementary Data 9) and ten cell types (VCT, SCT, EVT, HBCs, BVECs, two subtypes of Fib (Fib1 and Fib2), decidual BVECs, and two subtypes of decidual macrophages (dM1 and dM2)) in the human placenta (Supplementary Data 10). Similar to pigs, in the mouse placenta, BVECs were the dominant mediator in the IGF signaling network and the ligand-receptor interaction between Igf2 and Igf2r was integrated into autocrine network (Fig. 5e–h and Supplementary Fig. 11). A previous study showing that paternally expressed Igf2 and maternally expressed Igf2r were highly expressed and engaged in a robust ligand-receptor interaction in the mouse placental BVECs72. Together with our results, this supports the hypothesis that the integration of signaling pathways into cell-type-specific autocrine networks plays a crucial role in the regulatory evolution of imprinted genes.

Surprisingly, we did not observe the formation of an autocrine network involving human orthologues of these genes, and although IGF2 was highly expressed in human placental Fib and BVECs, the expression level of IGF2R was low (Fig. 5i–l and Supplementary Fig. 11). We investigated whether the absence of an autocrine network in specific cell types would affect the imprinting status of the associated genes. By evaluating the allelic expression of IGF2 and IGF2R with matched placental whole-genome sequencing datasets (see methods), we observed that IGF2 was biasedly allelic expressed (allelic bias = 0.99), whereas IGF2R was biallelic expressed (allelic bias = 0.63) in human placental Fib1 (Fig. 5m). There were insufficient allele-specific scRNA-seq reads of heterozygous SNVs over the IGF2R in human placental BVECs. Thus, we collected RNA-seq data for human placental endothelial cells from ENCODE to rigorously evaluate the allelic expression of IGF2 and IGF2R. Similar to placental Fib1, IGF2 was biasedly allelic expressed (allelic bias = 0.85), but IGF2R was biallelic expressed (allelic bias = 0.54) in human placental endothelial cells (Fig. 5n). We identified an autocrine network formed by these genes in human dM2 (Fig. 5l and Supplementary Fig. 11). Thus, we further evaluated the allelic expression of IGF2 and IGF2R using matched decidual whole-genome sequencing datasets (see methods). Interestingly, both IGF2 and IGF2R showed biased allelic expression in human dM2 (Fig. 5o). Previous studies reported paternal monoallelic expression of IGF2 and predominantly biallelic expression of IGF2R in the human placenta4,73,74. Similarly, except for dM2, our results indicate that IGF2R is not imprinted in human placental cell types. Taken together, these results suggest that cell-cell communication may mediate biased allelic expression of certain imprinted genes in a cell-type-specific manner.

In addition to IGF2 and IGF2R, we investigated whether the regulatory evolution of other imprinted genes could be explained by our proposed principle. Although CellChat provides a comprehensive database reflecting the current understanding of ligand-receptor interactions, most candidate imprinted genes identified in this study were not assignable to ligands or receptors, except for IGF2, IGF2R, DLK1, and TNFSF8. This is probably due to a lack of prior knowledge. However, fundamentally, ligand-receptor interactions rely on physical and functional protein-protein interactions. Therefore, we assessed the protein-protein interactions among the 234 (118 imprinted candidates plus 116 potential imprinted candidates) candidate imprinted genes using the STRING database, which systematically collects and integrates protein-protein interactions75. A total of 170 candidate imprinted genes could be mapped to the STRING database. We found that 90 out of the 170 candidate imprinted genes exhibited protein-protein interactions, resulting in 82 interaction pairs (Supplementary Fig. 12 and Supplementary Data 11). We selected a subset of predicted protein-protein interactions, including DLK1-COPG2, LSG1-POLR1D, and DLK1-KCNQ1, representing both known and candidate imprinted genes in this study, to validate their direct interactions through co-immunoprecipitation assays and yeast two-hybrid assays. The results confirmed that DLK1 directly interacts with COPG2 and KCNQ1, and LSG1 directly interacts with POLR1D (Supplementary Fig. 13). Future efforts are needed to validate all predicted protein-protein interactions. Additionally, certain predicted interactions may involve indirect rather than direct physical interactions, necessitating the application of more advanced validation methods in future studies.

To further corroborate our hypothesis, we calculated a cell-type specificity index of gene co-expression (see methods). This analysis revealed that most gene-gene interaction pairs were co-expressed in a cell-type-specific manner, exhibiting higher co-expression levels in the cell types where their imprinting was observed (Supplementary Fig. 12 and Supplementary Data 11). In both mice and humans, 82 known imprinted genes from the Geneimprint database could be mapped to the STRING database, resulting in 444 and 399 interaction pairs, respectively (Supplementary Fig. 14 and Supplementary Data 12). These imprinted gene-gene pairs also displayed cell-type-specific co-expression (Supplementary Fig. 14 and Supplementary Data 12). Together, our results indicate that the cell-type-specific parent-of-origin effects may be largely driven by the cell-cell communication.

Discussion

Genomic imprinting is an epigenetic regulation that regulates the parent-of-origin-specific expression of certain genes and it has been supposed to be one of the most crucial regulatory pathways involved in the placental development of eutherian mammals. Previous pioneering work aimed at identifying imprinted genes at the organ or tissue level in eutherian mammals such as humans, mice, and pigs has advanced our understanding of genomic imprinting14,15,18,44. However, mounting evidence indicates that cell-type-specific imprinted transcription is more pervasive than previously understood47. Given the heterogeneous nature of placental tissue, our study utilizes reciprocal crosses of genetically distinct Duroc and Lulai pig breeds, combined with snRNA-seq and whole-genome sequencing, to accurately assess parent-of-origin effects on gene transcription. We decipher a high-resolution single-cell atlas of porcine placenta development, which offers an important resource for investigating imprinted genes at cellular level. This approach reveals the genomic imprinting of a series of well-defined cell types in the pig placenta, allowing us to validate nearly all known imprinted genes and discover 97 previously uncharacterized candidate imprinted genes. This demonstrates that the proposed single-cell integrative omics methodology is a powerful and robust tool for constructing a cell-type-specific landscape of genomic imprinting.

A previous study calculated allelic ratios (MAT/PAT) of 25 known imprinted genes and found uniform canonical imprinted gene expression across five cell types in mouse cerebral cortex76. In the present study, we also found that a small subset of candidate imprinted genes (30/118) exhibited consistent parent-of-origin expression regardless of cell types and developmental stages. Although lacking effective high-throughput screening for imprinted genes, pioneering efforts have provided evidence that individual genes exhibit imprinting status in a cell-type- and developmental-stage-dependent manner by using the fluorescence in situ hybridization and luciferase transgenic reporter strategy19,77. This is in line with our results which reveal that highly cell-type- and developmental-stage-dependent allelic effects for imprinted genes. In 1991, Igf2r, encoding a multiple ligand-binding cell surface receptor, was one of the first reported maternally expressed imprinted genes in mice78. However, several studies have shown the conflicting parent-of-origin effects for IGF2R in other mammals. In pigs, Bischoff and colleagues reported that IGF2R was identified as maternally expressed imprinted genes in brain and liver tissues, but not in placental tissues48. Braunschweig reported that IGF2R was expressed from both parental alleles in kidney and muscle tissues of fetal, newborn, and adult pigs as well as fetal brain and liver tissues59. Another study reported that IGF2R was paternally expressed in porcine brain tissue samples16. The discordance observed in the overall imprinting status of IGF2R in bulk tissue is likely attributable to differences in number of cells expressing both alleles or only one allele, As the present study reported, IGF2R was maternally expressed in certain placental cell types, but exhibited biallelic expression in others. Our results demonstrate that a full capture of cell-type-specific imprinted genes using a single-cell transcriptome-based approach is conceivable and superior to bulk RNA-seq methods.

In 1984, pronuclear transplantations, where embryos with uniparental diploid genomes developed abnormally, inspired the concept that the two parentally inherited genomes in mammalian cells are functionally distinct79,80, revealing the need for distinct epigenetic imprints acquired in parental gametes. Following the discovery of the first imprinted genes (Igf2r, Igf2, and H19) in 199178,81,82, there has been a longstanding interest in understanding how genomic imprinting evolves in response to selective pressure. Several theories, such as the kinship theory67,68, sexual antagonism theory69, and maternal-offspring coadaptation theory70, have dedicated to explain its existence. The most widely accepted theory explaining the evolution of imprinting, known as the kinship theory, posits that imprinting occurs when there is a conflict of interest between the maternal and paternal genomes due to their differential investment in the offspring (e.g., in the allocation of nutrients)67,68. Strictly speaking, imprinting possibly arise whenever there is an asymmetric interaction between an individual and its maternal and paternal relatives. For example, Igf2 and Igf2r interact with each other in signaling pathways that regulate the expression of genes related to fetal growth (Igf2 promotes fetal growth, whereas Igf2r inhibits excessive fetal growth)72. This cell-cell communication through ligand-receptor signaling pathways mediates adaptive responses to maternal and paternal asymmetries in resource allocation. This may explain the opposite imprinting patterns of IGF2 and IGF2R in the placentas of pigs and mice. In humans, however, IGF2R predominantly exhibits biallelic expression in the placenta, likely due to the fact that most human pregnancies are singleton. The lack of competition between the parental genomes may explain the relatively relaxed necessity for placenta-specific imprinting in humans4.

Cell-cell communication plays an important role in mediating the adaptive responses of cell types to both internal and external factors. Thus, cell-type-specific imprinting may occur whenever the integration and modification of signaling pathways into a cell-type-specific autocrine or paracrine network influences the phenotype or fitness resulting from asymmetric interactions between an individual and its maternal and paternal relatives. Our hypothesis regarding cell-cell communication-induced imprinting mode provides an additional molecular mechanism to explain the well-known discrepancies in IGF2R imprinting observed between humans, mice, and pigs. Furthermore, to better understand the biological functions of predicted imprinted gene-gene interaction pairs in this study, we performed the gene ontology (GO) enrichment analysis. We found that these genes were enriched in processes related to nutrient metabolism (e.g., fatty acids, amino acids, and organic acids metabolic processes), ion transport, and cell proliferation and growth (Supplementary Data 13). These findings suggest that placental imprinted genes are likely involved in processes critical for balancing maternal-offspring resource allocation. Future efforts connecting cellular transcription activity to phenotype and fitness in the context of intercellular interactions will further validate the generality of our findings in eutherian mammals. That being said, the mechanisms by which the cell-type-specific selective asymmetry drives the evolution of epigenetic regulation to control parent-of-origin-specific allele expression in the mammalian placenta remain to be further explored.

In summary, our data and results provide an essential resource to the genetics community and molecular insights into parent-of-origin effects on gene expression in mammals. In addition to these notable findings (Fig. 6), our current study is not without limitations. Our single-nucleus poly-A-mRNA data does not capture many noncoding RNAs and low snRNA-seq reads coverage of certain lowly expressed genes limits the statistical power. In addition, the limited coverage of SNVs using the 10x Genomics 3’ snRNA-seq approach may not be able to distinguish isoform-specific imprinting. Future researches should seek to complement our single-nucleus transcriptomic data to overcome its limitations. Placental tissue samples should be collected and analyzed from more time points to better understand the mechanisms by which imprinted genes regulate placental development. Additional assays and functional validations are required to obtain a complete understanding of the genomic imprinting and the mechanisms of its variation across cell types and developmental stages.

Fig. 6: Schematic model summarizing the findings of this study.
figure 6

snRNA-seq single-nucleus RNA sequencing, BVECs blood vascular endothelial cells.

Methods

Animal experiments and sample collection

To decode the genomic imprinting in the pig placenta, we selected two distantly related pig breeds (Duroc pig (Western breed) and Lulai pig (Chinese local breed)) to perform two reciprocal crosses. The pregnant sows from each cross were humanely euthanized to collect placental tissue samples. In the first Duroc mother × Lulai father cross and Duroc father × Lulai mother cross, we collected three placental tissue samples of F1 progeny from each cross at the E40 stage, along with ear tissue samples from all parents. Thus, a total of six placental tissue samples and four parental ear tissue samples were collected at the E40 stage. In the second Duroc mother × Lulai father cross and Duroc father × Lulai mother cross, the same number of placental tissue samples and parental ear tissue samples were collected at the E70 stage. In total, we collected 12 placental tissue samples (n = 6 (four males and two females) each from the E40 and E70 stages) and 8 parental ear tissue samples. All tissue samples were frozen in liquid nitrogen and promptly stored at −80 °C for subsequent analysis. All experiments conducted in the present study adhered to the ethical policies and procedures approved by the South China Agricultural University Animal Care and Animal Use Committee (approval number SCAU#2014-0136).

Sample preparation for snRNA­-seq

The frozen tissue samples were cut into 2 mm pieces and homogenized using a Dounce homogenizer (885302-0002, Kimble Chase) in 2 ml of ice-cold EZ lysis buffer (NUC-101, Sigma-Aldrich) supplemented with protease inhibitor (5892791001, Roche) and RNase inhibitor (Promega N2615 and AM2696, Life Technologies). Then, 2 ml of ice-cold EZ lysis buffer was add, and samples were incubated on ice for 5 min. Samples were filtered through a 40 mm cell strainer (43-50040-51, pluriSelect) and then centrifuged at 500 x g for 5 min at 4 °C. The nucleus pellet was resuspended and washed with 4 ml of ice-cold EZ lysis buffer and incubated on ice for 5 min. After another centrifugation, supernatant was removed and the nucleus pellet was resuspended in Nuclei Suspension Buffer (1x PBS, 0.07% BSA, and 0.1% RNase inhibitor) and filtered through a 20 mm cell strainer (43-50020-50, pluriSelect). The isolated nuclei were stained with 7-Aminoactinomycin D (AP104, MultiSciences) for 5 min and then sorted to remove debris and impurities using a BD FACSAria II flow cytometer. A final concentration of 700-1200 nuclei/μl was used for loading on a 10x V3 channel. Single-nucleus suspension was loaded on the 10x Chromium chip according to the manufacturer’s protocol for the Chromium Single Cell 3’ Reagent Kits v3 (1000075, 10x Genomics). Libraries were checked for quality and then sequenced on an Illumina NovaSeq 6000 platform (double-end sequencing, 150 bp).

snRNA-seq data analysis

The Cell Ranger (version 7.1.0) was used to perform sample demultiplexing and barcode processing. The reads of snRNA-seq data were aligned to the Sscrofa11.1 reference genome. We removed the low-quality nucleus profiles defined as those with either <500 detected genes, >4500 detected genes, or >50,000 unique molecular identifiers using the Seurat R package (version 4.3.0.1)83. The doublets in each dataset were further removed using the DoubletFinder package (version 2.0.3)84.

The individual aforementioned filtered, non-scaled Seurat objects of the E40 or E70 placental tissues were firstly merged into an object. The merged object was normalized and performed the dimensional reduction analysis using the “NormalizeData”, “FindVariableFeatures”, “ScaleData”, and “RunPCA” functions implemented in the Seurat. Then, the batch effects between samples were corrected using the “RunHarmony” function of Harmony package (version 1.0.3)85. The “FindNeighbors” and “FindClusters” functions of Seurat were used to accomplish cell clustering. The “RunUMAP” or “RunTSNE” function was used to visualize cells. Finally, the “FindAllMarkers” function of Seurat was used to determine the marker genes (log2 fold changes (log2FC) >0.25 and adjusted P value < 0.05).

The above approach enabled us to identify the origin of cells in each cluster, distinguishing whether they were from male or female placental samples. To further eliminate maternally derived nuclei from the placental snRNA-seq datasets, we utilized the expression of XIST, a gene expressed exclusively in female cells, as a marker to identify maternal contaminating cell clusters in placental samples. These maternal contaminating cell clusters (Cluster 5, 8, 13, 15, and 18 at the E40 stage, and Cluster 8, 16, and 17 at the E70 stage) were shared across both male and female placental samples, and were subsequently removed from male and female placental snRNA-seq datasets. Next, we carried out the dimensional reduction, cell clustering, and identification of marker genes for each dataset. The cell cluster identities were assigned as five broad cell types (TCs, BVECs, VSMCs, Fib, and HBCs) by manual annotation based on the expression of known marker genes. To further identify subtypes of the TCs at the E40 and E70 stages separately, the TCs were selected in each dataset and then used for integrating into Seurat objects. Each of these merged objects was performed the dimensional reduction, cell clustering, and identification of marker genes (Supplementary Data 1) as described above. To assess the similarities of subtypes of TCs between the E40 and E70 stages, we performed the MetaNeighbour analysis39 with default parameters. The AUROC scores were obtained, and if AUROC scores between cell subtypes were higher than 0.95, they were considered to be conserved or similar.

To compare whether subtypes of TCs of pigs are conserved with those of humans and mice, we download the placental snRNA-seq datasets of humans (Donor P13, https://www.reproductivecellatlas.org/mfi.html)38 and mice35 with well-defined trophoblast subtypes. We used the cell-type annotation already defined by the authors to classify each cell and then determined the marker genes (log2FC > 0.25 and adjusted P value < 0.05) for each trophoblast subtype using Seurat R package. The detected orthologous genes of these three species were retained to perform the cell-type similarity analysis using the MetaNeighbour algorithm with default parameters.

Immunofluorescence

Immunofluorescence was performed on 4 μm-thick, 4% PFA-fixed, and paraffin-embedded placental tissue sections. Firstly, antigen retrieval was accomplished by microwaving the sections at 95 °C for 15 min in Tris-EDTA (pH 8.0; B0035, Powerful Biology). Sections were cooled at room temperature and washed three times (5 min each time) with PBS (pH 7.4; B0040, Powerful Biology). After blocking with 3% BSA (A8010, Solarbio) for 30 min, the primary antibody was added for incubation at 4 °C overnight. Sections were washed three times (5 min each time) and incubated with corresponding secondary antibody for 50 min. Sections were washed three times (5 min each time) with PBS (pH 7.4). Nuclei were counterstained using 10 min incubation with DAPI (B0025, Powerful Biology). Finally, sections were embedded in mounting medium and stored at 4 °C. The primary antibodies used were anti-DIAPH3 (ER1802-54, HUABIO), anti-HEPACAM2 (ER1910-46, HUABIO), anti-THRB (AF0357, Affinity Biosciences), and CHST3 (ER1906-28, HUABIO).

Whole-genome sequencing and construction of parental genome

Ear tissue samples from 8 parental individuals were used for DNA extraction based on the protocol outlined in Sambrook and Russell (2006). The TruSeq Nano DNA LT Library Prep Kit (Illumina Inc., USA) was used to construct the DNA libraries and then the libraries were checked for quality using the Agilent Bioanalyzer 2100. The libraries were sequenced on the Illumina HiSeq X10 platform in a 150-bp paired-ended manner.

Details of the pipeline to construct each parental genome have been described in our previous study18. In brief, the reads from whole-genome sequencing were mapped to Sscrofa11.1 reference genome using the BWA MEM software (version 0.7.16a)86. The SNVs were called using the Genome Analysis Toolkit (GATK, version 4.1.10)87 and removed the SNVs that showed heterogeneous calls. The individualized parental genomes were created by replacing the reference allele with homozygous alternative allele.

Identification of imprinted genes

To identify imprinted genes in well-defined cell types of each snRNA-seq dataset, the Seqtk tool (https://github.com/lh3/seqtk) was used to extract sequences in the FASTQ format based on the cell barcodes of each cell type. Then, extracted snRNA-seq reads of cell types from each hybrid animal were mapped to each of the two individualized parental genomes using the HISAT2 (version 2-2.1.0)88. Only a read that contained more alleles matching the genome than its mate was assigned to a genome. After assignment of reads, number of reads originated from each gene was counted using the BEDtools (version 2.30.0)89 and a read count table of genes for each of the two parental genomes was generated. The details of the analytics pipeline have been deposited to the GitHub repository (https://github.com/JianpingQuan/ASE-analysis).

To detect significant parent-of-origin expression in reciprocal F1 of Duroc and Lulai pigs, the Duroc expression ratio in Duroc mother × Lulai father cross (maternal ratio) and the Duroc expression ratio in Duroc father × Lulai mother cross (paternal ratio) were defined as p1 and p2, respectively. Previous studies have shown that most known imprinted genes have an allelic ratio over 0.7 or below 0.3 and confirmed the generality of this criteria13,76,90. Additionally, the sequencing read counts is a factor that can statistically influence allele bias, as lowly expressed transcripts may lead to false positives in the detection of allele bias. Previous studies have reported that a minimum coverage of 10 reads for heterozygous SNVs was required to accurately calculate allele-specific expression76,91. Therefore, to identify imprinted candidates, we used relatively stringent criteria cutoff of average p1 > 0.70, p2 < 0.30 for maternally expressed candidate imprinted genes and average p1 < 0.30, p2 > 0.70 for paternally expressed candidate imprinted genes (at lead 15 reads of coverage for heterozygous SNVs in more than half of replicates), and the allele-specific expression of each F1 replicate was required to show the same parent-biased expression direction (cutoff of p1 > 0.70, p2 < 0.30 or p1 < 0.30, p2 > 0.70). The previous study reported that imprinted genes likely exhibited interindividual variability in their imprinting status, being imprinted in some individuals while exhibiting near-biallelic expression in others46. Thus, we also called the potential imprinted candidates using the criteria cutoff of average p1 > 0.70, p2 < 0.30 average p1 < 0.30, p2 > 0.70, but conservative cutoff for each replicate (p1 > 0.60, p2 < 0.40 or p1 < 0.60, p2 > 0.40). By the same token, for the candidate imprinted genes that met stringent criteria defined above, we also assessed their parental-biased expression in the cell types where these genes failed to meet stringent criteria. We firstly calculated the allelic ratio for each replicate and then computed the mean of allelic ratios across replicates to derive the average allelic ratio. Finally, the bias in parental expression levels of these candidate imprinted genes in the cell types was assessed using the standard criteria (the average allelic ratio >0.70 for MAT or <0.30 for PAT).

Trajectory analysis and differential gene expression analysis

We used the PAGA in scanpy toolkit (version 1.8.1)61 to infer the trajectory of trophoblast differentiation of snRNA-seq data from the E40 and E70 placental tissues. Firstly, the Seurat objects were converted to AnnData using the Convert function of the SeuratDisk package (version 0.0.0.90). Then, the data was processed following the steps suggested by scanpy, including the computation of neighbourhood graph among data points and the generation of a topologically faithful embedding. Finally, the PAGA was performed with the default parameters.

To identify DEGs within the cell type pairs between the E40 and E70 stages, we conducted the “FindMarkers” function in Seurat for the differential gene expression analysis. The Wilcoxon rank-sum test implemented in the “FindMarkers” function was used to calculate the Log2FC and P value adjustment. Genes with |log2FC | >1.0, adjusted P value < 0.05 were considered as DEGs (Supplementary Data 7).

Gene set scoring analysis

The genes of DNA methylation and demethylation gene sets are listed in the Supplementary Data 3. The signature scores of DNA methylation and demethylation gene sets as well as the DPPA5-AMPH gene set were computed using the “AddModuleScore” function of the Seurat R package. The differences in signature scores between the E40 and E70 stages were evaluated using the two-side Wilcoxon rank sum test. Additionally, the Pearson correlation analysis between these gene sets based on the signature scores was performed using the cor.test function implemented in R software (version 4.2.3).

Cross-species ligand-receptor interaction comparison

We downloaded the scRNA-seq dataset of mouse placenta from GSE156125 (samples labeled E10.5 h, E11.5 h, and E12.5 h) in the Gene Expression Omnibus (GEO) database92 and the scRNA-seq dataset of human placenta from E-MTAB-6701 in the ArrayExpress database30. Firstly, we processed and performed the cell clustering analysis on the mouse and human scRNA-seq datasets as described above and assigned the cell type identities by manual annotation based on the expression of known marker genes (Supplementary Data 9 and 10).

Next, the Seurat objects of the pig, mouse, human placentas were converted into CellChat objects to perform cell-cell communication analysis using CellChat (version 1.6.1)71. The data was processed following the steps suggested by CellChat, including the identification of over-expressed signaling genes and ligand-receptor interactions associated with each cell type, computation of the communication strength between any interacting cell types, inference of the cell-cell communication at a signaling pathway level, and calculation of the aggregated cell-cell communication network. The “netVisual_aggregate” function was used to visualize the cell-cell communication network. The centrality scores were computed and visualized using the “netAnalysis_computeCentrality” and “netAnalysis_signalingRole_network” functions. The “netAnalysis_contribution” function was used to compute and visualize the contribution of each ligand-receptor pair to the overall signaling pathway. The gene expression distribution of signaling genes related to ligand-receptor pairs was plotted using a Seurat wrapper function “plotGeneExpression”.

Assessment of allelic expression of IGF2 and IGF2R in human cell types

To identify mono- or biallelic expression of IGF2 and IGF2R in the human placental Fib, BVECs, and dM2, the RNA-seq sequences based on the cell barcodes of each cell type was extracted and the matched placental and matched decidual whole-genome sequencing datasets were downloaded from E-MTAB-7304 in the ArrayExpress database30. The RNA-seq dataset of endothelial cell of umbilical vein was downloaded from the ENCODE catalog (https://www.encodeproject.org/files/ENCFF788ZKW/). The RNA-seq reads were aligned to the GRCh38 genome by STAR (version 2.7.11a). The mapped reads were then processed based on the GATK RNA-seq short variant discovery tutorial (https://gatk.broadinstitute.org/hc/en-us/articles/360035531192-RNAseq-short-variant-discovery-SNPs-Indels) and “ASEReadCounter” function of GATK to identify heterozygous SNVs and their allele frequency. Heterozygous SNVs with at least 10 reads of coverage were kept for further analyses. The significance of allele bias of a gene was then assessed via MBASED algorithm (version 1.36.0)93, which required that a gene exhibit an estimated major haplotype frequency (MAF) ≥ 0.70 and P value < 0.05 in order to be declared as exhibiting allele-specific expression.

Cell-type specificity index of gene co-expression

To further identify the cell-type-specific ligand-receptor interactions among the candidate imprinted genes by assessing the protein-protein interactions using the STRING database75, we defined a cell-type specificity index of gene co-expression. Firstly, the candidate imprinted genes in the present study and imprinted genes of mice and humans with clear and provisional evidence included in the Geneimprint database (http://www.geneimprint.com/) were mapped to STRING database separately to identify the interaction pairs. Next, for each identified interaction pair, we calculated the average expression of each gene for all cells within each cell type (cell type expression) using the Seurat wrapper function “AverageExpression”. A cell-type specificity value was computed for each gene through dividing each cell type expression value by the sum of all cell type expression values of the corresponding gene. Finally, the cell-type specificity index of gene co-expression for each interaction pair was calculated by multiplying the cell-type specificity values of respective genes. Note that the closer the value gets to 1 the more exclusive an interaction pair is expressed in a single cell type. To better understand the biological functions of the predicted imprinted gene-gene interaction pairs in the porcine placenta, we performed the gene ontology enrichment analysis using the “enrichGO” function of clusterProfiler R package (version 4.8.3) based on the dataset “org.Ss.eg.db”94.

Validation of candidate imprinted genes

To validate the candidate imprinted genes, the RT-PCR plus Sanger sequencing was performed9. The F1 placental tissues were dissected from reciprocal crosses between genetically distinct Duroc and Lulai pig breeds and used for RNA extraction using Trizol kit (B511321, Sangon Biotech). cDNA was synthesized with Evo M-MLV One Step RT-PCR kit (AG11606, Accurate Biology). RT-PCR was performed with primers described in Supplementary Data 14. The PCR products were used for Sanger sequencing and the data files were analyzed using the Chromas software (version 1.62).

Validation of predicted protein-protein interactions

To validate the predicted DLK1-COPG2, LSG1-POLR1D, and DLK1-KCNQ1 interactions, we firstly performed co-immunoprecipitation assays95. Since useful polyclonal or monoclonal antibodies with specificity to pigs were absent, the DLK1-HA, COPG2-MYC, LSG1-HA, POLR1D-FLAG, and KCNQ1-FLAG recombinant plasmids were used to performed co-immunoprecipitation assays. The recombinant plasmids were co-transformed into porcine kidney (PK15) cells (CL-0187, Procell) and incubated 48 h. Total protein was extracted with IP Lysis Buffer (87787, ThermoFisher) and whole cell lysates were centrifuged at 21,380 x g for 10 min at 4 °C. Supernatants were taken as input and incubated with IgG control antibody and HA agarose beads overnight at 4 °C. Western blotting was performed using anti-MYC (AE070, Abclonal), anti-FLAG (M185-3L, MBL), and anti-HA (Ab9110, Abcam). We noted that the protein concentration of recombinant KCNQ1-FLAG was very low in the supernatants (input group). In order to avoid false positives, we further performed yeast two-hybrid assays96,97 to validate the interaction between DLK1 and KCNQ1. The DLK1 was cloned into the pBT3-SUC vector and the KCNQ1 was cloned into the pPR3-N vector to generate the bait construct and the prey construct, respectively. The bait and prey constructs were transformed in pairs into yeast strain NMY51 (Y004, oebiotech). At the same time, the pTSU2-APP plus pNubG-Fe65 plasmids were transferred into yeast strain NMY51 as a positive control, and the pTSU2-APP plus pPR3-N plasmids and the pBT3-SUC-DLK1 plus pPR3-N plasmids were transferred into yeast strain NMY51 as negative controls. The transformants were assayed for growth on SD medium lacking leucine and tryptophan and SD medium lacking leucine, tryptophan, histidine, and adenine supplemented with X-Gal (5-bromo-4-chloro-3-indolyl-β-D-galactopyranoside) and 7 mM 3-AT (3-Amino-1,2,4-triazole) at 30 °C for 2-3 days.

Statistics and reproducibility

The cor.test function implemented in R software (version 4.2.3) was used to perform the Pearson correlation analysis with two-sided t-test. The “FindMarkers” function in Seurat (version 4.3.0.1) was used to identify DEGs with two-sided Wilcoxon rank sum test and P value adjustment was performed by Bonferroni correction using all genes in the dataset. For comparing signature scores of DNA methylation and demethylation gene sets between two groups, the two-sided Wilcoxon rank sum test was performed and Cohen’s d effect size was computed using the effsize R package (version 0.8.1). P values < 0.05 and |d | > 0.2 are considered significant. Exact sample sizes are indicated in the figure legends.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.