Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Reproductive and cognitive phenotypes in carriers of recessive pathogenic variants

Abstract

The genetic landscape of human Mendelian diseases is shaped by mutation and selection. Although selection on heterozygotes is well-established in autosomal-dominant disorders, convincing evidence for selection in carriers of pathogenic variants associated with recessive conditions is limited. Here, we studied heterozygous pathogenic variants in 1,929 genes, which cause recessive diseases when bi-allelic, in n = 378,751 unrelated European individuals from the UK Biobank. We find evidence suggesting fitness effects in heterozygous carriers for recessive genes, especially for variants in constrained genes across a broad range of diseases. Our data suggest reproductive effects at the population level, and hence natural selection, for autosomal-recessive disease variants. Further, variants in genes that underlie intellectual disability are associated with lower educational attainment in carriers, and we observe an altered genetic landscape, characterized by a threefold reduction in the calculated frequency of bi-allelic intellectual disability in the population relative to other recessive disorders.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Association of genetic burden for recessive disease with childlessness.
Fig. 2: Association of recessive genetic burden with cognitive and disease-related phenotypes.
Fig. 3: Association of childlessness, educational attainment, log-transformed number of ICD-10 diagnoses for different disorder groups.
Fig. 4: Comparison of the associations for rare synonymous variants and PLPs in recessive ID genes and all the other recessive genes.
Fig. 5: Sex-specific associations of childlessness, educational attainment, log-transformed number of ICD-10 diagnoses.
Fig. 6: Consanguinity effects and the genetic architecture for various disorder groups.

Similar content being viewed by others

Data availability

The raw data used in this study are available as part of the UK Biobank dataset.

Code availability

The code used to generate the data for this project is available via GitHub at https://github.com/Genome-Bioinformatics-RadboudUMC/ukbb_recessive_public.

References

  1. Acuna-Hidalgo, R., Veltman, J. A. & Hoischen, A. New insights into the generation and role of de novo mutations in health and disease. Genome Biol. 17, 241 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  2. Goldmann, J. M., Veltman, J. A. & Gilissen, C. De novo mutations reflect development and aging of the human germline. Trends Genet. 35, 828–839 (2019).

    Article  CAS  PubMed  Google Scholar 

  3. Shadrina, M. et al. Automated identification of germline de novo mutations in family trios: a consensus-based informatic approach. Preprint at bioRxiv https://www.biorxiv.org/content/10.1101/2024.03.08.584100v1 (2024).

  4. Kaplanis, J. et al. Genetic and chemotherapeutic influences on germline hypermutation. Nature 605, 503–508 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Goldmann, J. M. et al. Differences in the number of de novo mutations between individuals are due to small family-specific effects and stochasticity. Genome Res. 31, 1513–1518 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Gilissen, C. et al. Genome sequencing identifies major causes of severe intellectual disability. Nature 511, 344–347 (2014).

    Article  CAS  PubMed  Google Scholar 

  7. Kaplanis, J. et al. Evidence for 28 genetic disorders discovered by combining healthcare and research data. Nature 586, 757–762 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Girirajan, S. & Eichler, E. E. Phenotypic variability and genetic susceptibility to genomic disorders. Hum. Mol. Genet. 19, R176–R187 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Taylor, S. M., Parobek, C. M. & Fairhurst, R. M. Haemoglobinopathies and the clinical epidemiology of malaria: a systematic review and meta-analysis. Lancet Infect. Dis. 12, 457–468 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  10. Weatherall, D. J. Genetic variation and susceptibility to infection: the red cell and malaria. Br. J. Haematol. 141, 276–286 (2008).

    Article  CAS  PubMed  Google Scholar 

  11. Aidoo, M. et al. Protective effects of the sickle cell gene against malaria morbidity and mortality. Lancet 359, 1311–1312 (2002).

    Article  CAS  PubMed  Google Scholar 

  12. Hogenauer, C. et al. Active intestinal chloride secretion in human carriers of cystic fibrosis mutations: an evaluation of the hypothesis that heterozygotes have subnormal active intestinal chloride secretion. Am. J. Hum. Genet. 67, 1422–1427 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Oussalah, A. et al. Population and evolutionary genetics of the PAH locus to uncover overdominance and adaptive mechanisms in phenylketonuria: results from a multiethnic study. EBioMedicine 51, 102623 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Barton, A. R., Hujoel, M. L. A., Mukamel, R. E., Sherman, M. A. & Loh, P. R. A spectrum of recessiveness among Mendelian disease variants in UK Biobank. Am. J. Hum. Genet. 109, 1298–1307 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Heyne, H. O. et al. Mono- and biallelic variant effects on disease at biobank scale. Nature 613, 519–525 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Agarwal, I., Fuller, Z. L., Myers, S. R. & Przeworski, M. Relating pathogenic loss-of-function mutations in humans to their evolutionary fitness costs. eLife 12, e83172 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Amorim, C. E. G. et al. The population genetics of human disease: the case of recessive, lethal mutations. PLoS Genet. 13, e1006915 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  18. Fridman, H. et al. The landscape of autosomal-recessive pathogenic variants in European populations reveals phenotype-specific effects. Am. J. Hum. Genet. 108, 608–619 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Liu, A. et al. Evidence from Finland and Sweden on the relationship between early-life diseases and lifetime childlessness in men and women. Nat. Hum. Behav. 8, 276–287 (2024).

    Article  PubMed  Google Scholar 

  20. Mathieson, I. et al. Genome-wide analysis identifies genetic effects on reproductive success and ongoing natural selection at the FADS locus. Nat. Hum. Behav. 7, 790–801 (2023).

    Article  PubMed  Google Scholar 

  21. Gardner, E. J. et al. Reduced reproductive success is associated with selective constraint on human genes. Nature 603, 858–863 (2022).

    Article  CAS  PubMed  Google Scholar 

  22. Koko, M. et al. Exome sequencing of UK birth cohorts. Wellcome Open Res. 9, 390 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  23. Chundru, V. K. et al. Federated analysis of autosomal recessive coding variants in 29,745 developmental disorder patients from diverse populations. Nat. Genet. 56, 2046–2053 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. de Ligt, J. et al. Diagnostic exome sequencing in persons with severe intellectual disability. N. Engl. J. Med. 367, 1921–1929 (2012).

    Article  PubMed  Google Scholar 

  25. Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  26. Seplyarskiy, V. et al. A mutation rate model at the basepair resolution identifies the mutagenic effect of polymerase III transcription. Nat. Genet. 55, 2235–2242 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Weghorn, D. et al. Applicability of the mutation-selection balance model to population genetics of heterozygous protein-truncating variants in humans. Mol. Biol. Evol. 36, 1701–1710 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Cassa, C. A. et al. Estimating the selective effects of heterozygous protein-truncating variants from human exome data. Nat. Genet. 49, 806–810 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Demange, P. A. et al. Investigating the genetic architecture of noncognitive skills using GWAS-by-subtraction. Nat. Genet. 53, 35–44 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Barban, N. et al. Genome-wide analysis identifies 12 loci influencing human reproductive behavior. Nat. Genet. 48, 1462–1472 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Verweij, R. M. et al. Sexual dimorphism in the genetic influence on human childlessness. Eur. J. Hum. Genet. 25, 1067–1074 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Nisen, J., Martikainen, P., Kaprio, J. & Silventoinen, K. Educational differences in completed fertility: a behavioral genetic study of Finnish male and female twins. Demography 50, 1399–1420 (2013).

    Article  PubMed  Google Scholar 

  34. Zschocke, J., Byers, P. H. & Wilkie, A. O. M. Gregor Mendel and the concepts of dominance and recessiveness. Nat. Rev. Genet. 23, 387–388 (2022).

    Article  CAS  PubMed  Google Scholar 

  35. Goker-Alpan, O. et al. Parkinsonism among Gaucher disease carriers. J. Med. Genet. 41, 937–940 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Chen, C. Y. et al. The impact of rare protein coding genetic variation on adult cognitive function. Nat. Genet. 55, 927–938 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Rolland, T. et al. Phenotypic effects of genetic variants associated with autism. Nat. Med. 29, 1671 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Verweij, R. M. et al. Using polygenic scores in social science research: unraveling childlessness. Front. Sociol. 4, 74 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Wright, C. F. et al. Assessing the pathogenicity, penetrance, and expressivity of putative disease-causing variants in a population setting. Am. J. Hum. Genet. 104, 275–286 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Schuurs-Hoeijmakers, J. H. et al. Identification of pathogenic gene variants in small families with intellectually disabled siblings by exome sequencing. J. Med. Genet. 50, 802–811 (2013).

    Article  CAS  PubMed  Google Scholar 

  41. Yuen, R. K. et al. Whole-genome sequencing of quartet families with autism spectrum disorder. Nat. Med. 21, 185–191 (2015).

    Article  CAS  PubMed  Google Scholar 

  42. Kirk, E. P. et al. Gene selection for the australian reproductive genetic carrier screening project (‘Mackenzie’s Mission’). Eur. J. Hum. Genet. 29, 79–87 (2021).

    Article  PubMed  Google Scholar 

  43. McLaren, W. et al. The ensembl variant effect predictor. Genome Biol. 17, 122 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  44. Li, Q. & Wang, K. InterVar: clinical interpretation of genetic variants by the 2015 ACMG-AMP guidelines. Am. J. Hum. Genet. 100, 267–280 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. van der Velde, K. J. et al. MOLGENIS research: advanced bioinformatics data software for non-bioinformaticians. Bioinformatics 35, 1076–1078 (2019).

    Article  PubMed  Google Scholar 

  46. Carter, A. R. et al. Educational attainment as a modifier for the effect of polygenic scores for cardiovascular risk factors: cross-sectional and prospective analysis of UK Biobank. Int J. Epidemiol. 51, 885–897 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank E. Gardner, J. Hampstead, H. Martin and M. Hurles for fruitful discussions and S. Carmi for advising about population genetics matters. This research has been conducted using the UK Biobank Resource under application number 66493. This project was financially supported by a VIDI grant from the Dutch Research Council (grant no. 917-17-353 to C.G.) and an AI for Health PhD grant from Radboudumc. This project was supported by a gift from the Koum Foundation (to E.L.-L.). E.L.-L. is Robin Chemers Neustein Director of Medical Genetics.

Author information

Authors and Affiliations

Authors

Contributions

H.G.B., C.G. and E.L.-L. supervised the study; H.F. and G.K. developed a data collection pipeline and statistical methods and analysed data. H.F., G.K., H.G.B., C.G. and E.L.-L. wrote the manuscript.

Corresponding authors

Correspondence to Christian Gilissen or Han G. Brunner.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Human Behaviour thanks Terry Vrijenhoek, Lily Wang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Association of genetic burden for recessive disease with childlessness for different constraint scores.

Associations of genetic burden with childlessness and hair color (as a control phenotype) for heterozygous PLPs in recessive genes (purple), PLPs in recessive genes excluding carriers of LoF in non-recessive genes (orange), and singleton LoFs in highly constrained (s-het > 0.15) non-recessive genes (green). Colored lines indicate odds ratios for the phenotypes with 99% confidence intervals. The dashed gray line indicates OR = 1, which serves as the reference point. Statistically significant deviations from OR = 1 are tested using a two-sided Wald test. The corresponding p-values are displayed in the figure and adjusted for multiple comparisons using the Bonferroni correction. Statistically significant associations are marked with an asterisk. The results are for three different s-het scores: Weghorn (a), Cassa (b) and pLI (c).

Extended Data Fig. 2 Simulations of the effect of sample size on detection of the association with childlessness.

Simulated estimation of the odds-ratios (a) and corresponding p-values (b), (c) depicted with a spread (95% percentile interval for n = 20 simulations per cohort size) for the effect of cohort size on the association of childlessness with the genetic burden of PLPs. Shown are simulation results for heterozygous carriers of PLPs in recessive genes (purple) and of carriers of singleton LoFs carriers in non-recessive highly constrained genes (green). The dotted gray line marks the significance level of P = 0.05.

Extended Data Fig. 3 Association of genetic burden for recessive disease with educational attainment, fluid intelligence score, and log-transformed number of ICD-10 diagnoses for different constraint scores.

Associations of genetic burden with educational attainment (measured in years of education), fluid intelligence score and log-transformed number of ICD-10 diagnoses for PLPs in all recessive (purple) genes and singleton LoFs in non-recessive highly constrained (green) genes. Colored lines indicate effect sizes (ES; estimated regression coefficients) with 99% confidence intervals. The dashed gray line indicates ES = 0, which serves as the reference point. Statistically significant deviations from ES = 0 are tested using a two-sided Wald test. The corresponding p-values are displayed in the figure and adjusted for multiple comparisons using the Bonferroni correction. Statistically significant associations are marked with an asterisk. The results are for three different s-het scores: Weghorn (a), Cassa (b) and pLI (c).

Extended Data Fig. 4 Association of genetic burden with various phenotypes for PLPs in recessive ID genes and all other recessive genes, using different constraint scores.

The comparison between PLPs in recessive ID genes (red) and all other recessive genes (blue) for the effects of genetic burden on childlessness, educational attainment (measured in years of education), log-transformed number of ICD-10 diagnoses, and hair color (as a control phenotype). The results are for three different s-het scores: Weghorn (a), Cassa (b), and pLI (c). Colored lines indicate odds ratio (OR; for childlessness and hair color) or effect sizes (ES; estimated regression coefficients) with 99% confidence intervals; dashed gray line indicates the OR = 1 (for childlessness and hair color) or ES = 0, which serve as the reference point. Statistically significant deviations from these points are tested using a two-sided Wald test. The corresponding p-values are displayed in the figure and adjusted for multiple comparisons using the Bonferroni correction. Statistically significant associations are marked with an asterisk.

Extended Data Fig. 5 Consanguinity ratio scores (CR) for different disorder groups in three European populations.

Consanguinity ratio scores (CRs) for 13 disorder groups calculated for 3 European populations: UK Biobank (blue), Dutch (magenta) and Estonian (orange) cohorts. Scores for the Dutch and Estonian cohorts are taken from Fridman et al.; Scores marked in asterisk are significantly higher than seen in a random set of recessive genes with the same coding length (Methods). We note that CR scores for some of the other disorder groups (Skeletal, Neuromuscular, Hematologic) are similar to or higher than the CR score obtained for ID genes, yet these did not reach significance.

Extended Data Fig. 6 Correlation of allele frequencies in disease categories between a Dutch cohort and the UK Biobank.

Correlation of average PLP allele frequencies (AF) per disorder group between the UK Biobank and Dutch cohorts with 95% confidence interval for the regression estimate (derived from n = 1,000 bootstrap samples). The X-axis shows the AF in the UK Biobank cohort, Y-axis shows the AF in the Dutch cohort. Pearson correlation ρ = 0.791, 95%CI = [0.425, 0.935], two-sided non-adjusted P = 0.001. AF for the Dutch cohort are taken from Fridman et al.

Supplementary information

Supplementary Information

Supplementary Figs. 1–9 and Tables 1–16.

Reporting Summary

Peer Review File

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fridman, H., Khazeeva, G., Levy-Lahad, E. et al. Reproductive and cognitive phenotypes in carriers of recessive pathogenic variants. Nat Hum Behav (2025). https://doi.org/10.1038/s41562-025-02204-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41562-025-02204-7

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing