Abstract
Understanding gene–environment interaction (GxE) is crucial for deciphering the genetic architecture of human complex traits. However, current statistical methods for GxE inference face challenges in both scalability and interpretability. Here we introduce PIGEON—a unified statistical framework for quantifying polygenic GxE using a variance component analytical approach. Based on this framework, we outline the main objectives in GxE studies and introduce an estimation procedure that requires only summary statistics data as input. We demonstrate the effectiveness of PIGEON through theoretical and empirical analyses, including a quasi-experimental gene-by-education study of health outcomes and gene-by-sex interaction for 530 traits using UK Biobank. We also identify genetic interactors that explain the treatment effect heterogeneity in a clinical trial on smoking cessation. PIGEON suggests a path towards polygenic, summary statistics-based inference in future GxE studies.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
27,99 € / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
118,99 € per year
only 9,92 € per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout






Similar content being viewed by others
Data availability
This study made use of publicly available datasets. This research has been conducted using the UK Biobank Resource under application number 42148. Data from the UK Biobank are available by application to all bona fide researchers in the public interest at https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access. Data from the Lung Health studies are available by application at https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000291.v2.p1. The SNPxE summary statistics used in the Article are available at http://qlu-lab.org/data.html.
Code availability
PIGEON software package is publicly available via GitHub at https://github.com/qlu-lab/PIGEON.
References
Barcellos, S. H., Carvalho, L. S. & Turley, P. Education can reduce health differences related to genetic risk of obesity. Proc. Natl Acad. Sci. USA 115, E9765–E9772 (2018).
Schmitz, L. L. & Conley, D. The effect of Vietnam-era conscription and genetic potential for educational attainment on schooling outcomes. Econ. Educ. Rev. 61, 85–97 (2017).
Li, J., Li, X., Zhang, S. & Snyder, M. Gene–environment interaction in the era of precision medicine. Cell 177, 38–44 (2019).
Mega, J. L. et al. Reduced-function CYP2C19 genotype and risk of adverse clinical outcomes among patients treated with clopidogrel predominantly for PCI: a meta-analysis. JAMA 304, 1821–1830 (2010).
Riaz, N. et al. Recurrent SERPINB3 and SERPINB4 mutations in patients who respond to anti-CTLA4 immunotherapy. Nat. Genet. 48, 1327–1329 (2016).
Miao, J., Wu, Y. & Lu, Q. Statistical methods for gene–environment interaction analysis. Wiley Interdisc. Rev. Comput. Stat. 16, e1635 (2024).
Freeman, G. Statistical methods for the analysis of genotype–environment interactions. Heredity 31, 339–354 (1973).
Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017).
Caspi, A. et al. Influence of life stress on depression: moderation by a polymorphism in the 5-HTT gene. Science 301, 386–389 (2003).
Dick, D. M. et al. Candidate gene–environment interaction research: reflections and recommendations. Perspect. Psychol. Sci. 10, 37–59 (2015).
Thomas, D. Gene–environment-wide association studies: emerging approaches. Nat. Rev. Genet. 11, 259–272 (2010).
Aschard, H. et al. Challenges and opportunities in genome-wide environmental interaction (GWEI) studies. Hum. Genet. 131, 1591–1613 (2012).
Miao, J. et al. A quantile integral linear model to quantify genetic effects on phenotypic variability. Proc. Natl Acad. Sci. USA 119, e2212959119 (2022).
Marderstein, A. R. et al. Leveraging phenotypic variability to identify genetic interactions in human phenotypes. Am. J. Hum. Genet. 108, 49–67 (2021).
Young, A. I., Wauthier, F. L. & Donnelly, P. Identifying loci affecting trait variability and detecting interactions in genome-wide association studies. Nat. Genet. 50, 1608–1614 (2018).
Wang, H. et al. Genotype-by-environment interactions inferred from genetic effects on phenotypic variability in the UK Biobank. Sci. Adv. 5, eaaw3538 (2019).
Dai, J. Y., Kooperberg, C., Leblanc, M. & Prentice, R. L. Two-stage testing procedures with independent filtering for genome-wide gene–environment interaction. Biometrika 99, 929–944 (2012).
Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010).
Bulik-Sullivan, B. K. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).
Lee, S. H. et al. Estimating the proportion of variation in susceptibility to schizophrenia captured by common SNPs. Nat. Genet. 44, 247–250 (2012).
Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).
van Rheenen, W., Peyrot, W. J., Schork, A. J., Lee, S. H. & Wray, N. R. Genetic correlations of polygenic disease traits: from theory to practice. Nat. Rev. Genet. 20, 567–581 (2019).
Miao, J. et al. Quantifying portable genetic effects and improving cross-ancestry genetic prediction with GWAS summary statistics. Nat. Commun. 14, 832 (2023).
Torkamani, A., Wineinger, N. E. & Topol, E. J. The personal and clinical utility of polygenic risk scores. Nat. Rev. Genet. 19, 581–590 (2018).
Martin, J. et al. Examining sex-differentiated genetic effects across neuropsychiatric and behavioral traits. Biol. Psychiatry 89, 1127–1137 (2021).
Bernabeu, E. et al. Sex differences in genetic architecture in the UK Biobank. Nat. Genet. 53, 1283–1289 (2021).
Dahl, A. et al. A robust method uncovers significant context-specific heritability in diverse complex traits. Am. J. Hum. Genet. 106, 71–91 (2020).
Robinson, M. R. et al. Genotype–covariate interaction effects and the heritability of adult body mass index. Nat. Genet. 49, 1174–1181 (2017).
Ni, G. et al. Genotype–covariate correlation and interaction disentangled by a whole-genome multivariate reaction norm model. Nat. Commun. 10, 2239 (2019).
Shin, J. & Lee, S. H. GxEsum: a novel approach to estimate the phenotypic variance explained by genome-wide GxE interaction based on GWAS summary statistics for biobank-scale data. Genome Biol. 22, 183 (2021).
Blokland, G. A. M. et al. Sex-dependent shared and nonshared genetic architecture across mood and psychotic disorders. Biol. Psychiatry 91, 102–117 (2022).
Domingue, B. W., Trejo, S., Armstrong-Carter, E. & Tucker-Drob, E. M. Interactions between polygenic scores and environments: methodological and conceptual challenges. Sociol. Sci. 7, 465–486 (2020).
Biroli, P. et al. The economics and econometrics of gene-environment interplay. Preprint at https://arxiv.org/abs/2203.00729 (2022).
Schmitz, L. L. et al. The socioeconomic gradient in epigenetic ageing clocks: evidence from the multi-ethnic study of atherosclerosis and the health and retirement study. Epigenetics 17, 589–611 (2022).
Schmitz, L. L., Goodwin, J., Miao, J., Lu, Q. & Conley, D. The impact of late-career job loss and genetic risk on body mass index: evidence from variance polygenic scores. Sci. Rep. 11, 7647 (2021).
Johnson, R., Sotoudeh, R. & Conley, D. Polygenic scores for plasticity: a new tool for studying gene–environment interplay. Demography 59, 1045–1070 (2022).
Qi, Q. et al. Sugar-sweetened beverages and genetic risk of obesity. N. Engl. J. Med. 367, 1387–1396 (2012).
Lynch, M. & Walsh, B. Genetics and Analysis of Quantitative Traits (Sinauer, 1998).
Tahmasbi, R., Evans, L. M., Turkheimer, E. & Keller, M. C. Testing the moderation of quantitative gene by environment interactions in unrelated individuals. Preprint at bioRxiv https://doi.org/10.1101/191080 (2017).
Lu, Q. et al. A powerful approach to estimating annotation-stratified genetic covariance via GWAS summary statistics. Am. J. Hum. Genet. 101, 939–964 (2017).
Kerin, M. & Marchini, J. Inferring gene-by-environment interactions with a Bayesian whole-genome regression model. Am. J. Hum. Genet. 107, 698–713 (2020).
Zhu, C. et al. Amplification is the primary mode of gene-by-sex interaction in complex human traits. Cell Genom. 3, 100297 (2023).
Dudbridge, F. Power and predictive accuracy of polygenic risk scores. PLoS Genet. 9, e1003348 (2013).
Ding, Y. et al. Large uncertainty in individual polygenic risk score estimation impacts PRS-based risk stratification. Nat. Genet. 54, 30–39 (2022).
Becker, J. et al. Resource profile and user guide of the Polygenic Index Repository. Nat. Hum. Behav. 5, 1744–1758 (2021).
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
Karczewski, K. J. et al. Pan-UK Biobank GWAS improves discovery, analysis of genetic architecture, and resolution into ancestry-enriched effects. Preprint at medRxiv https://doi.org/10.1101/2024.03.13.24303864 (2024).
Locke, A. E. et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197–206 (2015).
Okbay, A. et al. Genome-wide association study identifies 74 loci associated with educational attainment. Nature 533, 539–542 (2016).
Yengo, L. et al. Meta-analysis of genome-wide association studies for height and body mass index in ~700000 individuals of European ancestry. Hum. Mol. Genet. 27, 3641–3649 (2018).
Okbay, A. et al. Polygenic prediction of educational attainment within and between families from genome-wide association analyses in 3 million individuals. Nat. Genet. 54, 437–449 (2022).
Liu, M. et al. Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nat. Genet. 51, 237–244 (2019).
Mulasi, U., Kuchnia, A. J., Cole, A. J. & Earthman, C. P. Bioimpedance at the bedside: current applications, limitations, and opportunities. Nutr. Clin. Pract. 30, 180–193 (2015).
Bulik, C. M. et al. Prevalence, heritability, and prospective risk factors for anorexia nervosa. Arch. Gen. Psychiatry 63, 305–312 (2006).
Hübel, C. et al. Genomics of body fat percentage may contribute to sex bias in anorexia nervosa. Am. J. Med. Genet. Part B 180, 428–438 (2019).
Connett, J. E. et al. Design of the Lung Health Study: a randomized clinical trial of early intervention for chronic obstructive pulmonary disease. Control. Clin. Trials 14, 3–19 (1993).
Anthonisen, N. R. et al. Effects of smoking intervention and the use of an inhaled anticholinergic bronchodilator on the rate of decline of FEV1: the Lung Health Study. JAMA 272, 1497–1505 (1994).
Kong, A. et al. The nature of nurture: effects of parental genotypes. Science 359, 424–428 (2018).
Abdellaoui, A., Dolan, C. V., Verweij, K. J. & Nivard, M. G. Gene–environment correlations across geographic regions affect genome-wide association studies. Nat. Genet. 54, 1345–1354 (2022).
Sunde, H. F. et al. Genetic similarity between relatives provides evidence on the presence and history of assortative mating. Nat. Commun. 15, 2641 (2024).
Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).
Zhang, Y. et al. SUPERGNOVA: local genetic correlation analysis reveals heterogeneous etiologic sharing of complex traits. Genome Biol. 22, 262 (2021).
Kim, W. et al. Interaction of cigarette smoking and polygenic risk score on reduced lung function. JAMA Netw. Open 4, e2139525 (2021).
Ye, Y. et al. Interactions between enhanced polygenic risk scores and lifestyle for cardiovascular disease, diabetes, and lipid levels. Circ. Genom. Precis. Med. 14, e003128 (2021).
Mullins, N. et al. Polygenic interactions with environmental adversity in the aetiology of major depressive disorder. Psychol. Med. 46, 759–770 (2016).
Tyrrell, J. et al. Gene–obesogenic environment interactions in the UK Biobank study. Int. J. Epidemiol. 46, 559–575 (2017).
Dudbridge, F. & Fletcher, O. Gene–environment dependence creates spurious gene–environment interaction. Am. J. Hum. Genet. 95, 301–307 (2014).
Briley, D. A. et al. Interpreting behavior genetic models: seven developmental processes to understand. Behav. Genet. 49, 196–210 (2019).
Fletcher, J. M. & Conley, D. The challenge of causal inference in gene–environment interaction research: leveraging research designs from the social sciences. Am. J. Public Health 103, S42–S45 (2013).
Pasaniuc, B. & Price, A. L. Dissecting the genetics of complex traits using summary association statistics. Nat. Rev. Genet. 18, 117–127 (2017).
Visscher, P. M. et al. 10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet. 101, 5–22 (2017).
Loos, R. J. F. 15 years of genome-wide association studies and no signs of slowing down. Nat. Commun. 11, 5900 (2020).
Gauderman, W. J. et al. Update on the state of the science for analytical methods for gene-environment interactions. Am. J. Epidemiol. 186, 762–770 (2017).
Speed, D. & Balding, D. J. SumHer better estimates the SNP heritability of complex traits from summary statistics. Nat. Genet. 51, 277–284 (2019).
Song, S., Jiang, W., Zhang, Y., Hou, L. & Zhao, H. Leveraging LD eigenvalue regression to improve the estimation of SNP heritability and confounding inflation. Am. J. Hum. Genet. 109, 802–811 (2022).
Ning, Z., Pawitan, Y. & Shen, X. High-definition likelihood inference of genetic correlations across human complex traits. Nat. Genet. 52, 859–864 (2020).
Mostafavi, H. et al. Variable prediction accuracy of polygenic scores within an ancestry group. eLife 9, e48376 (2020).
Howe, L. J. et al. Within-sibship genome-wide association analyses decrease bias in estimates of direct genetic effects. Nat. Genet. 54, 581–592 (2022).
Daetwyler, H. D., Villanueva, B. & Woolliams, J. A. Accuracy of predicting the genetic risk of disease using a genome-wide approach. PLoS ONE 3, e3395 (2008).
Wray, N. R. et al. Pitfalls of predicting complex traits from SNPs. Nat. Rev. Genet. 14, 507–515 (2013).
Auton, A. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Choi, S. W. & O’Reilly, P. F. PRSice-2: polygenic risk score software for biobank-scale data. Gigascience 8, giz082 (2019).
Ge, T., Chen, C.-Y., Ni, Y., Feng, Y.-C. A. & Smoller, J. W. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat. Commun. 10, 1776 (2019).
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, s13742-015-0047-8 (2015).
Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284–1287 (2016).
Tashkin, D. P. et al. Comparison of the variability of the annual rates of change in FEV1 determined from serial measurements of the pre-versus post-bronchodilator FEV1 over 5 years in mild to moderate COPD: results of the lung health study. Respir. Res. 13, 70 (2012).
Abraham, G., Qiu, Y. & Inouye, M. FlashPCA2: principal component analysis of Biobank-scale genotype datasets. Bioinformatics 33, 2776–2778 (2017).
Keller, M. C. Gene × environment interaction studies have not properly controlled for potential confounders: the problem and the (simple) solution. Biol. Psychiatry 75, 18–24 (2014).
Acknowledgements
We acknowledge research support from National Institutes of Health (NIH) grant U01 HG012039, the National Institute on Aging (NIA) (R00 AG056599) and the University of Wisconsin-Madison Office of the Chancellor and the Vice Chancellor for Research and Graduate Education with funding from the Wisconsin Alumni Research Foundation (WARF). We also acknowledge use of the facilities of the Center for Demography of Health and Aging at the University of Wisconsin-Madison, funded by NIA Center Grant P30 AG017266. We thank members of the Social Genomics Working Group at University of Wisconsin for helpful comments. This research has been conducted using the UK Biobank Resource under application 42148.
Author information
Authors and Affiliations
Contributions
J.M. and Q.L. conceived and designed the study. J.M. developed the statistical framework, performed the simulations and data analysis, and implemented the software. G.S. assisted in GxSex analysis in UK Biobank. Yixuan Wu assisted in implementing the software. J.H. assisted in developing the statistical framework. Yuchang Wu assisted in UKB data preparation. S.B. assisted in Lung Health Study data preparation. J.S.A., K.S., L.L.S. and J.M.F. advised on result interpretation. Q.L. advised on statistical and genetic issues. J.M. and Q.L. wrote the paper. All authors contributed to paper editing and approved the paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Human Behaviour thanks Daniel Benjamin, Christopher Rayner and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Note and Figs. 1–10.
Supplementary Tables
Supplementary Tables 1–5.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Miao, J., Song, G., Wu, Y. et al. PIGEON: a statistical framework for estimating gene–environment interaction for polygenic traits. Nat Hum Behav (2025). https://doi.org/10.1038/s41562-025-02202-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41562-025-02202-9