Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Technical Report
  • Published:

An enhanced framework for local genetic correlation analysis

Abstract

Genetic correlation is a key parameter in the joint genetic model of complex traits, but it is usually estimated on a global genomic scale. Understanding local genetic correlations provides more detailed insight into the shared genetic architecture of complex traits. However, a state-of-the-art tool for local genetic correlation analysis, LAVA, is prone to false inference. Here we extend the high-definition likelihood (HDL) method to a local version, HDL-L, which performs genetic correlation analysis in small, approximately independent linkage disequilibrium blocks. HDL-L allows a more granular estimation of genetic variances and covariances. Simulations show that HDL-L offers more consistent heritability estimates and more efficient genetic correlation estimates compared with LAVA. HDL-L demonstrated robust performance across a wide range of simulations conducted under varying parameter settings. In the analysis of 30 phenotypes from the UK Biobank, HDL-L identified 109 significant local genetic correlations and showed a notable computational advantage. HDL-L proves to be a powerful tool for uncovering the detailed genetic landscape that underlies complex human traits, offering both accuracy and computational efficiency.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Evaluation of parameter estimation based on simulations.
Fig. 2: Evaluation of statistical inference based on simulations.
Fig. 3: Comparison between HDL-L and LAVA across 30 phenotypes in the UKBB.

Similar content being viewed by others

Data availability

The individual-level genotype and phenotype data are available by application via the UKBB at https://www.ukbiobank.ac.uk. The summary statistics of UKBB sex-specific GWAS from the Neale laboratory can be obtained from https://www.nealelab.is/uk-biobank. The data required to reproduce the figures in the HDL-L manuscript are available via Zenodo at https://doi.org/10.5281/zenodo.14825987 (ref. 18). Source data are provided with this paper.

Code availability

The HDL-L software is integrated into the HDL project repository (https://github.com/zhenin/HDL) and is available via Zenodo at https://doi.org/10.5281/zenodo.14825987 (ref. 18). The Zenodo archive also contains all simulation scripts and figure-generation code for the HDL-L manuscript, facilitating complete replication of the analyses and results. LAVA software is available via GitHub at https://github.com/josefin-werme/LAVA. COLOC and SuSiE software are available at https://chr1swallace.github.io/coloc/. LDSC software is available at https://github.com/bulik/ldsc. PLINK 2.0 (https://www.cog-genomics.org/plink/2.0) was used to extract individual-level data of imputed SNPs from the UKBB. PLINK 1.9 (https://www.cog-genomics.org/plink) and LDAK (http://dougspeed.com/ldak) were used in the calculation and simulation of LD.

References

  1. Visscher, P. M. et al. 10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet. 101, 5–22 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Uffelmann, E. & Posthuma, D. Emerging methods and resources for biological interrogation of neuropsychiatric polygenic signal. Biol. Psychiatry 89, 41–53 (2021).

    Article  CAS  PubMed  Google Scholar 

  3. Yao, C. et al. Genome-wide mapping of plasma protein QTLs identifies putatively causal genes and pathways for cardiovascular disease. Nat. Commun. 9, 3268 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  4. Hirschhorn, J. N. & Daly, M. J. Genome-wide association studies for common diseases and complex traits. Nat. Rev. Genet. 6, 95–108 (2005).

    Article  CAS  PubMed  Google Scholar 

  5. Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Ning, Z., Pawitan, Y. & Shen, X. High-definition likelihood inference of genetic correlations across human complex traits. Nat. Genet. 52, 859–864 (2020).

    Article  CAS  PubMed  Google Scholar 

  7. Werme, J., van der Sluis, S., Posthuma, D. & de Leeuw, C. A. An integrated framework for local genetic correlation analysis. Nat. Genet. 54, 274–282 (2022).

    Article  CAS  PubMed  Google Scholar 

  8. Liu, Q. et al. Genomic correlation, shared loci, and causal relationship between obesity and polycystic ovary syndrome: a large-scale genome-wide cross-trait analysis. BMC Med. 20, 66 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Yu, Y. et al. Investigating the shared genetic architecture between schizophrenia and body mass index. Mol. Psychiatry 28, 2312–2319 (2023).

    Article  PubMed  Google Scholar 

  10. Shi, H., Mancuso, N., Spendlove, S. & Pasaniuc, B. Local genetic correlation gives insights into the shared genetic architecture of complex traits. Am. J. Hum. Genet. 101, 737–751 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. van Rheenen, W., Peyrot, W. J., Schork, A. J., Lee, S. H. & Wray, N. R. Genetic correlations of polygenic disease traits: from theory to practice. Nat. Rev. Genet. 20, 567–581 (2019).

    Article  PubMed  Google Scholar 

  12. Zheng, J. et al. LD Hub: a centralized database and wen interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis. Bioinformatics 33, 272–279 (2017).

    Article  CAS  PubMed  Google Scholar 

  13. Pawitan, Y. In All Likelihood: Statistical Modelling and Inference Using Likelihood (Oxford Univ. Press, 2001).

  14. Canela-Xandri, O., Rawlik, K. & Tenesa, A. An atlas of genetic associations in UK Biobank. Nat. Genet. 50, 1593–1599 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  16. Fisher, R. A. Statistical Methods and Scientific Inference (Oliver & Boyd, 1959).

  17. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Li, Y. LD reference and R package repo for HDL-L. Zenodo https://doi.org/10.5281/zenodo.14825987 (2025).

Download references

Acknowledgements

X.S. received a National Key Research and Development Program grant (2022YFF1202105), a National Natural Science Foundation of China (NSFC) grant (12171495) and a Swedish Research Council (Vetenskapsrådet) grant (2022-01309). Y.L. is grateful for the financial support from the China Scholarship Council (CSC).

Author information

Authors and Affiliations

Authors

Contributions

X.S. and Y.P. initiated and supervised the study. Y.L. performed the analysis. All the authors contributed to method development and paper writing.

Corresponding author

Correspondence to Xia Shen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks David Balding and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Evaluation of parameter estimation under different heritability settings.

We compared HDL-L and LAVA for estimating heritability, genetic covariance, and genetic correlation across 50 simulation replicates. Six pairs of true heritability values were considered for the two phenotypes: \({h}_{1}^{2}\,=\,0.2,\,{h}_{2}^{2}\,=\,0.4\); \({h}_{1}^{2}\,=\,0.2,\,{h}_{2}^{2}\,=\,0.8;\) \({h}_{1}^{2}\,=\,0.6,\,{h}_{2}^{2}\,=\,0.8\); \({h}_{1}^{2}\,=\,0.3,\,{h}_{2}^{2}\,=\,0.3;\) \({h}_{1}^{2}\,=\,0.5,\,{h}_{2}^{2}\,=\,0.5;\) \({h}_{1}^{2}\,=\,0.7,\,{h}_{2}^{2}\,=\,0.7\). a-d, Chromosome 22 results: The boxplots depict estimation bias (Estimated minus true values), and the scatterplots with LOESS-fitted curves in the right panel show HDL-L’s improved accuracy. For each box, the horizontal line represents the median, the central box indicates the interquartile range (IQR), and whiskers extend up to 1.5 times the IQR.

Extended Data Fig. 2 Genome-wide evaluation of parameter estimation based on simulations.

HDL-L and LAVA were evaluated for estimating heritability, genetic covariance, and genetic correlation over 50 simulation replicates. Line plots show median mean squared error across 2,468 regions. The heatmap indicates the heritability setting for each phenotype, where \({h}_{1}^{2}\,\) and \({h}_{2}^{2}\) denote the true genome-wide heritability for trait 1 and trait 2, respectively. 10% of SNPs were randomly selected as causal SNPs.

Extended Data Fig. 3 Comparison of P-value distribution between HDL-L and LAVA under the null hypothesis.

We considered six different heritability combinations of the two phenotypes: \({h}_{1}^{2}=0.2,\,{h}_{2}^{2}=0.4\); \({h}_{1}^{2}=0.2,\,{h}_{2}^{2}=0.8;\) \({h}_{1}^{2}=0.6,\,{h}_{2}^{2}=0.8\); \({h}_{1}^{2}=0.3,\,{h}_{2}^{2}=0.3;\) \({h}_{1}^{2}=0.5,\,{h}_{2}^{2}=0.5;\) \({h}_{1}^{2}=0.7,\,{h}_{2}^{2}=0.7\). The true genetic correlation (\({r}_{G}\)) was set at 0 (null hypothesis). 10% of SNPs were randomly selected as causal SNPs.

Extended Data Fig. 4 Comparison of ROC curve between HDL-L and LAVA.

We considered six different heritability combinations of the two phenotypes: \({h}_{1}^{2}=0.2,\,{h}_{2}^{2}=0.4\); \({h}_{1}^{2}=0.2,\,{h}_{2}^{2}=0.8;\) \({h}_{1}^{2}=0.6,\,{h}_{2}^{2}=0.8\); \({h}_{1}^{2}=0.3,\,{h}_{2}^{2}=0.3;\) \({h}_{1}^{2}=0.5,\,{h}_{2}^{2}=0.5;\) \({h}_{1}^{2}=0.7,\,{h}_{2}^{2}=0.7\), with nine settings of true genetic correlation (\({r}_{G}\) from −1 to 1). 10% of SNPs were randomly selected as causal SNPs.

Extended Data Fig. 5 Influence of numbers of leading eigenvalues on bias of genetic covariance.

We randomly selected 16 loci. After performing eigen-decomposition to the LD matrix, leading eigenvalues explaining different amount of variances of the LD matrix and their corresponding eigenvectors were taken to approximate the LD matrix. Each boxplot shows the distribution of genetic covariance bias across simulations, with the box spanning the first quartile to the third quartile (Q1–Q3) and the horizontal line indicating the median.

Extended Data Fig. 6 Performance evaluation of HDL-L and LAVA in simulated genetic analysis with fully overlapping samples.

The figure presents the comparative analysis of HDL-L and LAVA in estimating heritability and genetic covariance from 50 simulation repeats. The predetermined genome-wide true genetic correlation was varying from −0.5, 0, 0.5, with the heritability of trait 1 (\({h}_{1}^{2}\)) on the whole genome set at 0.2, the heritability of trait 2 (\({h}_{2}^{2}\)) on the whole genome set at 0.4, and 10% of SNPs were randomly selected as causal variants. The analysis was conducted on 2,468 genomic loci and 50 simulation replicates. For the boxplots, the central line in each box marks the median value, the lower and upper edges represent the first and third quartiles (Q1 and Q3), and the whiskers extend 1.5 \(\times\) IQR (interquartile range) beyond Q1 or Q3. a, Boxplots of false positive rate. b, P-value distribution under null hypothesis. The P-value was derived by comparing the two-sided likelihood ratio test (LRT) statistic to a chi-squared distribution with 1 degree of freedom. c, Boxplots of true positive rate. d, Boxplots of mean squared error for heritabilities and genetic covariance.

Supplementary information

Supplementary Information

Supplementary Note and Fig. 1.

Reporting Summary

Peer Review File

Supplementary Tables

Supplementary Tables 1 and 2.

Source data

Source Data Fig. 1

Statistical source data.

Source Data Fig. 2

Statistical source data.

Source Data Fig. 3

Statistical source data.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Y., Pawitan, Y. & Shen, X. An enhanced framework for local genetic correlation analysis. Nat Genet 57, 1053–1058 (2025). https://doi.org/10.1038/s41588-025-02123-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-025-02123-3

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing