Extended Data Fig. 3: Comparison of LDGM precision matrices with Wen-Stephens shrinkage estimator.
From: Extremely sparse models of linkage disequilibrium in ancestrally diverse association studies

The comparison was performed in EUR, on chromosome 22 only (n = 20 LD blocks). To vary the amount of shrinkage, we changed the sample size parameter in the Wen-Stephens estimator (actual sample size: 1,006). a, Mean-squared error between the Wen-Stephens estimator and the LDGM precision matrix inverse. Dotted line denotes the median MSE between the LD sample correlation matrix and the LDGM precision matrix inverse. b, Mean-squared error between the Wen-Stephens estimator and the sample correlation matrix. Values are larger than the corresponding numbers in a for sample size parameters up to 40, and smaller for sample size parameters of 201 or higher. c, Number of nonzero entries per SNP in the Wen-Stephens estimator. Correlations with absolute value less than 1 × 10−8 are set to zero (consistent with the original paper), resulting in slightly increased sparsity for small values of the sample size parameter. At larger parameter values, no SNP pairs are below the threshold within LD blocks, but this approach can still be used to produce a sparse, banded diagonal matrix when it is not desired to use discrete blocks. Somewhat more sparsity can be achieved by relaxing the 1 × 10−8 threshold, but not without causing increased error. In all plots, the lower whisker, lower hinge, center, upper hinge and upper whisker correspond to (lower hinge − 1.5× interquartile range (IQR)) and the 25th percentile, median, 75th percentile, and (upper hinge + 1.5× IQR), respectively.