Extended Data Fig. 9: Statistical power for ACAT-V, fastGWA-BB, and REGENIE-Burden under different simulation scenarios.
From: A generalized linear mixed model association tool for biobank-scale data

Three gene-based test methods are compared in this analysis, that is, ACAT-V (implemented in GCTA), fastGWA-BB, and REGENIE-Burden. The y-axis represents the power, defined as the proportion of the 100 simulated causal genes on chromosome 1 with P values less than the significance threshold after Bonferroni correction (that is, 0.05/1224=4.1×10−5, where 1,224 is the number of genes used in the simulation), and “Prev” on the x-axis represents different levels of simulated prevalence of the binary trait. The prevalence is defined as \(n_{case}/(n_{case} + n_{control})\)). We varied the proportion of variants being causal in a gene (5%, 20%, or 50%) and the directions of variant effects (random or consistent), as labelled in the title of each panel. Each boxplot represents the distribution of power across 100 simulation replicates. The line inside each box indicates the median value, notches indicate the 95% confidence interval, central box indicates the interquartile range (IQR), whiskers indicate data up to 1.5 times the IQR, and outliers are shown as separate dot. In all the analyses, we used a one-sided \(\chi _{\mathrm{d.f.} = 1}^2\) statistic to test against the null hypothesis of no association.