Extended Data Fig. 1: Association of genetic features with drug indications using Firth logistic regression in the Open Target dataset in all drugs and drugs with one gene target.
From: Development of a human genetics-guided priority score for 19,365 genes and 399 drug indications

The Open Targets dataset was split into 80% training and 20% test sets in five-fold cross-validation. Firth logistic regression was run on the cross-validation training sets (n = 735,847 independent drug-gene-phenotype combinations) with drug indication as the outcome variable and the eight human genetic features, 14 phecode categories, genetic constraint and the number of gene targets per drug binarized into drugs with a single gene target and drugs with multiple gene targets as the predictor variables. Shown is a forest plot of beta coefficients with 95% CIs from the eight human genetic features included in the models in five-fold cross-validation. Each cross-validated sample is color labeled and filled circles indicate a beta coefficient with a significant P-value and the 95% CIs are defined as error bars.