Extended Data Fig. 4: Evaluation and comparison of different ESM models. | Nature Genetics

Extended Data Fig. 4: Evaluation and comparison of different ESM models.

From: Genome-wide prediction of disease variant effects with a deep protein language model

Extended Data Fig. 4

Tested ESM models: ESM1b, ESM1, the five ESM1v models, and an assembly of the five ESM1v models into a single model averaging the LLR scores obtained by the 5 models (ESM1v-avg). (a) Performance of the different ESM models on the clinical benchmarks (ClinVar and HGMD/gnomAD). Each model was evaluated as a binary classifier of pathogenic vs. benign missense variants over the two benchmarks using the global ROC-AUC metric. Only proteins smaller than 1,022aa were considered in this evaluation (thereby avoiding the sliding window approach). (b) Performance of the ESM models on the DMS benchmark.

Back to article page