Fig. 3: ESM-DBP outperforms SOTA methods on four prediction tasks. | Nature Communications

Fig. 3: ESM-DBP outperforms SOTA methods on four prediction tasks.

From: Improving prediction performance of general protein language model by ___domain-adaptive pretraining on DNA-binding protein

Fig. 3

a Comparison of MCC and AUC values among ESM-DBP and SOTA methods on four downstream prediction tasks. Except for DeepTFactor and iDRBP_ECHF, for which the training sets are unaccessible, the training sets of the remaining control methods remain consistent with ESM-DBP (separated by dotted line); b Head-to-head comparisons among ESM-DBP and four SOTA methods on independent test sets. Each point represents a sample and the number indicates the number of positive or negative samples located in the diagonal, upper, or lower triangle. The axes of the DBS represent the MCC values for each protein and the axes of other three figures indicate the probability of a sample being predicted as a positive; The DBZF predictions contain only 781 positive samples and 612 negative samples, as that is all in the DeepZF results file; c The DBS prediction results of ESM-DBP and SOTA methods on the chain C of a trimer (PDB ID: 4ZCF) with an MSA depth of 253 against the Uniclust30 database37 using HHblits program34. The protein structures are drawn using pymol81. Source data are provided as a Source Data file.

Back to article page