Improving docking and virtual screening performance using AlphaFold2 multi-state modeling for kinases

Song, Jinung; Ha, Junsu; Lee, Juyong; Ko, Junsu; Shin, Woong-Hee

doi:10.1038/s41598-024-75400-6

Download PDF

Article
Open access
Published: 24 October 2024

Improving docking and virtual screening performance using AlphaFold2 multi-state modeling for kinases

Jinung Song¹^na1,
Junsu Ha²^na1,
Juyong Lee^1,2,3,
Junsu Ko² &
…
Woong-Hee Shin^2,4

Scientific Reports volume 14, Article number: 25167 (2024) Cite this article

6583 Accesses
1 Citations
Metrics details

Subjects

Abstract

Structure-based virtual screening (SBVS) is a crucial computational approach in drug discovery, but its performance is sensitive to structural variations. Kinases, which are major drug targets, exemplify this challenge due to active site conformational changes caused by different inhibitor types. Most experimentally determined kinase structures have the DFGin state, potentially biasing SBVS towards type I inhibitors and limiting the discovery of diverse scaffolds. We introduce a multi-state modeling (MSM) protocol for AlphaFold2 (AF2) kinase structures using state-specific templates to address these challenges. Our comprehensive benchmarks evaluate predicted model qualities, binding pose prediction accuracy, and hit compound identification through ensemble SBVS. Results demonstrate that MSM models exhibit comparable or improved structural accuracy compared to standard AF2 models, enhancing pose prediction accuracy and effectively capturing kinase-ligand interactions. In virtual screening experiments, our MSM approach consistently outperforms standard AF2 and AF3 modeling, particularly in identifying diverse hit compounds. This study highlights the potential of MSM in broadening kinase inhibitor discovery by facilitating the identification of chemically diverse inhibitors, offering a promising solution to the structural bias problem in kinase-targeted drug discovery.

Integrating machine learning and structure-based approaches for repurposing potent tyrosine protein kinase Src inhibitors to treat inflammatory disorders

Article Open access 13 January 2025

Synthon-based ligand discovery in virtual libraries of over 11 billion compounds

Article 15 December 2021

An artificial intelligence accelerated virtual screening platform for drug discovery

Article Open access 05 September 2024

Introduction

Structure-based virtual screening (SBVS) is one of the most widely used computational drug discovery approaches to identify hit compounds against a target from a virtual library. By predicting the interaction between the ligand and the target protein, the technique ranks ligands based on scores that mimic the binding affinity between molecules. SBVS is a cost- and time-efficient way to explore vast chemical space by significantly narrowing down the number of drug candidates that need to be tested experimentally. Calculating the interaction between the receptor and ligand can be done in various ways: molecular docking, fingerprinting, pharmacophore matching, and so forth. As its name implies, the method requires the target protein structure, and its performance depends on the protein conformation. For targets whose experimental structures are not available, researchers must predict the three-dimensional structures.

One of the major obstacles in the SBVS method, especially molecular docking, is caused by its static representation of a receptor structure. Proteins are flexible molecules that can change shape depending on their binding partners. The structural change of a receptor protein might lead to a failure in molecular docking^1,2,3. One technique for addressing receptor flexibility is ensemble screening. This method uses pre-generated diverse receptor structures gathered from experimental structure databases or simulation trajectories such as molecular dynamics. Ligands in the screening library are docked to individual structures and ranked by their representative scores. Various methods exist to obtain the representative score, including arithmetic mean and harmonic mean. Since the ensemble method uses multiple target structures, it is important to reflect structural diversity when selecting the receptor ensemble. However, the crystal structures of the target may be biased towards thermodynamically stable states or the major type of inhibitor-bound form. This bias might lead to a failure in SBVS even when structural ensembles are used.

Kinases are one of the typical examples showing structural diversity. They play a key role in biological processes through phosphorylation, transferring a phosphate group from ATP to proteins. Since this process modulates key cellular operations like cell cycle regulation, metabolism, and apoptosis, kinases have become attractive targets in drug discovery. According to Santos et al.⁴, kinases belong to one of the four main target families: G protein-coupled receptors (GPCRs), ion channels, nuclear receptors, and kinases. In the ChEMBL database⁵, kinases are targeted by 14.2% of compounds.

The kinase ___domain, a structural ___domain with catalytic function, has a highly conserved structure. It is composed of an N-lobe and a C-lobe, linked by a hinge region⁶. The N-lobe is structured with five β-strands and a single alpha helix, known as the C-helix, while the C-lobe comprises multiple alpha helices. The catalytic activity of kinase domains is regulated by two loops: the activation loop and the catalytic loop. The His-Arg-Asp (HRD) motif of the catalytic loop directly interacts with the hydroxyl group of the target protein residue (serine, threonine, or tyrosine) set for phosphorylation. The Asp-Phe-Gly (DFG) motif found in the N-terminal of the activation loop plays a crucial role in anchoring ATP to the active site. The conformational states of the active site of kinase domains are classified as DFGin, DFGinter, or DFGout, based on the orientation of the aspartic acid of the DFG motif relative to the site. Figure 1 illustrates two distinct structural states, DFGin (Fig. 1A) and DFGout (Fig. 1B), of BRAF. The DFGin state locates the Phe of the DFG motif into the ATP binding pocket, allowing the kinase to hold ATP; thus, it is called the active state. On the other hand, the DFGout conformation directs the Phe out of the ATP binding pocket, so it is classified as an inactive state.

Kinase inhibitors can be categorized based on the binding site conformation of kinases and where the compounds bind. Type I inhibitors bind to the ATP-binding site and thus compete with ATP, so they bind to the DFGin state. We observed that the majority of experimentally determined human kinase structures form the DFGin state (87%, as of May 2023). On the other hand, type II inhibitors often associate with the kinase in the DFGout state. Type II compounds tend to partially occupy the ATP-binding site and a hydrophobic pocket close to it, which opens when the activation loop forms the DFGout conformation. In general, type II inhibitors have more selectivity than type I inhibitors. According to Hari et al.⁷, the selectivity of type II inhibitors is influenced by the inherent variations in a kinase’s capacity to adopt the DFGout conformation. Lastly, type III inhibitors, sometimes referred to as allosteric inhibitors, bind to a site not directly connected to the ATP-binding site. Therefore, it is crucial to consider as many different structural states of kinases as possible to obtain diverse hit molecules for SBVS.

Recent advances in protein structure prediction using deep learning techniques, such as AlphaFold2 (AF2)⁸ and RoseTTAFold⁹, allow for accurate modeling of protein structures. However, these methods rely on pre-trained models from the PDB database, so the generated models might be affected by the conformational state distribution in the PDB if the protein forms diverse structures. Therefore, these methods could produce kinase structures in the DFGin state, and thus SBVS targeting kinases using the predicted models might potentially yield outcomes favoring type I inhibitors.

To address this issue, Heo and Feig¹⁰ proposed a method called multi-state modeling (MSM) to predict GPCRs with high accuracy in the desired structural state. The authors found that models predicted by AF2 tend to have either active or inactive states depending on GPCR classes due to the small number of experimental structures of GPCRs. Instead of providing multiple sequence alignment (MSA) as an AF2 input, the authors provide an alignment of a query sequence and a structural template sequence of interest. The MSM technique showed improved performance in modeling GPCRs. Cognate docking using MSM models also showed enhanced accuracy for predicting binding poses with smaller root-mean-square distance (RMSD) from the crystal structure.

Inspired by this work, we established an MSM protocol for modeling kinase structures to overcome the structural bias of kinases by providing state-specific templates to AF2. All human kinase experimental structures were classified by active site conformation using KinCoRe¹¹ to construct a state-specific template database. Our protocol was able to predict kinase conformations with the desired structural state with high accuracy. We then benchmarked the MSM protocol for cognate docking and virtual screening problems against the standard AF2. We also compared our results with crystal structures and the most recent AlphaFold program, AlphaFold3 (AF3)¹². The benchmarking results showed that the MSM models produced accurate binding modes with lower RMSD than standard AF2 and AF3 models. Ensemble SBVS with models generated by MSM performed superior to those generated by AF2 and AF3. Especially when the active molecules are diverse, the MSM ensemble protocol enabled us to find more varied hit compounds than the crystal structures.

Results

Structural state distributions of experimental kinase structures and AlphaFold2 predicted models

The conformational change of a protein structure could be an obstacle to SBVS. The active site of kinases forms diverse conformations depending on the type of binding compound. For example, the DFGin state binds to type I inhibitors, while the DFGout state binds to type II inhibitors. Dunbrack and his colleagues introduced a classification rule for kinase structures called KinCoRe¹¹. This rule categorizes kinase structural states into 12 types based on the spatial state of the activation loop and the dihedral angle of the DFG motif. Details of the criteria can be found in the Methods section and in Modi et al.¹¹.

The blue bar in Fig. 2 shows the distribution of kinase conformational states annotated by the KinCoRe scheme¹¹ in the PDB database. More than half (53.6%) of experimentally determined kinase structures are classified as having the DFGin-BLAminus conformation. Details of the structure distribution are shown in Supplementary Table S1. Other than this major state, the remaining states each occupy less than 10% of PDB structures.

The thermodynamic stabilities of the conformational states might influence the preference for the DFGin state in experimentally determined structures. Meng et al.¹³ studied the transition between the structural states of c-Abl and c-Src kinases using umbrella sampling and the potential of mean force. With a difference of 1.4 kcal/mol, the DFGin state of c-Abl has a lower Gibbs free energy than the DFGout state. Their calculations show that the active conformation (DFGin) is the dominant population, while the DFGout state occupied only 9% in their simulation. Similarly, for c-Src, the DFGin state is more favored than the DFGout state. The free energy difference between the states is calculated as 5.4 kcal/mol. The authors also studied the thermodynamic barrier of the transition between the states¹⁴. It is estimated to be on the order of 2–4 kcal/mol, making the transition from the highly populated DFGin state to the DFGout state difficult. Levy and his colleagues¹⁵ collected 2,896 kinase structures and multiple sequence alignments and applied a Potts model to predict structural propensities from sequences. With this statistical potential, most kinases are predicted to have preferences for the DFGin state. The penalty for forming the DFGout state reaches 2–3 kcal/mol in extreme cases.

Not only thermodynamic preference, but also the distribution of kinase inhibitor types might affect the skewness of kinase structures. We counted the number of compounds in kinase inhibitor types from PKIDB (assessed August 2023)¹⁶. PKIDB is a curated database of kinase inhibitors in clinical trials. Out of 369 compounds in the database, only 84 molecules have their inhibition type annotated, because annotating the ligand type requires the complex structure. Type I inhibitors (DFGin bound) are the dominant form, occupying 66% (55 compounds) of the annotated inhibitors. In contrast, 17 molecules are labeled as type II (DFGout bound). Thus, the DFGin conformation (active state) might have a higher chance of being crystallized than the DFGout state. The biased conformational states of kinases would make it harder to discover chemically diverse kinase inhibitors.

We also examined the conformational state distribution of 25 kinase DUD-E targets¹⁷, benchmarked throughout this study, in the PDB database (Fig. 2, orange bar). DUD-E is a widely used benchmark dataset for evaluating virtual screening methods, composed of pharmaceutically important targets such as GPCRs and kinases. The experimental structures are less biased than all PDB structures; about 30% of DUD-E proteins have the DFGin-BLAminus conformation. The other states tend to have a higher proportion than in the all-human kinase structure distribution. The difference in state distribution between all kinases and DUD-E targets potentially indicates that the highly biased nature of kinase structures might not be suitable for discovering diverse types of kinase inhibitors.

From an MSA of a given sequence, AF2⁸ extracts coevolution information using the MSA Transformer and predicts the three-dimensional structure based on this information and deep-learning models trained on existing protein structures. We modeled 25 kinase catalytic domains provided in the DUD-E kinase subset with the default parameters of AF2 (standard AF2), resulting in 125 structures (five models per target), then assigned the conformational states of the models using KinCoRe. The average plDDT and MolProbity¹⁸ scores of the predicted structures are 89.38 and 1.04, respectively, implying that they are properly modeled. Out of 125 predicted models, 91 structures (72%) are annotated as having the DFGin-BLAminus conformation (Fig. 2, green bar). The distribution of predicted models is more skewed than all human kinase structures and the DUD-E set proteins. However, the other states have a lower proportion than the experimental structures. The standard AF2 did not produce DFGin-ABAminus, DFGin-BLBtrans, DFGinter-BABtrans, and DFGout-Unassigned conformations, which occupy a small portion of the human kinase PDB structures (7.3%, 3.0%, 0.3%, and 3.2%, respectively, Supplementary Table S1). Since the experimentally determined kinase structures are biased towards DFGin-BLAminus, predicted kinase structures by AF2 might have a tendency to produce biased conformations, which is also observed in previous research¹⁰. In addition to the biased trained models, the template selection process in AF2 modeling does not take the structural state of the kinase into account.

AF3 also shows a similar trend to AF2. The red bar in Fig. 2 shows the distribution of conformational states of AF3 structures. About 78% of AF3 structures were classified as DFGin-BLAminus, which is the major state in both crystal structures and AF2 modeled structures. This suggests that AF3 could not fully explore the diverse conformational landscape of kinases, despite the modeling part being replaced with a diffusion model.

Predicting state-specific kinase structures using multi-state modeling protocol

The MSM AF2 protocol provides conformational state-specific structures as templates for structure prediction. All human kinase structures from the KLIFS database¹⁹ were classified by their structural state following KinCoRe rules¹¹. For each state, the five structures with the highest sequence similarity to a query sequence were selected as templates for modeling. AF2 was then executed with this template information to predict five models for each template. Models with conformational states different from the given template structures were removed, and the model with the highest plDDT was selected for our benchmark. The overall workflow is illustrated in Fig. 3. Details of the kinase MSM protocol are elaborated in the Methods section.

We prepared two template sets: a trivial template (TT) set containing 100% sequence-identical structural templates, and a nontrivial template (NT) set not having identical proteins. With our MSM protocol, AF2 could model structures in specific states, producing on average 8.5 and 8.2 models with the TT set and NT set, respectively. The number of predicted models for each target ranges from three (WEE1) to 11 (ABL1 and LCK, Supplementary Table S2). The average plDDT scores of structures are 89.63 and 88.08 for models from the TT set and NT set, respectively, showing comparable values to the standard AF2 predictions (89.38). As expected, TT set models have higher accuracy on average than NT set models, but their difference is marginal. The distribution of the MolProbity score is given in Supplementary Fig. S1.

The quality of models for each structural state was also examined. The average plDDT values range from 86.62 (DFGin-BLAplus) to 94.48 (DFGin-BLAminus), indicating that the quality of the models does not depend heavily on the structural state of kinases. The highest plDDT score for DFGin-BLAminus might be due to the highest frequency of this state in the PDB database (Supplementary Table S1). The Pearson’s correlation coefficient between the average plDDT and percentage in the PDB structure of each structural state is 0.739. Even though the MSM protocol provides a structural template for AF2, the model quality might be influenced by AF2’s pre-trained models, so the average plDDT of each model follows the distribution of the crystal structures.

The TT set models and the NT set models have average MolProbity scores of 1.21 and 1.24, respectively. These figures represent a modest decline in quality compared to the standard AF2 models (1.04). We also examined each structural state’s average MolProbity score. The lowest value is 1.06 (DFGin-ABAminus), and the highest is 1.33 (DFGinter-Unassigned). For DFGin-BLAminus, the most populated state for both PDB and standard AF2, MSM models show an average MolProbity score of 1.08. The average MolProbity score and the distribution in the PDB structure have a Pearson’s correlation coefficient of -0.59, indicating a weak negative correlation. The high MolProbity scores might be associated with the states that are less populated in the PDB.

To investigate the accuracy of the models, we measured the TM-Score²⁰ of predicted models against crystal structures given in the DUD-E set, referred to as ‘reference structures’ throughout this paper. TM-Score assesses structural similarity between two given proteins, ranging from zero to one (identical). For MSM models, we used the predicted structures in the same structural state as the reference. The average TM-Scores are 0.92 and 0.90 for the models predicted using TT and NT sets, respectively. Compared with standard AF2 models (average TM-Score: 0.87), the MSM technique provided more accurate models than the standard AF2 protocol, which is expected since MSM provides structural templates. Models with TT sets generally have more accurate structures than those with NT sets, as also expected.

Consequently, by utilizing structures that represent a variety of states for the target kinase, the MSM could generate diverse structures as desired with high accuracy. Thus, it might provide an appropriate structure set for kinase ensemble SBVS.

Cognate docking accuracy of a compound to the multi-state modelled structures

To examine whether modeled structures are suitable for molecular docking and thus structure-based virtual screening, we first conducted a cognate docking experiment on crystal structures, standard AF2, AF3, and MSM models. Ligands from the 25 reference structures were used for this benchmark. For the standard AF2 model, the highest plDDT model for each protein, which is also used for performing virtual screening in the next section, was selected for evaluation. The same selection rule was applied to AF3. For evaluating the MSM model, we used the structure with the same KinCoRe annotation as the reference structure. Since the crystal structure of IGF1R was not assigned any of the structural states by KinCoRe, it was removed, reducing the number of target proteins to 24. AutoDock-GPU²¹ was employed to predict 50 binding poses. The RMSDs of all predicted poses to the crystal binding pose were calculated. For analysis, we considered three values: the RMSD of the best AutoDock score pose, that of the closest pose to the crystal binding mode, and the average RMSD of all 50 poses.

Table 1 summarizes the docking accuracy evaluation results. To define docking success, an RMSD cutoff of 2.0 Å, a standard criterion for judging docking success^22,23,24, was applied. Among the benchmarked structures, the crystal structures provided in the DUD-E set showed the best performance for both RMSD values and number of successful cases, as expected.

Table 1 Docking accuracy benchmark result. The values are the average values of 24 proteins and the numbers in the parentheses are the number of success cases with RMSD < 2 Å.

Full size table

Considering the best AutoDock score models, the standard AF2, AF3, MSM structures modeled with TT, and NT have average RMSDs of 2.74 Å, 3.62 Å, 2.15 Å, and 3.49 Å, respectively. In addition, the number of successful cases is 11 (standard AF2), 7 (AF3), 16 (with TT), and 9 (with NT) out of 24 receptors. Individual RMSD values are given in Supplementary Table S3.

Comparing the MSM models with AF2 and AF3 models, the multi-state models with TT sets have the most accurate docking poses. As observed in the previous section, template information influenced the quality of the docking poses; i.e., the predicted docking poses of the models with TT sets have smaller RMSDs and more successful cases than those with NT sets. One successful example is AKT2 (Fig. 4). Although the TM-scores of MSM with TT set (0.98) and standard AF2 (0.96) are similar, the best AutoDock score model docked to the MSM using TT sets has a more accurate binding pose than standard AF2, with RMSDs of 0.86 Å (Fig. 4B) and 3.24 Å (Fig. 4C), respectively.

Even when docking is not successful (RMSD > 2 Å), MSM structures were able to retrieve the interactions between protein and ligand found in the crystal structure for some cases. Out of eight cases, the interacting residues in the reference structures were successfully captured more than 50% in five proteins analyzed by PLIP²⁵ (Supplementary Table S4). For example, the predicted binding pose of the best AutoDock score conformation of PRKCB cognate docking had an RMSD of 2.79 Å when the MSM with TT structure was used as the receptor structure. However, out of eleven interacting residues identified in the reference structure, ten residues were retrieved in the MSM-ligand complex model. By analyzing the interaction pattern, the predicted docking pose to the multi-state modeled structure shows hydrophobic interactions with L348, F353, V356, A369, K371, A483, and D484 and hydrogen bonding with T404, E421, and V423. These interactions are also observed in the crystal structure (Supplementary Fig. S2). The cognate docking benchmark results suggest that MSM models could be used to predict binding poses of kinase-ligand complexes, thus making them suitable for virtual ensemble screening.

Similar trends are observed for the lowest RMSD conformations and average RMSD of 50 conformations. Among the predicted structures, MSM with TT shows the most accurate models (average of the lowest RMSD pose: 1.40 Å, success cases: 19) and the standard AF2 performs slightly lower (average of the lowest RMSD pose: 1.50 Å, success cases: 20). AF3 also showed worse performance, with an average RMSD of 2.01 Å and 14 success cases. The docking accuracy of MSM with NT sets is the worst among the receptor structure sets (average of the lowest RMSD pose: 2.14 Å, success cases: 14). In most cases, the best AutoDock score conformations do not match the lowest RMSD conformations, which means that the AutoDock score could not find the optimal docking poses. The average RMSD of 50 conformations follows the same order as the other two metrics: MSM with TT is the smallest, and MSM with NT is the highest.

Virtual screening performance with multi-state models

To investigate the advantage of using the MSM technique for SBVS, the compound library for each kinase protein from DUD-E was docked to structures with diverse states generated using our method and compared to crystal structures, standard AF2, and AF3 structures, using AutoDock-GPU. For ensemble docking using models generated by MSM and AF3, since a molecule was docked to multiple receptor structures and thus had multiple docking scores, we needed to decide on representative scores of the compounds to rank the molecules. We employed two representative scores: the AutoDock best (ADB) and the Boltzmann-weighted (BW) scores. The ADB score picks the lowest AutoDock score among the docked results, while the BW score is a weighted average of all scores. The details of the BW score are given in the Methods section.

Table 2 shows the performance of SBVS using the various receptor structure sets. To evaluate the performance, five metrics are used: enrichment factors at 1%, 5%, and 10% (EFX%), area under ROC curve (AUC), and Boltzmann-enhanced discrimination of receiver operating characteristic (BEDROC). EF at X% indicates the capability of finding active molecules within the top X%, and AUC represents the discrimination power of a screening method between active and decoy molecules. BEDROC puts exponential weights on the early rank of molecules, thus it can solve the ‘early recognition problem’ caused by AUC²⁶. Details of the metrics are illustrated in the Methods section. As observed in the cognate docking benchmark, crystal structures showed the best performance in most metrics. Also, MSM structures are better than or equal to standard AF2 and AF3 results, regardless of the scoring method or the template set for modeling. Details of individual results are provided in Supplementary Table S5. Comparing EF_1% values target-by-target, the MSM performed better than or equal to standard AF2 in 18 proteins out of 25 targets (72%) using the ADB score screened to the structures modeled with the NT set. Surprisingly, the combination, NT models plus ADB, also performed better than the crystal structures. Even with the lowest EF_1% combination, TT models with BW scoring, the MSM performed better than or equal to standard AF2 in more than half of the proteins (13 proteins).

Table 2 Performance of structure-based virtual screening on various receptor models.

Full size table

To assess the statistical significance of our findings, we performed a Wilcoxon signed-rank test, a non-parametric method, across all receptor structure pairs and metrics (Supplementary Table S6). While the results did not reach the conventional threshold for statistical significance (p < 0.05), EF_1% of NT models with ADB score showed relatively a lower p-value with standard AF2. The relatively small sample size (n = 25) may have limited our ability to detect subtle differences between methods. Further investigation with larger datasets could be beneficial to more fully understand the comparative performance of these methods.

One of the interesting findings is that although the TT set models with the same KinCoRe classification as the reference have more accurate structures and docking poses than the NT set models, the virtual screening performance is slightly worse in both scoring schemes. To elucidate the variance in performance between MSMs using TT and NT models, we identified kinases that exhibited notable differences in EF_1% between the two sets. Of the kinases studied, ABL1 demonstrated superior performance using the MSM with the NT set compared to the TT set, across both ensemble scoring methods (Supplementary Table S5). We measured TM-Scores between the templates from TT and NT sets to model the same structural state and EF_1% of the models (Supplementary Table S7). Among the kinase states analyzed, the DFGout-BBAminus for both proteins produced a significant difference in both templates and models. When the kinase forms an active state, the activation loop forms an extended conformation to facilitate the catalytic function of the protein, thus the DFGin conformation has a conserved structure. On the other hand, for the DFGout conformation, the activation loop collapsed on the protein surface with high flexibility¹¹. Therefore, although the protein structures have the same KinCoRe notation with DFGout, the activation loop can have different structures. The TM-Scores of the templates to model the DFGout-BBAminus state are 0.88 for ABL1. The structural difference of templates influenced the predicted models, resulting in TM-Scores of 0.89 between predicted models from TT and NT sets. Supplementary Fig. S3 shows a structural difference of ABL1 predicted models with DFGout-BBAminus conformation. The difference in activation loop ___location also might affect the virtual screening performance, leading the TT set (3.28) to have a lower EF_1% value than the NT set (15.83). Significantly, this difference in performance in the DFGout-BBAminus state impacted the overall ensemble docking results.

One benefit of using the MSM models is that the predicted models are diverse, so they are potentially useful for discovering various scaffolds of hit chemicals. To examine whether this hypothesis is true, we divided the benchmark set into two groups based on the diversity of active compounds in the screening library. The pairwise Tanimoto coefficients (T_c) between the active compounds were calculated using RDKitFP fingerprint²⁷. Then the pairwise distances between the compounds were calculated as (1 – T_c). The diversity of active compounds is defined as the average distances of the active compounds (Supplementary Table S5). With a threshold of 0.7, the kinases were classified into two groups: 14 proteins with values higher than or equal to the cutoff and the remaining 11 targets.

For the kinases with more diverse active compounds, the ensemble screening with MSM models performed better than standard AF2 and AF3 models in all metrics, and even higher EF_1% than the crystal structures. For EF_1%, the MSM performed much better than standard AF2 (Table 2). Out of 14 proteins, the MSM with NT set and BW scoring has higher or equal EF_1% values than standard AF2 models in 11 targets. On the other hand, when the active compounds become less diverse, MSM still performed better than the standard AF2, but the gap between them declined for EF_1%. This implies that our approach would be powerful for discovering diverse molecular scaffolds.

To investigate further, we examined the diversity of active compounds identified as top hits by calculating Shannon’s entropy (Supplementary Table S8). The active molecules in the DUD-E set were clustered using the Butina algorithm²⁸ with a distance cutoff of 0.3. The higher Shannon’s entropy means that more diverse active compounds were found. The NT models with BW score show the highest Shannon’s entropy within the top 1% (1.36). Other AF-based models have Shannon’s entropy around 1.0 (1.08, 1.01, and 1.15 for standard AF2, AF3 with ADB, and AF3 with BW). Regarding the kinases with average dissimilarity ≥ 0.7, the NT models with ADB show the highest Shannon’s entropy within the top 1% (1.60), and the gap between MSM protocol with other receptor structures increases. This result supports that the use of multiple-conformation ensembles generated by our protocol would be suitable for discovering diverse hits.

One of the diverse active hit examples is ABL1. The average dissimilarity of the active compounds is 0.73. Regardless of the template set and the scoring scheme, EF1% of MSM ensemble screening showed higher performance (TT models with ADB: 9.3, TT models with BW: 10.4, NT models with ADB: 15.3, and NT models with BW: 17.5) than standard AF2 model (6.0), AF3 (7.1 for ADB and 7.6 for BW), and crystal structure (7.6). We also examined the diversity of active compounds ranked within the top 1% ranked molecules. Our ensemble protocol tends to find diverse hits: 0.51, 0.53, 0.63, and 0.62 for TT models with ADB, TT models with BW, NT models with ADB, and NT models with BW, respectively. In contrast, the diversity of active compounds using the standard AF2 is 0.27. Supplementary Fig. S4 illustrates the distribution of active compounds within the top 1%. The ABL1 crystal structure got 16 active compounds from 6 clusters, while MSM with NT plus BW contained 32 active compounds from 16 clusters.

CSF1R is another example showing MSM ensemble screening can find diverse hit compounds. The average dissimilarity of hit molecules is 0.73. The gap in EF1% between MSM models (5.4) and the standard AF2 model (3.6) is smaller than in the case of ABL1. The diversities of discovered hit compounds within the top 1% are 0.70, 0.65, 0.73, and 0.73 in the order of TT models with ADB, TT models with BW, NT models with ADB, and NT models with BW, similar to the average dissimilarity of all active compounds. However, the diversity of hit compounds within the top 1% identified using the standard AF2 model is 0.23. Figure 5 shows the docking pose of one of the active compounds, ChEMBL245377. The MSM ranked the active compound within the top 1% (10th using TT models and BW scoring, out of 12,316 compounds) which is not ranked highly by the standard AF2 structure (167th ). The compound is not successfully docked using the standard AF2 model.

Although the ensemble screening with MSM structures generally performed better than standard AF2, there is an issue with selecting the representative score of a compound. For instance, for FGFR1, the MSM model with TT set and BW score showed an EF1% of 2.1. When we observed EF1% values of individual models, the DFGin-BLBplus state outperformed any other structures including standard AF2 and crystal structure (Table 3). Thus, a proper method for selecting or calculating representative scores for a compound should be designed to achieve high performance for MSM ensemble screening.

Table 3 EF_1% for FGFR1 structures of specific states. The MSM structures are built from TT set.

Full size table

Discussion

The receptor conformation affects SBVS performance. Like other proteins, kinases adjust the conformation of their binding sites in response to the binding ligand. Therefore, it is crucial to have adequate kinase structures to obtain inhibitors with the required mode of action or diversity. For human kinase structures that were identified through experiments, however, there is a clear bias toward the active state. The prediction of the AF2 structure could be influenced by the bias in the PDB database. We noticed that there is a bias toward the active state in the predicted structures with standard AF2 protocol. The results of the SBVS using the predicted structure would be compromised by this bias in receptor structure. To overcome the bias, we applied the MSM technique by providing a structural template to AF2 to generate structures with diverse states and using the models for ensemble docking. Compared with standard AF2 models, MSM protocol produced more accurate or comparable models although it did not give MSA as an input. Also, in the cognate docking study, MSM models provided ligand docking poses close to the crystal structure. With the diverse predicted kinase structures, we performed ensemble screening. The ensemble method showed enhanced or comparable SBVS performance to the standard AF2 modeled structure result. We also observed that our method would be more suitable when the ligands are diverse, leading to the identification of a diverse range of kinase inhibitors. Even for the targets where the ensemble docking method could not find active compounds, we found that some of the models outperformed the standard AF2. Thus, the selection of a representative structure should be improved and remains the next work for this project.

Ensemble screening with MSM models would open the possibility of uncovering novel kinase inhibitors with diverse chemical scaffolds. It has advantages in addressing current challenges in kinase inhibitor development for finding chemically diverse compounds. The chemical diversity of kinase inhibitors could aid in overcoming the problem of drug resistance generally caused by mutations, a significant obstacle in kinase-targeted cancer therapies. Additionally, it could increase the chance to find hit compounds not similar to existing patents. By exploring a diverse array of kinase inhibitors with accurately predicted structures, we would be able to find inhibitors with different modes of action. This could lead to the development of novel therapeutic strategies that are more robust in the face of drug resistance. Hence, our approach could potentially be applied to more effective and precise kinase-targeted therapies. In addition, the MSM method could be applied to important therapeutic targets such as GPCRs.

Methods

Benchmark dataset

To investigate whether our approach can generate a structural ensemble properly and improve SBVS performance, we selected a kinase subset of DUD-E¹⁷. For each target, a reference PDB structure for screening and a compound library composed of active and decoy molecules are provided in the DUD-E set. The active compounds of the DUD-E set are composed of molecules with affinity of 1 µM or better extracted from ChEMBL09⁵. The reference structures were selected by considering the resolution and enrichment of finding active molecules by DOCK3.5. The decoys of the DUD-E set are constructed by gathering compounds with similar characteristics to the active molecules, such as logP and number of rotatable bonds, from the ZINC database²⁹. The kinase subset, which is used in this work, consists of 26 kinases that include 205.6 actives and 12,830 decoys on average. Among the 26 targets in the DUD-E kinase subset, SRC kinase was removed from the DUD-E benchmark set because the given reference structure is not originated from human. The reference crystal structures provided in the set were used for the docking and screening benchmarks.

Kinase structural state annotation

In the active site of protein kinases, the activation loop, 20–30 residues long, is the most important secondary structural element³⁰ for determining the structural state. The loop starts from the conserved three-residue-long sequence, the DFG motif. In this work, the standalone version of KinCoRe¹¹ (https://github.com/vivekmodi/Kincore-standalone, Accessed 4/14/2022) was employed to annotate the conformational state of all experimental and modeled kinase structures. The program categorizes the conformational state into 12 classes by the ___location of the activation loop and dihedral angles of the DFG motif. The spatial state of the activation loop is defined by two distances: (1) a distance between Phe-ring of the DFG motif and Cα atom of the fourth residue from the conserved Glu in the C-helix of N-lobe and (2) a distance from the Phe-ring to the conserved Cζ atom of conserved Lys in β3 strand of N-lobe (Fig. 1). Based on the distances, the activation loop ___location is classified into three classes: DFGin, which is the Phe-ring located under C-helix, DFGout, the Phe-ring is moved into ATP binding pocket, and DFGinter, an intermediate state between DFGin and DFGout. The program further classifies the activation loop structural state by calculating dihedral angles: φ, ψ backbone dihedral angles of X-DFG (a residue before the DFG motif), Asp, and Phe of DFG motif, and χ1 angle of DFG-Phe. As a result, DFGin, the dominant class, has seven subclasses (BLAminus, BLAplus, ABAminus, BLBminus, BLBplus, BLBtrans, and Unassigned), while DFGinter (BABtrans and Unassigned) and DFGout (BBAminus and Unassigned) have only two subclasses. The three letters after the activation loop states follows the region of Ramachandran map occupied by X, D, and F residues: A, B, L for alpha, beta, and left-handed, respectively. The χ1 angle of Phe is indicated as plus (+ 60 degree), minus (-60 degree), and trans (180 degree). The last class is Unassigned-Unassigned, the activation loop and DFG conformations cannot be determined.

Construction of template database for each structural state

To construct the structural template database for MSM, KLIFS¹⁹, a database of experimentally determined kinase structures was used. The database contains catalytic ___domain structures of kinases, extracted from PDB and their inhibitors and provides the interaction information between the protein and the compound. As of May 2023, the database is composed of 6,344 structures (13,382 monomers). Among the kinase structures in KLIFS (Accessed 1/18/2023), we filtered out the non-human proteins and proteins that produced errors during KinCoRe annotation, resulting in 11,106 monomer structures. To construct the template structure database for each state, the crystal structures with the same annotation by KinCoRe were gathered.

Standard AlphaFold2 modeling

The kinase structure was modeled with the standard protocol of AF2 (v.2.3.1) to compare with MSM models. Only the kinase ___domain was modeled from the full sequence of a protein. The MSA for the kinase sequence was generated from BFD (v.3.2019), MGnify (v.5.2022), and UniRef90 (v.2.2022) via HHblits (from HH-suite v3.3.0) and Jackhammer (from HMMER v3.3.2). The four highest sequence identity proteins with 3D atomic coordinates were selected to provide template structures. The number of recycles was set to three and the model relaxation step was integrated into the procedure. As AF2 has five different trained models and runs all of them independently in a single run, five structures were generated from a single run. The models with the highest plDDT score out of the five predicted structures for the comparison, since plDDT is a confidence measure of AF2 predicted models.

Kinase structure modeling using AlphaFold3

AF3¹² is the most recent version of AF series. From AF2, the modeling part was replaced to the diffusion model. To model the kinase structures, AlphaFold3 webserver (alphafoldserver.com) was employed. Three different seeds were used to predict the diverse structures of a kinase. The outputs were classified using KinCoRe to annotate. For each kinase, the model with the best ranking score was selected as a receptor structure for the cognate docking benchmark. All three models were used as the ensemble of the kinase to screen the molecule.

Multistate modeling of kinase using structural template

The workflow of MSM is given in Fig. 3. From a given target kinase sequence to be modeled, the templates were searched by MMseqs2 (release 11) easy-search (e-value cutoff: 1e-3)³¹ against all sequences in each structural state. For each structural state, the top five templates, which were determined by e-value, were used for the modeling. To mimic the real drug discovery process and check the influence of the template for virtual screening, we generated two template sets for modeling: one set has the query protein which means 100% sequence identity with the query sequence (TT set), and the other template sets without the query protein (NT set). Modeling with each template was conducted independently.

Since AF2 produced five models per single run, 25 structures for each specific state were generated in total. Among them, we selected one structure for benchmarking with two criteria: conformational state and quality of the predicted model. First, we filtered out the models with different KinCoRe annotations from the template structure classification. For example, to predict a model the DFGin-ABAminus conformation of ABL1, five templates structure of TYR family (LYN: 5XY1, EPHA2: 7KJB, 5NK3, 4TRL, and IGF1R: 3F5P) were selected. However, the models were annotated as DFGin-BLAminus rather than DFGin-ABAminus conformation (Supplementary Fig. S5). Thus, all predicted models were discarded. After filtering by the KinCoRe annotation, the models with plDDT less than 70 were also removed. Among the remained structures, the model with the highest plDDT score was finally selected. Other details for modeling are the same as standard AF2 modeling.

Assessment of model quality

The TM-score²⁰ evaluates the structural similarity of protein structures. It is scaled according to the size of the protein and exhibits a better sensitivity to the overall structural alignment compared to the RMSD. The models with the same KinCoRe notation as the crystal structures, extracted from the DUD-E set were chosen for comparison with the crystal structure employing the TM-score. The TM-score’s values range from 0 to 1, where 1 signifies a perfect alignment.

MolProbity¹⁸ is a comprehensive validation tool for the structural integrity of proteins and nucleic acids. MolProbity validates protein structures through hydrogen replacement, comprehensive all-atom contact analysis, and evaluation of torsional angles. The MolProbity score integrates various factors into a single metric to indicate the model’s reliability, where a lower score denotes a better model. The models from the standard AF2 and MSM were validated using MolProbity implemented in Phenix software³².

Docking and virtual screening using AutoDock-GPU

AutoDock-GPU 1.5.3²¹, which is open-source and GPU-accelerated, was employed to benchmark the virtual screening performance of kinase structures. We used AutoDockTools³³ to convert the receptor PDB files to PDBQT and Meeko³⁴ based on RDKit²⁷ to convert the ligand files into PDBQT format.

To define a docking pocket ___location, the model structure was superimposed with the reference crystal structures provided in the DUD-E set using PyMOL alignment module³⁵. Then, a cubic box centered at the geometrical center position of the cognate ligand structure of the reference protein structure was defined. Each dimension of the box has a size of 22.5 Å, a default option of the program. The parameter nrun, the number of pose generations and searches in AutoDock-GPU, was set to 50 to find the optimal AutoDock score between protein and ligand.

Scoring schemes for ensemble docking

To get the representative score of a ligand that docked to the multiple receptor structures in ensemble screening, we employed two scoring schemes: AutoDock best score (ADB) and Boltzmann-weighted score (BW). After gathering all AutoDock scores of a compound docked to the MSM structures, ADB scheme picks the lowest value as a representative score for the ligand. For example, if a ligand is docked to five kinase structures with docking scores of -11 kcal/mol, -12 kcal/mol, -8 kcal/mol, -7 kcal/mol, and − 9 kcal/mol, then the compound has a score of -12 kcal/mol.

Instead of using the docking score from a single structure, the BW scheme calculates a weighted average of the docking scores. We modified BW score of Shin et al.¹, which was originally used to calculate a score of a protein with multiple ligand conformations, to apply a single ligand to multiple protein conformations (Eq. (1)).

$$\:\text{B}\text{W}\:\text{S}\text{c}\text{o}\text{r}\text{e}\left(P,L\right)=\frac{{\sum\:}_{{P}_{state}}^{{N}_{state}}AutoDock\left({P}_{state},L\right)\times\:\text{exp}\left[-{\upbeta\:}\times\:AutoDock\left({P}_{state},L\right)\right]}{{\sum\:}_{{P}_{state}}^{{N}_{state}}\text{exp}\left[-{\upbeta\:}\times\:AutoDock\left({P}_{state},L\right)\right]}\:$$

(1)

where $\:\beta\:=1$, P and L are protein and ligand, respectively. P_state means target protein structure with a specific state. AutoDock(P_state, L) means AutoDock score of a ligand for the target protein with a specific state.

Evaluation metrics for docking and screening

To evaluate the performance of cognate docking, RMSDs of the docked conformations from the bound conformation of the crystal structure were calculated. Then we picked two conformations: one with lowest AutoDock score and the other one is the lowest RMSD conformation. We also measured the docking success rate of the 24 target proteins with the RMSD cutoff of 2 Å, a widely used criterion for many docking studies^22,23,24.

In order to compare the virtual screening performance of MSM model ensemble screening with X-ray crystallography and standard AF2 structures, the EF_X%, AUC, and BEDROC were calculated.

The EF_X% is a widely used metric to evaluate virtual screening methods. The enrichment factor quantifies the extent to which active compounds are sampled in the top N% of compounds relative to the total compound set (Eq. (2)).

$$\:E{F}_{X\%}=\frac{\text{Number\:of\:actives\:in\:the\:top\:}\text{X}\text{\%}\text{\:/\:Number\:of\:compounds\:for\:X\%}}{\text{Total\:EquationNumber\:of\:actives\:in\:the\:}\text{library\:/\:Total\:EquationNumber\:of\:compounds\:in\:the\:library\:}}$$

(2)

We set X as 1, 5, and 10. A random selection of compounds makes the EF value 1.

One of the most popular measures for the discrimination problem is AUC. The true positive rate in relation to the false positive rate was plotted to create a receiver operating characteristic curve. In the case of virtual screening, the ratio of active chemicals represents the true positive rate, while the ratio of decoy molecules represents the false positive rate. When a program detects all active compounds before ranking any decoy compounds, AUC reaches 1.0, the maximum value and AUC 0.5 means that the program performance is the same as the random selection.

Although AUC gives an overall performance discriminating power between actives and decoys of SBVS, it has a problem called ‘early recognition’²⁶. In virtual screening, the highly ranked compounds are passed to experiment, not all compounds. Thus, it is important to rank active molecules within high rank. To solve this issue, Boltzmann-enhanced discrimination of receiver operating characteristic (BEDROC) puts an exponential weight on the highly ranked active compounds (Eq. (3)).

$$\:BEDROC=\frac{\sum\:_{i=1}^{N}{e}^{-\alpha\:{r}_{i}/N}}{{R}_{a}\left(\frac{1-{e}^{\alpha\:}}{{e}^{\alpha\:/N}-1}\right)}\times\:\frac{{R}_{a}\text{sinh}\left(\frac{\alpha\:}{2}\right)}{\text{cosh}\left(\frac{\alpha\:}{2}\right)-\text{cosh}\left(\frac{\alpha\:}{2}-\alpha\:{R}_{a}\right)}+\frac{1}{1-{e}^{\alpha\:\left(1-{R}_{a}\right)}}$$

(3)

N is the number of compounds, R_a is the ratio of active compounds in the library, i the index of the active compounds, and r_i is the rank of the active compound i. In this work, the weight, α, is set to 20.

Data availability

All predicted models by MSM and standard AF2 protocols are available at https://doi.org/10.5281/zenodo.8272608.

References

Shin, W. H., Christoffer, C. W., Wang, J. & Kihara, D. PL-PatchSurfer2: Improved local surface matching-based virtual screening method that is tolerant to target and ligand structure variation. J. Chem. Inf. Model. 56(9), 1676–1691 (2016).
Article CAS PubMed PubMed Central Google Scholar
Bordogna, A., Pandini, A. & Bonati, L. Predicting the accuracy of protein-ligand docking on homology models. J. Comput. Chem. 32(1), 81–98 (2011).
Article CAS PubMed PubMed Central Google Scholar
Fan, H. et al. Molecular docking screens using comparative models of proteins. J. Chem. Inf. Model. 49(11), 2512–2527 (2009).
Article CAS PubMed PubMed Central Google Scholar
Santos, R. et al. A comprehensive map of molecular drug targets. Nat. Rev. Drug Discov. 16(1), 19–34 (2017).
Article CAS PubMed Google Scholar
Gaulton, A. et al. ChEMBL: A large-scale bioactivity database for drug discovery. Nucleic Acids Res. 40(Database issue), D1100–D1007 (2012).
Article CAS PubMed Google Scholar
McClendon, C. L., Kornev, A. P., Gilson, M. K. & Taylor, S. S. Dynamic architecture of a protein kinase. Proc. Natl. Acad. Sci. U S A. 111(43), E4623–4631 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Hari, S. B., Merritt, E. A. & Maly, D. J. Sequence determinants of a specific inactive protein kinase conformation. Chem. Biol. 20(6), 806–815 (2013).
Article CAS PubMed PubMed Central Google Scholar
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature. 596(7873), 583–589 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science. 373(6557), 871–876 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Heo, L. & Feig, M. Multi-state modeling of G-protein coupled receptors at experimental accuracy. Proteins. 90(11), 1873–1885 (2022).
Article CAS PubMed PubMed Central Google Scholar
Modi, V. & Dunbrack, R. L. Jr Defining a new nomenclature for the structures of active and inactive kinases. Proc. Natl. Acad. Sci. U S A. 116(14), 6818–6827 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold3. Nature. 630(8016), 493–500 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Meng, Y., Lin, Y. L. & Roux, B. Computational study of the DFG-flip conformational transition in c-Abl and c-Src tyrosine kinases. J. Phys. Chem. B. 119(4), 1443–1456 (2015).
Article CAS PubMed Google Scholar
Meng, Y., Pond, M. P. & Roux, B. Tyrosine kinase activation and conformational flexibility: Lessons from src-family tyrosine kinases. Acc. Chem. Res. 50(5), 1193–1201 (2017).
Article CAS PubMed PubMed Central Google Scholar
Haldane, A., Flynn, W. F., He, P., Vijayan, R. S. & Levy, R. M. Structural propensities of kinase family proteins from a Potts model of residue co-variation. Protein Sci. 25(8), 1378–1384 (2016).
Article CAS PubMed PubMed Central Google Scholar
Carles, F., Bourg, S., Meyer, C. & Bonnet, P. PKIDB: A curated, annotated and updated database of protein kinase inhibitors in clinical trials. Molecules. 23(4), 908 (2018).
Article PubMed PubMed Central Google Scholar
Mysinger, M. M., Carchia, M., Irwin, J. J. & Shoichet, B. K. Directory of useful decoys, enhanced (DUD-E): Better ligands and decoys for better benchmarking. J. Med. Chem. 55(14), 6582–6594 (2012).
Article CAS PubMed PubMed Central Google Scholar
Chen, V. B. et al. MolProbity: All-atom structure validation for macromolecular crystallography. Acta Crystallogr. Sect. D. 66(1), 12–21 (2010).
Article ADS CAS Google Scholar
Kanev, G. K., de Graaf, C., Westerman, B. A., de Esch, I. J. P. & Kooistra, A. J. KLIFS: An overhaul after the first 5 years of supporting kinase research. Nucleic Acids Res. 49(D1), D562–D569 (2021).
Article CAS PubMed Google Scholar
Zhang, Y. & Skolnick, J. Scoring function for automated assessment of protein structure template quality. Proteins. 57(4), 702–710 (2004).
Article CAS PubMed Google Scholar
Santos-Martins, D. et al. Accelerating AutoDock4 with GPUs and gradient-based local search. J. Chem. Theory Comput. 17(2), 1060–1073 (2021).
Article CAS PubMed PubMed Central Google Scholar
Shin, W. H., Kim, J. K., Kim, D. S. & Seok, C. GalaxyDock2: Protein-ligand docking using beta-complex and global optimization. J. Comput. Chem. 34(30), 2647–2656 (2013).
Article CAS PubMed Google Scholar
Trott, O. & Olson, A. J. AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 31(2), 455–461 (2010).
Article CAS PubMed PubMed Central Google Scholar
Lee, J. & Seok, C. A statistical rescoring scheme for protein-ligand docking: Consideration of entropic effect. Proteins. 70(3), 1074–1083 (2008).
Article CAS PubMed Google Scholar
Adasme, M. F. et al. PLIP 2021: Expanding the scope of the protein-ligand interaction profiler to DNA and RNA. Nucleic Acids Res. 49(W1), W530–534 (2021).
Article CAS PubMed PubMed Central Google Scholar
Truchon, J. F. & Bayly, C. I. Evaluating virtual screening methods: Good and bad metrics for the early recognition problem. J. Chem. Inf. Model. 47(2), 488–508 (2007).
Article CAS PubMed Google Scholar
Landrum, G. & RDKit Open-source cheminformatics 2006. Accessed (2022).
Butina, D. Unsupervised data base clustering based on daylight’s fingerprint and Tanimoto similarity: A fast and automated way to cluster small and large data set. J. Chem. Inf. Comput. Sci. 39(4), 747–750 (1999).
Irwin, J. J. & Shoichet, B. K. ZINC–a free database of commercially available compounds for virtual screening. J. Chem. Inf. Model. 45(1), 177–182 (2005).
Article CAS PubMed PubMed Central Google Scholar
Steichen, J. M. et al. Structural basis for the regulation of protein kinase A by activation loop phosphorylation. J. Biol. Chem. 287(18), 14672–14680 (2012).
Article CAS PubMed PubMed Central Google Scholar
Mirdita, M., Steinegger, M. & Soding, J. MMseqs2 desktop and local web server app for fast, interactive sequence searches. Bioinformatics. 35(16), 2856–2858 (2019).
Article CAS PubMed PubMed Central Google Scholar
Leibschner, D. et al. Macromolecular structure determination using X-rays, neutrons and electrons: Recent developments in Phenix. Acta Cryst. D. 75(10), 861–877 (2019).
Article Google Scholar
Morris, G. M. et al. AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. J. Comput. Chem. 30(16), 2785–2791 (2009).
Article CAS PubMed PubMed Central Google Scholar
Forli, S. & Meeko https://github.com/forlilab/Meeko (2023).
Schrodinger, L. L. C. The PyMOL molecular graphics system, Version 1.8. (2015).

Download references

Acknowledgements

All authors acknowledge the support from the Bio & Medical Technology Development Program of the National Research Foundation (NRF) funded by the Korean government (No. 2022M3E5F3081268). WHS also acknowledges support from Ministry of Science and ICT (MSIT), Korea, under the ICAN (ICT Challenge and Advanced Network of HRD) support program (IITP-2024-RS-2022-00156439) supervised by the Institute for Information & Communications Technology Planning & Evaluation (IITP) and Korea University Grant (No. K2327351). JL also acknowledges the support from NRF Grants funded by MSIT (Nos. 2022R1C1C1005080 and 2020M3A9G7103933) and Korea Environment Industry & Technology Institute (KEITI) through “Advanced Technology Development Project for Predicting and Preventing Chemical Accidents” Program, funded by Korea Ministry of Environment (MOE) (RS-2023-00219144).

Author information

These authors contributed equally to this work.

Authors and Affiliations

College of Pharmacy, Seoul National University, Seoul, Republic of Korea
Jinung Song & Juyong Lee
Arontier Co., Seoul, Republic of Korea
Junsu Ha, Juyong Lee, Junsu Ko & Woong-Hee Shin
Department of Molecular Medicine and Biopharmaceutical Sciences, Graduate School of Convergence Science and Technology, Seoul, Republic of Korea
Juyong Lee
Department of Biomedical Informatics, Korea University College of Medicine, Seoul, Republic of Korea
Woong-Hee Shin

Authors

Jinung Song
View author publications
Search author on:PubMed Google Scholar
Junsu Ha
View author publications
Search author on:PubMed Google Scholar
Juyong Lee
View author publications
Search author on:PubMed Google Scholar
Junsu Ko
View author publications
Search author on:PubMed Google Scholar
Woong-Hee Shin
View author publications
Search author on:PubMed Google Scholar

Contributions

WHS, JL, and JK conceived the study. JS prepared the benchmark set, performed standard AlphaFold2, AlphaFold3, and AlphaFold2 with multi-state modeling, and analyzed the results. JH ran AutoDock-GPU to do cognate docking and screen the DUD-E molecules. WHS, JS, and JH drafted the manuscript. WHS edited and finalized the paper. All authors have reviewed the manuscript.

Corresponding author

Correspondence to Woong-Hee Shin.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Song, J., Ha, J., Lee, J. et al. Improving docking and virtual screening performance using AlphaFold2 multi-state modeling for kinases. Sci Rep 14, 25167 (2024). https://doi.org/10.1038/s41598-024-75400-6

Download citation

Received: 13 August 2024
Accepted: 04 October 2024
Published: 24 October 2024
DOI: https://doi.org/10.1038/s41598-024-75400-6

Subjects

Abstract

Similar content being viewed by others

Integrating machine learning and structure-based approaches for repurposing potent tyrosine protein kinase Src inhibitors to treat inflammatory disorders

Synthon-based ligand discovery in virtual libraries of over 11 billion compounds

An artificial intelligence accelerated virtual screening platform for drug discovery

Introduction

Results

Structural state distributions of experimental kinase structures and AlphaFold2 predicted models

Predicting state-specific kinase structures using multi-state modeling protocol

Cognate docking accuracy of a compound to the multi-state modelled structures

Virtual screening performance with multi-state models

Discussion

Methods

Benchmark dataset

Kinase structural state annotation

Construction of template database for each structural state

Standard AlphaFold2 modeling

Kinase structure modeling using AlphaFold3

Multistate modeling of kinase using structural template

Assessment of model quality

Docking and virtual screening using AutoDock-GPU

Scoring schemes for ensemble docking

Evaluation metrics for docking and screening

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Electronic supplementary material

Supplementary Material 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links