Pretrainable geometric graph neural network for antibody affinity maturation

Cai, Huiyu; Zhang, Zuobai; Wang, Mingkai; Zhong, Bozitao; Li, Quanxiao; Zhong, Yuxuan; Wu, Yanling; Ying, Tianlei; Tang, Jian

doi:10.1038/s41467-024-51563-8

Download PDF

Article
Open access
Published: 06 September 2024

Pretrainable geometric graph neural network for antibody affinity maturation

Nature Communications volume 15, Article number: 7785 (2024) Cite this article

17k Accesses
13 Citations
9 Altmetric
Metrics details

Subjects

An Author Correction to this article was published on 07 October 2024

This article has been updated

Abstract

Increasing the binding affinity of an antibody to its target antigen is a crucial task in antibody therapeutics development. This paper presents a pretrainable geometric graph neural network, GearBind, and explores its potential in in silico affinity maturation. Leveraging multi-relational graph construction, multi-level geometric message passing and contrastive pretraining on mass-scale, unlabeled protein structural data, GearBind outperforms previous state-of-the-art approaches on SKEMPI and an independent test set. A powerful ensemble model based on GearBind is then derived and used to successfully enhance the binding of two antibodies with distinct formats and target antigens. ELISA EC₅₀ values of the designed antibody mutants are decreased by up to 17 fold, and K_D values by up to 6.1 fold. These promising results underscore the utility of geometric deep learning and effective pretraining in macromolecule interaction modeling tasks.

Co-optimization of therapeutic antibody affinity and specificity using machine learning models that generalize to novel mutational space

Article Open access 01 July 2022

Predicting antibody affinity changes upon mutations by combining multiple predictors

Article Open access 11 November 2020

Geometric deep learning improves generalizability of MHC-bound peptide predictions

Article Open access 19 December 2024

Introduction

Antibody plays a crucial role in the human immune system and serves as a powerful diagnostic and therapeutic tool, due to its ability to bind selectively and specifically to target antigens with high affinity. In vivo, antibodies go through affinity maturation, where the target-binding affinity gradually increases as a result of somatic hypermutation and clonal selection¹. When a new antigen surfaces, therapeutic antibody leads repurposed from known antibodies or screened from a natural or de novo designed library often require in vitro affinity maturation to enhance their binding affinity to a desired, usually sub-nanomolar, level.

Wet lab experimental methods for in vitro antibody affinity maturation usually involve constructing mutant libraries and screening with display technology^2,3,4,5. These methods, while significantly improved during the past few years, are still labor-intensive and costly in general, taking 2–3 months or more to complete the process. Let’s consider the combinatorial search space of possible mutations. There are usually 50–60 residues on the complementarity-determining region (CDR) of an antibody, which are hypervariable in vivo and contribute to the majority of the binding free energy ΔG_bind⁶. Previous works show that multiple point mutations are often needed for successful affinity maturation^7,8. Performing experiments on all combinations of over a thousand possible point mutations in antibody CDR regions (60 residues × 19 residues per residue) is difficult if not prohibitive. Therefore, a fast and accurate computational method for narrowing down the search space is much desired.

Nevertheless, it is nontrivial for computational affinity maturation methods to balance speed and accuracy. Molecular dynamics methods based on empirical force fields^9,10,11,12 rely on human knowledge and abstractions to evaluate binding free energy changes after mutations. However, accurate models are often too slow to be used for ranking thousands of mutations (let alone their combinations). In recent years, machine learning, and particularly deep learning, has been demonstrated as a powerful tool capable of tackling this dilemma. Many machine-learning methods^{13,14,15,16,17,18} formulate the affinity maturation problem as a structure-based binding free energy change ($\Delta \Delta {G}_{{{{\rm{bind}}}}}:=\Delta {G}_{{{{\rm{bind}}}}}^{({{{\rm{mt}}}})}-\Delta {G}_{{{{\rm{bind}}}}}^{({{{\rm{wt}}}})}$, where wt is short for wild type and mt denotes mutant) prediction problem. However, despite the importance of protein side-chain conformation to protein–protein interaction, most existing methods model atom-level geometric information indirectly or incompletely, e.g., using hand-crafted features or residue-level features. These approaches inadequately address the intricate interplay between side-chain atoms. Another critical problem is the massive amount of paired binding affinity data required by machine-learning models for them to become accurate and reliable. To the best of our knowledge, the largest publicly available protein–protein binding free energy change dataset, Structural Kinetic and Energetic database of Mutant Protein Interactions (SKEMPI) v2.0¹⁹, contains only 7085 ΔΔG_bind measurements on 348 protein complexes, a tiny amount compared to the training set sizes of foundational protein models, such as AlphaFold2²⁰ and ESM2²¹.

To tackle the aforementioned challenges, we introduce GearBind, a pretrainable deep neural network that leverages multi-level geometric message passing to model the nuanced protein–protein interactions. We utilize contrastive pretraining techniques on large-scale protein structural dataset to incorporate vital structural insights into the model (Fig. 1). In silico experiments on SKEMPI and an independent test set demonstrate the superior performance of GearBind and the benefit of pretraining. We combine the GearBind models with previous state-of-the-art methods to create an ensemble model that achieves state-of-the-art performance on all metrics. The ablation study confirms the importance of key design choices within GearBind and the key role it played in the ensemble. We then use the GearBind-based ensemble to perform in silico affinity maturation for two antibodies with distinct formats and target antigens. Binding of the antibody CR3022 against the spike (S) protein of the Omicron SARS-CoV-2 variant is increased by up to 17-fold as measured by Enzyme-linked immunosorbent assay (ELISA), and by 6.1-fold as measured by Bio-layer Interferometry (BLI), after synthesizing and testing only 20 antibody candidates. All designed antibodies have maintained or increased binding towards the receptor-binding domains (RBDs) of both the SARS-CoV-2 Delta variant and SARS-CoV. Binding of the fully human single-___domain antibody (UdAb) against the oncofetal antigen 5T4 is increased by up to 5.6-fold as measured by ELISA, and by up to 2.1-fold as measured by BLI, after testing 12 candidates. These results underscore the importance of geometric deep learning and effective pretraining on antibody affinity maturation and, more generally, macromolecule interaction modeling.

**Fig. 1: GearBind-based in silico antibody affinity maturation pipeline.**

Results

GearBind: a pretrainable ΔΔG _bind predictor

The GearBind framework is designed to extract geometric representations from wild-type and mutant structures via multi-level and multi-relational message passing to predict the binding free energy change ΔΔG_bind. GearBind leverages information within a protein complex at three different levels with complementary insights, namely, atom-level information holding precise spatial and chemical characteristics, edge-level information capturing angular relationships, and residue-level information highlighting broader context within the protein complex. Merging these distinct yet interconnected tiers of information allows for a more holistic view of protein complexes, potentially enhancing model capabilities.

More formally, when a protein complex structure is input to GearBind, a multi-relational interface atom graph is first constructed to model the detailed interactions within the complex. The relations defined cover both sequential proximity (for atoms on the same chain) and spatial proximity (which includes k-nearest-neighbor and within-r-radius relations). Atom-level representations are obtained by applying a geometric relational graph neural network (GearNet²²) on the interface graph. On top of that, a line graph is constructed by treating each edge in the atom graph as a line node, connecting adjacent line nodes, and encoding the angular information as line edge features. Edge-level interactions are then captured by performing message passing on the line graph, similar to a sparse version of AlphaFold’s triangle attention²⁰. Finally, after aggregating atom and edge representations for each residue, a geometric graph attention layer is applied to pass messages between residues. This multi-level message-passing scheme injects multi-granularity structural information into the learned representations, making it highly useful for the task of ΔΔG_bind prediction.

Although GearBind can be trained from scratch on labeled ΔΔG_bind datasets, it could suffer from overfitting or poor generalization if the training data size is limited. To address this problem, we propose a self-supervised pretraining task to exploit large-scale unlabeled protein structures in CATH^22,23. In the pretraining stage, the encoder is trained to model the distribution of the native protein structures via noise contrastive estimation²⁴. Specifically, we maximize the probability (i.e., push down the energy) of native CATH proteins while minimizing the probability of mutant structures (Fig. 1c) generated by randomly mutating residues and sampling side-chain torsional angles from a rotamer library²⁵. Distinguishing native, stable protein structures from sampled mutant structures pushes the model towards understanding side-chain interaction patterns, which are crucial to protein–protein binding. Through this process, meaningful knowledge from abundant single-chain protein structural data could be transferred to benefit protein–protein binding modeling.

Cross-validation on SKEMPI

We validated GearBind performance via a split-by-complex, fivefold cross-validation on SKEMPI v2.0. Our splitting strategy dictates that each test set share no common PDB complex with its corresponding training set, making it more realistic than the split-by-mutation strategy, where the wild-type protein complexes and even the mutation sites in the test set could appear during training. We compared GearBind and GearBind+P (pretrained GearBind fine-tuned on SKEMPI) to state-of-the-art physics-based tools FoldX⁹, Flex-ddG¹⁰, and the deep-learning method Bind-ddG⁸. The results (Table 1) show that GearBind, with its multi-relational graph construction and multi-level message-passing schemes, outperforms the baselines in terms of mean absolute error (MAE), root mean squared error (RMSE), and PearsonR, while seconds FoldX in terms of SpearmanR. Pretraining GearBind brings further performance improvements, resulting in + 5.4% SpearmanR, + 2.6% PearsonR, −2.4% MAE and −1.7% RMSE. This highlights the effective knowledge transfer from mass-scale, unlabeled protein structural data.

Table 1 Cross-validation performance of different methods on SKEMPI (n = 5729)

Full size table

To understand the contributions of key architectural design choices in GearBind, we benchmarked the performance of five GearBind variants on SKEMPI. As shown in Fig. 2e, the tested GearBind variants perform worse than GearBind on all four metrics. The exclusion of edge- and residue-level message passing from GearBind brings a 13% and 3% SpearmanR drop, highlighting the benefits of combining multi-level information during feature extraction. The exclusion of side-chain atoms from the interface graph hurts performance even more (15% SpearmanR drop), showing the importance of explicitly modeling the full-atom structure. Notably, replacing the multi-relational interface graph with a KNN graph results in a severe 23% SpearmanR decline, while training a simple RGCN model on the multi-relational graphs results in performance on par with Bind-ddG (−9% SpearmanR compared to GearBind, + 2% compared to Bind-ddG). This suggests that the multi-relational graph construction strategy is a key ingredient in GearBind.

A GearBind-based ensemble for in silico affinity maturation

To understand the behavior of the benchmarked models on SKEMPI, we binned the SKEMPI dataset by the target difficulty and plotted the PearsonR and SpearmanR of each model on targets with different difficulty levels. PDB codes in SKEMPI are categorized into “easy” (50+ similar data points in the training set), “medium” (1–50), and “hard” (0) targets based on the number of training data points having a high structural similarity (TM-score²⁶ >0.8) to it. Deep-learning models, namely Bind-ddG, GearBind, and GearBind+P, enjoy superior performance compared to physics-based methods such as FoldX and Flex-ddG on easy targets, but the table turns when we move to the hard targets (Fig. 2a, b), showing room for improvement in their generalization capabilities. We also studied the performance of mutations that cause low (<0.5 kcal/mol), medium (0.5–2), and high (>2) absolute changes in binding free energy. As Supplementary Fig. S9 shows, all models perform better when the binding level changes more drastically. GearBind achieves superior performance in this region, with a PearsonR value of 0.707, compared to FoldX’s 0.411, showing its potential to identify mutations that could significantly enhance or disrupt binding. When the ∣ΔΔG_bind∣ is small, predictions from all methods have very low correlation with experimental ΔΔG_bind values, hinting either the noises in data or a deficiency of current tools in modeling weaker, more intricate interactions.

**Fig. 2: In silico evaluation on SKEMPI and the HER2 binders test set.**

To combine the advantages of both physics-based and deep-learning methods, we used the ensemble of all benchmarked methods to perform subsequent in silico affinity maturation. The prediction of the ensemble model is the simple average of prediction values from FoldX, Flex-ddG, GearBind, GearBind+P, and Bind-ddG. The proposed ensemble model outperforms each individual model in all four evaluation metrics (Table 1). We evaluated the contribution of individual models to the ensemble by excluding each of them and evaluating performance on SKEMPI. The results (Fig. 2f) show that excluding GearBind and GearBind+P hurts overall performance the most. Specifically, for the PearsonR metric, the exclusion of FoldX, Flex-ddG and Bind-ddG individually results in a marginal (less than 0.01) decrease, but the removal of GearBind causes a significant (more than 0.08) decline. We also note that, while FoldX is not the best-performing model when used in isolation, removing it from the ensemble results in the biggest SpearmanR drop. This shows that FoldX plays an important role in complementing the deep-learning models and forging a robust and accurate ensemble model. In fact, combining GearBind, GearBind+P and FoldX yields comparable performance to the 5-model ensemble (Supplementary Fig. S26).

Evaluation on the HER2 binders test set

With the models built and trained (on SKEMPI), we tested their performance on the HER2 binders test set, which we collected from Shanehsazzadeh et al.²⁷. This dataset contains high-quality binding affinity data, measured by surface plasmon resonance (SPR) on 419 HER2 binders with de novo designed CDR loops. The antibodies in the dataset are variants of Trastuzumab that have high edit distance (7.6 on average), making them potentially challenging for ΔΔG_bind predictors trained on low-edit-distance data. Among the benchmarked methods (Flex-ddG is not benchmarked due to its high time cost), GearBind+P achieved the best PearsonR and SpearmanR (Fig. 2c). We then averaged the predictions of all benchmarked models to form an ensemble model, and excluded each model from the ensemble to measure the performance change. Similarly, excluding GearBind(+P) hurts PearsonR the most, and excluding FoldX hurts SpearmanR the most, with GearBind(+P) closely following (Fig. 2d).

Affinity maturation of CR3022 and anti-5T4 UdAb

To validate the efficacy of our methodology, two antibodies, CR3022 and anti-5T4 UdAb, were selected as subjects for affinity maturation. The CR3022 antibody, originally isolated from a convalescent SARS patient²⁸, has been subsequently identified to bind to SARS-CoV-2^29,30. Meanwhile, a UdAb directed against the oncofetal antigen 5T4, is characterized by its exceptional stability³¹. Note that the two antibodies are in distinct formats and target distinct antigens. Both antigens have only one structurally similar protein chain (TM-score >0.8) with a different binding site in SKEMPI, making them challenging targets for our pipeline (Supplementary Tables S9 and S10 and Supplementary Figs. S14 and S15).

For CR3022, a total of 12 mutants were picked out in the first-round experimental validation according to the ensemble prediction of their binding affinity changes against the RBDs of the wild-type, BA.1.1, and BA.4 SARS-CoV-2 strains. We note that the wild-type and Delta RBDs share the same amino acids at the interface to CR3022. In an ELISA pre-experiment, we tested the binding of these mutants to the RBD of the SARS-CoV-2 Delta strain with antigen concentration at 100 nM. Nine out of twelve candidates exhibited improved binding compared to the wild-type CR3022 (Supplementary Fig. S16a). In the further validation with reduced RBD concentration at 10 nM, the EC₅₀ values for all 9 candidates were lower than the wild-type CR3022 (Fig. 3a, b). Based on these results, we combined the well-performed CR3022 mutations and synthesized 8 candidates with double or triple mutations as our second-round designs. Seven of the eight designed multi-point mutants exhibited enhanced binding against the Delta RBD, with 1.8–3.4-fold lower ELISA EC₅₀ compared to the wild-type. The triple mutant SH100D+SH103Y+SL33R has the lowest EC₅₀ at 0.06 nM (Fig. 3c, d). Against the Omicron Spike protein, the above seven multi-point mutants again displayed 7.6–17.0-fold binding increase with sub-nanomolar EC₅₀ values, among which the SH100D+SH103Y+SL33R triple mutant still exhibited the best performance (Fig. 3e, f). We next tested the binding of the mutants designed in the second round against the RBD of SARS-CoV, to examine if the binding optimization of CR3022 against SARS-CoV-2 RBDs caused a drastic change in binding to its original target. Seven out of eight mutants did not exhibit a significant change in ELISA EC₅₀ against SARS-CoV RBD (Fig. 3g, h). In summary, the above results demonstrate the success of our GearBind-based pipeline in CR3022 antibody affinity optimization.

**Fig. 3: ELISA binding assay results for CR3022 and anti-5T4 UdAb mutants designed with a GearBind-based pipeline.**

To demonstrate the generalizability of our method, we extended our experimental validation to the anti-5T4 UdAb. We developed 12 single-point mutants using the GearBind-based pipeline and verified their binding to 5T4 with ELISA. The highest binding UdAb mutant was S57W, with a 5.6-fold decrease in EC₅₀. This highlights the potential of our approach in enhancing antibody affinity for antibodies of different formats and with different targets, making it a promising tool for therapeutic antibody development.

We further validated the affinity-matured CR3022 antibodies and anti-5T4 UdAbs using Bio-layer Interferometry (BLI) to assess their binding affinities more accurately (Fig. 4). The 7 tested CR3022 mutants show a 3.1–6.1-fold increase in binding affinity against the Omicron Spike protein, with the best-performing mutant being SH100D+SH103Y+IL34W. The SH100D+SH103Y+SL33R triple mutant, identified by ELISA as the best-performing mutant, exhibits a 4.1-fold increase in binding affinity. The two tested anti-5T4 UdAb mutants, S54Y and S57W, exhibited a 1.8-fold and 2.1-fold improvement in binding affinity, respectively. Overall, the BLI measurements are consistent with the ELISA binding assay results and demonstrate the increased binding affinity of CR3022 and UdAb variants designed by the GearBind-based pipeline.

**Fig. 4: Bio-layer interferometry binding assay for CR3022 and anti-5T4 UdAbs candidates.**

Structural characteristics of optimized antibodies

Understanding the sequence-structure-function relationship of mutations designed by deep learning not only helps improve our models but also aids in interpreting their biological significance. To explore the underlying principles governing the increased antibody–antigen binding, we carried out molecular dynamics simulations and structural analyses on both the wild-type and mutant antibodies with the lowest ELISA EC₅₀, namely, the SH100D+SH103Y+SL33R triple mutant for CR3022, and the S57W mutant for the anti-5T4 UdAb. We conducted a 1 μs all-atom molecular dynamics simulation at room temperature for each system, including their respective wild-type counterparts (see “Methods” for details).

Based on the simulation results, among the four mutations studied, three demonstrated an increased number of hydrogen bonds with the target: SH100D and SL33R from CR3022, and S57W from UdAb (Fig. 5g, h). These four mutations in CR3022 and UdAb also stabilized the antibodies, as shown by the reduced fluctuations in Cα atoms in the corresponding antibody chains (Supplementary Fig. S20). Although the S103Y mutation in the heavy chain of CR3022 did not increase polar contacts, it potentially enhanced hydrophobic interactions between the antibody and antigen by excluding more solvent due to the larger size of tyrosine. In summary, the mutations designed by our pipeline likely achieved higher binding affinity through the formation of new interactions, while we also observed stabilized binding residues and altered structural properties in the mutated structures.

**Fig. 5: Structural analysis of optimized CR3022 and anti-5T4 UdAbs.**

Interestingly, the GearBind-predicted contribution (Supplementary Fig. S24) provided further insight into the formation of potential contacts for these designed mutations. Most contributions were found to be consistent with our deductions based on the molecular dynamics simulations. The possible hydrophobic interaction in S103Y was also presented in the contributions, providing further validation to our findings and aligning well with our deductions based on protein structure (Supplementary Fig. S24c).

Discussion

This study reports an in silico antibody maturation pipeline based on a pretrainable geometric graph neural network, GearBind, and the successful application of the pipeline on two distinct antibodies, CR3022 and anti-5T4 UdAb. Substantial in silico experiments were done to evaluate model performance and understand their strengths and limitations. The technical strengths of the proposed GearBind model can be summarized as follows: (1) In the graph construction phase, a multi-relational graph is built upon all heavy atoms on the interface. The relations defined cover both sequential proximity and spatial proximity. Replacing the all-atom graph to backbone-atom-only graph, or replacing the multi-relational graph to a simple kNN graph both cause severe performance decline. (2) In the feature extraction phase, a multi-level message-passing scheme is employed to obtain a comprehensive view on the intricate interactions at protein interfaces. (3) A pretraining algorithm based on contrastive learning is proposed. It harnesses the abundant, unlabeled single-chain protein structures in CATH, distilling knowledge about side-chain torsion angles into the model to further boost its performance.

We challenged our GearBind-based pipeline with two real-world antibody affinity maturation projects. ELISA binding assays showed that CR3022 mutations proposed by the pipeline have led to successful enhancements in binding against both the Delta RBD and the Omicron Spike protein. Notably, 7 out of 10 CR3022 single-point mutants and 9 out of 10 multi-point mutants showed a significant increase in binding, with up to 3.8-fold decrease in ELISA EC₅₀ for the Delta RBD and 17.0-fold for the Omicron Spike protein. Among 12 single-point anti-5T4 UdAb mutants, our pipeline achieved a maximum decrease of 5.6 times in ELISA EC₅₀ and 2.1 times in BLI-measured K_D. In short, GearBind proves to be an efficient and powerful tool for the design of antibodies with enhanced binding affinities. Based on the molecular dynamics simulations of the top-performing mutants identified by the GearBind-based pipeline, we observed that our designs enhanced binding affinity by creating new interactions or strengthening existing contacts, particularly hydrogen bonds. This provides insight into how GearBind learns from data and designs mutations that increase binding affinity.

While we mainly focus on structure-based methods in this work, others have explored purely sequence-based models for affinity maturation³². Our evaluation of ESM-1b and ESM-1v models on SKEMPI (Supplementary Table S4 and Supplementary Fig. S12) results in negative SpearmanR values, hinting that zero-shot prediction of large-scale protein language models is not a generally reliable method for ranking the binding affinity of protein complexes³³. This result is reasonable because the “fitness” of a peptide sequence, as modeled by protein language models, does not necessarily imply strong binding to all other biomolecules. For example, improved fitness of the Spike protein of SARS-CoV-2 would likely involve decreased binding affinity towards existing neutralizing antibodies. Another argument is that structural information plays a key role in building an accurate and reliable algorithm for protein–protein interactions³⁴.

Looking ahead, the potential applications of GearBind reach beyond protein–protein binding optimization. The model can be readily adapted to tackle protein–peptide and protein–ligand docking challenges, thereby opening up possibilities for its use in minibinder and enzyme design.

Despite these positive prospects, we acknowledge certain limitations in our current methodology and discuss potential directions for future work. First, the prerequisite for structure-based ΔΔG_bind prediction is an accurate complex structure, which is not easily available for most antibody–antigen pairs. To address this problem, homology modeling tools³⁵ can be used to build a complex structure from a template structure. This is how we built the complex structure of CR3022 binding to the Omicron RBDs. A more aggressive approach is to directly predict the complex structure from the sequence. As multimer structure prediction methods become more and more accurate³⁶, they might one day become reliable as the starting point of structure-based affinity maturation. Second, the reliance on external tools for mutant structure generation increases the time cost, and limits our action space to substitutions only. Future efforts can focus on training end-to-end models that directly predict the ΔΔG_bind, and models that can account for amino acid insertion and deletion. We also call for better pretraining strategy and architecture design to improve the generalization capabilities of deep-learning models, making them more robust on proteins they have not seen before. All in all, we believe our work takes a solid step towards building a reliable, robust, and efficient in silico affinity maturation pipeline that would bring tremendous opportunities to research and drug discovery applications.

Methods

Datasets

SKEMPI

We used the SKEMPI v2¹⁹ dataset for training and validation. The dataset contains 7085 ΔΔG_bind measurements on 348 complexes. We performed preprocessing following^13,18, discarding data with ambiguous ΔΔG_bind values (e.g., the mutant is a nonbinder without exact K_D measurements) or high ΔΔG_bind variance across multiple measurements (>1 kcal/mol), and arrived at 5729 distinct mutations, with their ΔΔG_bind measurements, on 340 complexes. See Supplementary Table S6 for a list of discarded, high-variance SKEMPI mutations. For each mutation, we sampled the mutant structure with FoldX 4⁹ based on the wild-type crystal structure. We used PDBFixer v1.8³⁷ to fix the PDB structures beforehand if the raw structure could not be processed by FoldX. These FoldX-derived SKEMPI structures are used to train deep-learning models, including Bind-ddG, GearBind, and GearBind+P. The resulting dataset was split into five subsets with roughly the same size using a split-by-PDB strategy, in order to perform cross-validation.

CATH

For pretraining, we use a non-redundant subset of CATH v4.3.0 domains, which contains 30,948 experimental protein structures with less than 40% sequence identity. We also remove proteins that exceed 2000 AAs in length for efficiency. During pretraining, we randomly truncate long sequences into 150 AAs for efficiency. It is important to note that, currently, our pretraining exclusively utilizes single-chain proteins. The information learned by single-chain pretraining can be transferred to downstream tasks on protein complexes and we have found that this approach alone is sufficient to yield improvement.

HER2 binders

The HER2 binders test set was collected from ref. ²⁷. The raw data include SPR data for 758 binders and 1097 non-binders. As all benchmarked methods only support amino acid substitutions, we filter out the binders that have different lengths compared to the wild-type antibody (Trastuzumab), leaving 419 Trastuzumab mutants. ΔΔG_bind values are calculated by $\Delta \Delta {G}_{{{{\rm{bind}}}}}=-RT\ln \left({K}_{{{{\rm{D}}}}}^{({{{\rm{mt}}}})}/{K}_{{{{\rm{D}}}}}^{({{{\rm{wt}}}})}\right)$ based on the SPR-measured binding affinity. Note that we only use this dataset as a test set to evaluate physics-based models (FoldX, Flex-ddG) and deep-learning models (Bind-ddG, GearBind, and GearBind+P) trained on SKEMPI.

GearBind implementation

Given a pair of wild-type and mutant structures, GearBind predicts the binding free energy change ΔΔG_bind by building a geometric encoder on a multi-relational graph, which is further enhanced by self-supervised pretraining. Note that the key feature that makes the neural network geometric is that it considers the spatial relationship between entities, i.e., nodes in a graph. In the following sections, we will discuss the construction of multi-relational graphs, multi-level message passing, and pretraining methods.

Constructing relational graphs for protein complex structures

Given a protein–protein complex, we construct a multi-relational graph for its interface and discard all other atoms. Here a residue is considered on the interface if its Euclidean distance to the nearest residue from the binding partner is no more than 6 Å. Each atom on the interface is regarded as a node in the graph. We add three types of edges to represent different interactions between these atoms. For two atoms with a sequential distance lower than 3, we add a sequential edge between them, the type of which is determined by their relative position in the protein sequence. For two atoms with spatial distance lower than 5 Å, we add a radial edge between them. Besides, each atom is also linked to its 10-nearest neighbors to guarantee the connectivity of the graph. Spatial edges that connect two atoms adjacent in the protein sequence are not interesting and thus discarded. The relational graph is denoted as $({{{\mathcal{V}}}},\, {{{\mathcal{E}}}},\, {{{\mathcal{R}}}})$ with ${{{\mathcal{V}}}},\, {{{\mathcal{E}}}},\, {{{\mathcal{R}}}}$ denoting the sets of nodes, edges, and relation types, respectively. We use the tuple (i, j, r) to denote the edge between atom i and j with type r. We use one-hot vectors of residue types and atom types as node features for each atom and further include sequential and spatial distances in edge features for each edge.

Building geometric encoder by multi-level message passing

On top of the constructed interface graph, we now perform multi-level message passing to model interactions between connected atoms, edges and residues. We use ${{{{\bf{a}}}}}_{i}^{(l)}$ and ${{{{\bf{e}}}}}_{(j,i,r)}^{(l)}$ to denote the representations of node i and edge (j, i, r) at the l-th layer. Specially, we use ${{{{\bf{a}}}}}_{i}^{(0)}$ to denote the node feature for atom i and ${{{{\bf{e}}}}}_{(j,i,r)}^{(0)}$ to denote the edge feature for edge (j, i, r). Then, the representations are updated by the following procedures:

$${{{{\bf{a}}}}}_{i}^{(l)}\leftarrow {{{\rm{AtomMP}}}}\left({{{{\bf{a}}}}}_{i}^{(l-1)}\right),$$

(1)

$${{{{\bf{e}}}}}_{(j,i,r)}^{(l)}\leftarrow {{{\rm{EdgeMP}}}}\left({{{{\bf{e}}}}}_{(j,i,r)}^{(l-1)}\right),$$

(2)

$${{{{\bf{a}}}}}_{i}^{(l)}\leftarrow {{{{\bf{a}}}}}_{i}^{(l)}+{{{\rm{AGGR}}}}\left({{{{\bf{e}}}}}_{(j,i,r)}^{(l)}\right),$$

(3)

$${{{{\bf{a}}}}}_{{{{\rm{C}}}}\alpha (i)}^{(l)}\leftarrow {{{{\bf{a}}}}}_{{{{\rm{C}}}}\alpha (i)}^{(l)}+{{{\rm{ResAttn}}}}\left({{{{\bf{a}}}}}_{{{{\rm{C}}}}\alpha (i)}^{(l)}\right).$$

(4)

First, we perform atom-level message passing (AtomMP) on the atom graph. Then, a line graph is constructed for the message passing between edges (EdgeMP) so as to learn effective representations between atom pairs. The edge representations are used to update atom representations via an aggregation function (AGGR). Finally, we take the representations ${{{{\bf{a}}}}}_{{{{\rm{C}}}}\alpha (i)}^{(l)}$ of the alpha carbon as residue representation and perform a residue-level attention mechanism (ResAttn), which can be seen as a special kind of message passing on a fully connected graph. In the following paragraphs, we will discuss these components in detail.

Atom-level message passing

Following GearNet²², we use a relational graph neural network (RGCN)³⁸ to pass messages between atoms. In a message-passing step, each node aggregates messages from its neighbors to update its own representation. The message is computed as the output of a relation (edge type)-specific linear layer when applied to the neighbor representation. Formally, the message-passing step is defined as:

$${{{\rm{AtomMP}}}}\left({{{{\bf{a}}}}}_{i}^{(l-1)}\right)={{{{\bf{a}}}}}_{i}^{(l-1)}+\sigma \left({{{\rm{BN}}}}\left({\sum}_{r\in {{{\mathcal{R}}}}}{{{{\bf{W}}}}}_{r}^{(a)}{\sum}_{(j,i,r)\in {{{\mathcal{E}}}}}{{{{\bf{a}}}}}_{j}^{(l-1)}\right)\right),$$

where BN( ⋅ ) denotes batch norm and σ( ⋅ ) is the ReLU activation function.

Edge-level message passing and aggregation

Modeling sequential proximity or spatial distance alone is not enough for capturing the complex protein–protein interactions (PPI) contributing to the binding. Multiple works have demonstrated the benefits of incorporating angular information using edge-level message passing^20,22,39. Here we construct a line graph⁴⁰, i.e., a relational graph among all edges of the above atom-level graph. Two edges are connected if and only if they share a common end node. The relations, or edge types, are defined as the angle between the atom-level edge pair, discretized into eight bins. We use $({{{{\mathcal{V}}}}}^{{\prime} },\, {{{{\mathcal{E}}}}}^{{\prime} },\, {{{{\mathcal{R}}}}}^{{\prime} })$ to denote the constructed line graph. Then, relational message passing is used on the line graph:

$${{{\rm{EdgeMP}}}}\left({{{{\bf{e}}}}}_{x}^{(l-1)}\right)=\sigma \left({{{\rm{BN}}}}\left({\sum}_{{r}^{{\prime} }\in {{{{\mathcal{R}}}}}^{{\prime} }}{{{{\bf{W}}}}}_{{r}^{{\prime} }}^{(e)}{\sum}_{(y,x,{r}^{{\prime} })\in {{{{\mathcal{E}}}}}^{{\prime} }}{{{{\bf{e}}}}}_{y}^{(l-1)}\right)\right),$$

where x and y denote edge tuples in the original graph for abbreviation.

Once we updated the edge representations, we aggregate them into its end nodes. These representations are fed into a linear layer and multiplied with the edge type-specific kernel matrix ${{{{\bf{W}}}}}_{r}^{(a)}$ in AtomMP:

$${{{\rm{AGGR}}}}\left({{{{\bf{e}}}}}_{(j,i,r)}^{(l)}\right)=\sigma \left({{{\rm{BN}}}}\left({\sum}_{r\in {{{\mathcal{R}}}}}{{{{\bf{W}}}}}_{r}^{(a)}{\sum}_{(j,i,r)\in {{{\mathcal{E}}}}}{{{\rm{Linear}}}}\left({{{{\bf{e}}}}}_{(j,i,r)}^{(l)}\right)\right)\right),$$

which will be used to update the representation for atom i as in Eq. (3).

Residue-level message passing

Constrained by the computational complexity, atom and edge-level message passing only consider sparse interactions while ignoring global interactions between all pairs of residues. By modeling a coarse-grained view of the interface at the residue level, we are able to perform message passing between all pairs of residues. To do this, we design a geometric graph attention mechanism, which takes the representations of the alpha carbon of residues as input and updates their representations with the output as in Eq. (4). Here we follow the typical definition of self-attention to calculate attention logits with query and key vectors and apply the probability on the value vectors to get residue representations r_i:

$${\alpha }_{ij}= {{{{\rm{Softmax}}}}}_{j}\left(\frac{1}{d}\cdot {{{{\rm{Linear}}}}}_{q}\left({{{{\bf{a}}}}}_{{{{\rm{C}}}}\alpha (i)}^{(l)}\right)\cdot {{{{\rm{Linear}}}}}_{k}\left({{{{\bf{a}}}}}_{{{{\rm{C}}}}\alpha (j)}^{(l)}\right)\right),\\ {{{{\boldsymbol{r}}}}}_{i}^{(l)}= {\sum}_{j}{\alpha }_{ij}\cdot {{{{\rm{Linear}}}}}_{v}\left({{{{\bf{a}}}}}_{{{{\rm{C}}}}\alpha (i)}^{(l)}\right),$$

where d is the hidden dimension of the representation ${{{{\bf{a}}}}}_{{{{\rm{C}}}}\alpha (i)}^{(l)}$ and the Softmax function is taken over all j.

Besides traditional self-attention, we also include geometric information in the attention mechanism, which should be invariant to roto-translational transformations on the global complex structure. Therefore, we construct a local frame for each residue with coordinates of its Nitrogen, Carbon and alpha Carbon atoms:

$$\begin{array}{rcl}&&{{{{\bf{v}}}}}_{i1}={{{{\bf{x}}}}}_{{{{\rm{N}}}}(i)}-{{{{\bf{x}}}}}_{{{{\rm{C}}}}\alpha (i)},\\ &&{{{{\bf{v}}}}}_{i2}={{{{\bf{x}}}}}_{{{{\rm{C}}}}(i)}-{{{{\bf{x}}}}}_{{{{\rm{C}}}}\alpha (i)},\\ &&{{{{\bf{u}}}}}_{i1},{{{{\bf{u}}}}}_{i2}={{{\rm{GramSchmidt}}}}({{{{\bf{v}}}}}_{i1},{{{{\bf{v}}}}}_{i2})\\ &&{{{{\bf{R}}}}}_{i}=\left[\frac{{{{{\bf{u}}}}}_{i1}}{\parallel {{{{\bf{u}}}}}_{i1}\parallel },\frac{{{{{\bf{u}}}}}_{i2}}{\parallel {{{{\bf{u}}}}}_{i2}\parallel },\frac{{{{{\bf{u}}}}}_{i1}}{\parallel {{{{\bf{u}}}}}_{i1}\parallel }\times \frac{{{{{\bf{u}}}}}_{i2}}{\parallel {{{{\bf{u}}}}}_{i2}\parallel }\right],\end{array}$$

where we use x to denote the coordinate of an atom and GramSchmidt( ⋅ ) refers to the Gram-Schmidt process for orthogonalization. Then, the geometric attention is designed to model the relative position of beta carbons of all residues j in the local frame of residue i:

$${{{{\bf{p}}}}}_{i}^{(l)}={\sum}_{j}{\alpha }_{ij}{{{{\bf{R}}}}}_{i}^{\top }\left({{{{\bf{x}}}}}_{{{{\rm{C}}}}\alpha (i)}-{{{{\bf{x}}}}}_{{{{\rm{C}}}}\beta (j)}\right),$$

where ${{{{\bf{p}}}}}_{i}^{(l)}$ is the spatial representations for the residue i. When the complex structure is rotated, the frame R_i and relative position x_Cα(i) − x_Cβ(j) are rotated accordingly and the effects will be counteracted, which guarantees the rotation invariance of our model.

The final output is the concatenation of residue representations ${{{{\boldsymbol{r}}}}}_{i}^{(l)}$ and spatial representations ${{{{\bf{p}}}}}_{i}^{(l)}$:

$${{{\rm{ResAttn}}}}\left({{{{\bf{a}}}}}_{{{{\rm{C}}}}\alpha (i)}^{(l)}\right)={{{\rm{Concat}}}}\left({{{{\boldsymbol{r}}}}}_{i}^{(l)},\, {{{{\bf{p}}}}}_{i}^{(l)}\right).$$

After obtaining representations for each atom, we apply a mean pooling layer over representations of all alpha carbons a_Cα(i) to get protein representations h. An antisymmetric prediction head is then applied to guarantee that back mutations would have the exact opposite predicted ΔΔG_bind values:

$$\widetilde{\Delta \Delta {G}_{{{{\rm{bind}}}}}}={{{\rm{MLP}}}}\left.\right({{{{\bf{h}}}}}^{({{{\rm{wt}}}})},\, {{{{\bf{h}}}}}^{\left.\right({{{\rm{mt}}}}}\left.\right)-{{{\rm{MLP}}}}({{{{\bf{h}}}}}^{({{{\rm{mt}}}})},\, {{{{\bf{h}}}}}^{({{{\rm{wt}}}})}),$$

(5)

where h^(wt) and h^(mt) denote the representations for wild-type and mutant complexes and $\widetilde{\Delta \Delta {G}_{{{{\rm{bind}}}}}}$ is the predicted ΔΔG_bind from our GearBind model.

Modeling energy landscape of proteins via noise contrastive estimation

As paired binding free energy change data is of relatively small size, it would be beneficial to pretrain GearBind with massive protein structural data. The high-level idea of our pretraining method is to model the distribution of native protein structures, which helps identify harmful mutations yielding unnatural structures. Denoting a protein structure as x, its distribution can be modeled with Boltzmann distribution as:

$$p({{{\bf{x}}}};{{{\boldsymbol{\theta }}}})=\frac{\exp (-E({{{\bf{x}}}};{{{\boldsymbol{\theta }}}}))}{A({{{\boldsymbol{\theta }}}})},\, A({{{\boldsymbol{\theta }}}})=\int\exp (-E({{{\bf{x}}}};{{{\boldsymbol{\theta }}}}))d{{{\bf{x}}}},$$

(6)

where θ denotes learnable parameters in our encoder, E(x; θ) denotes the energy function for the protein x and A(θ) is the partition function to normalize the distribution. The energy function is predicted by applying a linear layer on the GearBind representations h(x) of protein x:

$$E({{{\bf{x}}}};{{{\boldsymbol{\theta }}}})={{{\rm{Linear}}}}({{{\bf{h}}}}({{{\bf{x}}}})).$$

(7)

Given the observed dataset {x₁, . . . , x_T} from PDB, our objective is to maximize the probability of these samples:

$${{{\rm{maximize}}}}\,\,\frac{1}{2T}{\sum}_{t}\log p({{{{\bf{x}}}}}_{t};{{{\boldsymbol{\theta }}}}).$$

(8)

However, direct optimization of this objective is intractable, since calculating the partition function requires integration over the whole protein structure space. To address this issue, we adopt a popular method for learning energy-based models called noise contrastive estimation²⁴. For each observed structure x_t, we sample a negative structure y_t and then the problem can be transformed to a binary classification task, i.e., whether a sample is observed in the dataset or not.

$${{{\rm{minimize}}}}\,\,\frac{1}{2T}{\sum}_{t}\log \left[\sigma (E({{{{\bf{x}}}}}_{t};{{{\boldsymbol{\theta }}}})-E({{{{\bf{y}}}}}_{t};{{{\boldsymbol{\theta }}}}))\right],$$

(9)

where σ( ⋅ ) denotes the sigmoid function for calculating the probability for a sample x_t belonging to the positive class. We could see that the above training objective tries to push down the energy of the positive examples (i.e., the observed structures) while pushing up the energy of the negative samples (i.e., the mutant structures.

For negative sampling, we perform random single-point mutations on the corresponding positive samples and then generate its conformation by keeping the backbone unchanged and sampling side-chain torsional angles at the mutation site from a backbone-dependent rotamer library²⁵. Besides, to further enhance the model’s capability to distinguish structural noises, we randomly choose 30% residues to randomly rotate torsional angles when generating negative samples.

After pretraining on the CATH database, we finetune the GearBind encoder on downstream tasks for prediction to avoid overfitting.

Cross-validation on SKEMPI

During cross-validation, a model is trained and tested five times, each time using a different subset as the test set and the remaining four subsets as the training set. Results are calculated for each test set, and their mean and standard error of mean are reported as the final cross-validation performance. During the process of cross-validation, each individual data point is incorporated into the test set precisely once. This ensures that a comprehensive “test result table” is compiled, which includes predictive values for each data point when it is part of the test set. Subsequent performance analyses are done by splitting this table by various criteria and evaluate performance on each subset.

After cross-validation on SKEMPI, we obtain five sets of model parameters. During inference, we use the mean of the predicted values of these five checkpoints as the model prediction result.

Baseline implementation details

FoldX

In this work, we use FoldX 4⁹ for mutant structure generation. First, each PDB file is processed with the RepairPDB command for structural corrections. Then, the wild-type, mutant structure pair is built using the BuildModel command. We use the AnalyseComplex command to get the FoldX ΔΔG_bind prediction based on the wild-type and mutant structures.

Flex-ddG

We run Flex-ddG with its official implementation at https://github.com/Kortemme-Lab/flex_ddG_tutorial. Each PDB file is processed with PDBFixer v1.8³⁷. Using the default parameters, we sample 35 structure models for each mutation, with the number of backrub trails set to 35,000 and the energy function set to fa_talaris2014. The final ΔΔG_bind values are predicted with a generalized additive model that reweights the score terms.

GearBind(+P)

We implement GearBind with the TorchDrug library⁴¹. For message passing, we employed a 4-layer GearBind model with a hidden dimension of 128. Regarding edge message passing, the connections between edges are categorized into 8 bins according to the angles between them. To predict the ΔΔG_bind value from graph representations, we utilized a 2-layer MLP.

The model was trained using the Adam optimizer with a learning rate of 1e-4 and a batch size of 8. The training process is performed on 1 A100 GPU for 40 epochs. For pretraining, we use the same architecture with 4-layer GearBind model with a hidden dimension of 128. The pretraining was conducted using the Adam optimizer with a learning rate of 5e-4 and a batch size of 8, employing 4 A100 GPUs for 10 epochs.

Bind-ddG

To ensure a fair comparison, we re-implement and re-train the Bind-ddG model on our SKEMPI splits. We follow the configuration of the original implementation to set the dimensions of hidden and pair representations at 128 and 64, respectively. Our validation performance indicates that the optimal configuration for our setup includes a two-layer geometric attention mechanism and a four-layer MLP predictor. We trained the model using an Adam optimizer with a learning rate of 1e-4 and a batch size of 8, on a single A400 GPU, for a total of 40 epochs.

In silico affinity maturation of CR3022 and anti-5T4 UdAb

PDB 6XC3⁴², in which chains H and L comprise antibody CR3022 and chain C is the SARS-CoV-2 RBD, was chosen as the starting complex for CR3022 affinity maturation. To better simulate the CR3022 interaction with Omicron RBD, we constructed the complex structures for BA.4 and BA.1.1 mutants with SWISS-MODEL³⁵. We then performed saturation mutagenesis on the CDRs of CR3022 and generated mutant structures using FoldX⁹ and Flex-ddG¹⁰. Specifically, residues 26–35, 50–66, 99–108 from the heavy chain H and residues 24–40, 56–62, 95–103 from the light chain L are mutated. This totals 1400 single-point mutations (if we count the self-mutations). We use our ensemble model to rank the mutations and select the top-ranked mutants for synthesis and subsequent experimental validation. Mutations are ranked by the modified z-score (where values are subtracted by the median rather than the mean to be less sensitive to outliers) averaged across multiple ΔΔG_bind prediction methods.

An unpublished complex structure was used to optimize anti-5T4 UdAb. As the single-___domain antibody binding two distinct epiopes on 5T4 (Fig. 5b), anti-5T4 UdAb has a larger interface region compared to traditional antibodies. After analyzing its interface with 5T4, we decided to perform saturation mutagenesis on residues 1, 3, 25, 27–30, 31–33, 39–45, 52–57, 59, 91–93, 95, 99, 100–102, 103, 105, 110, 112, 115–117. This totals 780 single-point mutations (if we count the self-mutations) that go through the same ranking and selection strategies as described above.

Antigen preparation

The gene encoding SARS-CoV RBD was synthesized by Genscript (Nanjing, China) and subcloned into pSectag 2B vector with C-terminal human IgG1 Fc fragment and AviTag. The recombinant vector was transfected to Expi 293 cells and cultured at 37 °C for 5 days, followed by centrifugation at 2200×g for 20 min. The supernatant was harvested and filtered through a 0.22-μm vacuum filter. The protein G resin (Genscript) was loaded into the column, washed by PBS, and flow the supernatant through to fully combine the resin. Then the targeted protein was eluted with 0.1 M glycine (pH 3.0) and neutralized with 1 M Tris-HCL (pH 9.0), followed by buffer-exchanged and concentrated with phosphate buffered saline (PBS) using an Amicon ultra centrifugal concentrator (Millipore) with a molecular weight cut-off of 3 kDa. Protein concentration was measured using the NanoDrop 2000 spectrophotometer (Thermo Fisher), and protein purity was examined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE). The Delta RBD protein was purchased from Vazyme (Nanjing, China) and Omicron S protein was purchased from ACROBiosystems (Beijing, China). The biotinylated human TPBG/5T4 and human TPBG/5T4-Fc antigen were purchased from ACROBiosystems (Beijing, China).

Preparation for mutant and wild-type CR3022 antibodies

The heavy chain and light chain genes of different CR3022 antibodies were synthesized and subcloned into expression vector pcDNA 3.4 in IgG1 format. These constructed vectors were transfected into CHO cells and purified by Protein A. All antibodies were produced by Biointron Biological Inc. (Shanghai, China).

Generation of mutant and wild-type anti-5T4 UdAbs

The pComb3x vector encoding the gene of wild-type anti-5T4 UdAb was constructed in previous work and preserved in our laboratory. All anti-5T4 UdAb mutants with single-point mutation were constructed with QuickMutation™ Site-directed Gene Mutagenesis Kit (Beyotime, Shanghai, China) following the manufacturer’s protocol. The expression of different mutant and wild-type anti-5T4 UdAbs were performed in E. coli HB2151 bacterial culture at 30 °C for 16 h accompanied by 1 mM isopropyl b-d-1-thiogalactopyranoside (IPTG). The cells were harvested and lysed by polymyxin B at 30 °C for 0.5 h, followed by centrifugation at 8800×g for 10 min. The supernatant was collected, filtered through 0.8 μm polyethersulphone membranes by sterile syringes and purified by Ni-NTA (Smart Lifesciences) following the manufacturer’s instructions. Briefly, the filtered supernatant was loaded over the column with Ni-NTA. The resin was washed by washing buffer (10 mM Na₂HPO₄, 10 mM NaH2PO4 [pH 7.4], 500 mM NaCl, and 20 mM imidazole), and proteins were eluted in elution buffer (10 mM Na₂HPO₄, 10 mM NaH₂PO₄ [pH 7.4], 500 mM NaCl, and 250 mM imidazole). The collected pure fractions were immediately buffer-exchanged into PBS and concentrated with Amicon ultra centrifugal concentrators (Millipore). Protein concentration was measured using the NanoDrop 2000 spectrophotometer (Thermo Fisher), and protein purity was examined by SDS-PAGE.

Enzyme-linked immunosorbent assay (ELISA)

For comparison of different CR3022 mutants, the RBD of Delta (B.1.617.2) strain and S protein of Omicron (B.1.1.529) strain at 100 ng per well was coated in 96 wells half area microplates (Corning #3690) overnight at 4 °C. The antigen-coated plate was washed by three times with PBST (PBS with 0.05% Tween-20) and blocked with 3% MPBS (PBS with 3% skim milk) at 37 °C for 1 h. Following three times washing with PBST, 50 μL of threefold serially diluted antibody in 1% MPBS was added and incubated at 37 °C for 1.5 h. The HRP-conjugated anti-Fab and anti-Fc (Sigma-Aldrich) secondary antibodies were used for the detection of different tested antibodies. After washing with PBST for five times, the enzyme activity was measured after the addition of ABTS substrate (Invitrogen) for 15 min. The data were acquired by measuring the absorbance at 405 nm using a Microplate Spectrophotometer (Biotek) and the EC₅₀ (concentration for 50% of maximal effect) was calculated by GraphPad Prism8.0 software. To verify different UdAb mutants, the same experimental protocol as mentioned above was adopted. Briefly, the human TPBG/5T4-Fc antigen was coated in 96 wells half area microplates, then blocked with 3% MPBS and added serial diluted antibodies. The HRP-conjugated anti-Flag (Sigma-Aldrich) secondary antibody was used, followed by adding ABTS substrate and detected at 405 nm. The reported EC₅₀ value is the mean value from three duplicates on a single experiment.

Bio-layer interferometry (BLI) binding assay

The binding kinetics of different antibodies to SARS-CoV-2 Omicron S and 5T4 antigens were measured by BLI on an Octet-RED96 (ForteBio). Briefly, the his-tagged Omicron S protein at 8 μg/ml and biotinylated 5T4 protein at 5 μg/ml were separately loaded onto Ni-NTA and streptavidin-coated (SA) biosensors. The antigen immobilized sensors were incubated with threefold serially diluted CR3022 candidates or two-fold serially diluted anti-5T4 UdAbs starting at 300 nM in 0.02% PBST for 300 s for association, and then immersed into 0.02% PBST for another 300 s at 37 °C for dissociation. All the curves were fitted by 1:1 binding model using the Data Analysis software 11.1. All K_D values were determined with R² values of greater than 98% confidence level.

Protein structure and ΔΔG _bind contribution analysis

Protein structure analysis is conducted by python scripts. The antibody–antigen complex structure after mutation was obtained from Rosetta Flex-ddG relaxation¹⁰. The relaxed protein structure can provide more accurate side-chain conformations, which are critical for accurate contact and conformational analysis. The improved accuracy of such analyses enables a deeper understanding of the underlying binding mechanisms and can facilitate the identification of key characteristics involved in protein–protein interactions. The contribution scores are derived by using Integrated Gradients (IG)⁴³, a model-agnostic attribution method, on GearBind to obtain residue-level interpretation following²². All protein structure figures are created with PyMOL v3.0.

Molecular dynamics simulation

For antibody mutation structural analysis, we conducted molecular dynamics simulations of the wild-type and mutant antibody–antigen complex. Initial structures were taken from the Rosetta Flex-ddG relaxed structures used by GearBind. The LEaP module in the AMBER 22 suite was used for building starting structures and adding ions and solvent for the simulation⁴⁴. The protonation states of the molecules were kept at the default settings as assigned by LEaP during the initial structure preparation. All systems were simulated with the ff19SB protein force field⁴⁵ and solvated in boxes of water with the OPC3⁴⁶ solvent model. Simulated systems were solvated using a 10 Angstrom solvent box. All bonds involving hydrogen atoms were constrained with the SHAKE algorithm⁴⁷. The particle mesh Ewald (PME) algorithm was used to calculate long-range electrostatic interactions⁴⁸. Initial structures were relaxed with maximum 10,000 steps of minimization before convergence, then subjected to heating for 20 ps and equilibrating for 10 ps in the NPT ensemble with PMEMD. The CUDA version of PMEMD was used to accelerate the simulations⁴⁹. The simulation temperature was set to room temperature 298K. All systems were simulated for 1 μs production MD with one replica, and samples in the first 50 ns were not used for analysis. CPPTRAJ module in AMBERTools was used to analyze the simulation results, including calculating the root mean square deviation (RMSD), room-mean square fluctuation (RMSF), hydrogen bonding, and dihedral angle⁵⁰.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The raw SKEMPI database can be accessed at https://life.bsc.es/pid/skempi2. The CATH database can be accessed via https://www.cathdb.info/. The raw HER2 binders data can be accessed via https://github.com/AbSciBio/unlocking-de-novo-antibody-design/blob/main/spr-controls.csv. Source data are provided with this paper.

Code availability

The GearBind inference code, the trained model checkpoints and the dataset preprocessing scripts are available via https://github.com/DeepGraphLearning/GearBind under the Apache 2.0 License. They can also be accessed via Zenodo⁵¹.

Change history

07 October 2024
A Correction to this paper has been published: https://doi.org/10.1038/s41467-024-52887-1

References

Victora, G. D. & Nussenzweig, M. C. Germinal centers. Annu. Rev. Immunol. 40, 413–442 (2022).
Article CAS PubMed Google Scholar
Gram, H. et al. In vitro selection and affinity maturation of antibodies from a naive combinatorial immunoglobulin library. Proc. Natl. Acad. Sci. USA 89, 3576–3580 (1992).
Article ADS CAS PubMed PubMed Central Google Scholar
Jackson, J. R., Sathe, G., Rosenberg, M. & Sweet, R. In vitro antibody maturation. improvement of a high affinity, neutralizing antibody against il-1 beta. J. Immunol. 154, 3310–3319 (1995).
Article CAS PubMed Google Scholar
Li, B. et al. In vitro affinity maturation of a natural human antibody overcomes a barrier to in vivo affinity maturation. In MAbs Vol. 6, 437–445 (Taylor & Francis, 2014).
Wang, J. et al. In vitro affinity maturation of antibody against membrane-bound GPCR molecules. Appl. Microbiol. Biotechnol. 103, 7703–7717 (2019).
Article CAS PubMed PubMed Central Google Scholar
Murphy, K. & Weaver, C. Janeway’s immunobiology. Garland Sci. 146–151 (2016).
Sulea, T., Hussack, G., Ryan, S., Tanha, J. & Purisima, E. O. Application of assisted design of antibody and protein therapeutics (adapt) improves efficacy of a Clostridium difficile toxin a single-___domain antibody. Sci. Rep. 8, 2260 (2018).
Article ADS PubMed PubMed Central Google Scholar
Shan, S. et al. Deep learning guided optimization of human antibody against sars-cov-2 variants with broad neutralization. Proc. Natl. Acad. Sci. USA 119, 2122954119 (2022).
Article Google Scholar
Delgado, J., Radusky, L. G., Cianferoni, D. & Serrano, L. Foldx 5.0: working with RNA, small molecules and a new graphical interface. Bioinformatics 35, 4168–4169 (2019).
Article CAS PubMed PubMed Central Google Scholar
Barlow, K. A. et al. Flex ddg: Rosetta ensemble-based estimation of changes in protein–protein binding affinity upon mutation. J. Phys. Chem. B 122, 5389–5399 (2018).
Article CAS PubMed PubMed Central Google Scholar
Park, H. et al. Simultaneous optimization of biomolecular energy functions on features from small molecules and macromolecules. J. Chem. Theory Comput. 12, 6201–6212 (2016).
Article CAS PubMed PubMed Central Google Scholar
Frenz, B. et al. Prediction of protein mutational free energy: benchmark and sampling improvements increase classification accuracy. Front. Bioeng. Biotechnol. 8, 558247 (2020).
Article PubMed PubMed Central Google Scholar
Rodrigues, C. H., Myung, Y., Pires, D. E. & Ascher, D. B. mcsm-ppi2: predicting the effects of mutations on protein–protein interactions. Nucleic Acids Res. 47, 338–344 (2019).
Article Google Scholar
Myung, Y., Rodrigues, C. H., Ascher, D. B. & Pires, D. E. mcsm-ab2: guiding rational antibody design using graph-based signatures. Bioinformatics 36, 1453–1459 (2020).
Article CAS PubMed Google Scholar
Petukh, M., Dai, L. & Alexov, E. Saambe: webserver to predict the charge of binding free energy caused by amino acids mutations. Int. J. Mol. Sci. 17, 547 (2016).
Article PubMed PubMed Central Google Scholar
Huang, X., Zheng, W., Pearce, R. & Zhang, Y. Ssipe: accurately estimating protein–protein binding affinity change upon mutations using evolutionary profiles in combination with an optimized physical energy function. Bioinformatics 36, 2429–2437 (2020).
Article CAS PubMed Google Scholar
Jiang, Y. et al. Dgcddg: deep graph convolution for predicting protein-protein binding affinity changes upon mutations. IEEE/ACM Trans. Comput. Biol. Bioinform. 20, 2089–2100 (2023).
Liu, X., Luo, Y., Li, P., Song, S. & Peng, J. Deep geometric representations for modeling effects of mutations on protein-protein binding affinity. PLoS Comput. Biol. 17, 1009284 (2021).
Article ADS Google Scholar
Jankauskaitė, J., Jiménez-García, B., Dapkūnas, J., Fernández-Recio, J. & Moal, I. H. SKEMPI 2.0: an updated benchmark of changes in protein-protein binding energy, kinetics and thermodynamics upon mutation. Bioinformatics 35, 462–469 (2018).
Article PubMed Central Google Scholar
Jumper, J. et al. Highly accurate protein structure prediction with alphafold. Nature 596, 583–589 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).
Article ADS MathSciNet CAS PubMed Google Scholar
Zhang, Z., Xu, M., Jamasb, A. R., Chenthamarakshan, V., Lozano, A., Das, P. & Tang, J. Protein Representation Learning by Geometric Structure Pretraining. In The Eleventh International Conference on Learning Representations, (2023).
Zhang, Z., Xu, M., Lozano, A. C., Chenthamarakshan, V., Das, P. & Tang, J. Pre-training protein encoder via siamese sequence-structure diffusion trajectory prediction. Adv. Neural. Inf. Process. Syst. 36 (2024).
Gutmann, M. & Hyvärinen, A. Noise-contrastive estimation: a new estimation principle for unnormalized statistical models. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 297–304 (JMLR Workshop and Conference Proceedings, 2010).
Shapovalov, M. V. & Dunbrack, R. L. A smoothed backbone-dependent rotamer library for proteins derived from adaptive kernel density estimates and regressions. Structure 19, 844–858 (2011).
Article CAS PubMed PubMed Central Google Scholar
Zhang, Y. & Skolnick, J. Scoring function for automated assessment of protein structure template quality. Proteins: Struct., Funct., Bioinforma. 57, 702–710 (2004).
Article CAS Google Scholar
Shanehsazzadeh, A. et al. Unlocking de novo antibody design with generative artificial intelligence. Preprint at https://www.biorxiv.org/content/10.1101/2023.01.08.523187v1 (2023).
Ter Meulen, J. et al. Human monoclonal antibody combination against sars coronavirus: synergy and coverage of escape mutants. PLoS Med. 3, 237 (2006).
Article Google Scholar
Tian, X. et al. Potent binding of 2019 novel coronavirus spike protein by a SARS coronavirus-specific human monoclonal antibody. Emerg. Microbes Infect. 9, 382–385 (2020).
Article CAS PubMed PubMed Central Google Scholar
Yuan, M. et al. A highly conserved cryptic epitope in the receptor binding domains of SARS-CoV-2 and SARS-CoV. Science 368, 630–633 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Wu, Y. et al. A highly stable human single-___domain antibody-drug conjugate exhibits superior penetration and treatment of solid tumors. Mol. Ther. 30, 2785–2799 (2022).
Article CAS PubMed PubMed Central Google Scholar
Hie, B. L. et al. Efficient evolution of human antibodies from general protein language models. Nat. Biotechnol. 42, 275–283 (2024).
Article CAS PubMed Google Scholar
Shanker, V. R., Bruun, T. U., Hie, B. L. & Kim, P. S. Unsupervised evolution of protein and antibody complexes with a structure-informed language model. Science 385, 46–53 (2024).
Article CAS PubMed Google Scholar
Shanker, V. R., Bruun, T. U., Hie, B. L. & Kim, P. S. Inverse folding of protein complexes with a structure-informed language model enables unsupervised antibody evolution. Preprint at https://www.biorxiv.org/content/10.1101/2023.12.19.572475v2.full.pdf (2023).
Waterhouse, A. et al. Swiss-model: homology modelling of protein structures and complexes. Nucleic Acids Res. 46, 296–303 (2018).
Article Google Scholar
Abramson, J. et al. Accurate structure prediction of biomolecular interactions with alphafold 3. Nature 630, 493–500 (2024).
Eastman, P. PDBFixer. GitHub (2013).
Schlichtkrull, M. et al. Modeling relational data with graph convolutional networks. In European Semantic Web Conference, 593–607 (Springer, 2018).
Gasteiger, J., Groß, J. & Günnemann, S. Directional message passing for molecular graphs. In International Conference on Learning Representations (ICLR) (ICLR, 2020).
Harary, F. & Norman, R. Z. Some properties of line digraphs. Rendiconti del. Circolo Matematico di Palermo 9, 161–168 (1960).
Article MathSciNet Google Scholar
Zhu, Z. et al. Torchdrug: A powerful and flexible machine learning platform for drug discovery. Preprint at https://arxiv.org/abs/2202.08320 (2022).
Yuan, M. et al. Structural basis of a shared antibody response to sars-cov-2. Science 369, 1119–1123 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Sundararajan, M., Taly, A. & Yan, Q. Axiomatic attribution for deep networks. In International Conference on Machine Learning 3319–3328 (PMLR, 2017).
Case, D. A. et al. The amber biomolecular simulation programs. J. Comput. Chem. 26, 1668–1688 (2005).
Article CAS PubMed PubMed Central Google Scholar
Tian, C. et al. ff19sb: amino-acid-specific protein backbone parameters trained against quantum mechanics energy surfaces in solution. J. Chem. Theory Comput. 16, 528–552 (2019).
Article PubMed Google Scholar
Izadi, S., Anandakrishnan, R. & Onufriev, A. V. Building water models: a different approach. J. Phys. Chem. Lett. 5, 3863–3871 (2014).
Article CAS PubMed PubMed Central Google Scholar
Ryckaert, J.-P., Ciccotti, G. & Berendsen, H. J. Numerical integration of the cartesian equations of motion of a system with constraints: molecular dynamics of n-alkanes. J. Comput. Phys. 23, 327–341 (1977).
Article ADS CAS Google Scholar
Darden, T., York, D. & Pedersen, L. Particle mesh ewald: an nlog (n) method for ewald sums in large systems. J. Chem. Phys. 98, 10089–10092 (1993).
Article ADS CAS Google Scholar
Salomon-Ferrer, R., Gotz, A. W., Poole, D., Le Grand, S. & Walker, R. C. Routine microsecond molecular dynamics simulations with AMBER on GPUs. 2. explicit solvent particle mesh Ewald. J. Chem. Theory Comput. 9, 3878–3888 (2013).
Article CAS PubMed Google Scholar
Roe, D. R. & Cheatham III, T. E. PTRAJ and CPPTRAJ: software for processing and analysis of molecular dynamics trajectory data. J. Chem. Theory Comput. 9, 3084–3095 (2013).
Article CAS PubMed Google Scholar
Zhang, Z. & Cai, H. Source code of GearBind: pretrainable geometric graph neural network for antibody affinity maturation. Zenodo https://doi.org/10.5281/zenodo.13085795 (2024).

Download references

Author information

These authors contributed equally: Huiyu Cai, Zuobai Zhang, Mingkai Wang, Bozitao Zhong.

Authors and Affiliations

BioGeometry, Beijing, China
Huiyu Cai & Jian Tang
Mila-Québec AI Institute, Montréal, QC, Canada
Huiyu Cai, Zuobai Zhang, Bozitao Zhong & Jian Tang
Department of Computer Science and Operations Research, Université de Montréal, Montréal, QC, Canada
Huiyu Cai, Zuobai Zhang & Bozitao Zhong
Shanghai Engineering Research Center for Synthetic Immunology, Fudan University, Shanghai, China
Mingkai Wang, Quanxiao Li, Yuxuan Zhong, Yanling Wu & Tianlei Ying
MOE/NHC/CAMS Key Laboratory of Medical Molecular Virology, Shanghai Frontiers Science Center of Pathogenic Microorganisms and Infection, Shanghai Institute of Infectious Disease and Biosecurity, School of Basic Medical Sciences, Fudan University, Shanghai, China
Mingkai Wang, Quanxiao Li, Yuxuan Zhong, Yanling Wu & Tianlei Ying
Department of Decision Sciences, HEC Montréal, Montréal, QC, Canada
Jian Tang

Authors

Huiyu Cai
View author publications
Search author on:PubMed Google Scholar
Zuobai Zhang
View author publications
Search author on:PubMed Google Scholar
Mingkai Wang
View author publications
Search author on:PubMed Google Scholar
Bozitao Zhong
View author publications
Search author on:PubMed Google Scholar
Quanxiao Li
View author publications
Search author on:PubMed Google Scholar
Yuxuan Zhong
View author publications
Search author on:PubMed Google Scholar
Yanling Wu
View author publications
Search author on:PubMed Google Scholar
Tianlei Ying
View author publications
Search author on:PubMed Google Scholar
Jian Tang
View author publications
Search author on:PubMed Google Scholar

Contributions

J.T. conceptualized the study and supervised the project. T.Y. and Y.W. co-supervised the project. H.C. investigated related work, co-led model development, processed the datasets and in silico results, and led manuscript writing. Z.Z. investigated related work, and led model development. B.Z. led the analysis of the structures and MD trajectories of the proposed mutants. M.W. was in charge of the in vitro experiments and results analysis with the help of Q.L. and Y.Z. All participated in manuscript writing.

Corresponding authors

Correspondence to Yanling Wu, Tianlei Ying or Jian Tang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Ge Liu, Pietro Sormanni and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Reporting Summary

Peer Review File

Description of Additional Supplementary Files

Supplementary Software

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Cai, H., Zhang, Z., Wang, M. et al. Pretrainable geometric graph neural network for antibody affinity maturation. Nat Commun 15, 7785 (2024). https://doi.org/10.1038/s41467-024-51563-8

Download citation

Received: 13 September 2023
Accepted: 13 August 2024
Published: 06 September 2024
DOI: https://doi.org/10.1038/s41467-024-51563-8

This article is cited by

AI in drug development: advances in response, combination therapy, repositioning, and molecular design
- Ren Qi
- Shujia Liu
- Bin Liu
Science China Information Sciences (2025)

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

GearBind: a pretrainable ΔΔG bind predictor

Cross-validation on SKEMPI

A GearBind-based ensemble for in silico affinity maturation

Evaluation on the HER2 binders test set

Affinity maturation of CR3022 and anti-5T4 UdAb

Structural characteristics of optimized antibodies

Discussion

Methods

Datasets

SKEMPI

CATH

HER2 binders

GearBind implementation

Constructing relational graphs for protein complex structures

Building geometric encoder by multi-level message passing

Atom-level message passing

Edge-level message passing and aggregation

Residue-level message passing

Modeling energy landscape of proteins via noise contrastive estimation

Cross-validation on SKEMPI

Baseline implementation details

FoldX

Flex-ddG

GearBind(+P)

Bind-ddG

In silico affinity maturation of CR3022 and anti-5T4 UdAb

Antigen preparation

Preparation for mutant and wild-type CR3022 antibodies

Generation of mutant and wild-type anti-5T4 UdAbs

Enzyme-linked immunosorbent assay (ELISA)

Bio-layer interferometry (BLI) binding assay

Protein structure and ΔΔG bind contribution analysis

Molecular dynamics simulation

Reporting summary

Data availability

Code availability

Change history

07 October 2024

References

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links

GearBind: a pretrainable ΔΔG _bind predictor

Protein structure and ΔΔG _bind contribution analysis