The ClusPro AbEMap web server for the prediction of antibody epitopes

Antibodies form one of the key arms of the adaptive immune system in vertebrates. They target solvent-exposed proteins called antigens on the surfaces of pathogens. After recognition and contact, the antibodies mediate the humoral immune response to the attached pathogen1. The diversity and specificity of antibodies are the reason why harnessing their unique features is paramount in the pharmaceutical industry. Understanding and accurately predicting atomic-level details of the antibody–antigen interface are crucial for utilizing antibodies2. Finding the antigen residues in the interface, henceforth called ‘epitope mapping’, can be useful for the design of monoclonal antibodies3, for developing vaccines4 and for investigating immune responses5.

The development of methods for predicting antibody–antigen interactions and for antibody-based drug discovery was traditionally handicapped by the difficulty of obtaining high numbers of antibody sequences. However, because of advances in high-throughput single-cell and variable-diversity-joining sequencing of the B-cell receptor repertoire6,7, the availability of antibody sequences is no longer an issue in the race toward developing antibody-based drugs. Fast and accurate prediction of the epitopes for these antibody targets has become the new bottleneck8. Currently, epitope mapping efforts are dominated by experimental techniques such as X-ray crystallography, mutagenesis (e.g., alanine scanning) and phage display. X-ray crystallography is laborious and expensive, whereas mutagenesis and phage display generally do not provide atomic-level details9. Importantly, none of these experimental methods can be used in a high-throughput manner. In view of these limitations, substantial efforts have been devoted to the development of computational epitope-mapping methods10,11,12,13,14,15. However, epitope prediction for a given antigen and a given antibody is a difficult computational problem that requires further development to improve the accuracy and reliability of the predictions16. Part of the difficulty is due to the paucity of nonredundant structural data on antibody–antigen interactions, because, as reported by Jespersen et al. in 2017, <25% of the antibody–antigen complexes found in the Protein Data Bank (PDB) are unique when taking a 70% sequence identity threshold cutoff for the antigen17.

The challenge of epitope mapping can be partially addressed by finding residues on the antigen’s surface that are most likely to interact with a generic antibody (as opposed to a specific antibody)10,12,17,18. Some examples of such an approach are implemented in the servers Spatial Epitope Prediction for Protein Antigens (SEPPA)10,12 and BEpro (formerly known as PEPITO)18. SEPPA uses a logistic regression algorithm with features such as antigen residue surface accessibility and propensity for unit-triangle patches (three residue groups on the antigen’s surface) among other factors to score the surface residues10,11,12. BEpro adds an amino-acid propensity scale and side-chain orientations besides other features18. Despite the achievements of the antibody-agnostic approach, it is crucial to highlight that epitopes are, by definition, relational entities and that epitope mapping ought to be antibody specific. This is evidenced by several antigens with particular affinities to different antibodies at different interfaces. A well-studied example is hen egg lysozyme, which is crystallized with four different antibodies in the PDB structures 1BVK, 1DQJ, 2I25 and 1MLC, with little overlap19,20,21,22. Therefore, consideration of both the antibody and the antigen in epitope mapping not only is appropriate but generally also increases the accuracy because more information can be gleaned from the antibody side4.

This relational nature of epitopes makes it especially attractive to approach the epitope-mapping problem by using docking, which is a computational method that conventionally predicts the binding mode of two biological units23. One fairly successful example of such methods is ClusPro24,25,26. ClusPro is a webserver that directly docks two interacting proteins when given their X-ray structures. The server is freely available to those in nonprofit organizations and is used by over 20,000 scientists worldwide. It runs on a rigid-body docking program called ‘PIPER’, which uses a fast-Fourier transform (FFT) correlation approach27. The interaction energy, which is composed of van der Waals (vdW) energy terms (repulsive and attractive), electrostatic energy (Coulombic and Born approximations) and a structure-based pairwise statistical potential, is used for ranking the docked models. In 2012, a special antibody–antigen version of the pairwise statistical potential was introduced, vastly improving antibody–antigen docking accuracy28. In a recent comparative study, ClusPro was reported as the best server for antibody–antigen docking29. Hua et al. used the top 30 models predicted by ClusPro and combined it with site-directed mutagenesis to localize epitopes on several case studies successfully9. However, they also stated that docking alone or machine learning–based methods did not provide unique epitope positions and had to be followed by experiments9. Krawczyk and colleagues used a docking method in their epitope-mapping server EpiPred30. They used ZDOCK, an FFT-based rigid-body docking method to generate models to score putative epitope patches determined by geometric fitting30,31. More recently, Sikora and colleagues performed rigid-body Monte Carlo docking of monoclonal antibody against glycosulated SARS-CoV2 spike protein configurations to check for the accessibility of potential epitope candidates32.

One challenge that both Krawczyk et al. and Hua et al. faced, when using the EpiPred and ClusPro servers, respectively, was the servers’ inability to work with sequences9,30. Working with sequences requires a separate modeling step (which is not offered by EpiPred and ClusPro) if the antibody’s X-ray structure is not available. However, as mentioned above, antibody sequencing has made major advances over the past few years, whereas the technology for determining the X-ray structure of antibodies has not substantially improved33,34. This limitation implies that epitope-mapping tools should ideally include the ability to model the antibodies from their sequences in addition to mapping the epitopes on the given antigen structure. As a response to this need, we present in this work an end-to-end epitope-mapping server based on ClusPro’s docking protocol, ClusPro Antibody-based Epitope Mapping (ClusPro-AbEMap), that offers template-based modeling of the antibody if an X-ray structure is not available. The ClusPro-AbEMap server (https://abemap.cluspro.org/) integrates our template-based modeling method35 and antigen–antibody contact prediction via docking36,37 to identify the epitope of a given antigen structure from an antibody sequence or X-ray structure.

If the X-ray structure of the antibody is known, thousands of low-energy antibody–antigen models predicted by PIPER—the rigid-body docking program on which ClusPro is based—are used to score the antigen surface residues. If the X-ray structure of the antibody is unknown, AbEMap builds multiple homology models of the antibody that are used for docking instead. The consensus of the docked complexes based on all the antibody models and antigen structure templates is used to score the antigen residues. For the model antibodies, special care should be taken not to penalize possible clashes, by reducing the weight of the vdW component of the interaction energy. Although epitope prediction remains a difficult computational problem, and clearly more method development is required, we present the protocol for AbEMap, which performs better than the popular peer servers SEPPA, BEpro and EpiPred38.

Finally, we explore the potential use of deep neural network–based method AlphaFold2 for antibody structure prediction as well as epitope mapping. It has been demonstrated in the CASP14 experiment and now is well established that AlphaFold2 substantially improves the accuracy of predicting the structure of most monomeric proteins39,40,41. We show that AlphaFold2-modeled antibodies perform nearly as well as our ensemble of template-based models. However, according to our results, using the linker-based approach to predicting protein–protein interactions, which was recently proposed by several groups42,43,44,45, does not improve the accuracy of AbEMap in finding antibody epitopes.

The AbEMap algorithm and server overview

The server has two modes of running: the first requires an antibody structure as input, which can be an X-ray structure or a precomputed homology model, and the second can perform epitope prediction starting from the amino acid sequence of the antibody, assuming that appropriate antibody homologous structures are available in the PDB. The antigen structure is assumed to be known in both modes. The protocol for performing the second mode includes both homology modeling and docking (Fig. 1a–f). It builds multiple (if applicable) antibody models that form an ensemble. Once structures are available for both antibody and antigen, their mutual conformational space is sampled by using PIPER26, the docking engine of the ClusPro server, and the 1,000 lowest-energy complex poses are identified. When ClusPro is used for protein–protein docking, the 1,000 structures are clustered, and the centers of the most-populated clusters are selected as models of the complex. However, for epitope prediction, the 1,000 structures are instead used to calculate the frequency of each antigen surface atom’s occurrence in the antibody–antigen interface. As will be shown, to map an epitope, AbEMap defines the atomic epitope likelihood score as the Boltzmann weighted atomic interface occurrence frequency averaged over the ensemble of antibody structures.

Fig. 1: Outline of the AbEMap protocol using an antigen structure and an antibody sequence as inputs, examples of complex structures generated by PIPER, four examples of results and comparisons to other servers.figure 1

a, The user inputs the solved crystal structure of the antigen (shown as PyMOL stick figures in purple) and the antibody sequence (shown as purple text) (if the structure is unavailable). b, The antibody sequence is used to find close homologs by using BLAST for each of its heavy (H) and light (L) chains. A sample multiple sequence alignment of close homologs is shown for the monoclonal murine antibody 1FGN. L1 and H1 (green), L2 and H2 (blue) and L3 and H3 (red) regions of the complementarity-determining regions are highlighted. The list of homologs is filtered by using sequence identity and sequence similarity of L3 and H3 regions to the query sequence. c, The structures for the selected sequences are modeled individually by using MODELLER. Aligned regions of the backbone are copied from the template, whereas non-aligned regions are modeled. d, The residues with the highest likelihood of being in the epitope are highlighted in red on the results page of the server. e, Billions of antibody–antigen complex conformations are generated by PIPER for the given antibody structure or for each antibody model. The antibody is shown as a translucent cartoon, and the antigen is shown as a cyan surface. f, The bar plot shows the number of poses in the top 100 models generated by PIPER that are within different root mean square deviation (RMSD) thresholds. For example, three models (in the top 100) have RMSD ≤2 Å, and 21 models have RMSD between 10 and 12 Å. g, As examples of visualizing the results, modeled murine anti-tissue factor (PDB ID 1FGN) and tissue factor (PDB ID 1TFH) are shown as surfaces with residues colored from blue to red on the basis of increasing predicted epitope likelihood score. 19 of the 26 epitope residues are in the 30 top-ranked residues. h, Modeled humanized Fab D3h44 (PDB ID 1JPT) and tissue factor (PDB ID 1TFH) shown as surfaces with residues colored from purple to gold on the basis of increasing predicted epitope likelihood score. 20 of the 24 epitope residues are in the 30 top-ranked residues. i, Modeled anti-CCL2 neutralizing antibody (PDB ID 4DN3) and monocyte chemoattractant protein (1DOL) are shown as surfaces with residues colored from orange to green on the basis of a decreasing predicted epitope likelihood score. All 14 of the 14 epitope residues are in the 30 top-ranked residues. j, Modeled anti-shh chimera Fab fragment (PDB ID 3MXV) and sonic hedgehog N-terminal domain (PDB ID 3M1N) shown as surfaces with residues colored from yellow to red on the basis of increasing predicted epitope likelihood score. 15 of the 24 epitope residues are in the 30 top-ranked residues. k, The distribution of the area under the receiver operating characteristic curve (ROC AUC) scores of 28 unbound antibody–antigen complexes for two of the top epitope-predicting servers (SEPPA and BEpro) are compared to that of AbEMap. AbEMap outperforms both in terms of the average (red dot), median (middle line) and 25th and 75th quartiles. l, The F1 and MCC scores of three different methods are compared for model antibodies when only homologs with <80% sequence identity are used as templates. ClusPro AbEMap takes the ensemble average residue scores from 5 to 10 of the best homologs, and EpiPred is used for epitope prediction with the best model antibody. AbEMap outperforms EpiPred before and after ensemble averaging of the likelihood scores.

When given only amino acid sequences of the heavy and light chains of the antibody in the more general second mode, ClusPro-AbEMap starts with a BLAST search for homologous structures in the PDB (Fig. 1b). It restricts sequence identity to be above 20% and e-value below 1 × 10–40, but in case no templates are found, the e-value threshold is increased to 1 × 10–20. Once the search is complete, only templates with both heavy and light chains that meet the sequence constraints are retained. The resulting templates are then ranked on the basis of both sequence identity and sequence similarity of CDR3s in the heavy and light chains. For complementarity-determining region (CDR) detection, we use the same tools as in the ClusPro server46. We take the five highest-ranked structures based on CDR3 sequence identity and the five highest-ranked structures based on CDR3 sequence similarity and use the union of the two sets as antibody templates. If the CDRs of the antibody cannot be identified, or if there is no CDR (as in single-domain antigen receptors), then the five top candidates ranked by the global sequence identity are selected. The second step is constructing homology models of the antibody on the basis of the selected templates (Fig. 1c). MODELLER tools47 are used to realign the antibody sequences, taking into account the template structural information. The program models the backbone atoms of the non-aligned residues and all side chains, while the backbone atoms of the aligned residues are kept fixed at the template coordinates. The single best model proposed by MODELLER for each template is retained for the next step in the epitope-mapping process; thus, AbeMap generally retains multiple antibody models.

When given an antibody X-ray structure or antibody models that have been already constructed, the next step of AbEMap is global antibody–antigen docking by the PIPER program27 that directly docks two protein structures (Fig. 1e). PIPER uses the FFT correlation approach48, which represents the interaction energy of the complex as a weighted sum of correlations between the fixed receptor and rotationally and translationally mobile ligand grids. Together with the FFT method, this representation makes exhaustive conformational sampling of the six-dimensional energy landscape computationally feasible. The standard level of discretization used in PIPER is 70,000 rotations from the Sukharev quasi-uniform grid sequence49 (approximately 5 degrees by Euler angular step) and a translational grid step size of 1 Å. The energy function E includes terms representing repulsive and attractive components of the vdW energy (denoted as Erep and Eattr, respectively), a Columbic term describing the electrostatic interaction energy (ECoul), a generalized Born-type polar solvation energy term (EBorn) and another solvation term based on the structure-based statistical potential EDARS based on the Decoys As the Reference State (DARS) approach50. A special antibody–antigen asymmetric version of the DARS potential has been developed, significantly improving ClusPro’s antibody–antigen docking accuracy28. This antibody–antigen–specific potential takes advantage of the fact that aromatic residues dominate the paratope but not necessarily the epitope, whereas the epitope generally has a higher level of hydrophobicity than the paratope.

To sample antibody–antigen interaction, the known antigen structure is docked to either the known antibody X-ray structure or the ensemble of antibody homology models obtained in the previous step from the antibody sequence data. Following our recently published protocol46, we mask all antibody residues except for CDRs. The server shows results for the energy function currently used in ClusPro for antibody–antigen docking (E = 0.5 Erep − 0.2 Eattr + 300 ECoul + 30 EBorn + 0.2 EDARS). If the input is a computationally predicted or homology-modeled structure or just the antibody sequence, then results for two additional weight sets are provided for the user as default: the option ‘No vdW’, which means that the weights for both vdW contributions are zeroed out, and the option ‘Reduced attractive vdW’, which implies that the weight for the attractive vdW term Eattr is halved. These additional weight sets avoid penalizing possible steric clashes. However, it should be noted that the commonly used maximum repulsive and minimum attractive vdW thresholds are still in place for all coefficients. Reducing the vdW potential’s weights notably increased epitope prediction accuracy when using the AbEMap protocol when only the sequence of the antibody or homology-modeled structure was given (Supplementary Fig. 1). As in the ClusPro server, the best-scored pose per rotation is retained, resulting in a total of up to 70,000 docked poses for further analysis.

Once the PIPER docking poses and energies are obtained for the antibody structure or for each antibody model i in the ensemble of homology models (which can include one or several models, depending on the number of suitable templates), the top 1,000 lowest-energy poses are selected, and for each such pose j, the number lij of antigen surface atoms51 that are in contact with the corresponding antibody is counted. More precisely, any heavy atom on the antigen surface found to be within the 5-Å threshold from any of the antibody surface heavy atoms is considered to be in contact with the antibody. For each antigen atom on the interface, we calculate a Boltzmann-weighted normalized contact ‘occurrence’ as follows:

$$\upsilon _^ = \frac - \varepsilon _} \right)}}}}}}}$$

where εij and εio are the jth and the best PIPER energy scores of the ith antibody structure in the ensemble, and a value of 100 was used for T (‘temperature’) to scale the relative energy scores. After summing the atomic contributions shown above over j (the docked structures) and averaging over i (the different antibody models), an epitope likelihood score that indicates how often the atom participates in the antibody–antigen interface of low-energy models predicted by PIPER is obtained. The total number of considered docked structures and the ‘temperature’ factor were optimally selected by using the receiver operating characteristic (ROC) AUC score obtained from the likelihood scores of each residue (area under the ROC curve; this score is described in more detail below in Performance measures). The top 1,000 lowest-energy poses and a T value of 100 give the best results (Supplementary Fig. 2). Because PIPER generally increases the number of docked structures around the native interface, it is expected and observed that atoms predicted to be in the epitope more frequently by the docked structures are more likely to be in the true epitope. This likelihood score is shown as the B-factor value in the final PDB file given to the user (Fig. 1d), and it helps to visually highlight plausible epitope regions. For evaluating epitope prediction accuracy, we convert the atom likelihoods to residue likelihoods by summing up the atomic contributions for each residue. Although adding atomic likelihood values implies that bigger residues with more surface-accessible atoms are scored better, the residue likelihood values are not corrected for size, and hence this bias may have to be accounted for by the user.

Protein datasets used for testing AbEMap

A set of 40 antibody–antigen complexes found in the widely accepted protein–protein docking benchmark version 5.0 (BM5)52 from the Weng laboratory was used to test our protocol (Figs. 13). To ensure non-redundancy, the authors selected an antibody–antigen complex only if the antigen was not in the same Structural Classification of Proteins53 family, and it did not share more than 80% of the interface residues with another52. The BM5 set contains 12 antibody–antigen complexes with the antibody crystallized only in complex with the respective antigen but not on its own (termed ‘unbound-bound targets’). For the other 28 complexes, the X-ray structure of the antibody has been determined both on its own and in complex with the antigen (termed ‘unbound-unbound targets’). In both cases, the antigen was independently crystallized in addition to its form in a complex with the respective antibodies. We compared the performance of AbEMap to that of SEPPA, BEpro and EpiPred by using this set of antibody–antigen complexes. However, EpiPred did not work for one of the unbound targets (PDB ID 2I25), most likely because it contains a single-chain shark antigen receptor without any CDR rather than a traditional antibody. Therefore, the target 2I25 was excluded from figures that compare AbEMap with EpiPred. In addition, we wanted to test the protocol on the 23 antibody–antigen complexes that were recently added to the docking benchmark set (denoted ‘BM5.5’)29. However, AbEMap was unable to provide antibody models for two of the complexes, and hence our discussion is restricted to the remaining 21 targets.

Fig. 2: Epitope-mapping performance of four different servers tested on 28 unbound-unbound antibody–antigen complexes in the benchmark set BM5.figure 2

F1 and MCC scores for ClusPro-AbEMap, SEPPA, EpiPred and BEpro at different cutoff thresholds when the antigen residues are ranked by the obtained scores. The measures are averaged over the 28 complexes. The AbEMap results are slightly better than the ones obtained by SEPPA and substantially better than the ones obtained by BEPro and EpiPred.

Fig. 3: Examples of AbEMap’s applications.figure 3

a, The complex of birch pollen Bet V1 (blue surface) with the bound monoclonal antibody (magenta), PDB ID 1FSK. The true epitope residues are highlighted in red, and three of the homology models of the antibody are shown in green. The CDR3 of the heavy chain on the native antibody is highlighted in cyan. b, The integrin alpha-L 1 domain (blue surface) with true epitope residues (red) is shown with the different poses of the Efalizumab FAB fragment predicted by PIPER. The cluster centers of the top antibody clusters are shown as gray pseudo-atoms. The top cluster’s representative is shown as a cartoon (green). c, VEGF protein (blue surface) with the true epitope residues (red) is shown with the different poses of the FAB fragment of a neutralizing antibody predicted by PIPER. Similar to b, the top cluster centers are shown as gray pseudo-atoms, and the top-ranked cluster representative is shown as a cartoon. d, ROC plots of AbEMap’s performance with X-ray and model structures of the antibodies as inputs for the 28 unbound antibody–antigen complexes in the BM5 set. As shown, the use of homology models provides essentially the same accuracy as using the separately solved X-ray structures of antibodies. Hmlg, homology modeling.

Performance measures

The prediction performance for each antibody–antigen sequence was evaluated on the basis of the ground truth obtained from the bound structures of the complexes. The true epitopes are simply the residues of the antigen that are within 5 Å from the nearest antibody heavy atom in the native complex54,55. It appears that the most widely used performance measure among epitope-mapping servers10,11,12,18 is the ROC AUC, and hence it was also used to compare the performance of ClusPro-AbEMap with that of other probabilistic servers such as SEPPA and BePro (we recall that an ROC curve plots the true positive (TP) rate versus the false positive (FP) rate, and ROC AUC is the area under the ROC curve). We also show the F1 score—the harmonic mean of precision and recall—and the Matthews correlation coefficient (MCC) score at different residue rank cutoffs as used in other studies

留言 (0)

沒有登入
gif