MM/GBSA prediction of relative binding affinities of carbonic anhydrase inhibitors: effect of atomic charges and comparison with Autodock4Zn

Docking of ligands with Autodock Vina

To address the first question posed in the introduction, we have employed the AutoDock4Zn force field and Autodock Vina program to predict the binding affinities of the test set of 32 ligands docked in their deprotonated form. In a recent assessment study, this docking program was found to perform very well for zinc metalloenzymes [16]. In this study, we see similarly good performance for the test set of 32 ligands where the correlation between the predicted and experimental binding affinity is shown in Fig. 3.

Fig. 3figure 3

Predicted binding affinity of the full dataset of ligands by docking with the AutoDock4Zn force field in AutoDock Vina. Trendline shown with all data points (dashed) and with outliers removed (solid). Outliers denoted with a hollow symbol

When the entire dataset is considered, Autodock Vina yielded a relatively weak correlation with experimental binding affinities (R2 = 0.41). However, if two notable outliers, ligands 22 and 29 were removed, the correlation is significantly improved (R2 = 0.64). Ligand 22 has an experimental pKa of 6.3 [37] which is about 2–4 pKa units lower compared to most of the other ligands. Including the effect of deprotonation is expected to increase the binding energy of 22 by approximately 2 to 5 kcal mol−1 (more negative) relative to the other ligands which would bring it closer to the trend line. Indeed, Figure S1 in the Supporting Information shows that for ligands where experimental pKa values are available, the inclusion of the pKa correction in the docking scores improves the correlation with experimental values. Interestingly, 28 is structurally very similar to 26, 27 and 29, however it is the only one of the four analogues that is an outlier.

Despite the reasonably good correlation between the docking scores and experimental binding energies, there was a consistent difference between the predicted bound pose of the ligands compared to crystal structures. For the 24 ligands with crystal structures, the average RMSD between the crystal structure pose and lowest energy docked pose is > 6 Å. No ligands had a lowest energy binding mode with an RMSD of less than 2.5 Å from the crystal structure—in one case, 25 is predicted to bind outside of the active site. Two example ligands, arylsulfonamides 4 and 11 are shown in Fig. 4. In the crystal structures of their complexes with hCA II, the sulfonamide oxygen is also in close proximity to the Zn(II) ion (ca. 3 Å) which resembles a bidentate binding mode. When 4 is docked into the protein, the predicted binding pose is unable to correctly model the bidentate zinc—ligand interaction where the Zn–O distance is about 1 Å larger than in the crystal structure. As a result, the orientation of the phenyl ring in the docked pose and crystal structure are significantly different. On the other hand, 11 is predicted to have the correct bidentate zinc coordination, however the position of the long tail is incorrect. When the docked poses with the smallest deviation from the crystal structure is considered regardless of the corresponding binding energy, the mean RMSD reduces to 4.30 Å. Ligand 3 has the lowest RMSD of 1.9 Å, for a binding mode 0.23 kcal mol−1 higher in energy than the lowest energy pose. However, for 4 ligands—11, 12, 23 and 24, the AutoDock Vina program is unable to predict a docked pose within 6 Å of the crystal structure. When the energy of the binding pose with the smallest deviation from the crystal structure is used, no correlation with experiment is observed (R2 = 0.09).

Fig. 4figure 4

Comparison of crystal and docked binding poses of arylsulfonamide ligands 4 (left) and 11 (right). The crystal structure pose is shown in translucent stick and ball, and the docked pose is shown as full colour liquorice. Protein residues shown coloured by their residue type (non-polar residues in brown, polar residues in green, acidic residues in blue). Zn-sulfonamide oxygen (Zn-O) distances in the docked poses and crystal structures are also displayed

Validation of bonded parameters to maintain the zinc coordination geometry

In MD simulations of ligand binding to hCAs, it is important to preserve the tetrahedral coordination environment around Zn(II). Previously, both “bonded” [1, 40, 43] and “non-bonded” [67] approaches have been used to retain the tetrahedral zinc coordination. In the former, force field parameters are developed to model the Zn-ligand and Zn-Histidine bonds while in the latter, no bonded parameters are defined for Zn.

Lin and Wang systematically derived bonded parameters for zinc containing systems in the AMBER force field [40]. Similarly, Lu and Voth also developed bonded parameters defining the Zn(His)3 complex for carbonic anhydrase [43]. These bonded parameters were used in a partially bonded model in work within the Meuwly group [65]. In this approach, bonds are defined linking each of the three histidine nitrogen atoms to the zinc atom, whereas the bond between the zinc and sulfonamide nitrogen is retained using collective variables and there was a moderately strong correlation between their MM/GBSA and experimental binding energies [21]. Inspired by these efforts, a similar partially bonded model was applied here. As the choice of zinc Lennard–Jones parameters can affect the coordination in the active site [48], two different vdW parameters were tested: the default CHARMM Zn2+ Lennard-Jones parameters, and one published Lu and Voth. In the latter, the εmin (well depth at the minimum) is smaller than the parameter in the CHARMM forcefield, and so the attractive force due to the interaction is smaller. The use of the CHARMM Lennard–Jones parameters led to ligands adopting an incorrect coordination geometry, where the sulfonamide nitrogen is displaced by a sulfonamide oxygen in the coordination shell, with the trajectory-averaged distance between the zinc and sulfonamide nitrogen of 4.17 Å, which is 2.2 Å greater than found in the crystal structure. (Fig. 5). The use of the Zn(II) Lennard–Jones parameters optimised by Lu and Voth prevents this change in binding geometry and hence was used for all subsequent simulations [43]. It is clear that regardless of the choice of zinc vdW parameter, the ligand loses its bidentate binding mode, and instead retains a monodentate interaction with the zinc. The trajectory-averaged zinc–sulfonamide oxygen distance is 1.1 Å larger than the crystal structure for simulations with the Voth Zn (II) parameter. The zinc–histidine bond distance was restrained to a distance of 2.0 Å to better reproduce crystal structure geometries. A complete list of parameters can be found in Table 1 and Fig. 6.

Fig. 5figure 5

Representative binding pose of Ligand 4 using different Lennard–Jones parameters. Crystal structure (PDB ID 6RH4, left), Snapshots from a simulation of the solvated protein–ligand system with the zinc(II) vdW radii optimised by Lu and Voth (centre), and snapshot from a simulation of the solvated protein–ligand system with the CHARMM zinc(II) vdW radii (right). Only ligand and Zn(His)3 binding site are depicted. Average interatomic distances over a 4 ns production run are shown

Table 1 Bonded and non-bonded parameters used to maintain the zinc tetrahedral geometry in conjunction with the CHARMM force field.Fig. 6figure 6

Atom typing for Zn(II) and coordinating histidine residues

The distribution of Zn-N(sulfonamide) distance along a MD trajectory for CAII and ligand 4 is plotted in Fig. 7 which coincides reasonably well with the crystal structure distance.

Fig. 7figure 7

Normalized histogram of the distance between the zinc and sulfonamide nitrogen over an 8 ns trajectory for Ligand 4. Red line represents the crystal structure distance of 1.97 Å (PDB ID 6RH4)

Effect of QM ligand charges on MM/GBSA binding energies

Having established a suitable set of force field parameters, this section focuses on conducting MD simulations using different atomic charges to represent the ligand and the effect this has on the resulting MM/GBSA binding energies. Atomic charges were calculated using the Mulliken, ESP, and NPA schemes at the HF/6-31G(d,p), B3LYP-D3(BJ)/6-31G(d,p), and M06-2X/6-31G(d,p) levels of theory. These charges were compared for their ability to predict trends in binding affinities for a set of 15 ligands previously studied in the Meuwly group (ligands 115 from Fig. 2) [65]. In that work, a correlation coefficient of 0.55 was achieved for these 15 ligands which is somewhat lower compared to the values obtained in this work (vide infra). The effect of atomic charges on the binding affinity is compared in Figs. 8, 9 and 10.

Fig. 8figure 8

Predicted binding affinities of ligands 115 from MM/GBSA simulations with ESP, Mulliken, and NPA charges from a B3LYP-D3(BJ)/6-31G(d,p) calculation on the ligands. MM/GBSA energies averaged from 5 independent 4 ns production runs

Fig. 9figure 9

Predicted binding affinities of ligands 115 from MM/GBSA simulations with ESP, Mulliken, and NPA charges from an M06-2X/6-31G(d,p) calculation on the ligands. MM/GBSA energies averaged from 5 independent 4 ns production runs

Fig. 10figure 10

Predicted binding affinities of ligands 115 from MM/GBSA simulations with ESP, Mulliken, and NPA charges from a HF/6-31G(d,p) calculation on the ligands. MM/GBSA energies averaged from 5 independent 4 ns production runs

For this set of ligands, the use of ESP atomic charges consistently gave the best correlation between the experimental and MM/GBSA binding affinities. Notably, ESP charges calculated at the B3LYP-D3(BJ)/6-31G(d,p) level of theory yielded very good correlation with the experimentally derived binding energies, with an R2 value of 0.84.

Mulliken charges also yielded good correlation with experimental data when calculated with the two DFT methods. In particular, the protocol used here in conjunction with charges obtained at the B3LYP-D3(BJ) level of theory showed an improved performance compared to that of Meuwly and co-workers (R2 = 0.55 c.f. 0.66 in this work). Compared to ESP charges, Mulliken charges are not suitable for the prediction of trends in binding affinities when computed at the HF level of theory. This is unsurprising, as Mulliken charges are known to be sensitive to the level of theory at which they are obtained [39].

Despite previous work demonstrating NPA charges can accurately model solute–solvent interactions [14], here the use of NPA charges gave no correlation with the experimental data irrespective of the choice of QM method. Scheme 1 compares the atomic charges obtained using different schemes for a representative ligand 4. The automatically assigned CGENFF atomic charges are also shown for comparison but were not considered in the MD simulations due to the high penalty scores associated with these assigned charges. As shown, the use of NPA charges significantly polarises the charges around the sulfonamide head. In particular, the charge on the sulfur atom is around 2.5 times greater when assigned using NPA charges compared to ESP charges. Figure 11 shows the correlation between Mulliken and ESP compared to the correlation between Mulliken and NPA charges. It is clear that the NPA charge scheme yields charges that are disproportionately larger in magnitude compared to Mulliken charges, most notably for the sulfonamide S atom, which sit in the top right of the graph, and the sulfonamide nitrogen at the bottom left.

Scheme 1scheme 1

Distribution of partial charges of ligand 4 derived from CGENFF and B3LYP-D3(BJ)/6-31G(d,p) geometry optimisations and calculations. Partial charges obtained from Mulliken, ESP or NPA fittings. Partial charges rounded to three decimal places

Fig. 11figure 11

Mulliken and NPA charges plotted against ESP charges for all atoms in all ligands. NPA charges are significantly more polarised than Mulliken charges

To test the broader performance of the best performing charge schemes, we expanded on the test set to include a total of 32 chemically diverse ligands (Fig. 2). Binding affinities were predicted from simulations using Mulliken or ESP charges from either B3LYP-D3(BJ) or M06-2X, and compared to experimental data. The binding affinities and uncertainty are presented in Table 2, and plotted in Fig. 12. The RMSD in predicted absolute binding energies and pairwise RMSD in relative binding energies are also presented in Table 2, where the latter is about 2–3 times smaller presumably due to systematic cancellation of errors.

Table 2 Predicted MM/GBSA Binding Affinities of ligands 1 to 32 to hCA II with different charge schemes. Binding data is the average of 5 independent 4 ns trajectories. Uncertainty presented as 2 times the standard error in the meanFig. 12figure 12

MM/GBSA binding affinities of ligands 132 calculated with ESP or Mulliken atomic charges from DFT methods. Hollow symbols denote outliers, and dashed lines the correlation including the outliers. MM/GBSA energies averaged from 5 independent 4 ns production runs

Notably, for this larger and more diverse dataset the correlation between experimental and predicted binding affinities reduced significantly compared to the original set containing ligands 115. This effect was independent of the charge scheme used. ESP charges showed better correlation with the experimental data than Mulliken charges regardless of the QM level of theory. Both the B3LYP-D3(BJ) and M06-2X charges gave very similar correlation coefficients (R2 = 0.49 and 0.48 respectively). Nevertheless, when four outliers (22, 27, 28, and 30) were removed, this restored the good correlation with experimental binding affinities where the R2 increased to 0.77 in the best case for B3LYP ESP charges.

Further inspection of the four outliers reveals several interesting observations. As with the docking protocol, 22 (trifluomethylsulfonamide) has a significantly lower pKa (6.3) than the other ligands in the dataset, and hence the energetic cost of deprotonation is lower [37]. The pKa values of the other ligands range from 7.4 to 10.8, with the majority above 8.5. At a pH of 7, the difference in deprotonation energy between a ligand with a pKa of 6.3 and 8.3 translates to an approximately 2 kcal mol−1 change in binding free energy. Consequently, whilst this effect likely cancels for the other ligands with similar pKa values, the lower energetic requirement for deprotonation resulted in an underestimation of the binding affinity of 22 relative to other ligands and should bring it closer to the trend lines when the effect is included (Fig. 12).

Ligand 28, which contains a carbohydrate group has consistently underestimated binding affinities. This appears to be consistent with literature where the CHARMM carbohydrate forcefield is known to significantly and consistently underestimate protein—carbohydrate binding affinities [59].

Whilst ligands 2629 all had overestimated binding affinities, ureidobenzenesulfonamides 27 and 28 consistently showed the largest deviation from the trendline. These four ligands include SLC-0111 (26) and its derivatives which are more conformationally flexible than the majority of the ligands in the test set. Specifically, these compounds can adopt 3 distinct conformations, namely syn-syn, syn-anti, and anti-anti, corresponding to different orientations of the -NH groups pointing either towards (syn) or away from (anti) the chalcogen. For the ureidobenzensulfonamides 26, 28 and 29, different binding poses are observed in independent trajectories, corresponding to the anti-anti and syn-anti conformers. Whilst 26 and 29 have similar binding affinities for each pose, the ortho-substituted isopropyl phenyl group in 28 is significantly affected by the orientation of the urea as shown in Fig. 13. Binding in the anti-anti conformer results in a 7 kcal mol−1 increase in binding affinity relative to the syn-anti. In the syn-anti pose the isopropyl group points towards the hydrophobic pocket, whereas in the anti-anti conformer the phenyl ring is more easily able to form a strong hydrophobic interaction with a phenylalanine residue. This large variance results in the large uncertainties as shown in Table 2, with an uncertainty of 3.06 kcal mol−1 in MM/GBSA binding energies obtained from B3LYP-D3(BJ) ESP charges. To examine if this was due to insufficient sampling, 100 simulations were run with B3LYP-D3(BJ) ESP charges (50 with an anti-anti starting pose and 50 with a syn-syn starting pose). The binding affinity was predicted to be -25.6 kcal mol−1 (σ = 2.44) compared with -27.43 kcal mol−1 (σ = 3.46) when only five trajectories were run. This 2 kcal mol−1 shift brings the data point closer to the trend line indicating that insufficient sampling is likely an issue here. Notably, regardless of starting pose, trajectories containing both the syn-syn and syn-anti conformers were observed. The crystal structure shows that ligand 28 binds in the anti-anti conformation [53]. For the thiourea 27, the ligand adopts the syn-syn conformer in all trajectories, whilst the crystal structure shows the ligand binds in a syn-anti conformer [2]. However, fixing the ligand in the crystal structure conformer may not necessarily improve the correlation with experiment, as evidenced by the crystal structure anti-anti conformer of ligand 28 having a greater divergence from the predicted trendline of binding affinities.

Fig. 13figure 13

Representative binding modes of 28 in two different simulations. The ureido group can adopt different conformations, which affects the position of the isopropyl group. The anti-anti (left) and syn-anti (right) conformations are observed. Protein residues shown coloured by their residue type. Non-polar residues shown in brown, polar residues shown in green. Hydrophobic carbon–carbon interaction distances indicated by red dashed lines

When these ligands are excluded, the use of B3LYP-D3(BJ) optimised ESP charges has a strong correlation with the experimental binding affinities (R2 = 0.77), which only slightly lower than the correlation observed with only the original 15 ligands. Similarly, ESP charges obtained with M06-2X also show reasonable correlation, with an R2 of 0.66. Mulliken charges show limited correlation between the predicted and experimental binding data even with these difficult to model inhibitors excluded.

留言 (0)

沒有登入
gif