Efficient determination of the accessible conformation space of multi-domain complexes based on EPR PELDOR data

Determination of the conformation space of weakly interacting but covalently bound systems

So far we have studied two distinct macromolecular systems that are characterized by weak interactions using EPR PELDOR spectroscopy. In a previous study, we investigated a di-ubiquitin chain in which the two moieties are linked via a covalent bond between the side chain of K48 of the acceptor molecule and the C-terminus of the donor molecule (Kniss et al. 2018). In the currently investigated system, the C-terminus of ubiquitin was covalently linked to the side chain of K89 of the E2 enzyme Ubc7 linked to the U7BR peptide from yeast (von Delbruck et al. 2016). The U7BR peptide is a domain of the Cue1 protein and binds to the backside of the Ubc7 E2 ligase and stabilizes it (Metzger et al. 2013). In cells, this interaction is also required to recruit the E2 enzyme to the membrane of the endoplasmic reticulum (Biederer et al. 1997). We have used throughout our investigations a single chain construct in which the U7BR peptide is linked via a glycine-serine linker to the C-terminus of Ubc7. K89 is a mutant of the wild type C89 amino acid of Ubc7 and was chosen to produce a more stable bond between Ubc7-U7BR and ubiquitin (Plechanovova et al. 2012) (as the lysine side chain is longer than the cysteine side chain, this stabilized bond, however, might lead to an increased flexibility). Both systems—ubiquitin chains as well as E2-ubiquitin conjugates—play important roles in cellular signal transduction and characterizing these complexes is important to understand their biological functions at the molecular level. The binding constants of ubiquitin with its numerous interaction partners (e.g. other ubiquitin molecules, E2 and E3 enzymes as well as a myriad of different ubiquitin binding domains) have been determined and are very often in the range of − 10 µM (Dikic et al. 2009). The combination of such weak interactions with a covalent attachment results in a conformation space populated by many different states with different probabilities. In case of the interaction of E2 enzymes with conjugated ubiquitin, a combination of NMR chemical shift differences, PRE measurements and SAXS data had been used to show that the UbcH5c–ubiquitin system shows a wide distribution of relative orientations of both proteins with respect to each other, while this distribution for the Ubc13–ubiquitin conjugate was far more restricted with ubiquitin preferentially interacting via its Ile44 centered hydrophobic patch with helix 2 of the E2 enzyme (Pruneda et al. 2011).

Previous protocol for calculating the conformation space of di-ubiquitin

In our previous investigation of the conformation space of di-ubiquitin we had used an approach based on attaching MTSL spin label via a cysteine side chain to different positions within both ubiquitin molecules of a di-ubiquitin chain (Kniss et al. 2018). Subsequently, EPR PELDOR measurements were performed and analyzed using the Gaussian model-based approach implemented in the DeerAnalysis2016 (Jeschke et al. 2006) or DD (Stein et al. 2015) software packages, which yields the distance distribution between the two spin labels. The results of the analysis were cross-checked against the distributions obtained using the Tikhonov-regularized model-free approach implemented in DeerAnalysis2016, always showing almost identical results. For the purpose of the subsequent determination of the conformation space, the model-based approach was preferred because of the absence of additional components in the distance probability distributions arising, for instance, from the specific choice of the regularization parameter or from the intermolecular component by which the signal has to be divided prior to the analysis. Besides, the Tikhonov regularization approach may be unsuitable for distance probability distributions characterized by the coexistence of narrow and broad components, as they would result for example from the equilibrium between an open and a closed conformation.

To determine and visualize the populated conformation space based on the measurement of several of these distance distributions, we developed protocols for the software package CYANA (combined assignment and dynamics algorithm for NMR applications) (Guntert et al. 1997; Guntert and Buchner 2015) that is used to determine structures of biological macromolecules based on NMR-derived restraints. For these calculations, a CYANA library entry for the side chain of MTSL-labelled cysteine was introduced (Kniss et al. 2018). In our original implementation of PELDOR-based ensemble distribution calculations that we used for the determination of the conformation space of di-ubiquitin, we created in the first step a conformation ensemble with a broad distance distribution between both ubiquitin molecules using CYANA version 3.9. A virtual atom was placed in the center of the terminal N–O bond of each spin-labelled amino acid side chain as reference point for distance restraints. The dihedral angles of the spin-labelled sidechain were restricted based on a rotamer library generated for each labelled residue by MMM2015.1 in 298 K mode (Polyhach et al. 2011). These rotamer library restraints were included in our calculations using an extension of CYANA (Guntert et al. 1997; Guntert and Buchner 2015)that takes into account all rotamers from the given rotamer library; this was realized by a new type of restraint that has a value of 0 (no contribution to the target function) if the spin-labelled sidechain adopts a conformation included in the library and increases with increasing deviation from the allowed conformations in this library. Subsequently, from the set of experimentally available EPR distance restraints—five, in this specific case (Kniss et al. 2018)—a conformation ensemble was built by varying systematically and independently each distance between spin-labelled positions within the full range of the corresponding distance probability distribution. For this purpose, the aforementioned ranges, identified by the condition that the probability distribution has to be above a given threshold (in the specific case 0.1% of the maximum value), were divided into 0.5-nm bins and an additional term was added to the CYANA target function to enforce the fulfillment of the specific combination of inter-spin distances; similarly to how NOE-derived restraints are taken into account, this term gives no contribution if each distance is within the given bin and adds a quadratic penalty with increasing deviation outside of the bins (Guntert et al. 1997).

The conformation ensemble generated according to this procedure contained all possible combinations of the five measured distances within each of these distance ranges. Overall, 51,000 structure calculations were performed. For each distance combination, a bundle of the 20 conformers with the lowest target function out of 100 calculated structures was generated. All structures showing van der Waals collisions between the two ubiquitin moieties or not fulfilling the above-described distance restraints were discarded to generate a collisionfree conformation ensemble of − 3.7 × 105 (Pruneda et al. 2011) models. This ensemble represents the accessible conformation space that is consistent with the upper and lower boundaries of the PELDOR measurements.

To interpret and visualize the conformation ensemble of di-ubiquitin, probabilities were assigned to each of the conformers as described below. The results of that study enabled a better understanding of shifts in the conformation ensemble of multimers due to different experimental conditions and the influence of modulating ligands. However, this method was a brute-force approach requiring a complete structure calculation for each distance combination. Thus, while the proposed approach was successful in determining the conformation space of di-ubiquitin based on five PELDOR restraints (Kniss et al. 2018), the exponential increase in computational time with the number of restraints hinders its application to systems where more restraints are measured to achieve higher accuracy. Furthermore, the introduction of tight distance restraints between the spin labels led to a number of disrupted conformers that had to be filtered out after the conformation sampling.

The more efficient calculation protocol and its application to the Ubc7-U7BR–ubiquitin conjugate

With an increasing number of restraints—as we had planned for the investigation of the Ubc7-U7BR–ubiquitin system, this particular implementation would have resulted in unrealistic long computation times. To allow for a more efficient conformation sampling comparable to applications of CYANA in NMR, it is necessary to introduce the PELDOR-derived distance restraints directly into the CYANA target function. As previously mentioned, NMR distance restraints can be taken into account during the structure calculation with CYANA by introducing them as upper and lower distance limits between two atoms or atom groups; a violation of a distance restraint is represented by a quadratic penalty added to the target function, whereas a restraint does not contribute to the target function if the distance is within the specified upper and lower limits.

Compared to NMR distance restraints, however, PELDOR-based distance distributions between distant spin labels are rather broad, especially if located on different proteins. Thus, introducing this additional kind of restraint into the CYANA target function as simple distance faces two major obstacles (1). As the CYANA molecular dynamics calculation is performed in torsion angle space, each term in the target function must be a differentiable function of the torsion angles with efficiently computable first partial derivatives (2). The simulated annealing approach implemented in CYANA results predominantly in solutions near or at the minimum of the target function. As a result, conformations that are near the highest populated distance in the PELDOR distance distribution might be over-represented in the resulting structure bundle.

To overcome these issues, we devised the following approach. At the first step of the structure optimization, the experimental distance distributions are included into the CYANA target function as additive terms that contribute 0 to the target function if an inter-spin distance is in agreement with the corresponding experimental distance distribution and a value > 0 in relation to the deviation from the distance distribution. To this end, each experimental distance distribution is represented by $N$ discrete pairs $\left\ ,~~p_ ),~...,(r_ ,~~p_ )} \right\}$, where $_$is a distance value and $_$ the corresponding probability (Fig. 1A), which is converted into a continuous distance distribution $P\left(r\right)$ (Fig. 1C) using Bézier curves such that $P\left(_\right)\approx _$. This last step ensures that analytical derivatives can be formulated, thus addressing the first issue. The resulting target function term $T$ is

$$T\left(r\right)=A \text\text\text\left(0, 1-\frac_\text\text}c}\right)$$

where $A$ is a weighting factor, $p_} = \max \left\ ,...,p_ } \right\}$, and $c=0.75$ a cutoff threshold (Fig. 1E). This target function term adds a penalty only where the normalized probability $P\left(r\right)/_\text\text}$ is below the cutoff threshold $c$ (Fig. 1F).

Fig. 1

Transforming a discrete experimental distance distribution to a CYANA target function term. In each panel the horizontal axis is the distance expressed in Å between the respective spin labels. Panel A shows the discrete probability density function, which is normalized in (B), and made continuous using splines in (C). To obtain a penalty term for the CYANA target function the distribution is inverted (D) such that regions of zero probability density yield the highest penalty value of 1. In addition, low values (below 0.25 – dotted line in E) are truncated, creating a plateau in the rescaled target function (F) to lower the strain in the system

As mentioned before, calculations deploying a target function with this kind of term result in structure ensembles near the maximal probability in the distance distribution. In order to allow for a full sampling of the given distance distribution (addressing the above-mentioned obstacle 2), we chose an iterative approach in which the distance distribution of the already calculated ensemble is subtracted from the given experimental distribution. Thus, after the first iteration high-probability regions, that because of this would also be highly sampled, are lowered in probability with each successive iteration. To this aim, a Gaussian function with a given standard deviation $\sigma$ (vide infra for the numerical value) and curve area is subtracted from the distance distribution for each distance observed in a calculated conformer. If the number of conformations generated over all iterations is $n$, the area below each of these Gaussians is set to $1/n$. Summing these Gaussians for all conformations results therefore in a calculated distance distribution with the same area as the experimental distance distribution, which is normalized to unit area. Subtracting all the Gaussians leads to a nearly flat distribution towards the last iterations, whereby values below zero are set to 0 to avoid invalid negative values in the distribution.

An example for the resulting distribution after the calculation of three, six and eleven structures is shown in Fig. 2.

Fig. 2

Illustration of the difference between an example starting distance probability distribution derived from EPR measurements (A) and the distribution after the generation of structures in the ensemble (magenta). The change in the distance distribution is shown after the generation of 3 (B), 6 (C), and 11 (D) structures. The original experimental distance distribution is shown as a dashed line and the accumulated Gaussians for the distances of each structure as black curves

This approach was applied in the investigation of the conformation space of ubiquitin covalently attached to Ubc7-U7BR. We determined PELDOR-based distance distributions for a total of 17 spin label pairs (in each case one in the ubiquitin and one in the E2 enzyme; Supplementary Table S1). A cutoff value of c = 0.75 and the weighting factor A= 10.0 Å2 (Guntert and Buchner 2015) were set; owing to its large weight, the additional term overweighs other contributions to the CYANA target function, therefore driving the calculation to fulfill the experimental PELDOR restraints.

Despite the fact that the distance distributions were derived by several experiments with two spin labels per molecule, in the calculation all spin labels were modelled in one system assuming equal dynamical behavior of the molecule in each individual experiment. 250 iterations were performed, whereby in each iteration 100 structures were calculated and 10 structures with the lowest values of the target function were selected (the selection of the top 10% of the calculated conformations is a well-established method in NMR structure calculation to ensure convergence). For the first iteration the area-normalized distance probability distributions obtained from the analysis of the PELDOR data were used, whereas at each successive iteration a correction was applied as described above by subtracting for each double mutant the inter-spin distance distribution derived from the ith structure; this was represented by a Gaussian distribution with a standard deviation $\sigma$ of 2.5 Å and an area of 1/2500, thus yielding at the end of the calculation a curve with unit area. In other words, at each point of the calculation and for each distance the sum of the probability distributions of all the already calculated structures is subtracted from the corresponding PELDOR-derived probability distribution.

The result is an ensemble of 2500 structures that resembles the experimental distance probability distributions without violating any of the restraints of the structure calculation. Crystal structures of Ubc7-U7BR (PDB ID: 4JQU) and ubiquitin (PDB ID: 1UBQ) were used as templates for the calculations. The residue missing in the crystal structure 4JQ4 (97–102 of Ubc7 as well as the di-Gly linker connecting Ubc7 to U7BR of Cue1) and the isopeptide bond between G76 of ubiquitin and K89 of Ubc7 were all introduced into the structure from the default CYANA residue templates. The rotatable torsional angles of these residues were set to random values at the beginning of each calculation. In addition to the experimentally derived distance ensemble restraints, the backbone of the following parts of the molecules was kept rigid by fixing the corresponding torsion angles to their value in the crystal structures, which was regularized to the CYANA standard geometry (Gottstein et al. 2012). The rigid parts, ubiquitin (Met1–Leu71), Ubc7 (Met1–His94; Arg109–Phe165) and U7BR (Asn171–Thr224), were held together as a rigid body by employing distance restraints from the crystal structure. The remaining parts of the backbone of ubiquitin (Arg72–Gly76), Ubc7 (Ser95–Glu108), and the linker between Ubc7 and U7BR (Gly166–Glu170) as well as all the side chain torsion angles including χ1 could rotate freely in all calculations.

To assess how the presented conformational sampling method can create ensembles representing the provided experimental restraints, additional calculations were performed. The distance distribution of the conformation ensemble described above in comparison to a free calculation without the long-range restraints shows the expected behavior. First the sampled distance distributions without the long-range EPR derived restraints (Figure S1, blue curve) are much broader and do not show the distinct distance probabilities that were measured experimentally. Second, the distance distributions obtained from the ensembles with this new approach (Figure S1, magenta curve) are in high agreement with the experimentally measured distributions (Figure S1, green curve).

Additionally, we calculated ensembles with removing one of the experimental long–range distances from the calculation and compared the distribution of this distance in the resulting ensemble to the experimentally measured distribution (Figure S2). The conformational ensemble generated leaving out one distance constraint, reproduces the experimental distance distribution of this non-constrained distance and in most cases even the shape of the experimental distribution, showing that the created conformational ensemble reproduces the single remaining unconstrained distance and thus can model an experimental distribution that was not included.

Visualization of the conformation space

To calculate the population distribution within the obtained conformation space, each conformation of the ensemble was weighted by a probability derived from the PELDOR distance distributions. Taking each PELDOR distance distribution as a probability density (with an integral of unity), we calculated a joint probability for any given structure. In detail, for each restraint i with distance ri the area below the distance probability distribution in the region ri± 2.5 Å was used as the probability for the corresponding distance in the calculated structure, and for any given structure the probabilities of the PELDOR distances were multiplied to obtain an overall probability. The resulting values were afterwards divided by the maximum probability obtained in these calculations such that the structure with the highest value is assigned a relative probability of one. This yields relative probabilities that allow us to compare the relative weights of specific conformations. In order to create a representation of the structural ensemble that reflects the experimental distance distribution as a probability density function, we developed the following method. To visualize the sampled distribution of dimeric proteins the rigid parts of one of the monomers (from here on called the stationary monomer) where superimposed and aligned such that the other monomer (called the moving monomer) is positioned around the stationary monomer according to the specific conformation in the ensemble. The assigned probabilities of the moving monomer were mapped to a three-dimensional grid; for this purpose, all moving monomers in the ensemble were represented by 3D Gaussians. In case of a nearly spherically shaped protein (as e.g. for ubiquitin) it is sufficient to represent the whole protein by one single 3D Gaussian at the geometric center, whereby in order to reduce calculation time the functions can be truncated after a certain radial distance to ignore small value contributions; in the case of more complex structures, the protein could also be represented by a number of Gaussians up to one per atom. The values of the 3D Gaussians were mapped onto the respective grid points of an evenly spaced (1 Å in each direction) 3D grid, and the Gaussians for each moving monomer of the ensemble were finally merged by taking their maximum value at each grid point where they were computed.

The ensemble distribution can be illustrated with PyMOL by showing contour surfaces covering regions above a threshold value for the probability grid (Fig. 3). This approach can easily be extended to a multimeric conformation ensemble, whereby all but one of the monomers are regarded as moving conformers and treated as mentioned above.

Fig. 3

Conformation space of the Ubc7-U7BR–ubiquitin conjugate determined by PELDOR EPR spectroscopy in combination with the calculation method described here. Ubc7, shown in green and the covalently attached U7BR peptide, shown in orange, are depicted as structural models. The entire space allowed for ubiquitin according to a total of 17 PELDOR restraints measured between different sites on Ubc7-U7BR and ubiquitin is indicated as a blue-coloured volume representing Gaussians around the centre of mass of ubiquitin in the different conformations. One explicit structural model of ubiquitin in one randomly chosen orientation is shown as well in yellow. The conformation space resembles more the conformation space of the UbcH5c–ubiquitin conjugate than that of the Ubc13–ubiquitin one (Pruneda et al. 2011)

View original article

JOURNAL OF BIOMOLECULAR NMR

分享书签

0 0 0 0 0 0 0

More from this channel

Efficient determination of the accessible conformation space of multi-domain complexes based on EPR PELDOR data

留言 (0)