Widespread amyloidogenicity potential of multiple myeloma patient-derived immunoglobulin light chains

Ethics, consent and permissions

The study described in this manuscript is an extension of previous work [13] and has been reviewed and approved by the ethics committee of the University Hospital Düsseldorf. All patients of whom samples were used in the study have signed an informed consent (study number 5926R and registration ID 20170664320).

Patient-derived samples

Protein isolated from 24-h urine samples of 10~patients (5 females, 5 males, median age 64.5 years with a range between 45 and 72 years, 2 patients with a lambda isotype light chain and 8 patients with a kappa isotype chain) with MM and one patient with amyloidosis as detailed in Additional file 1: Table S1.

A histopathological examination of the patient’s kidney was not available since the corresponding invasive diagnostic procedure was not necessary for the therapy decision-making process as they were diagnosed according to IMWG criteria. These samples represent a sub-set of samples of a previous study [13] (the sample nomenclature is the same), and they were selected, because we had previously been able to determine their amino acid sequences [37] and they contained the light chain protein at sufficiently high quantity to perform the detailed biochemical and biophysical experiments and to determine the protein sequence. The protein of the patient with amyloidosis was only sequenced and not subjected to ThAgg-Fip analysis, due to lack of purity.

Protein sample preparation

The IgLCs were purified as described previously [13]. Briefly summarised, the protein content of a 24-h urine collection was precipitated by ammonium sulfate (70% saturation) and the LCs were purified after dialysis by size exclusion chromatography on an ÄKTA pure chromatography system (GE Healthcare) using a Superdex 75 10/300 GL column and 30 mM Tris-HCl, pH 7.4, as an elution buffer. The displayed SEC chromatograms (Additional file 1: Fig. S4) are an example of one run per sample. To purify a sufficient amount of protein, a number of SEC experiments were conducted successively and the relevant fractions were combined afterwards. The chromatograms were very reproducible and the same indicated fractions were used from each individual run.

IgLC concentration was determined by measuring UV absorption at ~280 nm (extinction coefficient of 33,265 (P001), 27,640 (P004), 26,150 (P005, P006, P007, P016, P017), 33,140 (P013) and 31,650 M−1 cm−1 (P020). Pepstatin A and E-64 were dissolved in DMSO to prepare a stock solution of 1 mM, respectively.

LC sequence characterisation and protease identification

All LC sequences were determined using a dedicated mass spectrometry workflow described in [37]. This workflow is based on the combination of bottom-up and top-down proteomics experiments with appropriate data analysis. It is important to note that the sample numbers are not identical between the present manuscript and the manuscript with the sequencing results [37], but the latter contains a correspondence table.

Briefly, light chain samples were solubilised in 8 M urea, then reduced (5 mM TCEP, 30 min) and alkylated (10 mM iodoacetamide, 30 min in the dark). After urea dilution, digestion was carried out for 3 h at 37°C (1:20) and was stopped by adding 5% FA. Peptides were desalted on Sep-Pak C18 SPE cartridge. Digests were analysed in LC-MS/MS on a Q-Exactive Plus. Peptide elution was performed with a linear gradient on a 50-cm C18 column. Mass spectra were acquired with a Top10 data-dependent acquisition mode using classical values except for the number of MS/MS μscans that was set to 4. All raw files were searched with MaxQuant against the Uniprot Homo sapiens reference proteome (74,830 entries) concatenated with the corresponding light chain sequence using trypsin as specific enzyme with a maximum of 4 miscleavages. Possible modifications included carbamidomethylation (Cys, fixed), oxidation (Met, variable) and Nter acetylation (variable). One unique peptide to the protein group was required for the protein identification. A false discovery rate cut-off of 1% was applied at the peptide and protein levels.

Determination of the dimeric fraction

The ratios of dimers and monomers of the various IgLCs were determined by both top-down mass spectrometry (analysis of intact proteins) and analytical ultracentrifugation (AUC).

The light chains were purified by size exclusion chromatography (SEC), aliquoted, frozen and thawed before the AUC measurement. Prior to the MS measurements, the samples were dialysed against ammonium bicarbonate after the SEC and freeze-dried. The chromatograms of the different samples with the fractions used for the experiments highlighted can be found in Additional file 1: Fig. S4. All experiments of the current study were conducted with specific fractions in order to standardise the preparation since the peaks are not always symmetrical. Therefore, a bias of the measured dimer fraction due to differences in sampling cannot be fully excluded.

Sedimentation velocity (SV) experiments were performed in a XL-A ProteomeLab ultracentrifuge equipped with absorbance optics. Experiments were conducted in An-60-Ti-rotor at 20°C using a rotor speed of 60,000 rpm.

Solutions of 35 μM of IgLC samples were investigated and the radial scans were acquired at a wavelength which ensures an optimal resolution, at an absorbance of ca. 1 OD. Scans have a radial step size of 0.002 cm, i.e. radial resolution of 500 data points/cm. The sedimentation boundaries were analysed with SEDFIT (version 16p35) [73].

The c(s) model for sedimentation coefficient distributions as solutions to the Lamm equation implemented in SEDFIT was applied to the data. This model provides an approximation of a weight-average frictional ratio f/f0 for all particles of a distribution and based on which sedimentation coefficients can be translated into diffusion coefficients and/or molecular mass. An extension to the two-dimensional sedimentation coefficient distribution can be achieved with the c(s, f/f0) model, providing weight-average frictional ratios for each sedimentation coefficient of the distribution with colour-coded signal intensities [74].

During aggregation, we observed a loss of signal due to sedimentation of large assemblies during acceleration of the centrifuge to final speed for SV analysis. Absorbance was first measured at 3000 rpm to allow for quantification of the loss of material to 60,000 rpm. The wavelength for detection was varied to optimise the signal for SV analyses. The final c(s) curves were adjusted to have signals representative for the concentration left at the first time point of data acquisition.

The AUC experiments were conducted as single measurements, if the run did not indicate any disturbances. In preliminary experiments that are not shown in this manuscript, we had investigated whether the protein concentration, different buffer systems and incubation time have an influence on the c(s)-distribution. These preliminary experiments showed a high reproducibility and therefore we did not include technical replicates. The data was fitted on average seven times by using different algorithms (Marquardt-Levenberg or Simplex).

The fraction of dimer measured in relation to the overall amount of native light chains (monomer and dimer) investigated by the AUC measurements was compared to the results of mass spectrometry (MS) and previously reported results from non-reducing SDS-PAGE [13] (Fig. 2).

Combined differential scanning fluorimetry (DSF) and dynamic light scattering (DLS) experiments

The thermal unfolding experiments of the IgLC samples as a function of protein and denaturant concentration were performed with a Prometheus Panta instrument (Nanotemper, Munich, Germany). This is a microcapillary-based (10 μl sample per capillary) instrument that allows to measure up to 48 samples in parallel. Intrinsic fluorescence can be excited at 280 nm and emission is monitored at 330 and 350 nm. The DLS experiments are performed on the same sample with a laser at 405 nm. Furthermore, the instrument also allows to measure the sample turbidity with back reflection optics.

The experiments where the initial increase and subsequent decrease in turbidity of the samples was followed (‘Bence-Jones test’, BJT) were performed by scanning the temperature from 25 to 90°C at a scan rate of 2.5°C/min. We performed these types of experiments at different concentrations for each sample and noticed a slight concentration dependence, i.e. the point of maximal turbidity shifted in most cases to lower temperatures as the protein concentration was increased. Here we report only the values at 27 μM for all samples. At the lowest concentrations measured (ca. 10 μM), the turbidity increase became in some cases too weak to be reliably quantified. However, the samples that did not show a significant turbidity increase at 27 μM also did not show any turbidity at higher concentration.

The other thermal ramping experiments were performed by scanning from 20°C (urea dependence) or 25°C (concentration dependence) to 70°C at a scan rate of 1°C/min. For the melting scans, we prepared stock solutions of the IgLCs (97 μM (P001), 150 μM (P004), 91 μM (P005), 139 μM (P006), 103 μM (P007), 97 μM (P013), 99 μM (P016), 158 μM (P017) and 81 μM (P020)). These stock solutions were filtered through a 220-nm pore size syringe filter before concentration determination and the DSF-DLS experiments. The BJT was performed at a uniform sample concentration of 30 μM and at pH 5.0, which was established by adding 100 mM Na-acetate buffer to the solutions in 30 mM Tris buffer pH 7.4. A pH value of 5 was used in the BJT as this corresponds to the original protocol used to test patient urine for BJ proteins [41].

For the concentration-dependent measurements, the stock solutions were diluted 3 times by a factor of two with 30 mM Tris buffer pH 7.4, to yield 4 different concentrations per protein, allowing to define the concentration dependence of the unfolding and aggregation temperatures. For the urea-dependent experiments, the stock solutions were diluted 5-fold into solutions of appropriate urea concentration. The final urea concentrations were 0, 0.67, 1.34, 2.01, 2.68, 3.35, 4.02, 4.69 and 5.36 M and the buffer concentration was in all cases 30 mM Tris buffer. From our initial tests, we found that the technical replicates of these experiments with the Panta instrument were very reproducible, with temperatures of unfolding and onset of turbidity usually within 0.1–0.2°C for technical replicates. We therefore did not routinely perform technical replicates of these experiments. The data reported in Table 1 stems from individual experiments.

The data was visualised as the ratio of the intrinsic fluorescence emission intensity at 350 nm over the intensity at 330 nm. For the thermal unfolding at the different protein concentrations, the melting temperature (Tm) and the temperature of aggregation onset (Tagg), as well as the cumulant radius, were automatically determined by the instrument software. The melting temperature corresponds to the maximum of the first derivative of the change in fluorescence intensity emission ratio. The onset of aggregation is defined by fitting the cumulant radius as a function of temperature by both a linear function (i.e. an extrapolation of the baseline) and by a sigmoidal model. The onset of aggregation is defined as the temperature at which the two fits first differ by more than 0.5%.

For the experiments in the absence of denaturant, the buffer viscosity and its temperature dependence was set to that of water. In the samples with urea, we did not correct the viscosity, which was not necessary as we did not analyse the sizes precisely. In these experiments, DLS was merely used to determine which samples had formed aggregates and should therefore be excluded from the thermodynamic analysis.

In Table 1, we report the values for Tm (pH 5 and 7.4) and Tagg (pH 7.4), as well as the concentration dependencies, d Tm/d log(c) and d Tagg/d log(c) (pH 7.4). We use the logarithmic derivatives, because the dependency is approximately linear on a logarithmic concentration scale, which allows to use the slope of a linear fit as a single key parameter across the entire concentration range explored. It has been shown that the determination of the unfolding temperature from fluorescence intensity ratios can lead to systematic deviations, depending on the relative intensities of the fluorescence spectra of the folded and unfolded states [75]. The use of the derivative introduced above eliminates such systematic errors, because a constant offset of all melting temperatures does not affect their dependence on protein concentration.

We also visualise the evolution of the size distribution of the sample as a function of temperature with contour plots on a logarithmic size scale. The full sets of raw data of these experiments can be found in Additional file 2: Fig. S1 and S2. In these plots, each time/temperature point corresponds to a full particle size distribution determined from a multi-species fit to the intensity autocorrelation function. These plots are automatically generated by the control software of the Prometheus Panta instrument.

For the combined chemical and thermal unfolding experiments, the data set of each protein was globally fitted to the thermodynamic two-state model recently presented [40]. The global fits are shown in Additional file 2: Fig. S3. In order to reduce the influence of aggregation on the fits, only samples containing urea were included in the fit, as the simultaneous DLS measurements had shown aggregation mostly in the absence of urea. For P004, all samples below 2 M urea were excluded on this rationale. From the global fit over all temperatures, we then determine the stability of the IgLC, ΔG, at 37°C. We find a significant correlation between ΔG and the m-value, i.e. m=d ΔG/d [urea] [76]. We therefore fix the m-value to a common value for all different light chains and focus on the resulting differences between ΔG. Error estimates of the obtained values were obtained by 100-fold bootstrapping by resampling the different capillaries with replacement.

Measurement of aggregation kinetics

Different solution conditions (acidic pH values) were tested for their potential to induce aggregation of patient-derived, purified IgLCs. In order to prepare the samples at different acidic pH values (pH 2, pH 3, pH 4), protein solutions of different concentrations were diluted from 30 mM Tris-HCl pH 7.4 1:1 into 300 mM citric acid buffer at the desired pH value. Two or three replicates of each solution were then pipetted into a high-binding surface plate (Corning #3601, Corning, NY, USA). The aggregation kinetics were monitored in the presence and absence of small glass beads (SiLibeads Typ M, 3.0 mm). The plates were sealed using SealPlate film (Sigma-Aldrich #Z369667). The kinetics of amyloid fibril formation were monitored at 37°C either under continuous shaking (600 rpm) or under quiescent conditions by measuring ThT fluorescence intensity through the bottom of the plate using a FLUOstar (BMG LABTECH, Germany) microplate reader (readings were taken every 150 or 300 s). In order to compare the factor of the increase of the ThT fluorescence emission intensity between the samples, the ThT fluorescence emission intensity at the end of the experiment was compared with the lowest emission value. The halftimes of the aggregation reaction are defined as the point where the ThT intensity is halfway between the initial baseline and the final plateau. The halftimes were obtained by individually fitting the curves using the following generic sigmoidal equation [77]

$$Y=_i+_it+\left(_f+_ft\right)/\left(1+^/k\right)}\right)$$

where Y is the ThT fluorescence emission intensity, t is the time and t50 is the time when 50% of maximum ThT fluorescence intensity is reached. The initial baseline is described by yi+mit and the final baseline is described by yf+mft. While this equation does not describe the underlying molecular processes of aggregation, it does allow determination of the main phenomenological parameter, the half time, of each experiment.

The aggregation kinetics experiments were conducted by investigating 2–3 technical replicates per condition. To test the reproducibility, the experiments were repeated for selected samples. The sequence regions incorporated into the fibrils were determined from one aggregated sample in each case. At the end of the experiments, the amount of aggregated protein was determined by combining the replicates and pelleting the aggregation product for 1 h at 16,100g followed by measuring the soluble content by UV absorbance at 280 nm of the supernatant, and correcting for the absorbance of the ThT. In order to investigate whether the aggregation can be seeded at acidic pH, fibrils were produced by incubating 35 μM protein solution at pH 3 and pH 4 in a high-binding plate and in a 2-ml Eppendorf tube in presence of glass beads under shaking conditions at 37°C. The presence of fibrillar aggregates was confirmed by AFM. For seed generation, the fibril solutions obtained from the plate and from an Eppendorf tube were homogenised using an ultra-sonication bath Sonorex RK 100 H (Bandelin, Germany) for 300 s. The seeded aggregation experiments were performed in high-binding surface plates under quiescent conditions with 35 μM P016 monomer and the pre-formed seeds were added to a final concentration of 5% of the monomer solution at the desired pH value. The seeds were added either at the beginning or after 6 h pre-incubation of the monomer solution at 37°C.

We furthermore investigated the seeding potential at neutral pH, where the protein remains largely unaffected by proteolytic activity. We used pre-formed fibrils, which were produced at pH 3 and pH 4 in an Eppendorf tube as described above at 100 μM LC concentration. The seeds were additionally washed to remove the soluble fragments by centrifuging the sample at 137,000g at 20°C for 45 min and re-suspending the pellet in 150 mM citric acid. This washing procedure was carried out three times. The seeded aggregation experiments were performed in high-binding surface plates under agitation conditions with 50 μM monomer solution and seeds added to a final concentration of 5%.

Prevention of amyloid formation at acidic pH values

In order to investigate whether the cleavage of the IgLCs at acidic pH values was responsible for the amyloid fibril formation, we tested whether inhibiting some of the identified proteases will prevent the formation of ThT-positive aggregates.

Therefore, the IgLCs (35 μM monomer concentration) were incubated as described above in a high-binding surface plate under quiescent conditions in the presence of 10 μM pepstatin A and E-64, respectively. E-64 is an irreversible and highly selective cysteine protease inhibitor, and pepstatin A is a reversible inhibitor of acidic proteases (aspartic proteases) and can be used in a mixture with other enzyme inhibitors. Further experiments were conducted with P005 and P016 in the presence of 1 μM pepstatin A or E-64 or 1 μM or 10 μM pepstatin A and E-64 under agitation conditions. The morphology of aggregates was investigated by AFM.

Fibril fragment determination

The aggregation products formed in Eppendorf tubes under agitation conditions at pH 3 and pH 4 were centrifuged using an Optima MAX-XP ultracentrifuge (Beckman Coulter) in a TLA-55 rotor at 40,000 rpm at 20°C for 45 min. The pellet was re-suspended in 150 mM citric acid (pH 3 or pH 4) and centrifuged again for 45 min. This washing procedure to remove the soluble fragments was conducted three times. The washed aggregates were dissolved in 6 M urea and subsequently analysed by mass spectrometry.

A Dionex UltiMate 3000 RSLC Nano System coupled to an Orbitrap Fusion Lumos mass spectrometer fitted with a nano-electrospray ionisation source (Thermo Scientific) was used for all experiments. Five microliters of reduced/alkylated protein samples in solvent A were loaded at a flow rate of 5 μL min−1 onto an in-house packed C4 (5 μm, Reprosil) trap column (0.150 mm i.d. × 35 mm) and separated at a flow rate of 0.5 μL min−1 using a C4 (5 μm, Reprosil) column (0.075 mm i.d. × 28 cm). The following gradient was used: 2.0% B from 0–10 min; 20% B at 11 min; 60% B at 22 min; 99% B from 25–30 min; and 2.0% B from 30.1 to 35 min. Solvent A consisted of 98% H2O, 2% ACN and 0.1% FA, and solvent B 20% H2O, 80% ACN and 0.1% FA. MS scans were acquired at 120,000 resolving power (at m/z 400) with a scan range set to 550–2,000 m/z, four microscans (μ scans) per MS scan, an automatic gain control (AGC) target value of 5×105 and maximum injection time of 50 ms. MS/MS scans were acquired using the Data-Dependent Acquisition mode (Top 4) at 120,000 resolving power (at m/z 400) with an isolation width of 1.2 m/z, five μ scans, an AGC target value of 5×105 and maximum injection time of 250 ms. For fragmentation, electron transfer dissociation with 10 ms of reaction injection time and a supplemental higher-energy collisional dissociation with normalised collision energy (NCE) of 10% (EThcD) was used.

All data were processed with ProSightPC v4.1 (Thermo Scientific) and Proteome Discoverer v2.4 (Thermo Scientific) using the ProSightPD 3.0 node. Spectral data were first deconvoluted and deisotoped using the cRAWler algorithm. Spectra were then searched using a two-tier search tree with searches against the corresponding LC sequences. The search 1 consists of a ProSight Absolute Mass search with MS1 tolerance of 10 ppm and MS2 tolerance of 5 ppm. The search 2 is a ProSight Biomarker search with MS1 tolerance of 10 ppm and MS2 tolerance of 5 ppm. Identifications with E-values better than 1e−10 (−log (E-value) = 10) were considered as confident hits.

Atomic force microscopy (AFM)

Atomic force microscopy height images were acquired after the aggregation kinetic measurements. Ten microliters of each sample (after diluting 1:4 with dH2O) were deposited onto freshly cleaved mica. After drying, the samples were washed 5 times with 100 μL of dH2O and dried under gentle flow of nitrogen. AFM images were obtained using a NanoScope V (Bruker) atomic force microscope equipped with a silicon cantilever ScanAsyst-Air with a tip radius of 2–12~nm. The images were analysed with the software Gwyddion 2.56 to measure height profiles and investigate a possible twisting of the fibrillar aggregates.

Microfluidic diffusional sizing and concentration measurements

To investigate the influence of acidic pH on the samples, the samples were analysed using SDS-PAGE and FluidityOne. Fluidity One is a microfluidic diffusional sizing (MDS [42]) device, which measures the rate of diffusion of protein species under steady-state laminar flow and determines the average particle size from the overall diffusion coefficient. The protein concentration is determined by fluorescence intensity, as the protein is mixed with ortho-phthalaldehyde (OPA) after the diffusion, a compound which reacts with primary amines, producing a fluorescent compound [78]. To measure the influence of pH on the average size of the molecules in the solution, the protein was pre-incubated at acidic pH values with 150 mM citric acid (pH 3 or pH 4). The IgLC solution was incubated in an Eppendorf tube at 37°C under quiescent conditions. After different incubation times, 6 μL of the solutions were pipetted onto a disposable microfluidic chip and measured with the Fluidity One (F1, Fluidic Analytics, Cambridge, UK). The samples were also analysed by SDS-PAGE, according to a previously published protocol [13].

Modelling and data and sequence analysis

The sequences of the IgLC samples of this study were parsed with IMGT, the international ImMunoGeneTics information system and were aligned in order to investigate the amino acid changes between the germline sequences and the sequences under study. The sequences were also analysed using different online bioinformatic tools (ZipperDB, Tango, Pasta, CamSol), which have been developed to predict the aggregation propensity/solubility of proteins based on their amino acid sequences. All the light chain sequences were modelled using the Lyra software [79], and insertions and deletions were refined using the modeller [80] software using default parameters. The structure renderings were created using PyMol [81].

All the substrates for Cathepsin B and D were retrieved from the Merops database [82]. Since limited available experimental evidence was present in the Merops database on the other proteases, they were removed from further analyses.

The 4 residues preceding and following the proteolytic sites were used to build two position-specific weight matrices (PSSMs) for Cathepsin B and D with the Biopython [83] motifs packages using a pseudocount of 5 and the background distribution of the human proteome. All the residues in the light chain sequences were scored using the PSSMs, and at each position a cleavage log-odds score was defined as the largest of the log-odds scores obtained for the two PSSMs. We then tested if the cleavage sites potentially produced by the proteases support the peptides experimentally identified by mass spectrometry. To do so, we identified the regions in the sequences proximal to the N- and C-terminal residues from the experimental peptide and tested if the cleavage log-odds scores within such region was significantly larger than outside. The proximal region was defined as the union of all residues that are at most two residues before a peptide’s N terminal and at most one residue after it, plus all residues that are at most one residue before a peptide’s C-terminal and at most two residues after it, in the corresponding light chain sequence. By including the residues near the observed peptide termini, we account for the potential effect of exopeptidases, and for imprecisions in the data used for training the PSSMs. The cleavage scores in the proximal regions were then compared to the scores outside of it by performing a one-tailed Mann–Whitney U test, and in all cases, the results were under the significance threshold of 0.05. The plots visualising these results were generated with the Python Matplotlib library.

In order to investigate the correlations between the experimental quantities measured of the IgLCs, numerical values were needed for all measurements. This necessitated an arbitrary conversion of fraction refolded by DSF to a numerical scale of 0.33, 0.50 or 0.66, for small, medium and large, respectively. Similarly, refolding at 2M urea was converted to an ordered categorical scale [0,1,2] for 0, <50 and >50, respectively. The ΔG-value was not included for P016, and m-values were not used because they were globally fitted and therefore shared between all data sets. All other non-numerical cells in Table 1 were interpreted as missing values. Before training an Elastic Net model on the data, missing values were set to the mean of the given observable and all values were normalised to the mean value with unit variance. The following limited set of parameters were chosen to eliminate the most internally correlated measurements (e.g. dimerization by three different methods): Tm DSF (pH 5), T (maximum turbidity at pH 5), Tm DSC, d Tm/d log(c) (pH 7.4), d Tagg/d log(c) (pH 7.4), cumulant radius (pH 7.4, 70°C), dimerization by AUC, fraction refold by DSC, ΔG (37°C) and digestability by trypsin. Model training was done using a grid search with 4-fold cross validation using negative mean squared error as the loss function. A separate validation set was not created because of the low number of data points. This analysis was performed in Python 3.7 using packages Numpy, Pandas, SKLearn, Matplotlib and Seaborn.

留言 (0)

沒有登入
gif