Multi-omics analyses of MEN1 missense mutations identify disruption of menin–MLL and menin–JunD interactions as critical requirements for molecular pathogenicity

Identification of disease-associated MEN1 mutations

A list of MEN1 gene missense mutations was compiled using data from the Human Gene Mutation Database (HGMD;(42)) for MEN1 mutations and the Catalogue of Somatic Mutations in Cancer (COSMIC;(43)) for somatic MEN1 mutations, complemented with data from the literature [8, 9, 20](Additional file 5: Table S1).

In silico analysis of MEN1 missense mutation-derived menin proteins

The PDB file of menin (3U84, reduced to monomer, containing 550 amino acids of the 610 amino acid menin protein) was analyzed using Swiss-PdbViewer 4.1.0. Menin protein products resulting from MEN1 mutations were assessed based on the following criteria: surface accessibility, predicted charge and polarity changes due to the mutations, presence of hydrogen bonds as well as salt bridges at the site of the mutations (using VMD 1.9.2 [44]). For every mutation, the number of rotamers, which showed clashes within the tertiary protein structure was counted (Additional file 5: Table S1). When using default software settings, the absence of clashing rotamers (using default software settings) due to the MEN1 mutations was defined as stable, less than 100% clashing rotamers as potentially stable and 100% clashing rotamers due to the mutation as predicted to result in unstable menin protein.

The PDB files of the menin–MLL1 complex (3U85), menin–MLL1–LEDGF complex (3U88) and menin–JunD complex (3U86) were acquired from RCSB [45] for analysis of the effects of MEN1 mutations on the reported interactions of menin with MLL1, PSIP1/LEDGF or JunD. Mutations in the reported MLL1, PSIP1/LEDGF and JunD interaction surfaces of menin were described by selecting amino acids in a range of 5 Å from a known interaction site and verifying if any known mutations were present within this 5-Å radius. Only mutations on the surface and affecting an amino acid capable of forming H-bonds were considered for their relevance for the predicted interaction surface.

Projection of menin mutations

Both structures of the menin–JunD (ID: 3U86) and menin–MLL1–LEDGF (ID: 3U88) complexes were superimposed by sequence alignment with matchmaker and amino acids were highlighted using Chimera (version 1.13.1) [46].

Plasmids and cell lines

The MEN1 cDNA was cloned into the Gateway system using the pENTR directional TOPO cloning kit to enable C-terminal tagging (Life technologies, Thermo Fisher Scientific, MA, USA). MEN1 gene mutations were introduced using site-directed mutagenesis essentially as described by the Quikchange strategy (Agilent, CA, USA) into the MEN1 cDNA in a pCDNA3.1 vector for transient expression, as well as in a pENTR-menin plasmid for C-terminal tagging with GFP via GATEWAY recombination with the pCDNA5_FRT_TO_C-GFP plasmid (van Nuland et al., 2013). The menin mutations and complete ORF of all plasmids were validated by DNA sequence analysis using M13fw and M13rev primers for the pENTR plasmids and CMV-fw and N-GFPrev primers for the pCDNA5_FRT_TO_C-GFP expression plasmids:

M13fwd: 5ʹ-CCCAGTCACGACGTTGTAAAACG-3ʹ and M13rev: 5ʹ-GAAACAGCTATGACCATG-3ʹ;

CMVfw: 5ʹ-CGCAAATGGGCGGTAGGCGTG-3ʹ and N-GFPrev: 5ʹ-ACAGCTCCTCGCCCTTGC-3ʹ.

Human embryonic kidney (HEK) 293 T and human cervical carcinoma HeLa cells were cultured in standard growth medium (DMEM, 10% FBS, 100 U/ml penicillin, 150 ug/ml streptomycin and 2 mM glutamine).

Expression of menin (mutant) proteins

HEK 293 T cells were transiently transfected with 600 ng DNA in a 12-well format using the PEI transfection reagent. After 48 h, cells were lysed in Laemmli sample buffer. Total cell lysates were analyzed through immunoblotting using anti-menin (A300-105A, Bethyl) and anti-alpha tubulin (CP06 Calbiochem) antibodies respectively. In addition, total RNA was isolated and quantitative RT-PCR was performed to determine total MEN1 and ACTB (beta-actin) expression in pCDNA3.1 MEN1 transfected cells as described [47]. Menin protein levels were determined using ImageJ [48] from three independent transfections after subtraction of the background/menin signal from empty vector transfected cells. Ratios of relative menin protein and MEN1 mRNA levels determined in triplicate were calculated.

For doxycycline-inducible expression in stable cell lines MEN1-GFP expressing cDNAs were chromosomally integrated in HeLa cells using the Flip-in system, essentially as described [10]. After 24 h of induction using 1 µg/ml doxycycline, GFP expression was verified using immunoblotting using GFP (JL-8- Clontech) and α-tubulin (CP06 Calbiochem) antibodies. Nuclear and cytoplasmic extracts were prepared for GFP-affinity purification coupled to mass spectrometry analyses as described before [10].

Quantitative mass spectrometry of the menin interactome

GFP-affinity purification, sample preparation and data analysis were performed as reported [10, 34, 49]. Briefly, HeLa cell harboring menin-GFP cDNAs were grown in 15 15 cm dishes until subconfluency (approximately 300 million cells in total) and induced with 1 ug/mL doxycycline for 24 h. Cells were harvested by dislodging with trypsin and cell pellets washed with cold PBS (Gibco, #10,010–015). Cell pellet was re-suspended in 5 packed-cell volumes (PCVs) of ice-cold Buffer A (10 mM Hepes–KOH pH 7.9, 1.5 mM MgCl2, 10 mM KCl), incubated for 10 min on ice and then centrifuged at 400 g and 4 °C for 5 min. Supernatants was aspirated and cells were lysed in 2 PCVs Buffer A containing 1 × CPI (Roche, #11,836,145,001), 0.5 mM DTT and 0.15% NP40. The suspension was homogenized in Dounce homogenizer followed by centrifugation at 3200 g and 4 °C for 15 min. Supernatant and pellet contain cytoplasmic and nuclear fractions, respectively. The nuclear pellet was washed gently with 10 volumes of Buffer A containing 1 × CPI (Roche, #11,836,145,001), 0.5 mM DTT and 0.15% NP40 and centrifuged for 5 min at 3200 g at 4 °C min. Nuclear proteins were extracted by 2 PCVs volumes of high salt Buffer B (420 mM NaCl, 20 mM Hepes–KOH pH 7.9, 20% v/v glycerol, 2 mM MgCl2, 0.2 mM EDTA, 0.1% NP40, 1 × CPI, 0.5 mM DTT) during gentle agitation at 4 °C for 1.5 h. Both the nuclear and cytoplasmic extracts were centrifuged at 3200 g and 4 °C for 60 min. Supernatants were collected and protein concentration was measured by Bradford assay.

1 mg of nuclear or 3 mg of cytoplasmic extract was used for GFP-affinity purification as described (Spruijt et al., 2013). In short, protein lysates were incubated in binding buffer (20 mM Hepes–KOH pH 7.9, 300 mM NaCl, 20% glycerol, 2 mM MgCl2, 0.2 mM EDTA, 0.1% NP-40, 0.5 mM DTT and 1 × Roche protease inhibitor cocktail) on a rotating wheel at 4 °C for 1 h in triplicates with GFP-Trap agarose beads (#gta-200, Chromotek) or control agarose beads (#bab-20, Chromotek). The beads were washed two times with binding buffer containing 0.5% NP-40, two times with PBS containing 0.5% NP-40, and two times with PBS. On-bead digestion of bound proteins was performed overnight in elution buffer (100 mM Tris–HCl pH 7.5, 2 M urea, 10 mM DTT) with 0.1 µg/ml of trypsin at RT and eluted tryptic peptides were bound to C18 stage tips (ThermoFisher, USA) prior to mass spectrometry analysis.

Tryptic peptides were eluted from the C18 stage tips in H2O:acetonitrile (35:65) and dried. Samples were analyzed by nanoflow-LC–MS/MS with a Q-ExactivePlus mass spectrometer (Thermo Fisher Scientific) coupled to an Easy nano-LC 1000 HPLC (Thermo Fisher Scientific) in the tandem mass spectrometry mode with a 90 min total analysis time. The flow rate was 300 nl/min, buffer A was 0.1% (v/v) formic acid and buffer B was 0.1% formic acid in 80% acetonitrile. A gradient of increasing organic proportion was used in combination with a reversed phase C18 separating column (2 µm particle size, 100 Ǻ pore size, 15 cm length, 50 µm i.d., Thermo Fisher Scientific). Each MS scan was followed by a maximum of 10 MS/MS scans in the data-dependent mode. Blank samples were run between each set of three samples to minimize carry over.

The raw data files were analyzed with MaxQuant software (version 1.5.3.30) using Uniprot human FASTA database [49, 50]. Label-free quantification values (LFQ) and match between run options were selected. Intensity based absolute quantification (iBAQ) algorithm was also activated for subsequent relative protein abundance estimation [51]. The obtained protein files were analyzed by Perseus software (MQ package, version 1.6.12), in which contaminants and reverse hits were filtered out [50]. Protein identification based on non-unique peptides as well as proteins identified by only one peptide in the different triplicates were excluded to increase protein prediction accuracy.

For identification of the bait interactors LFQ intensity-based values were transformed on the logarithmic scale (log2) to generate Gaussian distribution of the data. This allows for imputation of missing values based on the normal distribution of the overall data (in Perseus, width = 0.3; shift = 1.8). The normalized LFQ intensities were compared between grouped GFP triplicates and non-GFP triplicates using 1% as the permutation-based false discovery rate (FDR) in a two-tailed t-test. The threshold for significance (S0), based on the FDR and the ratio between GFP and non-GFP samples was kept at the constant value of 2. Relative abundance plots were obtained by comparison of the iBAQ values of GFP interactors. The values of the non-GFP iBAQ values were subtracted from the corresponding proteins in the GFP pull-down and were next normalized on the menin-GFP bait protein for scaling and data representation purposes. All mass spectrometry data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository under the dataset identifier PXD031928.

Genome localization experiments by (green)CUT&RUN

Genome localization analysis of GFP-tagged menin wild type, R52G, E255K, E359K or E408Q and GFP-tagged JunD was performed by greenCUT&RUN with the combination of enhancer-MNase and LaG16-MNase as described [33]. To localize MLL1 or RNA polymerase II standard CUT&RUN [32], was employed using MLL1/KMT2A antibodies (Epicypher #SKU 13–2004) or 8WG16 antibodies directed against the CTD of RPB1 at 1 µg/ml. For standard CUT&RUN and greenCUT&RUN mononucleosomal Drosophila DNA was used as spike-in DNA for normalization purposes and sequencing libraries were prepared as described (Nizamuddin et al., 2021). In brief, purified DNA fragments were subjected to library preparation with NEB Next Ultra II and NEB Multiplex Oligo Set I/II as per manufacturer (New England Biolabs) protocol without size selection. For each library, DNA concentration was determined using a Qubit instrument (Invitrogen, USA) and size distribution was analyzed with Agilent Bioanalyzer chips (DNA high sensitivity assay). The 75-nucleotide paired-end sequencing reads were generated (Illumina, HiSeq 3000) with 6–32 M reads per sample (Additional file 6: Table S2). These NGS data have been deposited to Sequence Read Archive (52) under the accession number PRJNA772915.

Bioinformatic analyses of genomics data

The HeLa cell datasets for H3K4me3 (ID: ENCFF063XTI), H3K4me1 (ENCFF617YCQ) and H3K27ac (ENCFF113QJM) were downloaded from ENCODE. The ATACseq dataset for HeLa cells was downloaded from SRA-NCBI (Accession ID: SRR8171284). The datasets generated in-house were initially passed through quality control using Trim-galore (version 0.6.3). Further reads were aligned on the human (version hg38) and Drosophila reference genome (BDFP5) using bowtie2 (version 2.3.4.1) with option: –dovetail –local –very-sensitive-local –no-unal –no-mixed –no-discordant -I 10 -X 700 [34, 53]. Equal number of reads was randomly selected for WT menin and menin mutants using sambamba (version 0.6.9) and utilized in the further analysis. The correlation plot was generated using the deeptools (version 3.3.2) [54, 55].

Peaks were called using HOMER with default parameters except filtering based on clonal signals was disabled using option: -C 0. Both narrow and broad peaks were called with HOMER and merged together using bedtools (version 2.27.1) except for JunD for which only narrow peaks were called (Quinlan & Hall, 2010). Reads of the control samples were normalized with SpikeIn and used to generate TagDirectories by HOMER using option: -totalReads before peak calling. To calculate normalized total reads, ratio of the SpikeIn per human reads of control and experiment were calculated and multiplied with total number of control reads. In case of H3K4me3 (ENCFF862LUQ) the coordinate of the peaks were downloaded from ENCODE. The peaks of menin (wild type) overlapping with JunD, MLL1 and H3K4me3 were identified using bedtools and classified into eight categories using in-house script which can be downloaded [56].

HOMER was used to find AP1/ATF motifs within the peaks with default parameters. In differential peak analysis, menin peaks with coverage of fold changes ≤ 4 or p value ≥ 0.001 against mutants were considered as unaffected by the menin mutations.

Heatmaps were generated using deeptools (version 3.3.2) with default parameters. All next-generation sequencing datasets have been deposited to the Sequence Read Archive (SRA) portal of the NCBI with accession ID PRJNA772915. The command-lines used for the data analysis can be downloaded [56].

留言 (0)

沒有登入
gif