Atlas of mRNA translation and decay for bacteria

Bacterial growth

Overnight bacterial cultures (not exceeding optical density 1 (OD600 = 1) were used to dilute the main culture to a starting OD600 of 0.03–0.05. Cultures were harvested by centrifugation on reaching logarithmic phase (OD600 = 0.4–0.8) unless indicated otherwise. If not stated differently, bacterial cultures were grown at 37 °C with rotation using the recommended growth media. Specifically, L. plantarum (ATCC 8014) and L. reuteri (DSM 17938) were grown in MRS broth (Sigma-Aldrich). L. plantarum stress treatments were carried out as follows: stationary-phase cultures were grown for 27 h post inoculation and harvested at OD600 = ~4.5. To generate samples for untreated control, heat shock and low-nutrient, biological replicate cultures (40 ml) were grown to mid-log phase (OD600 = 0.3–0.6), split (10 ml for untreated control, 15 ml for heat shock and 15 ml for low-nutrient sample) and cells harvested by centrifugation. Untreated control pellets were flash-frozen immediately for RNA analysis. Before heat shock, cells were resuspended in prewarmed MRS broth and incubated in a thermomixer for 15 min at 60 °C. Low-nutrient cell pellets were washed thoroughly with 50 ml of 0.5× LB medium (Sigma), centrifuged and the supernatant completely removed. Cells were then resuspended in prewarmed 0.5× LB medium (Sigma) and harvested after a total incubation time of 15 min at 37 °C. B. subtilis (168trpC2) was cultured in 2× YT (1.6% (wt/vol) tryptone (Bacto), 1% (wt/vol) yeast extract (Bacto) and 0.5% (wt/vol) NaCl. For extended growth we collected samples at mid-log, 24 h, 48 h and 8 days post inoculation. Salt stress was performed by mixing equal volumes of 2 M NaCl with mid-log-phase-grown B. subtilis (OD600 = 0.5–0.6), followed by 10 min incubation at room temperature and harvesting by centrifugation. Biological replicate cultures of L. plantarum, L. reuteri, E. coli (Invitrogen, no. 18265-017) and B. amyloliquefaciens were grown in LB medium to log phase, then cultures were split to generate samples for untreated control and random fragmented control. CAM was added to mid-log-phase-grown L. plantarum and L. reuteri cultures at a final concentration of 100 µg ml–1, incubated for 5 min at 37 °C and subsequently harvested on ice containing an additional 100 µg ml–1 CAM5. MUP treatment (final concentration 65 µg ml–1) of mid-log L. plantarum and L. reuteri was carried out for 10 min at 37 °C following centrifugation and flash-freezing of the pellet. C. crescentus (strain NA1000) was grown in PYE medium containing 0.2% (wt/vol) peptone (Bacto) and 0.1% yeast extract (Bacto) at 30 °C to mid-log, followed by centrifugation and flash-freezing of the cell pellet. Synechocystis strain PCC6083 was cultured in BG11 growth medium at 30 °C at a light intensity of 30 µE and 1% atmospheric CO2 and harvested at mid-log phase51. Heat shock treatments for E. coli, B. subtilis wild-type and RNase knockout strains were carried out as follows: E. coli (MG1655) strain was grown in LB medium at 37 °C to late log phase (0.6–0.8); B. subtilis wild-type (168trpC2), rnjA, rnjB and rny deletion strains41,52 were cultured at 37 °C in LB (supplemented with 100 µg ml–1 spectinomycin for rnjA and 5 µg ml–1 kanamycin for rnjB and rny deletion strains) to log growth phase (0.2–0.3). The culture was split in two, the cell pellet collected by centrifugation and the untreated control sample flash-frozen immediately. Heat shock was carried out by resuspending cells in 65 °C prewarmed LB medium followed by 10 min incubation at 65 °C. Samples were harvested by centrifugation and shock-frozen in an ethanol/dry ice bath for RNA analysis.

For antibiotic treatment of E. coli, strain MG1655 was grown in LB at 37 °C to exponential phase and cultures divided in three for sampling of untreated control, MUP (65 µg ml–1) and CAM (100 µg ml–1). Cultures were incubated with antibiotics for 10 min at 37 °C, harvested by centrifugation and shock-frozen in an ethanol/dry ice bath for RNA analysis.

For expression of RNase J in E. coli, the top ten cells were transformed with plasmid containing the RNase JA coding sequence from S. pyogenes (plasmid pEC622:pEC85 with rnjA promoter + rnjA (expressing RNase J1 for complementation in S. pyogenes (pEC85 replicates in E. coli), a gift from E. Charpentier). Strains A. finegoldii (DSM 17242), P. copri (DSM 18205), P. merdae (DSM 19495) and P. timonesis (DSM 22865) were anaerobically cultured in GAM broth (HyServe) for 24 h. Where indicated, species were treated with either MUP (65 µg ml–1) or CAM (100 µg ml–1) for 10 min. After incubation, 1 ml aliquots were removed and cells catalytically inactivated by the addition of RNAprotect Bacteria Reagent (Qiagen) and harvested by centrifugation.

Gut microbiome samples

To monitor the effects of drugs on the dynamics of the metadegradome, we screened two anaerobically cultivated human gut microbiomes against four drugs over time. Gut microbiomes from healthy patients were collected and resuspended in 40% glycerol and 500 µl aliquots were stored at –80 °C. One aliquot was used to inoculate 10 ml of starter culture maintained in MCDA broth for 24 h53 in the absence of oxygen and in the presence of 2.5–3.0% hydrogen at 37 °C. To ensure that the microbial community was in a metabolically active growth phase, we diluted the starter cultures with fresh MCDA medium at a 1:10 ratio 24 h before starting the experiment, in a total volume of 30 ml, under the anaerobic conditions given above but without shaking. For antibiotic treatment, 6 ml of faecal culture was transferred into a 15 ml Falcon tube either without antibiotics or supplemented with either CAM (80 µg ml–1), MUP (650 µg ml–1), DOX (5 µg ml–1) or ERY (5 µg ml–1). After incubation for 5 min, 30 min or 48 h, 1 ml aliquots were removed and growing cells catalytically inactivated by the addition of RNAprotect Bacteria Reagent (Qiagen) (t0 before treatment and at 5 min, 30 min and 48 h post treatment) and harvested by centrifugation. The number of replicates for each sample is listed in Supplementary Table 2.

RNA extraction

Extraction of RNA (if not stated otherwise) was performed as described in ref. 54, with minor modifications. In brief, cell pellets were resuspended in equal volumes of LET (25 mM Tris pH 8.0, 100 mM LiCl, 20 mM EDTA) and water-saturated phenol pH 6.6 (Thermo Fisher). Cells were lysed with acid-washed glass beads (Sigma-Aldrich) by vortexing for 3 min in MultiMixer. Following the addition of equal volumes of phenol/chloroform/isoamyl alcohol pH 4.5 (25:24:1) and nuclease-free water, lysis was extended by an additional 2 min of vortexing followed by centrifugation. The resulting aqueous phase was purified in two steps using phenol/chloroform/isoamyl alcohol (25:24:1) followed by chloroform. Following centrifugation, the clean aqueous phase was precipitated with sodium acetate-ethanol. For Lactobacillus mixtures (Fig. 3a), microbial RNA extracted from L. plantarum (untreated control and MUP treated) were mixed before the RNA ligation step of the 5P-seq library protocol at different ratios with RNA extracted from L. reuteri (untreated). Technical replicates of the Microbial Community Standard (Zymobiomics, no. D6300, lot no. ZRC 190633), consisting of eight deactivated bacterial strains, were generated by extraction of RNA from 75–125 µl of thawed cell suspension. Vaginal swab samples were mechanically lysed using beads in 1,000 µl of DNA/RNA shield (ZymoResearch) and lysate was stored at –80 °C for 2 months before use. Lysate was thawed and 250 µl used to extract microbial RNA. Faeces from a healthy donor were collected and transported in 40% glycerol. Technical replicates of RNA were extracted on the day stated above, with minor modifications listed as follows. In brief, 500 µl of faeces/glycerol suspension was mixed with an equal volume of LET buffer containing SDS (25 mM Tris pH 8.0, 100 mM LiCl, 20 mM EDTA, 10% SDS) and water-saturated phenol pH 6.6 (Thermo Fisher). Lysis was performed by vortexing with carbide beads; the duration was extended to 10 min after the addition of equal volumes of phenol/chloroform/isoamyl alcohol pH 4.5 (25:24:1) and nuclease-free water. All subsequent steps were performed as described above. RNA from the cultivated gut microbiome time-course experiment was extracted as stated, with modifications to the LET buffer containing SDS (25 mM Tris pH 8.0, 100 mM LiCl, 20 mM EDTA, 10% SDS). Compost RNA was extracted with 2 g of starting material (from Sundbyberg, Sweden) using the RNeasy PowerSoil Total RNA Kit (Qiagen) as recommended in the manufacturer’s guidelines. RNA quality was assessed by loading of either 1 µg of total RNA on 1.2% agarose gel or 12 ng of total RNA on a BioAnalyzer using an RNA Nano 6000 chip (Agilent Technologies). Before 5P-seq library preparation, RNA was quantified using the Qubit RNA BR (Broad-Range) kit (Thermo Fisher Scientific) according to the manufacturer’s guidelines.

DNA extraction

Purification of DNA was carried out by lysing cells from the gut microbial community with glass beads (Sigma), combined with phenol/chloroform/isoamyl alcohol extraction and ethanol precipitation. The workflow protocol was identical to the RNA extraction procedure described above but with the substitution of LET (25 mM Tris pH 8.0, 100 mM LiCl, 20 mM EDTA, 10% SDS) with 10 mM Tris-HCl pH 8.0 and the substitution of phenol/chloroform/isoamyl alcohol pH 4.5 with UltraPure phenol/chloroform/isoamyl alcohol pH 8.0 (Invitrogen)

Polyribosome fractionation

Polyribosme fractionation was performed as previously described55, with minor modifications. In brief, B. subtilis (168trpC2) was cultured in LB medium to mid-log phase at 37 °C and harvested on ice containing 100 µg ml–1 CAM, with 5 min centrifugation. The resulting pellet was lysed in 1 × TN (50 mM Tris/HCl pH 7.4, 150 mM NaCl, 1 mM DTT, 100 μg ml–1 CAM and a complete EDTA-free protease inhibitor tablet) using glass beads with vortexing for 2 min, followed by 5 min incubation on ice. Lysis and incubation were repeated twice. Cell debris was cleared by centrifugation at 1,500g for 5 min at 4 °C and the supernatant loaded onto a 15–50% sucrose gradient with an 80% cushion. After ultracentrifugation at 36,000 rpm for 90 min, Abs254 was monitored and fractionated. Subsequently, RNA was extracted from sucrose fractions by additionof equal volumes of phenol/chloroform/isoamyl alcohol pH 4.5 (25:24:1) and nuclease-free water, followed by 2 min of vortexing and centrifugation. The aqueous phase was further cleaned by the addition of chloroform, vortexing and centrifugation and the resulting aqueous phase sodium acetate/ethanol precipitated.

Preparation of 5P-seq libraries

Our 5P-seq libraries were prepared as previously described5,41, with minor modifications, using 150–9,000 ng of total RNA as input. To prepare random fragmented samples (negative controls), ribosomal RNA was depleted from DNA-free RNA and subsequent fragmentation by incubation for 5 min at 80 °C in fragmentation buffer (40 mM Tris acetate pH 8.1, 100 mM KOAc, 30 mM MgOAc). The reaction was purified using two volumes of RNACleanXP beads (Beckman Coulter) as recommended by the manufacturer. Free 5' OH sites were rephosphorylated using 5 U of T4 polynucleotide kinase (NEB) and incubated at 37 °C for 60 min, as recommended by the manufacturer. Rephosphorylated fragmented RNA was purified using phenol/chloroform/isoamyl alcohol (24:25:1) followed by sodium acetate/ethanol precipitation. From this step forward, procedures for random fragmented and standard 5P-seq library preparation were merged41. RNA was ligated to either the rP5_RND or rP5_RNA oligo (Supplementary Table 1) containing unique molecular identifiers. Ribosomal RNA was depleted using the Ribozero rRNA removal kit (Illumina), which is suitable for bacteria, yeasts and human samples. The rRNA-depleted sample was purified using 1.8 volumes of Ampure beads (Abcam) and fragmented by heat (80 °C) for 5 min in 5× fragmentation buffer (200 mM Tris acetate pH 8.1, 500 mM KOAc,150 mM MgOAc). Subsequent samples were reverse transcribed using random hexamers for priming. The resulting complementary DNA was bound to streptavidin beads (M-280), subjected to enzymatic reactions of DNA end repair and filling in of adenine to the 5' protruding ends of DNA fragments using Klenow Fragment (NEB). The common adaptor (P7-MPX) was ligated and 5P-seq libraries were amplified by PCR (15–17 cycles), purified using 1.8 volumes of Ampure beads (Abcam) and quantified with Qubit (Thermo Fisher). Library size was estimated from bioanalyser traces. 5P-seq libraries were pooled by mixing equal amounts of each sample, followed by enrichment of 300–500-nt fragments.

During the work described in this manuscript, our laboratory developed a simplified high-throughput 5P-seq strategy that was applied to a subset of samples, as detailed in Supplementary Table 2. Libraries compiled by this strategy were generated as recently described15,56. In brief, DNA-free RNA was ligated with RNA oligos containing unique molecular identifiers. Ligated RNA was reverse transcribed by priming with oligos containing a random hexamer and an Illumina-compatible region. RNA was eliminated by the addition of NaOH. Ribosomal RNA was depleted by the addition of in-house rRNA DNA oligo depletion mixes (Supplementary Table 1) to cDNA and performing duplex-specific nuclease (DSN, Evrogen) treatment. rRNA-depleted cDNA was PCR amplified (15–17 cycles). Depletion of rRNA with Ribozero Illumina (for bacteria and yeasts) was done after the single-stranded RNA ligation step. Ribosomal-depleted RNA was purified and reverse transcribed using the oligos mentioned above and then amplified by PCR. Libraries were quantified by fluorescence (Qubit, Thermo Fisher), their size estimated using an Agilent Bioanalyser and sequenced using either a NextSeq500 or NextSeq2000 Illumina sequencer.

Metagenomic library preparation

DNA libraries were prepared from time points of untreated (t0) and 48 h (t48h)-treated faecal cultures according to the manufacturer’s guidelines (ThruPlex DNA-Seq Kit, Takara Bio). In brief, DNA was sheared using an ME220 focused ultrasonicator (Covaris) to an average size of 350 nt (microtube AFA Fiber crimp-Cap, PN 520053). Sheared DNA (30 ng) was used to implement a DNA end repair reaction, followed by library synthesis and PCR amplification (ten cycles) with primers NEBi5 and PE2. DNA libraries were purified using Ampure XP, quantified by fluorescence (Qubit, Thermo Fisher) and sequenced using a NextSeq2000 Illumina sequencer.

Sequence data preprocessing and mapping

Demultiplexing and fastq generation of sequencing bcl image files was performed using bcl2fastq (v.2) with default options. Adaptor and quality trimming was performed with the bbduk programme of the BBTools suite (https://sourceforge.net/projects/bbmap/), with options (qtrim=r, ktrim=r, hdist=3, hdist2 = 2, K = 20, mink=14, trimq=16, minlen=30, maq=16), using the BBTools default adaptor set and polyG or polyA sequences for short reads. To reduce computational time, reads with both identical unique molecular identifier (UMI) and insert sequence were deduplicated before mapping using the deduping programme of the BBTools suite with default parameters. UMI sequences found in the first eight bases of each read were extracted using UMI-tools (v.1) with default options (using --bc-pattern NNNNNNNN).

Bacterial genomes were downloaded on 21 March 2019 from the National Center for Biotechnology Information Assembly database (https://www.ncbi.nlm.nih.gov/assembly/) with the following search terms: “bacteria”[Filter] AND (latest[filter] AND (“representative genome”[filter] OR “reference genome”[filter]) AND (all[filter] NOT “derived from surveillance project”[filter] AND all[filter] NOT anomalous[filter])). The list was further filtered to include only one strain per species, giving priority to genomes marked as “reference”. The resulting 5,804 genomes (Supplementary Table 4) were used to build the reference index. The index was built with the bbmap programme of the BBTools suite, with default options (and k = 10). Besides the reference index containing the 5,804 bacterial genomes, separate indices were built for individually cultured species and genus-level groups. Genomes for the latter groups were chosen from the initial set of 5,804 genomes. Alignment was performed with the bbmap programme of the BBTools suite, with the parameters (32 bit=t -da -eoom k = 11 strictmaxindel=10 intronlen=0 t = 16 trd=t minid=0.94 nzo=t). Alignment files were sorted and indexed with SAMtools57. Deduplication based on UMIs was then performed with UMI-tools (v.1)58, with the options (--soft-clip-threshold 1 --edit-distance 2 --method unique). The BAM files were then processed to enumerate the number of reads per species. We used a prestored dictionary of chromosome and species names and a custom script to perform counts in each species. The distribution of counts between genes coding for rRNA, tRNA, mRNA and other RNA types was computed with bedtools (https://bedtools.readthedocs.io/en/latest/). Counts at mRNA coding genes were used to select the top species in complex samples, as described below.

Individually cultured species were directly mapped to their reference indices. All Zymobiomics mixtures and vaginal, faecal and compost microbiome samples were aligned to the bacterial reference index that included the 5,804 species. We chose species with at least 1,000 reads in the coding regions in all the samples except for compost, where we relaxed the selection to 300 reads because there were fewer species with high counts. In total, 83 bacterial species with specified coverage belonging to 46 genera were identified from all the samples. Reference indices were built for these 46 genera (species were chosen from the preselected list of 5,804), and all complex faecal and compost samples were separately aligned to those references.

5PSeq and ribosome dynamics analysis

Deduplicated alignment files, along with genome sequence and annotation files, were provided as input to our recently developed fivepseq package14 for analysis and visualization of the 5' endpoint distribution of reads with default options applied. Fivepseq provides information regarding the presence of 3 nt periodicity (FFT), distribution of 5' counts relative to CDS start/stop or to nucleotides within each codon (translational frames), and codon and amino acid-specific protection patterns. Because fivepseq analyses only one genome per run, alignment files for complex samples were used as input for fivepseq for each genome separately. For genus-level analysis, sequence and annotation files for individual species were concatenated into one. Finally we compiled an online resource with interactive browsing of reports produced from the untreated bacteria with high coverage, at http://metadegradome.pelechanolab.com.

To generate a ribosome protection phenotype we took the sum of counts positioned 30 to 1 nt upstream of each amino acid and concatenated per-amino acid-scaled counts to obtain a vector describing ribosome protection in each sample. These vectors were used as input for PCA performed with the prcomp function of the R package stats (v.3.6.1). PCA plots were generated with the autoplotly package (v.0.1.2) in R.

Analysis of degradome fragments with respect to genomic features

Bacillus subtilis annotations for transcriptional start sites (TSS) and UTRs were obtained from BSGatlas59. E. coli annotations for TSS and UTRs were obtained from RegulonDB60. 5′P reads were assigned to a given TSS if overlapping with that TSS within a ±5 base pair window. For reads distribution (Extended Data Fig. 6a) and heat maps (Extended Data Fig. 6e,f), replicates were pooled and either raw or library size-normalized read counts were averaged over replicates. The strand-specific coverage of 5′P reads was computed and represented as heat maps using deepTools (v.3.3.2)61. For differential analysis of 5′P reads in different genic annotation features, those in each genic annotation feature were enumerated using featureCounts from the R package Rsubread (v.2.6.4)62,63. Differential analyses were performed using edgeR (v.3.34.1)63.

Cleavage motifs

The 5′P cleavage motifs were computed from the sequence composition of transcripts, involving region 4 nucleotides upstream and downstream of the 5' mapping positions of all reads. If multiple reads mapped to the same position, we considered the base composition of that region multiple times. Using the obtained base frequencies, we then proceeded to producing sequence logos with the R package ggseqlogo64, using Shannon entropy (bits) to compute the contribution of each nucleotide at each position.

Taxonomic analysis

Taxonomic trees were generated with the graphlan tool (v.0.9.7) (https://huttenhower.sph.harvard.edu/graphlan). Taxonomic lineage information for all 84 bacterial species identified in our samples was downloaded from the NCBI Taxonomy database with the efetch programme from NCBI e-utilities. Trees were annotated with information on library size, 3 nt periodicity, preferred ribosome protection frame and the presence of enzyme annotations for each genus (Supplementary Data 1). Library size equalled the maximum number of mRNA reads per species per sample, in the range 0–1 (≥1 million) reads. The 3 nt periodicity was computed taking into account the absolute value of FFT signal for the 3 nt periodicity wave and the preference for ribosome protection frame, as computed by the fivepseq package. For FFT, the maximum number of signals for transcripts aligned either at the start or the end was taken. The preference for ribosome protection frame was assessed based on the value of FPI, computed by the fivepseq package as 2 × F/(∑1F1 − Fi), for each frame Fi. The frame with maximum absolute FPI value was regarded as (mis)preferred, and the significance of this preference was assessed based on t-test P values comparing counts in the given frame with the other two combined (FPI and P values are found in the frame_stats.txt file of the fivepseq output). A positive FPI value means that one of the nucleotides in each codon on average has higher counts (is preferred) while a negative value means that one of the nucleotides on average has low counts (is misprefered) and the other two nucleotides receive higher counts: for example, if F1 is preferred it will have a positive FPI value and will be highlighted in the tree as a single preferred frame of protection while if, say F2 is (mis)preferred (has a negative FPI value), the tree will highlight F0 and F1 as the frames of preference. Both FFT and FPI values were in the range 0–1, and the maximum of the two values was taken to describe the strength of 3 nt periodicity. Enzyme annotations were obtained from the EggNOG database (v.5.0)65 The presence of each enzyme in each genus was counted as a number between 0 and 1, depending on the fraction of species within the genus annotated with the enzyme. The tree highlights these values with corresponding opacity.

Functional analysis

Functional analysis was performed on lists of genes with either differential abundance or differential frame protection between stress and control conditions. The FPI for each transcript was computed as described above. The fold change of FPIs was computed as the difference in mean FPI between the stress/mutant condition and the untreated/wild-type control for each transcript. Differential abundance of 5' endpoint counts for each transcript was computed as the expression log2(fold change) with the DESeq2 R package66 using adaptive Student’s t-test prior shrinkage estimation (apeglm).

Gene set enrichment analysis was performed with the R package WebGestaltR67 based on either fold change differential expression or FPI values. GMT formatted files for each species were obtained by modification of Uniprot protein annotations and used as enrichment databases. Annotation GFF files for each species were taken as the basis for generation of the reference gene lists. The significance of P values was computed with a hypergeometic test, and FDR used for multiple test correction.

Ethical oversight

Collection and processing of sequenced vaginal samples was granted by the Regional Ethical Review Board in Stockholm (no. 2017/725-31). Ethical approval for faecal samples and cultures was waived by the review board because only deidentified samples from healthy donors were used and no samples were stored in a biobank. Informed consent was obtained from all donors before sample collection.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

留言 (0)

沒有登入
gif