Open chromatin analysis in Trypanosoma cruzi life forms highlights critical differences in genomic compartments and developmental regulation at tDNA loci

Core and disruptive genomic compartments of T. cruzi have a different open chromatin profile

To investigate whether the lack of transcriptional regulation of protein-coding genes would be reflected in the open chromatin profile of T. cruzi, we performed a FAIRE-seq genome-wide approach. We rescued these regions from the genome of epimastigotes under exponential proliferation. Biological duplicates were processed together with input control samples (reversed cross-link before DNA extraction) to account for any bias in genome sequencing and assembly. FAIRE-seq reads presented high duplication levels (66.2–72.1% in epimastigotes—Additional file 1: Figure S1A) and high Phred score (quality) values (> 30—Additional file 1: Figure S1B). After the mapping process (Additional file 1: Figure S1C), only mapped reads with mapping quality scores (MAPQs) above 10 were retained, resulting in approximately 116 million reads mapped (Additional file 1: Figure S1D). Spearman correlation analysis (Additional file 1: Figure S1E) and PCA (Additional file 1: Figure S1F) using genome-wide read coverage of 50-bp windows for each replicate indicated a high correlation among them, showing good reproducibility and agreement among biological replicates. To avoid bias at repetitive regions, we removed multimapped reads, resulting in reduced (or zero) read coverage at repetitive regions of the genome (Additional file 1: Figure S2). Most of the repetitive regions occurred in intergenic sections (92.75%) of the genome (Additional file 2: Table S1). Of note, strand switch regions (SSRs) were not considered intergenic regions; nevertheless, no repeat annotations were found there. Only 309 CDSs (5.65% of total repetitive regions, in bp) were located in annotated repetitive regions.

To obtain an overview of the open chromatin distribution in large genomic regions, we compared the data of epimastigote FAIRE-seq (E) and their corresponding control samples (C). Figure 1 shows the distribution of reads obtained from both samples on two contigs from Dm28c (PRFA1000011—594 kb and PRFA1000027—454 kb). Datasets were normalized to reads per genomic content (RPGC) to account for different sequencing depths. Visual inspection of RPGC levels indicates that the FAIRE sample has higher overall levels along the contigs with some clear enrichments when compared to the control sample. Ratio levels in log2 (E/C) indicated that most regions were enriched in FAIRE samples (Fig. 1A). Repetitive regions are depleted in both FAIRE and control samples, likely due to the applied mapping and filtering steps (Additional file 1: Figure S2).

Fig. 1figure 1

Genome-wide analysis of active chromatin in T. cruzi. (A) Overlay of the number of reads per genomic content (RPGC) in FAIRE-seq data for epimastigotes (red line) and its control (blue) on contigs PRFA01000011 and PRFA01000027. Track with the ratio FAIRE/control is depicted in the log2 ratio. Polycistrons are shown in blue tracks; arrows indicate the transcription direction; genes from the conserved compartment, disruptive compartment and both genome compartments are shown in green, red, and orange tracks, respectively. B RPGC log2 ratio of normalized epimastigote FAIRE reads in different T. cruzi compartments. C Active chromatin landscape in core genes (nonmultigenic family) and virulence factors 1 kb upstream and downstream of their respective start (ATG) and end (STOP codon) coding regions. (*** Wilcoxon–Mann–Whitney test with p-value = 0.001)

Regions covered by the core compartment (composed of conserved and hypothetical conserved genes, green tracks) had higher RPGC-normalized counts than the disruptive compartment regions (red tracks) (Fig. 1A and B, Additional file 2: Tables S2, S3). Differences in RPGC counts between these two compartments were also observed in control samples; however, this difference was even more evident in FAIRE samples, which was reflected by higher levels of log2 ratio (E/C) in the core than in the disruptive compartment (Fig. 1 B). Remarkably, removing multimapped reads (see methods—Additional file 1: Figure S2B) slightly affected this difference since the same criteria were used for experimental and control samples. Instead, this can be explained by the less open chromatin at the multigenic family in contrast to a more open chromatin profile in the nonmultifamily CDS.

Because the distribution of open chromatin was revealed to be distinct between genome compartments, we asked whether their landscapes, at a gene level, were also different. Strikingly, the landscape of open chromatin greatly differs among virulence factors (present in the disruptive compartment), among them, and among the remaining protein-coding genes. Figure 1C shows that at most core-gene members, the regions coding for their 5’ and 3’ UTRs are enriched in open chromatin, corroborating the impoverishment of nucleosomes seen previously by MNASE-seq (Additional file 1: Figure S3B). In contrast, no clear enrichment of open chromatin was found around the genomic regions coding for the 5’ UTR of virulence factors, while the regions coding for the 3’ UTR greatly differed among them: TS and GP63 exhibited an enrichment similar to other CDSs. At the same time, MASPs had a clear depletion followed by an enrichment a few base pairs downstream (Fig. 1C). Nucleosome occupancy partially explains the active chromatin distribution around the virulence factor genes, as some regions (such as the upstream regions of mucins, GP63, DGF-1 and RHS) exhibit enrichment or decrease both in FAIRE and MNAse data (Additional file 1: Figure S3A).

Open chromatin is enriched at divergent SSRs (dSSRs) and uniformly distributed along PTUs

In trypanosomatids, transcription of protein-coding genes initiates mainly at dSSRs and terminates at cSSRs; however, transcription might also start at some non-SSRs, mainly close to tRNA genes [62728]. This latter has not yet been described in T. cruzi, we inspected dSSRs as a proxy of transcription start sites, which resulted in enrichment of open chromatin compared to PTUs, as evidenced in Fig. 2A and 2B. Combining a previous nucleosome mapping and occupancy profile [25] with the FAIRE-seq data along PTUs revealed complementary opposite landscape profiles: open chromatin regions mainly reflect nucleosome-depleted regions (Fig. 2A).

Fig. 2figure 2

Open chromatin at a PTU context in epimastigotes. A Superposition of FAIRE (red—right values on the y-axis) and MNase-seq (blue—left values on the y-axis) datasets. B Log2 ratio of normalized epimastigote FAIRE reads in dSSR (blue) and cSSR (green). C Hierarchical cluster analysis of FAIRE data depicted in A (Cluster 1: n = 227; Cluster 2: n = 412; Cluster 3: n = 258). In A, B and C, the first base of the feature is Start, while the last base is End. D Percentage of bases from multifamily genes for each PTU according to hierarchical cluster analysis shown in C (*** Wilcoxon–Mann–Whitney test with p-value = 0.001). E/C, ratio for epimastigotes and their respective controls

PTU regions were hierarchically clustered into three groups based on their log2 (E/C) RPGC level (Fig. 2C). Clusters 2 and 3 are very similar, exhibiting a near-flat pattern with low overall RPGC levels. In contrast, Cluster 1 showed higher overall levels of open chromatin, with a decrease at the edges, mainly at the PTU ending region. Clusters 2 and 3 contained significantly more genes (per bp) from the multigenic family than Cluster 1, which was enriched mainly from genes of the core compartment (Fig. 2D).

Levels and relative nuclear position of eu- and heterochromatin changes during metacyclogenesis

Many morphological changes are observed during the differentiation of epimastigotes to metacyclics, including nuclear elongation and kinetoplast repositioning to the parasite posterior end. Previously, it has been reported that the heterochromatin near the nuclear envelope in the epimastigote forms spreads progressively along with the nucleus in the intermediate forms, reaching a higher level of compaction in metacyclics [15]. However, only in this work a systematic evaluation and 3D reconstruction of the two traditional chromatin classes, eu- and heterochromatin, was performed to check chromatin remodeling during parasite differentiation. The obtained results indicate that in T. cruzi epimastigotes, euchromatin resides in the central area, whereas heterochromatin is mainly distributed close to the nuclear envelope and surrounding the nucleolus (Fig. 3A—epimastigote). The euchromatin volume is higher in epimastigote and intermediate I forms and decreases during differentiation to metacyclic. Heterochromatin, in the epimastigote form, spreads throughout the nucleus, and as metacyclogenesis advances, its percentage increases and its location becomes increasingly peripheral. These results indicate that progressive chromatin remodeling occurs during parasite differentiation. It is worth observing that during metacyclogenesis, the nucleolus reduces its size, as it undergoes disassembly and dispersion throughout the nuclear matrix, which may be related to the decrease in ribosomal biogenesis (Fig. 3, Additional file 3: Videos S1–S4).

Fig. 3figure 3

Tridimensional reconstruction of nuclear chromatin regions during T. cruzi metacyclogenesis. 3D reconstruction of different developmental stages, from epimastigote to metacyclic, where three different FIB-SEM slices were used to show chromatin state and distribution. High and low electron-dense chromatin regions, which correspond to heterochromatin and euchromatin, respectively, were quantified from TEM slices. Note chromatin remodeling as the differentiation process advances. The euchromatin region (yellow) decreases, whereas the heterochromatin area (purple) increases and occupies mainly the nucleus periphery. The 3D reconstruction is a representative image obtained for one replicate, but similar observation was obtained for other replicates. Ratio values of eu-/heterochromatin for each cell type, in triplicates, were used for statistical analysis (one-way ANOVA test). *—Epimastigote vs. Intermediate I; **—Epimastigote vs. Intermediate III, and ***—Epimastigote vs. Metacyclic

Genome-wide analysis of open chromatin regions in epimastigotes and metacyclics

Tridimensional reconstruction of chromatin areas along epimastigote-to-metacyclic differentiation indicated a significant reduction in euchromatin regions (Fig. 3), which are considered open chromatin areas. To gain more insights into these changes, we performed a comparison of FAIRE-seq data from epimastigote and metacyclic forms. Visual inspection of normalized RPGC levels along the T. cruzi genome confirms a greater abundance in open chromatin and few clear enrichments in epimastigote forms than metacyclics (Fig. 4A).

Fig. 4figure 4

Open chromatin profile changes in life forms. A IGV snapshot of contig PRFA01000005 showing FAIRE-seq profile distribution in epimastigote (red), metacyclic (blue), and control samples (gray). PTUs are shown in blue tracks with the transcription direction indicated by arrows. Genes from disruptive (red), core (green), and those from both (orange) compartments are depicted. tDNAs are shown in dark blue. B Comparison of core and disruptive compartments using RPGC counts. *** Wilcoxon–Mann–Whitney test with p-value = 0.001. C Scatter plots of E/MT RPGC levels on each indicated feature. Median values are written between parentheses. One-way ANOVA with Dunnett’s correction. D Hierarchical cluster analysis of the distribution of the RPGC log2 ratio (E/MTs) in PTUs considering (top) or not (below) 1 kb upstream or downstream. The number of members in each cluster is depicted next to the graph. The first base of the feature is represented by Start, while the last base is End. MT-metacyclic

As mentioned above, the disruptive compartment is enriched in virulence factors that are mainly expressed in infective forms [23]. To address whether the disruptive compartment would be enriched in open chromatin in metacyclics relative to epimastigotes, RPGC-normalized FAIRE-seq data were compared between life forms, obtaining log2 ratios. Median RPGC values from the disruptive and core compartments were lower in metacyclics (Fig. 4B, Additional file 2: Tables S2, S3). Differences between compartments within life forms are very similar: the core compartment has 3.9 times more open chromatin than the disruptive compartment in both epimastigotes and metacyclics. In the MNase data, core and disruptive compartments also presented different RPGC levels, with the former being significantly higher (Additional file 1: Figure S3B). In general, virulence factors (TS, GP63, MASP, and mucins) and core compartments have 2.6 and 2.8 times (fold change, median values) less open chromatin in metacyclics than in epimastigotes (Additional file 1: Figure S3A), which agrees with global changes in other genomic features (see Fig. 4C, Additional file 2: Table S4). The open chromatin landscape of virulence factor genes is similar in both life forms, with a slight difference at the upstream coding region of mucins (Additional file 1: Figure S3D). Taken together, these results indicate that virulence factors indeed have a different pattern of open chromatin when compared to other CDSs (as we discussed below) that seemed to be maintained between life forms, which reinforces major posttranscriptional control of these genes.

Previously, we found that dSSRs from trypomastigotes were enriched in nucleosomes compared to epimastigotes [25]. Then, we speculated whether dSSRs from epimastigotes would be more abundant in open chromatin. Indeed, dSSRs are approximately 3.6 more enriched in open chromatin at epimastigotes (Fig. 4C) (Additional file 1: Figure S4A), which corroborates a lower nucleosome occupancy in replicative forms compared to nonreplicative forms.

Upon hierarchical cluster analysis, the enrichment of open chromatin at dSSRs can be further visualized in a PTU context (Fig. 4 D—top, Additional file 2: Table S5). Most notably, Cluster 1 showed enrichment at dSSRs followed by a decreasing signal level toward the cSSRs. A detailed inspection of Cluster 1 elements revealed that approximately 10% of their PTUs are located downstream of tDNA loci. Cluster 2 represents PTUs whose dSSRs have similar levels of openness compared to their adjacent PTUs. Furthermore, interestingly, Cluster 3 encompasses PTUs whose levels of open chromatin are equal between E and metacyclics (log2 E/metacyclic = 0). Taken together, the obtained data indicate that different PTUs have distinct levels of openness at their transcription initiation regions. Clusters 2 and 3 were enriched in genes from the dispersed compartment (data not shown).

Considering only the PTU region, epimastigotes showed a significantly increased signal (2.9 times) compared to metacyclics (Fig. 4C). The hierarchical cluster analysis reflected PTUs with different levels of open chromatin among life forms and, importantly, near-flat enrichment along the PTU region (Fig. 4D bottom, Additional file 2: Table S6). In accordance with hierarchical clustering of PTUs with SSRs (Fig. 4D top), Clusters 2 and 3 were also enriched in the multigenic family (data not shown).

FAIRE-seq data correlate with steady-state gene expression levels in both life forms

Given the different open chromatin profiles observed among PTUs (Fig. 4D), we investigated whether a similar pattern occurs in distinct CDSs from different life stages. Thus, CDSs were hierarchically clustered into three groups based on their log2 RPGC level (Additional file 1: Figure S4D). An increase in open chromatin was found mainly at genomic regions encompassing 5' and 3' UTRs at Clusters 1 and 2. Similar to the results shown above, Cluster 1, which has the highest log2 ratio, was enriched in genes from the core compartment, whereas Clusters 2 and 3 were enriched in genes from the disruptive compartment and repetitive genes (data not shown).

Then, we explored whether regions enriched in open chromatin would have transcripts expressed at higher levels. CDSs were first classified as high, medium or low expressed in each life form based on their TPM counts obtained from a transcriptomic study published elsewhere [19] (Additional file 1: Figure S5). RPGC-normalized counts were retrieved for each CDS according to their respective expression class and each life form (Fig. 5). Surprisingly, a positive correlation of open chromatin with steady-state transcription levels was found. For epimastigotes, significant differences in FAIRE enrichment were observed when comparing all expression classes. In contrast, for metacyclics, significance was observed mainly when compared to the low expressed genes (high versus low and medium versus low).

Fig. 5figure 5

FAIRE-seq data correlate to steady-state transcription levels in both life forms. RPGC-normalized tag counts for each gene are mapped to expression classes for epimastigotes (red) and metacyclics (blue). Statistical significance tests were performed with the Wilcoxon–Mann–Whitney test (for p values: *** = 0.001; ** = 0.01; N. S Not Significant)

Open chromatin is developmentally regulated at tDNA loci

FAIRE-seq analysis highlighted a global decrease in the levels of open chromatin in metacyclics compared to epimastigotes. However, we wondered whether the open chromatin profile would give us more clues about the differences in gene expression and phenotype found between life forms, especially those related to their differentiation program. Comparing all genomic features, we detected striking differences in open chromatin enrichment at the regions coding for the small nuclear RNAs (snRNAs) and tRNA genes. Of note, this enrichment was significantly higher (fold changes of 13.4 and 9.3 for tDNAs and snDNA, respectively) in epimastigotes than in metacyclics (Figs. 4C and 6A, B).

Fig. 6figure 6

FAIRE enrichment at tDNA loci. A IGV snapshot of a representative tDNA cluster (black box) showing FAIRE enrichment in epimastigote (red) tracks over metacyclic trypomastigotes (blue). B Boxplot of RPGC-normalized tag counts in tDNA features (*** Wilcoxon–Mann–Whitney test with p-value = 0.001). C Hierarchical cluster analysis of FAIRE-seq data at tDNA loci (reciprocal ratio). The number of tDNAs in each cluster is depicted below. D Distribution of tDNAs from each cluster of C according to their location relative to the adjacent PTU transcription direction. E RNA-FISH analysis of Asp-GUC 3’ tRNAs in epimastigotes and metacyclics. F Northern blot assays using 20 pmol of biotin- labeled probes for Asp, Glu tRNAs, and 5S RNA. Total RNA were fractionated in 15% polyacrylamide gels and transferred to nylon membranes

Hierarchical cluster analysis of the E/metacyclic ratio revealed that the majority of tDNAs had at least six times more open chromatin in epimastigotes than in metacyclics (Fig. 6C, Additional file 1: Figure S6, Additional file 2: Table S7). Clusters were not related to the tRNA isoacceptor type or class (Additional file 1: Figure S6B); however, their distribution reflects their location regarding the transcription direction of the adjacent PTU (Additional file 1: Figure S6C). For example, Clusters 1 and 2 were enriched in tDNAs that were mainly located between codirectional (60% and 30%, respectively) and divergent (10% and 40%, respectively) PTUs (Fig. 6D). This distribution suggests that chromatin alterations at tDNAs may affect transcription initiation at adjacent PTUs.

Finally, we addressed whether the increase in open chromatin within tDNAs would be associated with tRNA expression levels. Previously, Garcia-Silva [29] showed by RNA-FISH assays that compared to metacyclics, epimastigotes have higher amounts of tRNAs. To confirm this finding, epimastigotes and metacyclics were probed simultaneously using one of the four probes against 5’ and 3’ Asp and Glu tRNAs by RNA-FISH assays. Additional file 1: Figure S6D indicates that tRNAs are much lower in metacyclics (about 50% less abundant). Evaluation of tRNA expression by PCR is not a straightforward approach once this molecule is highly modified. Then, we also evaluated tRNA expression in life forms using Northern blot. Asp and Glu tRNAs are more abundant (per cell) in epimastigotes than at metacyclics (Fig. 6E–F, Additional file 1: Figure S6F) corroborating with RNA-FISH analysis; but tRNAs are equally abundant when similar amounts of total RNA (in ng) from life forms where compared (Fig. 6F). It is important to note that in both analyses we are evaluating steady-state tRNA transcripts, instead of nascent transcripts. The nascent transcript analysis is hampered due to the very low abundance of transcription in metacyclics [17].

留言 (0)

沒有登入
gif