Aberrant DNA hydroxymethylation reshapes transcription factor binding in myeloid neoplasms

5hmC landscapes in patients with myeloid neoplasms

We performed 5hmC analysis in a total of 82 bone marrow aspirates from 75 patients with myeloid neoplasms and 7 healthy controls (Additional file 1: Table S1, Fig. 1A). The mutation status of these patients was listed in Additional file 1: Fig. S1. In order to profile genome-wide 5hmC with clinical samples, we greatly improved our previously published CMS-IP-seq method [7, 10, 36, 37] and made it fully compatible with low-input genomic DNA (as low as 10 ng; Additional file 1: Fig. S2A-D, Fig. 1). We collected 1.05 billion uniquely mapped reads (1,021,637,350 reads) that covered 7.2% of the human genome with an average length of 5hmC peaks at ~ 200 bp. By comparing the 5hmC peak numbers among analyzed individuals, we observed more 5hmC peaks in healthy donors (174,399 peaks on average) compared with patients diagnosed with chronic myelomonocytic leukemia (CMML; 68,625 peaks) and adult acute myeloid leukemia (AML; 77,371 peaks) (Additional file 1: Fig. S2E). MDS patients displayed the highest variations in 5hmC peak numbers, which ranged from 10,728 to 217,132 peaks (average 108,315 peaks) regardless of MDS subtypes (Fig. 1B). Among the different MDS subtypes, patients with refractory cytopenia with multilineage dysplasia and ringed sideroblasts (RCMD-RS) displayed similar numbers of 5hmC peaks as healthy controls; however, the genomic location of 5hmC peaks was distinct between RCMD-RS and healthy donors, suggesting a unique molecular feature of RCMD-RS (Fig. 1C). Besides the variable numbers of 5hmC peaks identified from MDS patients, we also observed higher heterogeneity of the genomic 5hmC distributions in MDS patients compared with healthy donors, AML and CMML patients, further echoing the heterogenous features of MDS patients observed in the clinic [38] (Fig. 1C, Additional file 1: Fig. S2E-F). Genomic Regions Enrichment of Annotations Tool (GREAT) analysis further revealed that the genomic regions displaying high 5hmC heterogeneity in MDS patients (n = 12,540) were enriched at genes essential for hematopoietic and immune cell function [39] (Fig. 1D).

Fig. 1figure 1

Heterogenous 5hmC distributions in patients with myeloid neoplasms. A The experimental design for genome-wide sCMS-IP-seq analysis in the cohort. B Boxplot representation of the identified 5hmC peak numbers in the analyzed cohort. Bounds of the box span from 25 to 75% percentile, and the center line within each box represents the median. Whiskers represent median ± 1.5 times interquartile range. C Heatmap representation of the enrichment of selected 5hmC peaks among all the analyzed cohorts. The selected 5hmC peaks (n = 12,540) exhibited small variations (coefficient of variation, CV < 0.15) in healthy donor groups, whereas MDS and cancer patients showed higher variation. Each row represents a selected 5hmC peak, each column represents an individual person. RAEB: Refractory anemia with excess blasts; RARS: Refractory anemia with ring sideroblasts; RCMD: Refractory cytopenia with multilineage dysplasia; RS: Ring sideroblasts. D Top 10 selected categories of the GREAT analysis results for the 5hmC peaks shown in Fig. 1C. E Heatmap representation of the top 20,000 variable disease-specific differentially hydroxymethylated regions (DHMRs). F Box-plot showing the enrichment of 5hmC within published DNaseI hypersensitive sites, H3K4me1-, or H3K27ac-enriched genomic regions in healthy controls, as well as patients diagnosed with myeloid neoplasms. Bounds of the box span from 25 to 75% percentile, the center line within the box represents the median. Whiskers represent median ± 1.5 times interquartile range. G The t-SNE plot of the DNA methylation level within disease-specific DHMRs in the published AML cohort. Hypo-DHMRs: genomic regions showed decreased 5hmC levels in patients compared with healthy controls; Hyper-DHMRs: genomic regions showed increased 5hmC levels in patients compared with healthy controls. The DNA methylation levels at each CpG site were obtained from published datasets collected from CD34 + cells in healthy control (phs000159) or bone marrow aspirates in AML patients (GSE98350)

Despite of the heterogeneity of 5hmC distributions among analyzed patients, we were still able to identify 129,652 conserved disease-specific 5hmC peaks (differentially hydroxymethylated regions, DHMRs) (Fig. 1E). GREAT analysis [39] indicated that these DHMRs were associated with genes that are relevant for myeloid and lymphoid cell function (Additional file 1: Fig. S3A). To further examine the distribution of 5hmC with other epigenetic marks in patients with myeloid neoplasms, we analyzed the 5hmC enrichment profile within the published DNase I hypersensitive site and histone modification (H3K4me1 and H3K27ac) regions collected from human CD34 + cells (Fig. 1F, Additional file 1: Table S2). We observed significantly reduced 5hmC levels in DNase I hypersensitive sites, H3K4me1, and H3K27ac enriched regions, suggesting impaired DNA hydroxymethylation at enhancer regions in patients with MDS and myeloid malignancies (Fig. 1F). In parallel, we analyzed DNA methylation levels within the DHMRs using publicly available whole-genome bisulfite sequencing (WGBS) in healthy CD34 + hematopoietic stem progenitor cells (HSPCs) and reduced representation bisulfite sequencing (RRBS) data collected from AML patients [40] (Fig. 1G and Additional file 1: Fig. S3B, Table S2). We observed a high similarity of DNA methylation level in CpG sites located within DHMRs in healthy donors. By contrast, the same CpG sites in AML patients had high DNA methylation variations (Fig. 1G and Additional file 1: Fig. S3B). Collectively, these data suggest epigenetic abnormalities, including altered DNA methylation, histone modifications and chromatin accessibility at DHMRs, represent a hallmark of myeloid neoplasms.

Distinct 5hmC features in patients with myeloid neoplasms

To further evaluate the clinic implication of landscape changes in 5hmC, we performed t-Distributed Stochastic Neighbor Embedding (t-SNE) analysis using the 5hmC profiles within the tested cohort (Fig. 2A). Interestingly, the healthy controls clustered far from with patients bearing myeloid neoplasms (Cluster I vs Clusters II & III; Fig. 2A). In addition, we used all identified DHMRs to correlate 5hmC alterations with the survival of analyzed patients. 622 DHMRs showed a strong association (P < 0.05) with survival (Additional file 1: Table S3), which could be used as potential biomarkers to predict patient outcomes in future clinical management, as example regions shown in Fig. 2B and Additional file 1: Fig. S4A. GREAT analysis [39] revealed that these survival-associated DHMRs were correlated with genes that regulate lymphocyte activation and differentiation (Additional file 1: Fig. S4B).

Fig. 2figure 2

The distinct clustering patterns of disease-specific DHMRs. A The t-SNE plot for the merged DHMRs identified from the analyzed cohort. Three distinct clusters were observed, reflecting distinct disease status. B Kaplan–Meier survival curves for patients with high and low 5hmC levels at selected genomic regions. The high and low 5hmC were separated by the median value of 5hmC. Boxplot: Bounds of the box span from 25 to 75% percentile, the center line within each box represents the median. Whiskers represent median ± 1.5 times interquartile range. C The clustering analysis of DHMRs based on TET2 mutation status in healthy controls and patients with myeloid neoplasms (with WT or mutant TET2). D The immunophenotypic features, including platelet (PLT), blasts, and monocytes (mono) counting in patients with WT TET2 and mutant TET2. E Venn diagram of DHMRs identified between the comparison of healthy donors vs patients with WT TET2 or healthy donors vs patients with mutant TET2. F GREAT analysis of DHMRs identified from the comparison shown in (E)

DNA hydroxymethylation is catalyzed by the TET protein family member, with TET2 being one of the most frequently mutated genes in patients with myeloid neoplasms. Therefore, we further investigated whether we could combine 5hmC and TET2 mutation information to facilitate the molecular classification of the disease. Among all the analyzed patients, 30 patients had annotated TET2 mutation status (12 with WT TET2 and 18 with mutant TET2). We then compared the 5hmC distributions within healthy controls and patients with either WT or mutant TET2. As shown in Fig. 2C, we were able to clearly separate these three groups of patients based on the differential 5hmC landscapes and TET2 mutation status. Patients with TET2 mutations exhibited more notable differences than those with WT TET2 when both groups were compared to the healthy controls (Fig. 2C). Next, we further examined the clinical features of patients with and without TET2 mutations (Fig. 2D). We observed that individuals with TET2 mutations had higher platelet and monocyte counts but lower blast counts compared with those bearing normal TET2 (Fig. 2D). No significant differences in neutrophil numbers, white blood counts and hemoglobin levels were noted between these two groups (Additional file 1: Fig. S4C). These results agreed with previous reports from us and others showing that TET2 mutations tend to cause myeloid bias in both human and mouse studies [18,19,20].

To further understand the impact of TET2 on 5hmC landscapes and downstream transcriptional regulation, we compared 5hmC profiles among normal healthy donors, patients with and without TET2 mutations based on the mutation information obtained from Additional file 1: Fig. S1 and Table S1. Compared to the healthy donor group, we identified a total of 11,910 and 71,764 differentially hydroxymethylated region (DHMRs) in patients with WT TET2 or with TET2 mutations, respectively. Among these DHMRs, 9,523 were found to be independent on TET2 mutation status (Fig. 2E). In parallel, we identified 2,378 and 62,241 DHMRs that were specifically enriched in patients with WT or mutant TET2, respectively (Fig. 2E). The GREAT analysis showed that all the DHMRs identified from Fig. 2E were enriched at genomic regions that are related to immune cell activation, suggesting an abnormal inflammation feature in myeloid cancer patients regardless of TET2 mutation status (Fig. 2F). We also observed that the DHMRs specifically identified in TET2-mutant patients were enriched at genomic regions associated with genes important for T cell function and post-transcriptional silencing (Fig. 2F), suggesting that TET2 mutations have strong impact on reshaping the 5hmC landscapes and altering transcriptional activities in patients with myeloid neoplasms.

Abnormal 5hmC enrichment at key transcription factor (TF) binding sites in myeloid neoplasms

Our current study showed significant 5hmC alterations within DNase I hypersensitive sites marked by enhancer marks, H3K4me1 and H3K27ac, in subjects with myeloid neoplasm compared to healthy individuals (Fig. 1F). Previous studies have shown that 5hmC is enriched at enhancers occupied by transcription factors (TFs) [7, 9], and that DNA hydroxymethylation could alter TF binding [34, 41]. We therefore examined 5hmC enrichment within 380 TF motifs in the analyzed cohort using a previously developed analysis pipeline [42]. Indeed, we observed substantial 5hmC changes within these TF motifs (Fig. 3A). Overall, 5hmC levels within these TF motifs were high in healthy individuals (mostly in Cluster I) but remained low in patients (mostly in Cluster II and Cluster III). Interestingly, among all the analyzed motifs, we observed a set of TF motifs, belonging to the CCAAT-enhancer-binding protein (C/EBP) family, that displayed a distinct 5hmC distribution pattern (Fig. 3A-C), with low 5hmC levels in healthy individuals (Cluster I) but high 5hmC in patients (Clusters II and III). In addition, patients with TET2 mutations exhibited similar enrichment of 5hmC within C/EBP binding sites compared with patients with WT TET2 or healthy controls (Fig. 3D, Additional file 1: Fig. S5A). Furthermore, this set of C/EBP binding motifs displayed the most significant changes in 5hmC between healthy individuals and patients (comparison between Cluster I and Cluster II & III) (Fig. 3B-C). C/EBP family members have been reported to play an essential role in regulating hematopoiesis and myeloid differentiation [43, 44]. Genetic defects in C/EBP have been reported in AML [45]. Furthermore, our previous study demonstrated that 5hmC modification could prevent C/EBPβ DNA binding in vitro [34], suggesting that increased 5hmC within C/EBP binding motifs might reshape genomic location of C/EBP and its downstream transcriptional activity during leukemogenesis. Indeed, consistent with changes in DNA hydroxymethylation, the expression levels of C/EBPβ target genes [46] were significantly altered in patients with myeloid neoplasms (Fig. 3E-F).

Fig. 3figure 3

Abnormal 5hmC enrichment at key TF binding sites in patients with myeloid neoplasms. A Heatmap representation of the 5hmC deviation score at the annotated TF-binding motifs (n = 380). Higher deviation scores represent more enrichment of 5hmC in corresponding TFs motifs; negative deviation scores mean depletion of 5hmC in TF-binding motifs. Each row represents an individual TF motif. Each column stands for an individual case. Red box, 5hmC enrichment status of the indicated C/EBP family members. B The rank of 5hmC changes within the analyzed TF-binding motifs. The C/EBP family members are generally ranked among the top 10 mostly-enriched motifs. C Genome-browser views of the overlaid 5hmC enrichment at the C/EBPβ binding sites. The 5hmC signals within each individual at each cluster were overlaid. C/EBPβ binding sites were obtained from the public C/EBPβ ChIP-seq datasets (GSM2345026 and GSM2345027). D Heatmap representation of 5hmC deviation scores in healthy donors and patients with known TET2 mutation status (WT vs mutation) at the binding motifs of the C/EBP families. E Heatmap representation of the expression of C/EBPβ target genes [46] (n = 527) in the analyzed cohort. The C/EBPβ target genes were defined as genes containing C/EBPβ binding sites within 1-kb of their transcription start site. The C/EBPβ binding sites were identified from public ChIP-seq data (GSM2345026 and GSM2345027). F The t-SNE analysis on the expression of C/EBPβ target genes [46] in healthy donors and patients with AML, CMML or MDS

To seek other potential TFs with 5hmC changes within their binding motifs in samples from myeloid neoplasms, we extended our analyses to other TFs and correlated 5hmC levels within all 380 TF motifs with the overall survival of patients. Differential 5hmC levels within 24 additional TFs were detected and were correlated with patient outcomes (Additional file 1: Fig. S5B). For example, distinct 5hmC levels within TP63 and MYBL2 binding regions were significantly associated with the survival of patients with myeloid neoplasms (Additional file 1: Fig. S5C). Overall, these data establish that patients with myeloid neoplasms display dramatic 5hmC changes within key TF binding motifs and that 5hmC changes within selected TF binding motifs are strongly associated with patient outcomes.

Aberrant 5hmC enrichment reshapes CEBP-α binding in human leukemia cells

To further evaluate the impact of 5hmC on C/EBP binding in the disease context of leukemia, we performed genome-wide 5hmC and C/EBP analysis in MOLM13, a human AML cell line. Because C/EBP-α, but not C/EBP-β, is highly expressed in MOLM13 cell (Additional file 1: Fig. S6A) and genetic lesions in C/EBP-α contribute to AML development [45], we used C/EBP-α as a proof-of-concept example to investigate the causal relationship between 5hmC and C/EBP-α enrichment in MOLM13 cell. To achieve this, we performed C/EBP-α ChIP-seq and sCMS-IP-seq in MOLM13 cell. We observed almost no enrichment of 5hmC within the C/EBP-α binding sites (Fig. 4A-B, Additional file 1: Fig. S6B). To ensure the data quality of 5hmC profiling, we also monitored the 5hmC enrichment within BRD4 binding sites [47] in MOLM13 cells because 5hmC is known to be enriched at enhancers and BRD4 is an enhancer binding protein. We observed strong 5hmC enrichment within the BRD4 binding sties (Fig. 4B-C), which ruled out the possibility that the absence of 5hmC enrichment at C/EBP-α binding sites is due to low sCMS-IP-seq data quality. To further investigate whether 5hmC indeed repels genomic binding of C/EBP-α, we treated MOLM13 cells with vitamin C (ViC), which is known to boost the TET enzymatic activity to increase global 5hmC levels without altering the expression levels of TET family members (Fig. 4D-E, Additional file 1: Fig. S6C-D). ChIP-seq analysis confirmed that vitamin C induced 5hmC enhancement significantly and suppressed genomic association of C/EBP-α without altering its protein expression level and its chromatin association capability (Fig. 4E-G, Additional file 1: Fig. S6E-F). Motif analysis also confirmed that C/EBP binding motif was enriched most prominently in genomic regions that  exhibited  ViC-induced gain of DNA hydroxymethylation (Fig. 4H). These results strongly indicate that ViC-mediated increase in 5hmC reshapes the genomic location of C/EBP-α. In addition, GREAT analysis implied that genomic regions with enhanced 5hmC are associated with genes essential for cell survival and stress response (Additional file 1: Fig. S6G). ViC-treated MOLM13 cells have been previously shown to upregulate the expression of genes involved in apoptotic and cell differentiation signaling [48]. Likewise, in our functional analysis, we also observed upregulation of CD11b expression in MOLM13 cells following ViC treatment (Fig. 4I), suggesting that 5hmC might induce re-localization of C/EBP-α to promote differentiation of AML cells. Overall, our data strongly suggest a direct effect of 5hmC in regulating genomic distribution of C/EBP-α in leukemia cells. TET2 loss of function could alter 5hmC distribution and reshape C/EBP-α binding to impact downstream transcriptional activities during leukemogenesis.

Fig. 4figure 4

5hmC enrichment reshapes CEBP-α binding in human leukemia cells. A Histogram and Heatmap representation of 5hmC enrichment within the C/EBP-α binding sites in MOLM13 leukemia cells. B Genome-browser views of 5hmC (red), C/EBP-α (blue) and BRD4 (green) peaks in MOLM13 cells at the indicated regions. BRD4, but not C/EBP-α, was enriched at 5hmC-enriched regions. C Histogram and Heatmap representation of 5hmC enrichment within the BRD4 binding sites in MOLM13 cells. BRD4 ChIP-seq data were obtained from GSM1557123. A total of 54,411 BRD4 peaks were identified. D Dotblot analysis of global 5hmC (top) and 5mC (bottom) levels in MOLM13 cells treated with or without 250 µM Vitamin C (ViC) at the indicated time points. Methylene blue (Methyl Blue) staining was used on the same blot as the loading control. ViC treatment led to 5hmC increase, but had minor effects on 5mC levels in MOLM13 cells. E Immunofluorescent staining of 5hmC (red) and C/EBPα (green) in MOLM13 cells before and after Vitamin C (ViC) treatment for 72 h. DAPI was used for nuclear staining. ViC treatment resulted in a significant increase of 5hmC, but had minor effects on the fluorescent signal of C/EBPα. Scale bar: 5 µm. F Histogram and Heatmap representation of C/EBP-α enrichment within the newly emerged 5hmC peaks following ViC treatment in MOLM13 cells compared with the untreated group. C/EBP-α enrichment was plotted within the new 5hmC peaks gained in ViC-treated MOLM13 cells. C/EBP-α enrichment was significantly reduced in regions showing ViC-induced 5hmC increase. G Immunoblot (left) and statistical quantification (right) of C/EBP-α protein expression in MOLM13 cells treated without and with 250 µM ViC at the indicated time points. Anti-tubulin was used as the loading control. n = 3 biological replicates. H Motif analysis of DHMRs identified in MOLM13 cells before and after ViC (250 µM) treatment for 72 h. I Representative histogram and statistical analysis of flow cytometry analysis on CD11b expression in MOLM13 cells treated with or without ViC (250 µM) for 72 h. MFI: mean fluorescent intensity. Data were shown as mean ± S.D; n = 3. P = 0.0012, by two-tailed Student’s t-test

留言 (0)

沒有登入
gif