We first assessed the relationship between sequence conservation across species and chromatin accessibility within the PU.1 locus. By aligning bulk ATAC-seq data from four primary human blood cell lineages [50] with the Cons 100 Verts track with multiple alignments of 100 vertebrate species and measurements of evolutionary conservation (http://genome.ucsc.edu), we identified 14 DNA regions that exhibit open chromatin and sequence conservation in at least one blood cell lineage (Fig. 1A and B). These include the well-characterized URE, which contains two conserved regions [19,20,21], the core PU.1 promoter (PrPr), and the antisense promoter (AsPr) for a PU.1 antisense lncRNA [51, 52]. We also identified human homologs of two known murine PCREs, ZL12 [22, 23] and CE5 [22], henceforth referred to as hZL12 and hCE5. Additionally, we discovered eight putative PCREs, TL1-8 (Fig. 1A, lower panel). Among the four blood cell lineages, myeloid cells displayed the highest levels of chromatin accessibility within most of the established and putative PCREs (TL5, hZL12, TL7, hCE5, TL8) (Fig. 1C). This finding demonstrates the specificity of the observed changes in chromatin accessibility in myeloid cells. This trend is consistently seen at marker genes of the four hematopoietic lineages, as chromatin is accessible specifically at the promoter regions of ITGAM (encoding the myeloid marker CD11b), GYPA (encoding the erythroid marker GPA), CD19 (encoding the B-cell marker CD19), and CD3G and CD3D (encoding components of the T-cell marker CD3) in corresponding cell lineages (Fig. 1D). The differences in chromatin accessibility within these conserved PCREs suggest their potential involvement in lineage-specific PU.1 expression.
Fig. 1Conserved DNA elements with chromatin opening at the PU.1 locus. (A) Genomic track view of the genomic region covering the PU.1 locus and its upstream region. Shown are averaged ATAC-seq tracks of four human blood cell lineages (data were extracted from a published dataset (GEO: GSE74912), see also Table S1 for sample details) (upper panels), 100 vertebrates Basewise Conservation by PhyloP (middle panel), and a diagram of DNA elements. (B) Venn diagram showing DNA elements exhibiting sequence conservation and chromatin opening. (C) Bar graph of ATAC-seq enrichment scores at the DNA elements from four human blood cell lineages (shown in Fig. 1A). Error bars indicate SD (n > = 4). (D) Gene track view of lineage-marker gene loci and their upstream regions. Averaged ATAC-seq tracks of the four human blood cell lineages as used in Fig. 1A are shown
Chromatin accessibility at certain PCREs is correlated with the unique expression pattern of PU.1 in blood cell lineagesWe next investigated the relationship between chromatin accessibility and the expression pattern of PU.1. We verified PU.1 expression patterns in blood cell lineages by examining single-cell RNA-seq datasets. Using the RNA portion of published cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) data of healthy human bone marrow mononuclear cells (BMMC) and peripheral blood mononuclear cell (PBMC) populations [53], we noted that PU.1 expression was highest in primary myeloid cells, less in B cells, minimal in erythroid cells, and essentially silenced in mature T cells (Fig. 2A-B). As expected, lineage-marker genes, including ITGAM (encoding the myeloid marker CD11b), GYPA (encoding the erythroid marker GPA), CD19 (encoding the B-cell marker CD19), and CD3G and CD3D (encoding components of the T-cell marker CD3), are expressed predominantly in their respective cell populations, further confirming the validity of the CITE-seq data (Figures S1A-D). We next interrogate the regulatory effects of putative PCREs and epigenetic features with lineage-specific PU.1 expression. For this purpose, we analyzed a single-cell multiomic dataset of human PBMC (10x Genomics), in which chromatin accessibility and gene expression were co-determined at single cell resolution. This analysis revealed that chromatin accessibility at most elements within a PCRE cluster (PCREC), including URE, TL5, hLZ12, TL7, hCE5, and TL8, was specific for myeloid cells (Fig. 2C, top panel). In contrast, chromatin opening was pronounced but not lineage-specific in elements such as TL1 or was greater in lymphocytes, as seen in TL3 and TL4 (Fig. 2C). Chromatin accessibility at the URE occurred in proportion to the PU.1 expression levels, while the PrPr displayed a bimodal pattern: opening to a similar degree in myeloid and B cells and closing in T cells. As predicted, the correlation between chromatin accessibility and lineage-specific expression was observed with lineages-marker genes ITGAM, CD19, and CD3E in myeloid, B, and T cells, respectively (Figures S1E-G).
We further examined the chromatin structure of progenitor cells along the developmental pathways of myeloid and lymphoid cell lineages with the most upregulated and the most downregulated PU.1 expression respectively. PU.1 is initially expressed at low levels in hematopoietic stem cells (HSCs), increases as it progresses to granulocyte-monocyte progenitors (GMPs), and is most prominently expressed in mature myeloid cells (Fig. 2D). Chromatin is accessible at the PrPr and the URE at all cellular stages. However, chromatin at the remaining elements of the PCREC only becomes accessible at the GMP stage, but remain open in mature myeloid cells (Fig. 2E and S1H). On the other hand, PU.1 is expressed at low levels in hematopoietic precursor cells (HPCs) and only continues to be decrease in expression as they progress toward mature T-cells (Figure S2I). Correspondingly, chromatin remains inaccessible at the PCREC at all stages of T-cell development, while accessibility at the PrPr is sequentially reduced as the T-cell lineage develops (Figures S2J-K). Taken together, our findings indicate that chromatin accessibility at the PCREC is associated with high levels of PU.1 expression in myeloid cells and that the degree of chromatin opening at the PrPr is associated with low expression of PU.1 in progenitor cells. Thus, the lineage-dependent relationship observed between chromatin accessibility at the putative PCREs and PU.1 transcript levels supports a direct regulatory rule and prompted further epigenetic analyses.
Fig. 2Associations between PU.1 expression and chromatin accessibility at certain regulatory elements. A-B) Expression analysis of PU.1 in primary human blood cells at single-cell resolution. (A) UMAP plot of myeloid, B, T, and erythroid cells. (B) Transcript profiles of PU.1 in blood cell lineages. CITE-seq data of combined healthy human BMMC, CD34 + cells and PBMC cells. Data were retrieved from GEO (GSE139369) [53]. (C) Gene track view of the PU.1 locus showing an integrated single-cell ATAC-seq (scATAC-seq) and single-cell RNA-seq (scRNA-seq) analysis of healthy human PBMCs. The data were from the 10k Multiome dataset (10x Genomics). (D) Transcript profile of PU.1 during myeloid cell differentiation. The transcript counts of HSCs, multipotent progenitors (MPPs), common myeloid progenitors (CMPs), GMPs, and mature myeloid cells were analyzed from the published RNA-seq dataset GSE74246 [50]. Error bars indicate SD (n > = 3). (E) Bar graph of ATAC-seq enrichment scores of PCREs and the PrPr from HSCs and their progenies during myeloid differentiation. The published ATAC-seq dataset was GSE74912 [50]. Error bars indicate SD (n > = 6). The arrows in figures D and E indicate the direction of the myeloid differentiation pathway with sequential cell stages. See also Table S1 for sample details
DNA methylation inversely correlates with chromatin accessibility at PCREs in myeloid cellsWe sought to examine the methylation status of PCREs in blood cell lineages. Whole-genome bisulfite sequencing (WGBS) datasets of myeloid, B, T, and erythroid cells from a previously published atlas of human tissue methylation [47] were employed. Remarkably, all PCREs were essentially unmethylated in myeloid cells, as shown by the low proportion of WGBS-methylated probes (Fig. 3A-B). A subset of these, TL1, TL3, and TL4, lacked methylation in all cell lineages. However, lineage-associated, differential methylation occurred in several other elements. For example, TL2, TL7, and TL8 were methylated in B, T, and erythroid cells but not in myeloid cells. Additionally, URE, TL5, hLZ12, PrPr, and the CpG island were methylated only in T cells, where PU.1 expression was lowest (Figs. 2 and 3A-B). Methylation of the CpG islands, which are often located in the promoter regions, is known to be correlated with gene silencing [54, 55]. As expected, lineage-marker genes are specifically demethylated at their promoters in their corresponding cell lineages (Figures S2A-B). Our findings, therefore, indicate that a lack of methylation within PCREs is correlated with chromatin accessibility and high expression of PU.1 in myeloid cells. Conversely, methylation within select elements correlates with diminished PU.1 expression in other blood lineages.
Fig. 3Differential DNA methylation and enrichment of the H3K27Ac enhancer mark at the PCREs, see also Figures S2. A-C) WGBS analysis of DNA methylation at the regions of interest within the PU.1 locus in myeloid cells (monocytes and granulocytes), B, T, and erythroid cells (A) Corresponding genomic PU.1 locus schema. (B) Heatmap of methylated probes for regions of interest, including the PCREC. (C) Graph of normalized methylation enrichment scores at the PU.1 locus across blood cell lineages. WGBS data were downloaded from GEO (GSE186458). Error bars indicate SD (n > = 3). D-E) H3K27ac enrichment at the PU.1 locus in four human blood cell lineages. (D) Average H3K27ac ChIP-seq coverage tracks of blood cell lineages. (E) Bar graph of coverage analysis showing enrichment scores for H3K27Ac peaks at the elements. The ChIP-seq data of four human blood cell lineages were retrieved from ENCODE and GEO (GSE70660). Error bars indicate SD (n > = 2). See also Table S1 for sample details
The 8-kb PCREC exhibits myeloid-specific enhancer signaturesWe further examined the histone modification statuses at PCREs. Because the URE is known to function as an enhancer in myeloid cells [16, 19, 20, 22], we utilized it as a positive internal benchmark for all other PCREs. We observed strong enrichment of the H3K27ac active enhancer mark at an 8-kb cluster of elements spanning from the URE to the TL8 (Fig. 3D-E), in parallel with chromatin opening (Figs. 1 and 2) and a lack of DNA methylation (Fig. 3). This chromatin modification was lineage-specific and correlated with the high PU.1 expression exclusive to myeloid cells (Fig. 2). In B and erythroid cells, H3K27ac enrichment occurred only at the URE and was relatively weak. The putative PCREs exhibited H3K27ac enrichment in myeloid cells to a lesser extent compared to the URE (Fig. 4A-B), suggesting their contributing roles as myeloid-specific PU.1 enhancers. Additionally, H3K27ac was highly enriched at lineage-marker gene loci, which correlated with their high expression in the corresponding cell types (Figures S2C-F). We further examined additional histone marks for the presence of enhancers and gene transcription, H3K4me1 and H3K4me2 [31, 32]. H3K4me1 was diffusely enriched throughout the PU.1 locus, including the upstream regulatory regions. Across cell lineages, its enrichment declined in a stepwise manner from myeloid to B to erythroid to T cells, largely corresponding with RNA expression patterns (Figure S3G). H3K4me2 displayed an ~ 5 kb region of enrichment spanning from the promoter to the first exon, and the signal decreased in a similar manner (Figure S3H). Notably, three narrow upstream regions enriched with H3K4me2 overlapped with these elements but lacked lineage specificity (Figure S3H). Thus, H3K4me1 and H3K4me2 could provide a contributing role in locating enhancer-containing regions but do not specifically reflect enhancer activities at PCREs. Taken together, our data revealed the 8-kb PCREC upstream of the PU.1 locus demarcated by a differential H3K27ac histone mark, that reflects a lineage-specific PU.1 expression pattern.
Fig. 4Enrichment of histone signatures for transcription and PU.1 occupancy at the PCREC. A-B) H3K9ac ChIP-seq coverage across four human blood cell lineages. (A) Average H3K9ac ChIP-seq coverage tracks of the genomic region encompassing the PU.1 locus and its upstream region. (B) Bar graph of average enrichment scores. C-D) H3K4me3 ChIP-seq coverage across four human blood cell lineages. (C) Average H3K4me3 ChIP-seq coverage tracks of the genomic region encompassing the PU.1 locus and its upstream region. (D) Bar graph of average enrichment scores. E-F) Average PU.1 ChIP-seq coverage tracks of myeloid and B cells (E) and bar graph of average enrichment scores for ChIP-seq peaks at the PCREC. Error bars indicate SD (n > = 2). Data were retrieved from ENCODE and GEO. See Table S1 for sample details
Absence of histone marks for silencer and gene repression within PCREs in blood cell lineagesThe URE is thought to be bifunctional, acting as an enhancer in myeloid cells and a silencer in T cells [8, 20, 22]. Thus, we examined the PU.1 locus for histone marks that have been utilized to demarcate silencers across blood cell lineages. To our surprise, the repressive histone mark H3K27me3 wase not significantly enriched at the PU.1 locus and its upstream genomic region in any of the lineages (Figure S3A). However, other control gene loci possessed these chromatin marks (Figures S3B-D), providing technical validation for the datasets. Thus, the utility of H3K27me3 could be limited to inferring the activity of broad genomic regions containing DNA elements in a context-dependent manner rather than identifying the silencing activity of specific PCREs, at least in our exanimated blood cell lineages. Nevertheless, these results indicate that PCREs act in a manner independent of H3K27Me3 repressive histone modification.
The PCREC is transcriptionally active and associated with PU.1 autoregulation in myeloid cellsActive enhancers, with certain histone modifications and the binding of trans-acting factors, often exhibit transcriptional activities. Indeed, we revealed enrichment of histone 3 lysine 9 acetylation (H3K9ac) and histone 3 lysine 4 trimethylation (H3K4me3) [32], markers for transcription sites and active promoter, throughout the PCREC in myeloid cells (Fig. 4A-D and S5A-B). In contrast, these markers were only observed at the URE and the PrPr in B cells (Fig. 4A-B and S5A-B). T and erythroid cells displayed weak enrichment of these marks restricted to the URE (Fig. 4A-B and S5A-B). These results suggest that transcriptional activity at the PCREC is tightly coupled with high PU.1 expression in myeloid cells. Because PU.1 autoregulates its own expression [19, 56, 57], we also examined PU.1 recruitment to its own locus. Accordingly, we detected PU.1 occupancy at the PCREC and the PrPr (Fig. 4C and S5C). Intriguingly, even though PU.1 expression is modest in B cells compared to myeloid cells (Fig. 2), PU.1 occupancy at the URE was at comparable levels between the two cell types (Fig. 4C and S5C). In contrast, PU.1 occupancy at other PCREs in the cluster was higher in myeloid cells than in B cells. Additionally, a weak PU.1 enrichment at the PCREC was observed in HSCs, CMPs, and GMPs, and lower still in MEPs, which corresponds to its expression pattern in these cells (Figure S4A-B). We further examined the occupancies of other hematopoietic transcription factors and noted generalized occupancy at the PU.1 promoter. However, their binding at the PCREC varies. GATA2 does not bind to the PCREC in all lineages, whereas occupancy by RUNX1, EGR, LMO2, and TAL1 could be observed (Figure S4). Thus, in addition to regulating the URE, PU.1 occupancy at other constituents of the PCREC is linked to superior PU.1 autoregulation in myeloid cells.
Noncoding RNA transcripts are initiated at the PCREC in the upstream region of PU.1 in myeloid cellsTranscriptional activity at active enhancers gives rise to enhancer RNAs (eRNA), which include 1d-eRNAs (long, polyadenylated and unidirectional transcription) and 2d-eRNAs (short, nonpolyadenylated and bidirectional transcription) [58, 59]. As previously reported, the 1d-eRNA LOUP is originates from the URE and induces PU.1 expression [15]. Therefore, we sought to examine molecular indicators of transcriptional activity at the PCREC. We first analyzed Global Nuclear Run-On sequencing (GRO-seq) data [60, 61]. In addition to those found at the PU.1 gene body, nascent RNAs were present within an upstream region encompassing the PCREC in myeloid but not erythroid cells (Fig. 5A-B). This suggests the presence of noncoding transcripts, as no coding gene is known to be present within this region. As expected, nascent mRNAs of the myeloid-marker gene ITGAM and the erythroid-marker gene GYPA were present specifically in the corresponding cell types (Fig. 5C-D). We further examined RNA polymerase II (Pol II) chromatin binding and noted its occupancy at all the constituents of the PCREC, with binding activity prominently at the URE (Fig. 5E). Remarkably, antisense RNAs were also detectable within the PU.1 locus (Fig. 5A-B). We further located transcription initiation sites within PCREs by inspecting the Cap Analysis Gene Expression sequencing (CAGE-seq) tracks downloaded from the FANTOM5 project [62]. Intriguingly, in addition to a CAGE-seq peak corresponding to 1d-eRNA LOUP initiation at the distal homology region (or H1) of the URE [15], there are bidirectional CAGE-seq peaks within the proximal homology region (or H2) of the URE (Fig. 5F). This finding suggests the presence of a 2d-eRNA. Moreover, we identified CAGE-seq peaks at TL5, hLZ12, TL7, and TL8, further indicating the presence of eRNAs characteristic of active enhancers at these elements (Fig. 5F). Thus, the PCREC exhibits the molecular features of myeloid-specific enhancers.
Fig. 5Noncoding transcriptional activities at the PCREC A-D) Gene track view of the genomic regions encompassing the PU.1 locus and its upstream and downstream neighboring genes (A), the zoomed-in genomic region comprising the PU.1 locus and its upstream region (B), the ITGAM locus (C), and the GYPA locus (D). Shown are average GRO-seq coverage tracks of myeloid and erythroid cells (n = 2). E) Average Pol II ChIP-seq coverage tracks across four human blood cell lineages (n > = 1). F) CAGE-seq track of human and mouse primary cells, cell lines and tissues from the FANTOM5 project. Data in (A-D), and (F) were retrieved from ENCODE and GEO. See Table S1 for sample details
A lineage-specific chromatin loop juxtaposed the PCREC with the PU.1 promoter resides within the 35-kb CTCF-insulated neighborhoodAlthough the central role of the URE-PrPr chromatin loop in PU.1 induction in myeloid cells has been well described [15,16,17, 19, 20], a comprehensive analysis of chromatin 3D architectures at the PU.1 locus in human blood cell lineages has not yet been performed. To elucidate this, we examined Micro-C (or intact micrococcal nuclease (MNase) chromosomal conformation capture sequencing (Hi-C)) datasets which employ MNase digestion rather than traditional four or six-cutter restriction enzymes, thus allowing for viewing of chromatin interactions at a sub-kilobase resolution [63, 64]. To optimize our analyses of the interaction map at such a small resolution, we used HiCCUPS to call unbiased loops at 0.5, 1, 2, and 5 kb resolutions simultaneously [65].
In all blood cell lineages and in a control non-hematopoietic cell lineage, a common loop was observed between the TL1/2 region, a site of open chromatin in all lineages (Figs. 1C and 2C), and a region 4 kilobases downstream from the PU.1 promoter (P4K) (Fig. 6A-E). To further characterize this loop, we inspected the occupancy of the DNA-binding protein CTCF, an important mediator of chromatin looping that forms the boundaries of insulated chromatin loops. Indeed, CTCF specifically occupied TL1/2 and P4K regions where chromatin is accessible (Fig. 6 and S5), suggesting that the interaction of these elements forms a 35-kb insulated chromatin neighborhood within the PU.1 locus in a lineage-independent manner. Interestingly, we noticed the presence of various lineage-specific chromatin interactions within this large chromatin loop in blood cells (Fig. 6A-D). Notably, an interaction involving the PCREC, which is specific in myeloid and B cells (Fig. 6A-B) and correlates with PU.1 expression in these cells (Fig. 2). Of note, in addition to the known functional interaction between the PCREC and the PrPr reported in myeloid cells, there are also interactions between the PrPr and other elements, such as TL3/4 and TL1/2, in B cells. These interactions may interfere with functional PCRE-PrPr interactions, contributing to net-decreased PU.1 levels in B cells. In T cells, the PCREC shifted its docking from the PrPr to the P4K (Fig. 6C), which correlates with PU.1 silencing in these cells (Fig. 2). In line with the predominant chromatin accessibility at TL3 and TL4 in B and T cells (Fig. 2C), the TL3/4-P4K interaction was also noted in these cells (Fig. 6B-C). In erythroid cells, where PU.1 is minimally expressed (Fig. 2), no PCREC-PrPr interaction existed, with an interaction between the TL1/2 and the PrPr being observed instead (Fig. 6D). In a non-hematopoietic tissue (heart left ventricle), no chromatin interactions aside from TL1/2-P4K could be found (Fig. 6E). Taken together, these results highlight chromatin interactions between PCREs and the PU.1 promoter in 3D space and show a compelling correlation with patterns of lineage-specific PU.1 expression.
Fig. 6Chromatin architecture dynamics at the PU.1 locus in cell lineages. Hi-C contact map (left) and schematics for chromatin interactions (right) of (A) myeloid cells (monocytes), (B) B cells, (C) T cells (CD4 + T cells), (D) erythroid cells (erythroid precursors, GEO: GSM6616197) [66], and (E) heart left ventricle. The corresponding genomic locations with PCREs, and CTCF ChIP-seq tracks are shown underneath each contact map. In the schematics, the PCREC with interaction points, are shown. Within the PCREC, elements containing interaction points, including the URE as well as the TL5-6 sub-cluster that comprises TL5, hLZ12, and TL6, are displayed. The arrows represent the chromatin direction starting from the upstream region toward the PU.1 gene body. Two-direction arrows depict chromatin interactions identified by HiCCUPS. ata were retrieved from ENCODE and GEO (Table
留言 (0)