Epigenetics and tissue immunity—Translating environmental cues into functional adaptations*

1 INTRODUCTION

Tissue immune responses require the coordinated activation of specialized cell types, both professional immune cells and organ structural cells, each with differing, but tightly regulated, transcriptional programmes that are switched on and off in response to pathogen or tissue damage signals. The precise immune functionality of these tissue populations is determined by the selection of genes they express, encoding proteins that differentially contribute to immune responses. Tissue immune cells encounter an array of organ-specific conditions that determine tissue-specific transcriptional programmes. Gene expression is controlled, in part, by the state of chromatin, with closed structures sterically hindering the binding of transcriptional modulators. Post-translational modifications of histone proteins, such as methylation or acetylation, alter chromatin compaction and accessibility, while DNA methylation of CpG dinucleotides may also influence gene expression by preventing transcription factor binding. The extent and nature of these epigenetic modifications are shaped by the stimuli encountered by cells, for example, engagement of toll-like receptor (TLR) 4 on macrophages by lipopolysaccharide (LPS) can alter subsequent gene expression upon re-challenge, due to changes in histone acetylation. In addition, several non-coding RNAs provide an additional layer of post-transcriptional control of gene expression. Regulation of immune cell transcriptional activity is paramount to ensure appropriate immune responses are generated and terminated in a timely manner, and a failure to do so may have negative consequences for organ homeostasis. Therefore, understanding the precise epigenetic mechanisms that control tissue immune responses will inform treatment strategies for a variety of diseases. Here, with a focus on the mammalian kidney, we will review the basic concepts of tissue immunity, discuss the technologies available to profile epigenetic modifications in tissue immune cells, and consider how these mechanisms influence the development, phenotype and function of different tissue immune cell subsets, as well as the immunological function of structural cells in health and disease.

2 TISSUE IMMUNITY 2.1 Tissue immunity—a coordinated effort by structural cells and tissue-resident immune cells

The study of mammalian immunity has historically focused on interrogating the responses of immune cells in blood or secondary lymphoid organs (lymph node and spleen). However, it is increasingly appreciated that several subsets of innate and adaptive immune cells reside in non-lymphoid organs.1-3 These tissue-resident populations may constitute a large proportion of the total immune cell pool, and do not enter the circulation, permanently occupying a specific niche within tissues.4 The archetypal tissue-resident cell type is the macrophage, exemplifying the canonical features of tissue-resident cells, being long-lived, self-renewing, and showing tissue-specific transcriptional and functional specialization.5 Macrophages take up residency in tissue niches early in embryogenesis, seeding from the yolk sac (YS), and then fetal liver precursors.6 Post-natally, macrophage tissue pools are variably replenished by monocyte-derived cells,7, 8 with tissue-specific cues, for example from the microbiome8 (in the case of the intestine) or high interstitial sodium9 (in the case of the kidney), influencing this process. Other prevalent tissue-resident immune cell subsets include T cells; antigen-specific CD8+ T and CD4+ T cells enter tissues during viral challenge, and persist long after the resolution of infection.10-12 A common tissue-residency transcriptional signature has been described in lymphocytes,13, 14 and other tissue-resident subsets including innate lymphocytes and natural killer (NK) cells have also been characterized.15, 16 Tissue-resident immune cells play a variety of important functional roles in addition to immune defense, frequently contributing to organ homeostasis.17-19 For example, human yolk sac-derived macrophages in the heart were physically connected to cardiomyocytes via gap-junctions containing connexin 43, which allows macrophages to participate in and regulate electrical conduction20; in the colon, muscularis macrophages regulate peristalsis.21

Effective tissue immune responses require the coordinated interaction of these resident populations with each other and with their circulating counterparts, via cytokine and chemokine production, as well as cross-talk with the epithelial compartment.22-25 Indeed, immune functionality within organs is not limited to immune cells, but non-immune tissue cells can also play a part. For example, in human and mouse kidney, we previously showed that pelvic epithelial cells express antimicrobial peptides (AMP), directly contributing to anti-bacterial immunity, as well as producing neutrophil-recruiting chemokines, orchestrating the specific anatomical localization of the key circulating phagocytes to protect the kidney from bacteria ascending from the bladder.26 Krausgruber et al27 described the expression of immune mediators, as well as cytokines and chemokines in epithelium, fibroblast and endothelium, in a variety of mouse organs, generating so-called ‘structural immunity’. Thus, tissue immune responses involve the combined efforts and interactions of epithelial, endothelial, stromal, and resident immune cells, and are tailored to the tissue-specific challenges encountered, requiring tissue-specific cues to control cellular transcriptional programmes.

2.2 Experimental identification of tissue immune cells

Since all solid organs contain a vascular network, required to supply oxygen and remove metabolic waste, a major challenge in the field of tissue immunity has been how to accurately distinguish cells that are bona fide tissue-dwelling cells, from those that are in the circulation. In murine studies, three approaches have been applied to establish the tissue-resident status of an immune cell. Firstly, parabiosis,28 in which the circulatory systems of two mice expressing a congenic surface marker (most often CD45.1 versus CD45.2) are surgically joined. After a number of weeks, tissue-resident populations remain donor-derived, whereas an equal number of donor (eg, CD45.1) and recipient (eg, CD45.2) cells would be expected for any recirculating leukocyte population.12, 15, 29, 30 Secondly, an intravascular anti-CD45 antibody can be administered pre-mortem to label circulating immune cells prior to organ harvest. In this case, tissue-resident cells outside of the vasculature remain unlabeled.31 Thirdly, imaging tissue sections enables the position of immune cells relative to the vasculature to be directly defined. Together, these approaches have been used to identify tissue-specific markers expressed on organ-resident populations, for example, for tissue-resident memory T cells (Trm) these include CD69, integrin ⍺E (CD103), and the ⍺1 subunit of the ⍺1β1 integrin, CD49a.31-34 In humans, assessing tissue residency is more challenging, but T cells isolated from non-lymphoid organs express some of these markers35, 36; For example, CD69 is detectable on skin-resident T cells in humans.37, 38 However, there are differences in the phenotypes of Trm in murine versus human organs38 and different organs imprint distinct tissue-specific transcriptional programmes, phenotypes, and functions on resident T cells.35, 39 To summarize, identifying and studying bona fide tissue-resident subsets requires careful application of the experimental systems discussed above, and is particularly challenging in humans, but is necessary to definitively delineate the organ-specific transcriptomic and epigenetic profiles of resident immune cells.

2.3 Tissue cues, structural and immune cells in the kidney

Every organ presents a unique environment for the immune cells residing there, with specific tissue cues, shaped by the homeostatic function of the organ. In many cases, there are discrete microenvironments within an organ, due to spatial separation of different organ functions. A good example of this is found in the mammalian kidney, an organ specialized for the removal of metabolic waste and excess fluid. Anatomically, each kidney consists of an outer cortex containing glomeruli where filtrate is generated, and an inner medulla where urine is concentrated (Figure 1A). The functional subunit of the kidney is the nephron, made up of a glomerulus, proximal tubule (PT) (where filtered electrolytes are reabsorbed), loop of Henle (LOH) (that generates the intrarenal sodium gradient required for urine concentration), and collecting ducts (CD) that coalesce in the kidney pelvis.40 Different mononuclear phagocyte (MNP) populations are differentially located in cortex and medulla.39 Furthermore, immune cells in the cortex are exposed to a very different environment compared to medullary immune cells that experience hypersalinity and hypoxia.41 Notably, these environmental cues can affect immune cell recruitment and function; we found that high extracellular sodium augmented the anti-bacterial function of macrophages, and increased the production of monocyte-recruiting chemokines by epithelial cells,40 effects mediated at a molecular level by the transcription factors Nuclear Factor Of Activated T Cells 5 (NFAT5) and Hypoxia Inducible Factor 1 Subunit Alpha (HIF1ɑ).42 We recently applied single-cell RNA sequencing (scRNAseq) to more comprehensively profile kidney immune cells,26 utilizing technological advances that have enabled high-throughput scRNAseq and the generation of organ atlases.43 In normal human kidney, we identified more than 15 subsets of tissue immune cells based on their transcriptional profiles, with the immune landscape dominated by MNPs, including macrophages and DCs, T cells, and NK cells, and different populations were enriched in different regions of the kidney (Figure 1B-C). The localization pattern of the kidney immune cells was orchestrated by immune-epithelial cross-talk for regulating anti-bacterial immune responses (Figure 1D).

image

Tissue immunity in the human kidney. A, The human kidney in section; the kidney is macroscopically divided into cortex and medulla. Hundreds of thousands of nephron units are arranged over cortico-medullary depth (highlighted). Filtrate is generated in the glomerulus (lower panel) comprising podocytes, mesangial cells, and glomerular endothelial cells (GEC), before being modified by solute and metabolite resorption, excretion, and concentration along tubular nephron segments. Gradients of oxygen tension and salinity exist between the cortex and medulla as indicated. B, UMAP showing populations identified by integrated analysis of scRNAseq data (Stewart et al and Kuppe et al) (91 265 cells) (PTEC, proximal tubular epithelial cell; DT, distal tubule; LOH, loop of Henle; PC, principal cell; IC-A, intercalated cell type A; IC-B, intercalated cell type B; PE & Uro’, urothelium and pelvic epithelium; LE lymphatic endothelium; GEC, glomerular endothelial cell; VRE, vasa recta endothelial cell; PCE, peritubular capillary epithelial cell; T&NK, T cells and NK cells; MNP, mononuclear phagocytes). C, The immune compartment of the human kidney as identified by integrated scRNAseq analysis (17 680 cells). Distinct populations of dendritic cells are evident (cDC1, conventional DC1; cDC2, conventional DC2; pDC, plasmacytoid DC; aDC, activated DC). Among innate lymphocytes, two subsets of NK cells (NK cell 1, NK cell 2) are evident, in addition to innate lymphoid cells (ILC). (D) Left panel: Ascending bacterial infection from the lower urinary tract first reaches the pelvic region of the kidney. Here, antimicrobial peptide production by the pelvic epithelium acts as a first line of defense, and chemokine expression orchestrates the recruitment of neutrophils and mononuclear phagocytes. Right panel: antimicrobial peptide (red) and chemokine (blue) gene expression patterns in nephron cell types ordered along the proximal-distal axis of the nephron (analysis of scRNAseq data, Stewart et al)

3 EPIGENETIC MECHANISMS

Epigenetic mechanisms play a crucial role in cell fate specification by regulating gene expression and silencing in a context-dependent manner. Epigenetic control of gene transcription and translation does not involve changes to the DNA sequence, but rather, reversible chemical modifications of DNA or histones, or the activities of non-coding RNAs, that together enable cell- and tissue-specific, gene expression patterns that are essential for controlling normal developmental processes and maintaining tissue homeostasis. Disruption of epigenetic mechanisms can lead to organ dysfunction and disease states, including autoimmunity and cancer.

3.1 DNA methylation

DNA methylation is one of the best-studied epigenetic modifications to date. In general, DNA methylation leads to the addition of a methyl group to the fifth carbon atom of a cytosine (5mC) followed by a guanine base (Figure 2A). As a result, the methylated CpG dinucleotides, which are frequently found in gene regulatory regions, block transcription factor binding to gene promoters, and repress target gene expression. Currently, three active DNA methyltransferases (DNMTs) are known to catalyze DNA methylation in mammals, namely DNMT1, DNMT3A, and DNMT3B.44 Demethylation in the mammalian genome is mediated by the TET (Ten-Eleven Translocation) family of dioxygenases that oxidize 5mC to 5-hydroxymethylcytosine (5hmC), and then to 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC).45 The “intermediate” 5hmC marks active demethylation, plays distinct epigenetic roles, and is a useful indicator of gene expression.46, 47

image

Methods to study epigenetics at bulk and single-cell resolution. A, Epigenetic modifications occur through DNA methylation (left panel) or histone modifications (right panel). Highlighted are reactions methylating cytosine pyrimidine bases to 5-Methylcytosine (5mC) and oxidation of this to 5-Hydroxymethylcytosine (5hmC). The nucleosome is composed of pairs of histone components, and commonly studied histone modifications (monomethylation, dimethylation, trimethylation, and acetylation) and their effect on chromatin structure and transcription are highlighted (+, activation; -, repression). B, Outline of methods to study DNA methylation, histone modifications, chromatin accessibility, and nuclear organization for both bulk sample and single-cell applications. C, Illustration of multi-omic profiling methods, highlighting the molecular layer targeted by each assay

DNMT1 is essential for maintaining DNA methylation at the synthesis phase of cell cycle,48 while DNMT3A and DNMT3B are required for de novo DNA methylation. During embryogenesis, DNA is first demethylated by TET1, TET2, and TET3, resulting in a “clean slate” for de novo DNA methylation by DNMT3A and DNMT3B.49 The activities of these two enzymes are regulated by their expression pattern and structure of distinct isoforms; DNMT3A has two isoforms DNMT3A1 and DNMT3A2.50 Full-length DNMT3A1 expression is maintained in differentiated cells and its intact N-terminal region can interact with DNA to repress gene expression. In contrast, DNMT3A2, which lacks 223 amino acids in its N-terminal region, is predominantly expressed in embryonic stem cells (ESC). DNMT3B has more than 30 isoforms with distinct catalytic and regulatory activities. Although loss-of-function studies of DNMT3A and DNMT3B confirmed their importance in de novo methylation rather than imprinted methylation patterns,51 double knockout mice have a gradual loss of DNA methylation over time in ESCs, suggesting that these enzymes work in concert with DNMT1 to maintain DNA methylation.52, 53 DNMT2, which is now known as tRNA aspartic acid methyltransferase 1 (TRDMT1), was found to be a tRNA methyltransferase that does not methylate DNA, despite including a similar catalytic domain to DNMT1.54

DNA methylation is critical for the differentiation of hematopoietic stem cells (HSCs) that give rise to lymphoid (eg, B/T lymphocytes and NK cells) and myeloid (eg, macrophages and dendritic cells) lineages. Whole DNA methylome mapping efforts have revealed reciprocal increased or decreased levels of DNA methylation during lymphoid or myeloid lineage specification, respectively.55 The generation of functionally specialized, differentiated immune cell subsets is regulated by DNA methylation, for example, regulatory T cell (Treg) differentiation is coordinated by DNMT1,56 T helper (Th)1/2/17 polarization by TET2 and DNMT3A,57, 58 and memory T cell activation by demethylation of interferon gamma (IFNG).59 These loss-of-function studies have not only revealed the role of these enzymes in methylation or demethylation of specific CpG sites, but also show their influence in controlling the activity of transcription factors that may bind to enhancers in these cell types.

3.2 Histone modifications

DNA is highly compacted into nucleosomes, obstructing transcription, but post-translational modification of nucleosome core histones at their N-terminus may alter the accessibility of DNA to the transcriptional machinery, resulting in the activation or repression of gene expression60 (Figure 2A). Nucleosomes comprise 145-147 base pairs (bp) of DNA wrapped twice around eight core histone proteins (H), including two copies of H2A, H2B, H3, and H4, and adjacent nucleosomes are joined by 10-70bp of linker DNA to form nuclear chromatin. The N-terminal regions of histones contain modifiable amino acids at their surface, including lysine, arginine, serine, threonine, and tyrosine. Chromatin compaction and accessibility are altered by chemical modification of these amino acids, specifically acetylation, methylation, and phosphorylation, changing the transcription of certain genes.

Histone acetylation usually results in higher gene expression and is achieved by the regulated activity of two groups of enzymes with opposite effects, namely histone acetyltransferases (HATs) and histone deacetylases (HDACs). HATs were first identified by Allfrey et al61 who found that these enzymes can neutralize lysine's positive charge in the tail regions of histones by transferring an acetyl group from acetyl-CoA to the target lysine residues, weakening the interaction between histones and DNA. As the modified chromatin becomes less compact, the transcriptional machinery can gain access to target genes promoting transcription. HATs fall into two classes: type-A and type-B. While type-B HATs, such as HAT1, acetylate newly synthesized histones H3 and H4 at their tail regions in the cytoplasm,62 type-A HATs such as MYST, Sas (Something about silencing)2, Sas3 (previously yeast Ybf2), TIP60 (Tat-interacting protein 60 kDa)), and Cyclic adenosine monophosphate (cAMP) Response Element-Binding Protein (CREB)-binding protein (CBP)/p300 families, acetylate nucleosomal histones in the context of chromatin.63 On the other hand, HDACs oppose the actions of HATs, restoring lysine's positive charge that is essential for the stability of the chromatin architecture,63, 64 leading to closed chromatin and suppressing gene expression.

Like histone acetylation, phosphorylation of histones at serines, threonines, and tyrosines is also dynamic, and regulated by counteracting kinases and phosphatases. Kinases transfer a phosphate group from ATP to the target residues, adding negative charge to the N-terminal regions of histones (for example, mitogen-activated protein kinase 1 (MAPK1)65) or to the core region (for example, Janus kinase 2 (JAK2)65, 66), creating sites for DNA exit from the nucleosome. Histone phosphorylation works in partnership with other modifications, for example, phosphorylation of the serine in histone H3 (H3S10ph) leads to further acetylation at H3K9ac,67 H3K14ac,68 and H4K16ac.69

Unlike acetylation or phosphorylation that alters gene expression patterns by changing the charge of histones, the addition of one or more methyl groups at lysines and arginines of histones enables DNA to uncoil from nucleosomes. Histone methylation can lead to transcriptional repression or activation, depending on the amino acid targeted and the number of methyl groups added. These factors are determined by critical residues in the catalytic domains of histone lysine methyltransferases (HKMTs), including DIM5 (defective in methylation 5) and SET (Suppressor of position-effect variegation (Su(var))3-9, Enhancer-of-zeste, Trithorax)7/9. X-ray crystallography70, 71 and site-directed mutagenesis72 have shown that Tyrosine281 in DIM5 or Phenylalanine305 in SET7/9 is responsible for transferring one or three methyl groups to target histone lysines, respectively. Generally, methylation of Lysine9 (H3K9me2, H3K9me3) and Lysine27 (H3K27me2) in H3 leads to repression of transcription, and these modifications are enriched in developmentally silenced loci or heterochromatin domains.73 In contrast, methylation of Lysine4 (H3K4me1, H3K4me2, H3K4me3), Lysine36 (H3K36me3), and Lysine79 (H3K79me1) in H3 results in active transcription, and are present at the 5' untranslated region of target genes (H3K4), or in gene bodies (H3K36, H3K79).73, 74

3.3 Non-coding RNAs

Non-coding RNAs comprise the majority of RNAs, do not encode functional proteins, and play a pivotal role in regulating gene expression at the post-transcriptional level. These non-coding RNAs include micro-RNAs (miRNAs), Piwi-interacting RNAs (piRNAs), small interfering RNAs (siRNAs), and long non-coding RNAs (lncRNAs). miRNAs and siRNAs contain 19 ~ 24 nucleotides, can both be generated through cleavage by the RNase III enzyme Dicer, and induce RNA silencing by forming a RNA-induced silencing complex (RISC) with their target mRNAs. miRNAs are single-stranded RNAs that contain incomplete hairpin structures and achieve gene silencing by targeting untranslated regions of enzymes critical for chromatin remodeling, such as HDAC4 (miR-140,75 POLR3D by miR-32076), and DNMT3A and DNMT3B by miR-29 family.76, 77 In addition to their silencing effects, recent studies have shown that miRNAs such as miR-373 may induce RNA activation through binding to the promoter regions of target genes, for example, E-cadherin and CSDC2.78 piRNAs are transcripts with 26 ~ 31 nucleotides that form RNA-protein complexes by interacting with Piwi domain-containing Argonaute proteins to achieve target gene silencing,79 particularly of transposons.80 The biogenesis of piRNAs is not yet clear; however, it is thought that they could be derived from long, single-stranded precursor molecules,81 catalyzed by two piwi proteins Aubergine (Aub) and Argonaute-3 (Ago3). This process, also known as the piRNA Ping-Pong pathway, appears to trigger the degradation of transposons.82 lncRNAs transcripts exceed 200 nucleotides in length, for example, X-inactive specific transcript (XIST), a 17kb lncRNA best known for its role in X chromosome inactivation. XIST physically binds to its target X chromosome in cis, recruiting the Polycomb complex 2 (PRC2).82 As a result, H3K27 trimethylation is induced to repress gene expression.

4 TECHNOLOGIES: PROFILING EPIGENETIC MECHANISMS—MOVING FROM TISSUES TO SINGLE CELLS 4.1 DNA methylation profiling

Genome-wide assays of DNA methylation (5mC) can be performed using methods which capture the entire genome at single-base resolution, or methods that target specific modifications and regions of the genome to build lower resolution maps of methylation (Figure 2B). The current gold standard is whole-genome bisulfite sequencing (WGBS). A high concentration of sodium bisulfite at pH 5.0 results in deamination of cytosine to uracil, while 5mC is protected from the deamination reaction.83 Consequently in sequencing data, 5mC are read as cytosine bases; however, the deaminated bases are sequenced as thymines. In the original methodology, Sanger sequencing was used to assess CpG methylation.83, 84 The current standard approach is to prepare libraries for next-generation sequencing (NGS) to generate genome-scale maps of DNA methylation.85

WGBS cannot differentiate 5mC and 5hmC—both are read as cytosines in sequencing. A modification to bisulfite sequencing, oxidative bisulfite sequencing (OxBS-seq), converts 5hmC to 5fC, and subsequent bisulfite treatment converts 5fC to uracil, leaving 5mC unconverted. Comparisons between oxBS-seq and conventional bisulfite sequencing allow for identification of 5hmC modified regions.86, 87 An alternative method for 5hmC profiling utilizes differential TET enzyme-mediated oxidation; in this assay, 5hmC is converted to β-glucosyl-5-hydroxymethylcytosine (g5hmC) by β-glucosyltransferase, and is protected from oxidation by TET. Consequently, 5hmC is sequenced as cytosine after bisulfite conversion, whereas 5mC which becomes oxidized to 5caC is sequenced as thymine.88, 89

Protocols for single-cell bisulfite sequencing (scBS-seq) have been developed, consisting of single-cell isolation into plate wells, prior to bisulfite conversion and library construction.90 As a proof of principle, these methods have been used to interrogate DNA methylation patterns in ESC.91, 92

WGBS remains prohibitively costly for high-throughput experiments, principally because very large genomic regions which are CpG depleted consume sequencing reads. A complementary approach—reduced representation bisulfite sequencing (RRBS)—employs DNA digestion with methylation-insensitive restriction enzymes to enrich regions of the genome enriched for CpG, prior to bisulfite conversion, PCR amplification, and sequencing.93 This method covers only around 10%-15% of the genome and provides a biased view of the genome as a result of cleavage at restriction enzyme-sensitive sites. While RRBS offers very limited coverage of non-CpG island regions, it does offer single-base resolution data on areas dense in CpG methylation. RRBS has also been adapted to study DNA methylation patterns in single cells. Single-cell reduced representation bisulfite sequencing (scRRBS) has been used to decipher heterogeneity among ESC.94-96

More recently, methods to simultaneously assay single-cell methylome and transcriptome have been developed, relying on physical separation of RNA and genomic DNA. These include Switching Mechanism At the end of the 5'-end of the RNA Transcript (Smart)-RRBS and scMT-seq which combine the Smart-seq2 whole transcriptome method97 and RRBS,98, 99 scM&T-seq which adapts the existing G&T-seq approach100 with a bisulfite conversion step for genomic DNA.100, 101 Data from single-cell methylation profiling approaches suffer from the limited throughput and high cost per-cell. In an attempt to increase throughput, Mulqueen et al102 leveraged combinatorial indexing in a method termed sci-MET. Using this approach, they use WGBS to profile cell lines and murine primary cortical nuclei, overall generating 3282 scBS libraries. These data illustrated the potential of this sc-WGBS to distinguish cell types in primary tissue in a manner amenable to scaling.

Methylation-specific restriction enzyme (MRSE)-based approaches take advantage of restriction endonucleases which are sensitive to base methylation status. Here, libraries can be generated after cleavage of DNA by restriction enzymes unable to cleave methylated-cytosine bases. Methylated regions will remain intact after DNA cleavage. These fragments are amplified by PCR and sequenced. In contrast to bisulfite sequencing, this method does not offer single-base resolution.103

A hybrid approach—methylation-sensitive restriction enzyme bisulfite sequencing (MREBS) builds on the strengths of MRSE and RRBS, and extends the coverage of RRBS to a larger fraction of the genome.104

Using an affinity purification approach, 5mC specific antibodies105 or methyl-CpG binding protein106 can be used to enrich regions of the genome which are highly methylated. After immunoprecipitation, fragments enriched for 5mC can be assayed by array hybridization or next-generation sequencing. This method is highly economical but biases toward hypermethylated CpG rich regions of the genome.107

A further method providing a low-cost and high-throughput view of the methylome uses bisulfite conversion of genomic DNA followed by PCR amplification and hybridization to a microarray. This method—the Illumina HMEPIC BeadChips generate data on 850 000 methylation sites across the genome, and build on the 450 000 site HMK450 chip predecessor.108 Although this does not provide single-base resolution, it does offer coverage of the 95% of CpG islands and is well suited to high-throughput approaches assaying methylation variation in population studies.

Long-read methods including nanopore sequencing, and SMRT sequencing have also been used to assay methylation status genome-wide without requiring bisulfite conversion via picoampere signal intensities corresponding to modified bases (Nanopore),109 or variation in polymerase kinetic activity (SMRT sequencing).110 Although these methods generate data with a higher error rate and modest throughput, they are able to provide much longer reads allowing more efficient interrogation of methylome haplotypes111 and delineation of methylation status at repetitive elements and structural variants.112

4.2 Histone modifications

Genome-wide profiles of chromatin modifications can be routinely assayed using chromatin immunoprecipitation followed by sequencing (ChIP-seq) (Figure 2B). This method utilizes protein affinity purification using an antibody specific for a chromatin post-translational modification or another DNA-binding protein such as a transcription factor. Following crosslinking of DNA-protein complexes, fragmentation, and exonuclease treatment, DNA-protein complexes are immunoprecipitated, and enriched DNA fragments are sequenced using NGS.113-116 This method has been extremely widely used to profile genomic regions associated with transcription factors, and a broad range of histone acetylation and methylation states.

Attempts to generate ChIP-seq data at single-cell resolution have used droplet-encapsulation microfluidics to perform massively parallel DNA barcoding followed by NGS. Rotem et al117 profiled H3K4me3 and H3K4me2 marks in mouse ESC, embryonic fibroblasts (EF), and hematopoietic progenitors. The method ligates DNA barcodes to chromatin-associated DNA fragments generated after MNase treatment, providing an index link to the cell of origin. The method then proceeds to immunoprecipitation on a pooled sample, and sequencing of enriched DNA fragments. While these data are extremely sparse, the histone modification profiles generated readily distinguished ESC and EF, and revealed heterogeneity among the gene regulatory programmes of ESC. Building on this droplet microfluidics-based approach, Grosselin et al118 profiled H3K4me3 and H3K27me3 marks firstly in human B- and T lymphocytes, before turning to murine stromal cells. In a patient-derived xenograft model of triple-negative breast cancer, they were able to detect rare untreated cells bearing the same H3K27me3 repressive pattern as drug-resistant cancer cells. They speculated that this may represent epigenetic priming of cancer cells toward treatment resistance, a feature not readily identifiable by gene expression profiling.

Single-cell or low-input ChIP-seq is technically very challenging, largely as a result of non-specific immunoprecipitation. An alternative approach termed CoBATCH dispenses with chromatin immunoprecipitation, and instead utilizes a protein A-Tn5 transposase fusion protein to perform in situ targeted tagmentation at sites bound by specific antibodies, before a round of combinatorial indexing. Using this method, Wang et al119 profiled 2758 endothelial cells sorted from mouse embryonic organs for the H3K27ac mark to define regions with active enhancers, identifying numerous organ-specific H3K27ac profiles at transcription factor enhancers, for example, Hoxa11 in the kidney.

In 2017, Skene and Henikoff developed Cleavage Under Targets & Release Using Nuclease (CUT&RUN) for mapping interactions between DNA and proteins. This method offers technical advantages over ChIP-seq, such as avoidance of false-positive site identification as a result of cross-linking. In common with ChIP-seq, this method uses transcription factor-specific antibodies. Nuclei are immobilized using concanavalin-A coated magnetic beads. After the addition of a protein A-MNase fusion protein and free Ca++, DNA cleavage occurs at antibody-associated sites, releasing transcription factor-associated fragments which diffuse out of the nuclear membrane and into the supernatant which are collected for library preparation and NGS. The method offers lower background signal, and reduced sequencing requirements and hence represents a cost-effective alternative to ChIP-seq.120 The same group followed the CUT&RUN method with Cleavage Under Targets and Tagmentation (CUT&Tag). Similarly to CoBATCH, this method uses a protein A-Tn5 transposase fusion protein and generates fragment libraries at antibody-targeted sites, allowing the profiling of histone modifications and transcription factor associated DNA in low-input material and single cells.121 By virtue of the use of a Tn5 transposase, this method generates a low-level Assay for Transposase-Accessible Chromatin using sequencing (ATACseq) signal in addition to a strong protein-enriched signal, therefore offering the potential for delineation of joint chromatin accessibility and binding protein associated profiles. Future developments are likely to extend this method to probe multiple histone modifications and transcription factors via adapter barcoding. The utility of a protein A-Tn5 transposase fusion protein in generating single-cell histone modification maps has been replicated by Carter et al122 Although a single-cell method employing targeting of MNase to specific antibody bound sites successfully profiled H3K4me3 on 242 cells, this approach lacks scalability.123 The CUT&Tag approach has since been adapted to commercial, high-throughput droplet microfluidics single-cell ATACseq protocols (scATACseq), generating datasets on histone modification patterns in the murine central nervous system.124

4.3 Chromatin accessibility

Mapping regions of open chromatin across the genome were initially achieved through identifying sites sensitive to deoxyribonuclease I (DNase I)125, 126 (Figure 2B), a nuclease that preferentially cleaves DNA at phosphodiester linkages adjacent to pyrimidine nucleotides. With the advent of high-throughput sequencing, investigators were able to map DNase I cleavage regions throughout the genome, termed DNase-sEquations 127, 128 DNase sensitivity maps have been instrumental in efforts to produce comprehensive inventories of DNA elements, exemplified by the ENCODE project.129-131

Genome-wide profiling of chromatin accessibility has been enabled through the use of Tn5 transposase loaded with sequencing adapters. The enzyme inserts adapters at regions of open chromatin and generates DNA fragments for downstream NGS profiling.132 Recent improvements to the ATACseq protocol, namely Omni-ATAC and Fast-ATAC, have been demonstrated to have substantially reduce background noise in diverse cell lines and tissue types.133, 134 Omni-ATAC also worked on frozen sample blocks, which were historically difficult to assay.133 Fast-ATAC is an optimized ATACseq protocol for blood cells, which produced high-quality data with reduced noise.134 ATACseq has been adapted to a high-throughput single-cell method, initially as a nano-well based method,135 before its evolution to a massively parallel droplet-microfluidics implementation capable of profiling hundreds of thousands of cells simultaneously.

留言 (0)

沒有登入
gif