Super-silencer perturbation by EZH2 and REST inhibition leads to large loss of chromatin interactions and reduction in cancer growth

Cell culture

Human chronic myelogenous leukemia cell line K562 and human leukemia monocytic cell line THP1 were cultured in RPMI-1640 supplemented with 10% FBS and 1% penicillin–streptomycin. HAP1, a near-haploid human leukemic cancer cell line, was cultured in Iscove’s modified Dulbecco’s medium supplemented with 10% FBS and 1% penicillin–streptomycin. The SEM CTCFAID cell line was a kind gift from the lab of L. Chunliang37. SEM cells were cultured in RPMI-1640, 2 mM l-glutamine, 10% FBS and 1% penicillin–streptomycin. All cultures were maintained at 37 °C, 5% CO2 in a humidified incubator.

CRISPR excision

CRISPR excision was performed using the all-in-one CRISPR–Cas9 vector system, as described previously46. Briefly, two guide RNAs (gRNAs) were designed for each region using the website (https://portals.broadinstitute.org/gppx/crispick/public)47. Single gRNA was cloned into either a pX330A or a pX330S vector (gift from L. Shang, pX330A modified to include green fluorescent protein (GFP) reporter marker) followed by Sanger sequencing. Golden Gate assembly of two gRNAs was performed to insert them into the pX330A backbone. Two positive gRNA insertion plasmids were confirmed by Sanger sequencing using CRISPR-step2-F and CRISPR-step2-R primers (Supplementary Table 7). Next, the plasmid was electroporated into the K562 cell line using the Neon transfection system (Thermo Fisher). After 48 h, transfected cells underwent fluorescence-activated cell sorting into 96-well plates as single-cell colonies based on GFP signal.

Cells were harvested from each clone and genotyping was performed using an internal and flanking primer pair. Successful clones were confirmed through agarose gel electrophoresis and Sanger sequencing (First Base). All primers used for genotyping are listed in Supplementary Table 7.

RNA extraction, reverse transcription (RT)–qPCR and RNA-seq

Total RNA was isolated from cells using the RNeasy Mini Kit (Qiagen) and on-column DNase digestion (Qiagen) was performed. RNA concentrations were determined by Nanodrop ND1000 (Thermo Scientific) and 1 µg of total RNA was then reverse-transcribed to complementary DNA using the SuperScript III first-strand synthesis system (Invitrogen). qPCR was then performed using the Applied Biosystems QuantStudio 3/5 real-time PCR system using SYBR green PCR master mix and appropriate primers (Supplementary Table 7). Gene transcript levels were analyzed using the 2−ΔΔCt method48. For RNA-seq, quality of the extracted RNA was analyzed using an Agilent RNA 6000 Nano Kit (Agilent). Library construction was performed using TruSeq Stranded Total RNA LT (with Ribo-Zero Gold) Set A (Illumina), as per the protocol. Libraries were then sequenced using the Illumina HiSeq4000 platform. Two biological replicates were performed for each condition.

Adhesion assay

The cell adhesion assay was performed using the CytoSelect 48-well adhesion assay (Cell Biolabs), according to the manufacturer’s protocol. Briefly, a cell suspension (5 × 105 cells in 200 ml of FBS-free medium) was added to fibronectin-coated wells and BSA-coated wells (negative control). After 3 h of incubation, cells were washed with PBS, stained with crystal violet and eluted with extraction solution. Adhesion levels were quantified by optical absorbance at 560 nm using the Tecan plate reader. Three biological replicates were performed for each condition.

Growth curve assay

A total of 1,000 cells per well were seeded in 96-well plates. Cell growth was measured on day 0, day 1, day 2, day 4 and day 5 using a CellTiter-Glo assay kit (Promega, G7571). Luminescence was read on a Tecan plate reader. Three biological replicates were performed for each condition.

Xenograft experiments

We complied with all relevant ethical regulations and all animal studies were carried out in accordance with animal care and use guidelines approved by Biological Resource Center.

The present study was conducted with 6–8-week-old female CB17 SCID mice. The mice (n = 5) were injected subcutaneously. All mice were monitored for tumor growth at the inoculation site and tumor volume was measured using a Vernier caliper twice weekly for 40 days or until the tumor volume reached 1,000 mm3, whichever came first. Tumor volume was calculated using the formula V = a × b2 × 0.52, where a represents the largest diameter and b represents the smallest diameter of the tumor.

The xenograft model presented in Fig. 1f was generated using the following method: one of the DKO CRISPR clones (‘DKO-C1’) and its EV were injected into the same mice (left side and right side), while another DKO CRISPR clone (‘DKO-C2’) and its EV were injected into the same mice (left side and right side) to reduce the tumor growth differences in different mice.

The xenograft model presented in Fig. 6e was generated using the following method: NSG mice (NOD scid gamma mice), aged 10–13 weeks, were used in the study. Five mice (three male and two female) were assigned to each group treated with vehicle and GR. In contrast, four mice (two male and two female) were assigned to each group treated with GSK343 and X5050. All mice were inoculated subcutaneously in the left flank with 5 × 106 K562 cells and randomized into four groups: vehicle, GSK343, X5050 and GR. In groups treated with a single agent or combination treatment, the same doses of individual drugs (0.2 mg kg−1 GSK343 and 0.5 mg kg−1 X5050) were administered by intraperitoneal injection twice weekly. Tumor sizes were serially measured using a digital caliper and tumor volumes were estimated according to the equation V = 1/2(length × width2). At the endpoint, individual tumors were excised, weighed and recorded by photograph.

The serially transplantable PDX model presented in Fig. 6f was generated using the following method: 4 × 106 AML29 PDX human leukemia cells carrying the FLT3 internal tandem duplication mutation were transplanted into 2-Gy-irradiated 10–12-week-old NSGS mice by intravenous injection and randomized into four groups (vehicle, GSK343, X5050 and GR; n = 4 mice each group, one male and three female). In groups treated with a single agent or combination treatment, the same doses of individual drugs (0.1 mg kg−1 GSK343 and 0.25 mg kg−1 X5050) were administered by intraperitoneal injection twice weekly, starting from 2 days after transplantation and continuing for 7 weeks. At the endpoint, the spleen was excised and weighed. Cells in the bone marrow, spleen and peripheral blood were isolated and analyzed by flow cytometry.

siRNA knockdown experiment

siRNAs (Thermo Fisher) were electroporated into DKO cells using the Neo transfection system (Thermo Fisher) according to the manufacturer’s instructions. FGF18 siRNA-2 (5′-GAGACGGAAUUCUACCUGUTT-3′) and FGF18 siRNA-3 (5′-AGACACCUUCGGUAGUCAATT-3′) were used for siRNAs targeting FGF18 genes. After 48 h of incubation, cells were harvested for RT–qPCR and seeding for growth curve measurement. CTCF siRNA knockdown was performed using ON-TARGETplus SMARTpool human CTCF (10664) siRNA (Dharmacon L-020165-00-0005). After 48 h of incubation, cells were harvested for western blot and Hi-C. Primers used for RT–qPCR are listed in Supplementary Table 7.

Short hairpin RNA (shRNA) knockdown experiment

Two shRNAs against the REST gene, shREST-3 (5′-CCGGGCATCCTACTTGTCCTAATACTCGAGTATTAGGACAAGTAGGATGCTTTTTG-3′) and shREST-4 (5′-CCGGGCCTCTAATCAACATGAAGTACTCGAGTACTTCATGTTGATTAGAGGCTTTTTG-3′), were cloned into the pLKO.1-TRC vector (Addgene, 10878). Plasmids were transfected into K562 cells by lentiviral transduction. Stable REST knockdown K562 cells were selected with puromycin for 3 days.

4C-seq

4C-seq was performed as previously described but with slight modifications49. Briefly, 40 million cells were crosslinked with 1% formaldehyde. Nuclear pellets were isolated by cold lysis buffer (10 mM Tris-HCl, 10 mM NaCl, 5 mM EDTA and 0.5% NP-40) supplemented with protease inhibitors (Roche). The first digestion was performed overnight at 37 °C with HindIII enzyme (New England Biolabs). Digestion efficiency was measured by gel electrophoresis. DNA was ligated overnight at 16 °C by T4 DNA ligase (Thermo Scientific), decrosslinked by proteinase K and extracted by phenol–chloroform, referred to as the ‘3C library’. The 3C library was then processed overnight at 37 °C for a second digestion with DpnII enzyme (New England Biolabs). After ligation, ‘4C template DNA’ was obtained. The concentration of 4C DNA was determined by Qubit assays (Thermo Scientific). The 4C template DNA was then amplified using specific primers with Illumina Nextera adaptors and sent for sequencing on the MiSeq platform. Primers used for 4C amplification are listed in Supplementary Table 7.

3C-PCR

The 3C library was generated as described above. Briefly, 600–800 ng of the 3C library was measured by Qubit assays (Thermo Scientific) and subjected to the following PCR protocol with specific 3C-PCR primers: 98 °C for 3 min, then 33 cycles of 94 °C for 1 min, 60 °C for 1 min and 72 °C for 20 s and, finally, 72 °C for 10 min. PCR products were run on 1.5% agarose gels. Primers were designed for 3C-PCR following the unidirectional strategy50. After gel electrophoresis, bands corresponding to expected products were gel-excised and purified (Qiagen) for Sanger sequencing. Band intensities were measured by Image Lab. Primers used are listed in Supplementary Table 7. At least two biological replicates were performed for 3C analyses.

Hi-C experiment and library preparation

One million cells were fixed and used for the Hi-C experiment using the Arima-Hi-C kit (Arima Genomics) according to the manufacturer’s protocol. Libraries were prepared using the KAPA Hyper-Prep kit (KAPA), according to the Arima-Hi-C kit protocol. Final Hi-C libraries were sequenced as 2 × 150-bp paired-end reads on the Illumina HiSeq 4000 platform.

The Hi-C experiment was performed for different KO cells, single treatment and GR treatment, siCTCF-treated cells and etoposide-treated cells. Two biological replicates were performed for each condition.

Chromatin immunoprecipitation with sequencing (ChIP-seq) and ChIP-qPCR

ChIP-seq was performed according to Robertson et al. with slight modifications51. Briefly, cells were fixed with 1% formaldehyde (Thermo Scientific) and lysed in 1% SDS lysis buffer supplemented with protease inhibitor cocktail tablet (Roche) and sonicated using Bioruptor (Diagenode).

Cell lysate was precleared by centrifuging in dilution buffer and incubating with protein G Dynabeads (Invitrogen) overnight at 4 °C. Precleared lysate was added into the antibody-conjugated beads and incubated overnight at 4 °C with rotation. The incubated beads were then washed and added into the elution buffer with RNase A (Qiagen), followed by decrosslinking with proteinase K (Ambion) overnight at 37 °C. ChIP DNA was extracted with the QIAquick PCR purification kit (Qiagen) and quantitated using the Qubit high-sensitivity double-stranded DNA assay (Invitrogen).

The ChIP-seq library was prepared using the ThruPLEX DNA-seq 48D kit (Rubicon) in accordance with the instructions and the prepared library was then sequenced using the Illumina HiSeq 4000 platform. Antibodies used included anti-H3K27me3 (9733S, Cell Signaling Technology), anti-H3K27ac (ab4729, abcam) and mouse IgG (sc-2025, Santa Cruz Technology). In total, 3.5 µg of antibodies were used for each ChIP. Two biological replicates were performed for each condition.

ChIP-qPCR reactions were performed using the QuantStudio 5 quantitative PCR machine (Life Technologies) with GoTaq qPCR master mix (Promega) for at least three ChIP DNA replicates. Primers used are listed in Supplementary Table 7.

Protein extraction and western blot

Cells were lysed using radioimmunoprecipitation assay buffer (Sigma-Aldrich) with protease inhibitor cocktail (Life Technologies) at 4 °C for 30 min. The cell lysate was centrifuged to remove cell debris and protein concentration was determined using a Pierce BCA protein assay kit (Thermo Scientific). In total, 20 µg of protein was separated in 4–20% Mini-PROTEAN TGX precast gels (Bio-Rad) and transferred to a PVDF membrane. After blocking with TBST containing 5% nonfat dried milk at room temperature for 1 h, the membrane was washed and incubated overnight with the following primary antibodies (1:1,000 in 5 ml of 5% nonfat milk in TBST) at 4 °C: anti-β-actin (ab6276, abcam; sc-47778, Santa Cruz Biotechnology), anti-REST (17-641, Merck Millipore; 22242-1-AP, Proteintech), anti-cleaved PARP (9541S, Cell Signaling Technology), anti-γH2AX (JBW301, Merck Millipore) and anti-CTCF (sc-271474, Santa Cruz Biotechnology). The membrane was washed and then incubated at room temperature for 1 h with the following horseradish peroxidase-conjugated secondary antibodies: mouse antirabbit IgG (sc-2357, Santa Cruz Biotechnology) or goat antimouse IgG (sc-2031, Santa Cruz Biotechnology) (1:10,000 in 5 ml of 5% nonfat milk in TBST). After extensive washing, bands were detected by chemiluminescence reagent (Bio-Rad) and imaged using the ChemiDoc imaging system (Bio-Rad). Protein levels were calculated by Image Lab. Three biological replicates were performed for each condition.

Drug combination assay

K562 and SEM cells were seeded on 96-well culture plates at a density of 4,000 cells per well and treated with GSK343 (0, 2, 5 and 7 µM) and X5050 (0, 1, 5, 20, 50 and 100 µM) for 72 h as a single agent or in combination. To rapidly deplete CTCF in SEM cells, cells were seeded as described above37 and pretreated with IAA (50 µM) and doxycycline (1 µg ml−1) for 48 h before the addition of GSK343 and/or X5050. Cell viability was measured using the CellTiter-Glo assay kit (Promega, G7571). Luminescence was read on a Tecan plate reader. The dose–response matrix of GSK343 and X5050 in K562 and SEM cells was shown as the percentage inhibition. The Bliss synergy score was calculated using SynergyFinder 3.0 (https://synergyfinder.fimm.fi/)52. Three biological replicates were performed for each condition.

Cell-cycle analysis

Cell-cycle analysis was performed 24 h after drug treatments with GSK343 and/or X5050 using the BrdU flow kit (BD Pharmigen) according to manufacturer’s instructions. Briefly, K562 cells were labeled with BrdU for 2 h before fixation and incubated for 30 min at room temperature with 200 ng ml−1 RNase A. Next, DNA was stained with 25 μg ml−1 PI-conjugated and FITC-conjugated anti-BrdU antibody and subjected to flow cytometry LSRII (BD Biosciences). Three biological replicates were performed for each condition.

The gating strategy for cell-cycle analysis was based on the measurement of DNA content by staining with PI and BrdU staining. First, the cells were gated on the basis of cell size and granularity by plotting the side scatter area (SSC-A) against the forward scatter area (FSC-A). Next, the side scatter width (SSC-W) versus side scatter height (SSC-H) plots were used to determine single-cell populations. Finally, four gates were defined according to the BrdU and PI staining. All subpopulations were analyzed on the BrdU versus PI scatter for S, subG1, G0/G1 and G2/M phases of the cell cycle.

Annexin V and PI apoptosis assay

The annexin V and PI apoptosis assay was performed on K562 cells treated with DMSO, GSK343, X5050 or GR according to manufacturer’s protocol (FITC annexin V apoptosis detection kit with PI, BioLegend)53. Briefly, the K562 cells were washed twice with cold BioLegend’s cell staining buffer and then resuspended in 100 µl of Annexin V binding buffer at a concentration of 0.5 × 106 cells per ml. Cells were then mixed and 100 µl of cell suspension was passed through a cell strainer, followed by the addition of 5 µl of FITC annexin V and 10 µl of PI staining solution. After incubating for 15 min, 400 µl of annexin V binding buffer was added and the samples were analyzed using a BD LSRII flow cytometer (BD Biosciences). Each sample was tested in triplicate.

The gating strategy for annexin V and PI apoptosis analysis was based on a positive control consisting of 50% live cells and 50% cells incubated at 65 °C for 20 min (late-apoptotic cells). Next, the cells were gated on SSC-A versus FSC-A to select the cell population. The cells were gated on SSC-W versus SSC-H in the perspective of cell granularity. Then, the cells were gated on forward scatter width (FSC-W) versus forward scatter height (FSC-H) in the perspective of cell size. Finally, all the subpopulations were analyzed on the annexin V–FITC versus PI scatter for live, early-apoptotic, late-apoptotic and necrotic cells. The boundaries of staining between live cells and late-apoptotic cells were defined at 1,000 for both FITC–annexin V and PI fluorophores.

Colony formation assay

K562 cells were plated in triplicate at a density of 200 cells per dish in 35-mm dishes containing methylcellulose semisolid medium Methocult H4034 (StemCell Technologies). The cells were treated with DMSO, GSK343, X5050 or GR. After 14 days of incubation, colonies were counted using an SZX12 Olympus microscopy (Olympus). Three biological replicates were performed for each condition.

ATAC-seq

ATAC-seq was performed using the Diagenode ATAC-seq kit (Diagenode) following the manufacturer’s protocol54. Briefly, 200,000 EV, S1KO, S2KO and DKO K562 cells per sample were harvested and nuclei were isolated. Immediately following the nuclear preparation, the pellet was resuspended in 50 µl of tagmentation mix (25 µl of 2× tagmentation buffer, 2.5 µl of tagmentase, 0.01% digitonin, 0.1% Tween 20, 16.5 µl of PBS and 5.25 µl of nuclease-free water). The transposition reaction was carried out for 30 min at 37 °C. The tagmented DNA was then purified and eluted in 12 µl of DNA elution buffer. Following purification, we amplified library fragments using 10 µl of tagmented DNA and 1 µl of primer pair provided by the 24 UDI for Tagmented libraries set I (Diagenode) using the following PCR conditions: 72 °C for 5 min, 98 °C for 30 s and thermocycling at 98 °C for 10 s, 63 °C for 30 s and 72 °C for 1 min. We first amplified the libraries for five cycles, after which we took 5 µl of amplified DNA and added 10 µl of qPCR mix with SYBR green. We ran this reaction for 20 cycles to determine the additional number of cycles needed for the remaining 45-µl reaction. The libraries were amplified for an additional 6–11 cycles and cleaned up using AMPure XP beads (Beckman Coulter). The indexed libraries were sequenced with 2 × 50-bp paired-end reads on the HiSeq X Ten platform.

The same ATAC-seq protocol was used for K562 cells treated with 0.1% DMSO or GR for 72 h with some modifications. Briefly, 50,000 nuclei were prepared and the final libraries were sequenced with 2 × 50-bp paired-end reads on the NovaSeq 6000 platform. Two biological replicates were performed for each condition.

CUT&RUN assay

The CUT&RUN assay was performed for CTCF and H3K27me3 using the CUT&RUN kit according to the manufacturer’s instructions (EpiCypher)25,55. All experiments were performed using 500,000 cells, 0.5 µg of antibody and 10 µl of activated concanavalin A beads per sample. For CTCF, EV, S1KO, S2KO and DKO, K562 cells were harvested and bound to activated concanavalin A beads. Cells were subsequently permeabilized with 0.01% digitonin and incubated with rabbit IgG negative control (13-0042K, EpiCypher) or CTCF antibodies (ab188408, abcam) at 4 °C overnight.

For H3K27me3, K562 and THP1 cells were treated with 0.1% DMSO or GR for 24 h. The number of viable cells was counted and 500,000 cells per sample were harvested. Cells were incubated with H3K27me3 antibodies (9733S, Cell Signaling Technology) at 4 °C overnight. The K-MetStat panel was spiked into each sample as an internal control for anti-H3K27me3 antibody specificity. All libraries were amplified using 14 cycles on the thermocycler and validated using the Agilent high-sensitivity DNA kit. The final libraries were size-selected using AMPure XP beads (Beckman Coulter) twice to remove DNA fragments < 150 bp. The indexed libraries for CTCF and H3K27me3 were sequenced with 2 × 150-bp paired-end reads on the Illumina NovaSeq 6000 platform. Two biological replicates were performed for each condition.

Recombinant human FGF18 assay

K562 cells were counted and seeded (4,000 cells per well) in 96-well plates. Cells were serum-starved for 24 h before the addition of rhFGF18. Cells were treated either with sterile water or increasing concentrations of rhFGF18 (0, 10, 50, 100 and 200 ng ml−1). After 72 h of treatment, viable cells were measured with the CellTiter-Glo assay kit (Promega, G7571). Luminescence was read on a Tecan plate reader. For the cycloheximide treatment, the cells were treated with sterile water or 200 ng ml−1 of rhFGF18 in the presence or absence of cycloheximide (25 µg ml−1). After 72 h, cells were harvested for RT–qPCR as described above. Three biological replicates were performed for each condition.

RNA-seq, ChIP-seq and 4C data analysis

Two replicates of the RNA-seq libraries for EV, S1KO, S2KO and DKO were quantified using kallisto (0.45.1)56. The DEG analysis was performed using sleuth (0.30.0)57.

RNA-seq data for the single treatment, GR treatment and GR time course treatment were analyzed by DESeq2 (1.38.3). Two biological replicates were performed for each condition. Briefly, adaptors were trimmed by trimmomatic (0.39)58 and reads were mapped to hg19 (http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/) by STAR (2.7.3a)59. Read normalization and differential expression analysis were performed using DESeq2 (1.38.3)60. DEGs were identified using a log2(fold change) between −1 and 1 and a P value less than 0.05.

Gene set enrichment analysis (GSEA) was performed using the GSEA pipeline from the Broad Institute44. ChIP-seq analysis was performed for H3K27me3 and H3K27ac in different KO cells. Briefly, adaptors were trimmed by trimmomatic (0.39)58 and reads were then mapped by Bowtie 2 (2.2.5)61 using default parameters in paired-end mode; we filtered out alignment with a mapq score smaller than 30. Two replicates of ChIP-seq were combined and peaks and bigWig files were generated by MACS2 (2.1.0.20150731)62 using the options ‘-q 0.01’ for H3K27ac and ‘--broad --broad-cutoff 0.1 -q 0.05’ for H3K27me3.

4C-seq analysis was performed for different KO cells. Briefly, 4C reads were trimmed at the HindIII digestion site using tagdust (2.33)63 and reads were mapped using Bowtie 2 (2.2.5)61 in the single-end mode with the option ‘--end-to-end’. Significant interactions were called against the HindIII-digested genome background using R3Cseq (1.24.0)45 with a cutoff P value of 0.05. Two replicates were performed for each 4C analysis and significant interactions from the two replicates were pooled. 4C interactions were drawn in arc style using Sushi (1.16.0)64 from Bioconductor.

Hi-C data analysis

Paired-end reads were mapped to the hg19 reference genome using BWA mem (0.7.17-r1188)65,66. The mapped reads were then passed to the Juicer pipeline (1.5.7) to perform chimeric read analysis, merging, Hi-C matrix generation, loop calling using HiCCUPS and TAD calling using Arrowhead43,67. Insulation scores were calculated by hicFindTADs from HiCExplorer (3.4.2/3.6)68 using a 10-kb-resolution matrix. The eigenvector values at 1-Mb and 50-kb resolution and APA at 5-kb resolution were performed using the eigenvector and APA functions of Juicer tools43,67. All calculations and visualizations of insulation scores were performed using computeMatrix reference-point and plotProfile from deepTools (3.5.1)69

We divided the TADs and loops into three categories (gained, lost and unchanged) according to the following principles (using GSK343-treated Hi-C as an example): (1) gained TADs were those TADs only found in GSK343-treated Hi-C libraries; (2) lost TADs were those TADs only found in DMSO-treated Hi-C libraries; and (3) unchanged TADs were those TADs found in both GSK343-treated and DMSO-treated Hi-C libraries.

Heat map and box plot of RPM signal and ChIP-seq signal at different 4C regions

We compared three different conditions (DKO versus EV, S1KO versus EV and DKO versus S1KO). The classification criteria were as follows: gained, 4C interactions only present in the experiment condition but not in the control condition; lost, 4C interactions only present in the control condition but not in experiment condition; unchanged, 4C interactions present in both control and experiment conditions. Only 4C interactions with a P value < 0.05 were considered in the comparison.

For the reads per million reads mapped (RPM) signal heat map, different types of 4C regions were first classified on the basis of their H3K27ac or H3K27me3 ChIP-seq signal levels in the control condition. Tertiles were used to assign these 4C regions into different categories (high, medium and low). The 4C intensity in RPM of each 4C region was shown in a color-scaled manner.

For the ChIP-seq signal heat map, deepTools computeMatrix (3.3.1)70 was used to calculate the ChIP-seq signal for different types of 4C regions and then the total signal areas were calculated using the following formula: total signal area = sum(Sig × BS), where BS represents the bin size used when summarizing the reads per kilobase per million mapped reads (RPKM) and Sig represents the ChIP-seq signal in RPKM for an individual bin. For the ChIP-seq signal box plot, the same 4C region in different conditions was connected using gray lines. A two-sided Wilcoxon signed-rank paired test was used and P values were indicated accordingly.

K562 H3K27me3 HiChIP and data analysis

HiChIP was performed using anti-H3K27me3 antibody in normal K562 cells. Five million cells were used for the HiChIP experiment using the Dovetail HiChIP MNase kit (Dovetail Genomics) according to the manufacturer’s protocol. Briefly, 1 µl of MNase enzyme mix and 1,250 ng of H3K27me3 antibody (9733S, Cell Signaling Technology) were used for each HiChIP. Libraries were prepared using the Dovetail primer set for Illumina and Dovetail library module for Illumina kits, according to the manufacturer’s protocol. The final HiChIP libraries were sequenced with 2 × 150-bp paired-end reads on the Illumina HiSeq 4000 platform. One biological replicate was used for K562 H3K27me3 HiChIP.

HiChIP analysis was performed by following the Dovetail website (https://hichip.readthedocs.io/en/latest/index.html) step by step. Paired-end reads were aligned to hg19 using BWA mem (0.7.17-r1188)65,66 with the parameter ‘-5SP -T0’. Next, the mapped reads were processed by pairtools (0.3.0) (https://github.com/mirnylab/pairtools) to obtain pairs and BAM files. The BAM files were then sorted to produce index files using SAMtools (1.10)71 for downstream analysis. The quality control (QC) report was generated using the get_qc.py and enrichment_stats.sh scripts in HiChIP (https://github.com/dovetail-genomics/HiChiP/blob/main/docs/source/before_you_begin.rst).

After passing QC, the pairs file was converted into a contact matrix in .hic format by Juicer tools67 and visualized in Juicebox (2.1.10)72. HiChIP loops were called by FitHiChIP (9.0)73 in a 25-kb bin size and the coverage of the whole genome was calculated by bamCoverage from deepTools (3.5.1)69. Then, results at certain regions were visualized using Sushi (1.16.0)64 in R.

ATAC-seq analysis

Adaptors were trimmed by TrimGalore (0.6.6) (https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) and then aligned in paired-end mode using Bowtie 2 (2.3.5.1)61 with default parameters. The resulting SAM files were converted to BAM format and filtered to exclude reads with a mapq score below 30 using SAMtools (1.10)71. The BAM files were sorted and duplicate reads were removed using Picard (https://github.com/broadinstitute/picard, 2.27.5-4). After removing reads mapped to chrM, chrUn and ‘random’ chromosomes, BAM files from replicates were merged and then converted to bigWig files with RPKM normalization using deepTools bamCoverage (3.4.3)69 for data visualization on the University of California, Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu). Peaks were called using MACS2 (2.2.7.1)62 callpeak with the flag parameters ‘--shift -100 --extsize 200 --nomodel -B --SPMR -g hs’ and peaks contained in the ENCODE (v2) hg19 blacklist were removed.

CUT&RUN analysis

The paired-end reads were aligned with Bowtie 2 (2.3.5.1)61 using the flag parameters ‘--local --very-sensitive-local --no-unal --no-mixed --no-discordant --phred33 -I 10 -X 700’. The resulting SAM files were then converted to BAM files and sorted with SAMtools (1.10)71 and Picard (https://github.com/broadinstitute/picard, 2.27.5-4) software. After removing reads mapped to chrM, chrUn and random chromosomes, BAM files of replicates were sorted and combined using the SAMtools (1.10)71 sort and merge function. Merged BAM files were converted to bigWig files with RPKM normalization using deepTools bamCoverage (3.4.3)69 for data visualization on the UCSC Genome Browser (http://genome.ucsc.edu). Peaks were called using MACS2 (2.2.7.1)62 with the parameters ‘--nomodel --extsize mean fragment size of the sequencing library’ and only peaks not contained in the ENCODE (v2) hg19 blacklist were retained.

Disease-associated single-nucleotide polymorphism (SNP) analysis

Three different categories of SNP lists, namely disease-associated, cancer-associated and acute myeloid leukemia (AML)-associated or chronic myeloid leukemia (CML)-associated SNPs, were downloaded from a genome-wide association study (GWAS)74. ChIP-seq data of H3K27me3 in K562 (ENCSR000AKQ) downloaded from ENCODE were used to identify the SSs. Then, the overlapped ratio was calculated in both MRRs and SNPs. To confirm the enrichment of SNPs in MRRs, random data with the same number and size of regions as MRR peaks from the genome were created 1,000 times. Mean overlapped ratios between random data and SNPs were calculated and a one-sided one-sample t-test was used to compare the difference between random and MRR ratios.

ChIP-seq peaks of REST at different categories of TADs and loops

We calculated the number of REST ChIP-seq peaks (https://www.encodeproject.org/experiments/ENCSR137ZMQ/) at gained, lost and unchanged TAD regions in GSK343 versus DMSO. ChIP-seq peaks with at least one overlapped base pair with the TADs were counted and used to generate the violin plot by R (Fig. 3e).

Gene desert analysis

Intergenic regions were identified on the basis of known gene locations. If the size of intergenic regions was ≥500 kb, these regions were defined as gene deserts. TADs in the DMSO 72-h condition with more than 50% of their length overlapping with gene deserts were considered as TADs in gene deserts and compared to other TADs (TADs not in gene deserts). Specifically, we set the start and end positions of TADs extended 5 kb upstream and downstream as the TAD boundaries. Then, we calculated the mean insulation score differences of the TAD boundaries for two categories (TADs not in gene deserts and TADs in gene deserts) in the comparison of GR versus DMSO at 72 h. A box plot was generated to visualize the change in insulation scores. A two-tailed Wilcoxon test was used to evaluate the difference between all TADs and TADs in gene deserts.

The RNA-seq reads in DMSO and GR at 72 h were calculated in each 20-kb bin across the whole genome and gene deserts. These reads were then normalized by the total number of mapped reads in the whole genome as RNA signals. Box plots were generated to compare the normalized RNA-seq reads between the whole genome and gene deserts. A two-tailed Wilcoxon test was used to compare the difference between them.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

留言 (0)

沒有登入
gif