Transcriptomic dysregulation and autistic-like behaviors in Kmt2c haploinsufficient mice rescued by an LSD1 inhibitor

Establishment of Kmt2c +/fs mice

To obtain single guide RNA (sgRNA) for the recognition of the Kmt2c locus, oligomers were first generated by annealing of the following primers: forward primer, 5′-CACCGGAATGGACCACCACCTCGAG-3′, and the reverse primer, 5′-AAACCTCGAGGTGGTGGTCCATTCC-3′. The oligomers were integrated into the BbsI site downstream of the U6 promoter of the pSpCas9(BB)-2A-GFP vector (PX458) (Addgene, MA, USA). This integration step followed our previous reports [17, 18]. After the integration, the sgRNA for the microinjection into the fertilized eggs was transcribed by MEGAshortscript Kit (Life Technologies, Carlsbad, CA, USA) using the generated sgRNA-SpCas9-GFP all-in-one vector as a template. A mixture of sgRNA (50 ng/μl) and Cas9 mRNA (100 ng/μl, Merck Millipore, MA, USA) was microinjected into the cytoplasm of fertilized eggs obtained from C57BL6/N mice (CLEA Japan, Inc., Tokyo, Japan). The injected eggs were transferred into the oviducts of pseudopregnant ICR female mice. Whole exome sequencing was performed to screen for possible off-target mutations using genomic DNA purified from tail samples of F1 mutant mice with the GenElute Mammalian Miniprep kit (Merck Millipore). We validated the introduction of novel insertion/deletion of nucleotides after the filtering as previously reported [18].

Mouse housing and genotyping

The experimental protocols were approved by the Wako Animal Experiment Committee of RIKEN. Kmt2c heterozygous mutant mice were maintained by breeding Kmt2c+/fs mice with wildtype C57BL6/N mice (CLEA Japan, Inc.) in a 12 h:12 h light-dark cycle. The mice for behavioral tests were obtained by in vitro fertilization using sperms of the heterozygous mutant mice and eggs derived from C57BL6/N mice (CLEA Japan, Inc.). Genotyping of all mutant mice was performed using genomic DNA derived from tail tips as previously reported [18]. Briefly, tail biopsies were conducted on postnatal days 14–21, and each tail tip was incubated in 100 μL of lysis buffer (25 mM NaOH, 0.2 mM EDTA) at 95 °C for 20–30 min. An equal volume of 40 mM TRIZMA hydrochloride was added to the lysate after incubation and the mixture was vortexed briefly. Afterward, 1 μL of the mixture was diluted with water 2.5-fold and was PCR-amplified under the conditions shown in Supplementary Table 1. The PCR products derived from wildtype and mutant allelles were discriminated by agarose gel electrophoresis or Sanger sequencing by an ABI 3730xl sequencer (Life Technologies, CA, USA) using BigDye Terminator V3.1 (Thermo Fisher Scientific, MA, USA) (Supplementary Table 1).

qPCR

Total RNA was extracted from the hippocampal samples of 16 weeks old male mice with TRIzol RNA Isolation Reagents (Thermo Fisher Scientific). After DNase I treatment, reverse transcription was performed by SuperScriptTM III First-Strand Synthesis System (Invitrogen, MA, USA) using random hexamers. RT-qPCR was conducted using TB Green Premix Ex TaqTM II (Takara Bio Inc., Shiga, Japan). The sequences of the primers are shown in Supplementary Table 1. The primers for Kmt2c were designed to recognize the full-length isoform of Kmt2c.

Western blot of KMT2C

Nucleic and cytoplasmic protein samples were extracted from whole brains dissected from the mouse embryos at E15.5 using NE-PER Nuclear and Cytoplasmic Extraction Reagents (Thermo Fisher Scientific). The amount of protein in each sample was quantified by the Micro BCATM Protein Assay kit (Thermo Fisher Scientific), and 13 μg of proteins were separated using SDS-polyacrylamide gel electrophoresis (SDS-PAGE) with 3–8% NuPAGE Tris-Acetate Gels and NuPAGE Tris-Acetate SDS running buffer (Thermo Fisher Scientific). The separated proteins were transferred to the PVDF membrane under the conditions of 30 V, 4 °C for 18 h. The transferred membranes were blocked with 5% (for KMT2C) or 3% (for LaminA/C) skim milk in Tris-buffered saline (TBS) with 0.05% Tween-20 (TBST) for 30 min at 20–25 °C, then incubated overnight at 4 °C with rabbit anti-KMT2C (1:5000, ABE1851; Merck Millipore, Cambridge, UK) or rabbit anti-LaminA/C (1:1000, #2032; Cell Signaling Technology, Inc., MA, USA) primary antibody diluted by the corresponding blocking buffers. After washing five times with TBST for 5 min each, the membranes were incubated for 1 h with horseradish peroxidase (HRP)-conjugated anti-rabbit-IgG for anti-KMT2C or anti-LaminA/C (1:10000, sc-2357; Santa Cruz Biotechnology, TX, USA) at 20–25 °C. The immunoreactive bands were visualized using Amersham ECL Prime (GE Healthcare, Buckinghamshire, UK) and scanned using a LAS-3000 image analyzer (Fujifilm, Tokyo, Japan). The intensity of signals identified by each antibody was quantified by Fiji [19] (v.2.1.051).

Behavioral tests

Details of all behavioral tests are described in the Supplementary Method. The major procedures are outlined below. We analyzed 15 male mice per condition in each behavioral test except for the IntelliCage analysis. The behavioral tests started after 13 weeks old. In the open field test, Y-maze test, Crawley’s three-chamber social interaction test, and prepulse inhibition (PPI) test, we combined the data from two batches of the experiments, of which one batch is from an analysis of untreated mice in the pharmacological experiments, in order to increase the statistical power.

Crawley’s three-chamber social interaction test

The social behavior of animals was measured in Crawley’s social interaction test chamber (O’HARA & CO., LTD., Tokyo, Japan). The chamber consisted of three areas separated by transparent walls with holes, and they could freely move three areas. On the first day, all stranger mice were habituated to the small cages in the arena for 10 min. On the second day, subject animals were first placed in the center area with a stranger caged mouse on one side, for 10 min (mouse cage vs. empty cage). The duration of the area staying was recorded.

Y-maze test

The Y-maze test was performed in an apparatus with three arms arranged at 120° intervals (O’HARA & CO., LTD) for 5 min. The alternation rate (the number to enter all three arms within three entries/[the total number of entries into arms] −2) was recorded.

Barnes maze test

In the Barnes maze test, twelve holes (40 mm in diameter) were evenly spaced around the circumference of a white circular arena (O’HARA & CO., LTD) and one escape box was set under the specific hole. The training session consisted of 20 trials (one or two trials/day, 5 min). After 1 day of the training sessions, probe tests were conducted.

Serial reversal test in IntelliCage analysis

The IntelliCages apparatus (39 × 58 × 21 cm; TSE systems, Inc. MO, USA) contains one chamber each at the four corners that is accessible through an open doorway. A radiofrequency identification transponder (Standard Microchip T-VA, DataMars, Lamone, Switzerland; and Trovan, Melton, UK) was implanted into the mouse dorso-cervical region under isoflurane inhalation anesthesia to track each mouse in the corner chambers. The light period was from 8:00 to 20:00 local time, and the dark period was from 20:00 to 8:00 (Light: Dark = 12 h:12 h). Behavioral tests using the IntelliCage system were conducted following the methods in the previous paper, and detailed procedures except for the serial reversal test are described in the Supplementary Method. The IntelliCage test was initiated when the mice are 16 weeks old. Following the completion of all other tests shown in Supplementary Fig. 1, the serial reversal test was started (N = 6, 40–41 weeks old). The serial reversal test within this system was performed as follows. Before the serial reversal test, one of the four corners (correct corner) is accessible to drinking water for seven days (place learning test). After the place learning test, the correct corner was changed to the diagonally opposite corner to investigate the reversal learning and train mice to learn the change of the correct corner (place learning reversal test). The serial reversal test was started after the place learning reversal test for five days. The correct corner was switched to the other diagonal line on the first day of the serial reversal test. Afterward, the correct corner was changed to the corner diagonally opposite to the corner of the previous day, every day, and this diagonal moving was continued for four days. These processes were repeated three times, and the rate of the visits to the correct corner on the previous day in the first 15 min was calculated as an indicator of inflexibility. During these tests, the drinking water was accessible from 21:00 to 23:59 local time.

Sample preparations for RNA-seq of bulk tissues

Total RNAs were extracted from forebrain samples derived from 11 weeks old mice and vafidemstat- or DMSO-treated 22 weeks old mice using Trizol reagent (Thermo Fisher Scientific), followed by the treatment with DNase I (Takara Bio Inc.) to remove DNA. The quality of RNA samples was analyzed by Agilent Bioanalyzer (Agilent Technologies, Inc., CA, USA), and samples with RNA integrity number (RIN) > 7.9 were subjected to RNA-seq. The preparation of the library and sequencing were conducted by Novogene Co., Ltd, Beijing, China. The mRNA was enriched from total RNA using poly-T oligo-attached magnetic beads. After fragmentation, the first-strand cDNA was synthesized using random hexamer primers followed by the second-strand cDNA synthesis using NEBNext Ultra RNA Library Prep Kit for Illumina (New England BioLabs, MA, USA). Sequencing was performed on the Novaseq-6000 platform (Illumina) following the manufacturer’s instructions. The amount of data per sample was ~6 Gb (~20 M of 150 bp × 2 paired-end reads).

Chromatin immunoprecipitation

Chromatin immunoprecipitation was performed as previously described [20]. Two 16 weeks old wildtype C57BL6/N mice were used to obtain data of biological replicates. The hippocampal tissue of each mouse was dissociated using a 100 μm cell strainer (SPL Life Sciences Co., Ltd., Gyeonggi-do, Korea) with a syringe plunger in 4 mL of ice-cold PBS, and further homogenized using a Dounce tissue grinder (Sigma-Aldrich) with a small clearance pestle (pestle B) for 10 strokes and centrifuged at 1500 × g, 5 min, 4 °C. Fixation was performed by re-suspending the cell pellet in 1 mL of fixation buffer (PBS supplemented with 1% formaldehyde and 18.7 μM disuccinimidyl glutarate) and incubating at 25 °C for 10 min. The fixation was stopped by adding 100 μL of 2.5 M Glycine. The fixed cells were collected by centrifugation at 1500 × g, 4 °C for 5 min, and washed twice in PBS by re-suspending the pellet in 1 ml of PBS and centrifuging at 1500 × g, 4 °C for 5 min at each wash cycle. The fixed cell pellet was snap-frozen in liquid nitrogen and stored at −80 °C until use. The chromatin lysate was prepared by sonication with Covaris E220 (Covaris, MA, USA). Before the sonication, the stored pellet was washed once with 1 ml of lysis buffer 1 (50 mM HEPES-KOH, 140 mM NaCl, 1 mM EDTA, 10% (w/v) Glycerol, 0.5% (w/v) NP-40, 0.25% (w/v) Triton X-100, 0.1 × Protease Inhibitor Cocktail (Sigma)), once with 1 ml of lysis buffer 2 (10 mM Tris-HCl, 200 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, 0.1 × Protease Inhibitor Cocktail) and twice with 1 ml of RIPA buffer (# 89900; Thermo Fisher Scientific) supplemented with 0.1 × Protease Inhibitor Cocktail. Each washing step was performed by re-suspending the pellet in the buffer and centrifuging at 2000 × g, 4 °C for 5 min. After the washing step, the cell pellet was re-suspended in 1 ml of RIPA buffer supplemented with 1 × Protease Inhibitor Cocktail and transferred to a milliTUBE (Covaris). The sonication condition of the Covaris E220 was PIP: 175 W, Cycles per Burst: 200, Duty Factor: 15%, for 30 min at 7 °C. After the sonication, the chromatin lysate was centrifuged at 20,000 × g, 4 °C for 5 min, and the supernatant was collected.

Immunoprecipitation was performed in RIPA buffer containing 0.5 mg/ml BSA (15561-020, Invitrogen), the chromatin lysate equivalent to 1.6–1.8 × 106 cells, and 50 µl of Protein A beads coupled with 15 µg of anti-KMT2C antibody (ABE1851; Merck Millipore) for the duration of 4 h at 4 °C. After the reaction, the beads were washed five times in 1 ml of low salt buffer (20 mM Tris-HCl (pH 8.0), 0.1% SDS, 1% (w/v) Triton X-100, 2 mM EDTA, 150 mM NaCl), and three more times with 1 ml of high salt buffer (20 mM Tris-HCl (pH 8.0), 0.1% SDS, 1% (w/v) TritonX-100, 2 mM EDTA, 500 mM NaCl). Washing of the beads was performed by placing the tube on the DynaMag 2 magnet (Thermo Fisher Scientific), removing the supernatant, and re-suspending the beads in the next buffer. The chromatin complex was eluted from the magnetic beads by agitation in 200 µl of ChIP elution buffer (10 mM Tris-HCl (pH 8.0), 300 mM NaCl, 5 mM EDTA, 1% SDS) for 15 min at room temperature. On the magnet, the supernatant containing the chromatin complex was collected, transferred to a new tube, and incubated at 65 °C, overnight, for reverse-crosslinking. The chromatin complex was further incubated with 50 µg/ml RNase A for 20 min at 37 and with 0.5 mg/ml Proteinase K at 55 °C for 40 min. ChIP DNA was purified by phenol-chloroform extraction followed by ethanol precipitation. Input DNA was extracted from 20 µl of the ChIP lysate in the same way as the ChIP DNA. Libraries were prepared using 10 ng of input DNA and 0.75–1.0 ng of ChIP DNA with the KAPA LTP Library Preparation Kit (# KK8232; KAPA Biosystems, MA, USA) and UDI adapters (NOVA-514180, BIOO Scientific, TX, USA). Sequencing was performed on the HiSeq 1500 platform (Illumina) to obtain ~40 M single-end 80 bp reads.

Sample preparations for microarray analysis

Total RNAs were extracted from hippocampal samples of 16 weeks old mice using the same method as for bulk RNA-seq and then treated with DNase I. A microarray analysis was performed by using the ClariomTM D Assay, mouse (902513, Thermo Fisher Scientific) platform according to the manufacturer’s protocol. The library preparation and assay were performed at RIKEN Center for Brain Science, Research Resource Division (RIKEN CBS RRD).

Quartz-seq2

We performed a scRNA-seq analysis using Quartz-seq2 according to the protocol described by developer [21], as detailed below.

Single-cell separation

Neonatal forebrain (prosencephalon) samples were dissected from P4 mouse pups under a stereomicroscope and temporarily stored in ice-cold Hank’s balanced salt solution. The separation to single cells was performed using Pierce Primary Neuron Isolation Kit (Thermo Fisher Scientific) according to the manufacturer’s protocol. Tail samples were used for genotyping of the Kmt2c frameshift mutation and verification of sex by amplification of Sry and Kdm5d fragments on the Y chromosome. The PCR conditions are shown in Supplementary Table 1. The separated single cells were stained by 7-AAD to exclude the dead cells. These cells were analyzed using flow cytometry FACSAria II Special Order System (BD Bioscience, CA, USA) with 100 μm nozzles. Single cells were isolated into 384-well PCR plates with 1 μL of lysis buffer (0.1111 μM respective RT primers (v32_384p01, unique molecular identifier (UMI) sequence -cell barcode -oligo dT), 0.12 mM dNTP mix, 0.3% NP-40, 1 unit/μL RNasin plus) in each well (Supplementary Fig. 2a, b). During single-cell sorting, 384-well PCR plates were kept on a 384 aluminum stand at 4 °C using CoolRack XT PCR384 (Corning, NY, USA). The 384-well PCR plates were sealed by the PX1 PCR Plate Sealer (BIO-RAD, CA, USA) and centrifuged at 10,000 × g and 4 °C for 1 min immediately after the cell collection. We agitated the plates at 2600 rpm and 4 °C for 1 min using ThermoMixer C (Eppendorf, Hamburg, Germany) followed by another centrifugation. The prepared 384-well plates were immediately cryopreserved at −80 °C and maintained until subsequent RT for cell barcoding.

Whole transcript amplification

Whole-transcript amplification was performed by using the 384-well PCR plates prepared above. The cryopreserved 384-well plates with single-cell lysates were centrifuged at 10,000 × g and 4 °C for 1 min, followed by denaturation of total RNA in each 384-plate at 70 °C for 90 s and hybridization of RT primers to poly-adenylated RNA at 35 °C for 15 s using the Dual 384-Well GeneAmpTM PCR System 9700. We removed the seal and added 1 μL of RT premix (2× Thermopol buffer, 1.25 units/μL SuperScript III, 0.1375 units/μL RNasin plus) to 1 μL of lysis buffer in each well on the 384 aluminum plate. We agitated the re-sealed plates at 2600 rpm and 4 °C for 1 min. The plates were then centrifuged at 10,000 × g and 4 °C for 1 min, and the reverse transcription (RT) was reacted at 35 °C for 5 min and 50 °C for 50 min. The RT was stopped at 70 °C for 15 min. We turned the plates upside down on the assembled collector and centrifuged the plates with an assembled collector at 3010 × g and 4 °C for 3 min using swing-bucket rotors. Subsequently, we collected the cDNA solution into a disposable reservoir. We purified and concentrated the cDNA solution using the DNA Clean & ConcentratorTM-5 kit (Zymo Research, CA, USA) and extracted it with 20 μL of nuclease-free water. We added 25 μL of TdT solution (1× Thermopol buffer, 2.4 mM dATP, 0.0384 units/μL RNase H (Invitrogen), 26.88 units/μL terminal transferase (Roche, Basel, Switzerland)) into 20 μL of extracted cDNA. A poly(A) tailing was reacted at 37 °C for 75 s followed by the inactivation at 65 °C for 10 min. After the dispersion of ~11 μL of solution into four wells from 45 μL of TdT solution, 46.16 μL of PCR I premix (1.08492× MightyAmp Buffer version 2, 0.06932 μM Tagging primer, 0.05415 units/μL MightyAmp DNA polymerase (Takara Bio Inc.)) was added to 11 μL of TdT solution for the respective wells of the PCR tube. The solution was mixed at 2000 rpm and 4 °C for 2 min. We denatured the solution at 98 °C for 130 s and hybridized tagging primer to poly(A)-tailed cDNA at 40 °C for 1 min. After that, the Increment step by heating to 68 °C at 0.2 °C every second and second-strand synthesis at 68 °C for 5 min were conducted. 50.232 μL of PCR II premix (0.99697× MightyAmp Buffer version. 2, 1.8952 μM gM primer) was mixed with 56.16 μL of PCR I solution. Subsequently, the cDNA was amplified by 12 cycles with the following conditions: 98 °C for 10 s, 65 °C for 15 s, and 68 °C for 5 min. Finally, all of the PCR solutions derived from one 384-well PCR plate were transferred to a 15 mL polypropylene centrifuge tube. 32.1 μL of 3 M sodium acetate (pH 5.2) and 6.42 mL of PB-Buffer (QIAGEN, Düsseldorf, Germany) were added to the PCR solution. The mixture was purified by a MinElute Spin Column (QIAGEN). Purified cDNA was extracted into 40 μL of nuclease-free water. The extracted cDNA was additionally purified with 26 μL of Ampure XP beads.

Preparation of sequence library and sequencing

5–10 ng of the amplified cDNA in 130 µL of nuclease-free water was fragmented using an M220 Focused-ultrasonicator (Covaris) under the following conditions: duty factor 10.0, peak power 50.0, cycles per burst 200, and treatment time 65 s. The fragmented cDNA was purified using the DNA Clean & ConcentratorTM-5 kit. Purified cDNA was extracted into 10 μL of nuclease-free water. We then added 2 μL of End-Repair premix (1.4 μL of End repair & A-tailing buffer and 0.6 μL of End repair & A-tailing Enzyme (KAPA Biosystems)) to 10 μL of fragmented cDNA solution followed by the incubation at 37 °C for 60 min and 65 °C for 30 min. 2 μL of adapter buffer (1.5 μM truncated adapter, 10 mM Tris-HCl pH 7.8, 0.1 mM EDTA pH 8.0, 50 mM NaCl) and 8 μL of ligation premix (6 μL of ligation buffer, 2 μL of DNA ligase (KAPA Biosystems)) was added at 4 °C. After that, adapters were ligated at 20 °C for 15 min. After the ligation, 18 μL of Ampure XP beads was added to 22 μL of adapter ligation solution. Adapter-ligated cDNA was extracted into 20 μL of nuclease-free water. The cDNA mixtures were mixed with 32 μL of PCR premix (25 μL of 2 × KAPA HiFi ready mix, 1.75 μL of 10 μM TPC2 primer (HPLC-purified), 10 μM P5-gMac hybrid primer (HPLC-purified)) to 18 μL of adapter-ligated cDNA. The solution was denatured at 98 °C for 45 s. Subsequently, cDNA was amplified by eight cycles under the following conditions: 98 °C for 15 s, 60 °C for 30 s, and 72 °C for 1 min. 40 μL of Ampure XP beads was added to 50 μL of PCR solution. Purified sequence-library DNA was eluted into 20–30 μL of nuclease-free water. The DNA concentration and DNA size of sequence library DNA were checked using KAPA Library Quantification Kits (NIPPON Genetics, Tokyo, Japan). The sequencing was performed by RIKEN GENESIS CO., LTD. (Japan) using NextSeq (Illumina, CA, USA). The lengths of reads were 23 bp (Read1) and 63 bp (Read2).

Vafidemstat preparation and administration

In the pharmacological experiment with an LSD1 inhibitor, we utilized vafidemstat (also known as ORY-2001, MedChemExpress, NA, USA). Vafidemstat was dissolved in dimethyl-sulfoxide (DMSO) to prepare the stock solution (50 mg/mL). The stock solution was stored at −80 °C until use. Vafidemstat was administered via drinking water according to a previous report [22]. Before the experiment, we measured that 5 mL per day of water was taken by a mouse on average. We calculated the final concentration in the drinking water assuming the weight of a mouse to be 35 g. To administer 0.96 mg/kg day of vafidemstat to a mouse via drinking water, the stock solution was diluted by water to 6.72 μg/mL. The drinking water was changed every week. Vafidemstat was administered to 15 male mice per condition for 4 weeks before the behavioral tests. The data on the structure was downloaded from ChemSpider (http://www.chemspider.com/), and the structural formula was drawn using the ChemSketch software (ACD/Labs, Ontario, Canada).

Data analysisRNA-seq of bulk tissues

Quality control of the fastq files obtained by the above-described experimental procedures was performed using FASTQC (v.0.11.9) (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Read mapping on the mm10 mouse genome was performed using STAR [23] (v.2.7.9a) with “--outFilterMultimapNmax 1” after removal of the adapter sequence by trim_galore [24] (v.0.6.6) with default settings. The output SAM files were converted to BAM files by samtools [25] (v.1.3.1). Ribosomal RNAs were removed after the mapping using “intersectBed” functions of bedtools [26] (v.2.30.0). The reference BED file of ribosomal RNAs based on GRCm38/mm10 was obtained from UCSC Genome Browser (https://genome.ucsc.edu/). Read counts of each gene in the filtered BAM files were calculated using featureCounts [27] (v.2.0.1). The gene annotation was based on the gtf file of GRCm38 (Ensembl release 102) derived from https://ftp.ensembl.org/pub/release-102/gtf/mus_musculus/. After extracting protein-coding genes using Biomart (GRCm38.p6) as a reference, the read counts of genes were normalized, and the differentially expressed genes (DEGs, uncorrected P value < 0.05) were identified using DESeq2 [28] (v.1.30.1) in R (v.4.0.4). Mitochondrial genes and genes with baseMean < = 20 were excluded from the downstream analysis. We analyzed four mice per genotype in the analysis of adult forebrain and three mice per conditions in the pharmacological experiment. We also verified the genotypes of mice by confirming the presence or absence of reads harboring the Kmt2c frameshift valiant using the Integrative Genomics Viewer (IGV) [29]. We note that the data from one Kmt2c+/fs mouse treated with DMSO failed our quality control criteria and was excluded from the downstream analyses.

Overlap between DEGs in Kmt2c+/fs mice and ASD exome genes or ASD postmortem brain DEGs

To analyze the overlap of DEGs in Kmt2c+/fs mice with ASD-associated genes, the gene symbols in mice were converted to the orthologous genes in humans by the following procedures. To obtain a correspondence table of orthologous genes between mice and humans, we first utilized the “getLDS” function of the biomaRt package [30, 31] (v.2.46.3) in R (v.4.0.4). We retrieved the orthologous relationships between murine gene symbols, human gene symbols, and human Ensembl gene IDs. In the obtained list, there were cases where a particular murine gene was converted to multiple human genes or multiple murine genes corresponded to a single human gene. In such cases, we selected the human gene that exactly matched the capitalized mouse gene symbol. We then assigned human Ensembl gene IDs to each mouse gene based on the generated correspondence table (“Humanized ENSGID” in Supplementary Tables 2 and 3). Subsequently, we examined the overlap between DEGs in Kmt2c+/fs mice and the 185 ASD exome genes (see Results for details) or DEGs in human ASD postmortem brains, which were obtained from ref. [10] and ref. [32], respectively, based on the human Ensembl gene IDs. In this analysis, we focused on protein-coding genes in the nuclear genome. The overlaps were statistically evaluated by a hypergeometric test using the phyper function in R (v.4.0.4). The background gene lists in each test were defined as those that were detected in each bulk RNA-seq or scRNA-seq analysis and successfully converted to human Ensembl gene IDs.

Linkage disequilibrium score regression (LDSC)

We analyzed the enrichment of single nucleotide polymorphisms (SNP)-based heritability in DEGs in Kmt2c+/fs mice using LDSC (v.1.0.1). After the conversion of the identified DEGs into human Ensemble gene IDs as described above, we performed an LDSC analysis using significant DEGs (uncorrected P < 0.05) and the top 1000 and top 2000 DEGs (sorted by uncorrected P values) identified by RNA-seq of bulk adult mouse forebrains, and the top 1000 DEGs in each cell cluster identified by scRNA-seq of neonatal forebrains. The summary statistics in ASD genome-wide association studies (GWAS) meta-analysis was downloaded from https://bitbucket.org/steinlabunc/spark_asd_sumstats/src/master/ASD_SPARK_iPSYCH_PGC.tsv.gz. This GWAS summary statistics file was formatted for LDSC using the “munge_sumstat” function in LDSC. Analyses of partitioned heritability of the genes of interest (i.e., DEGs) were performed using the baseline model (v2.2) of 97 genomic annotations and the data of variant frequencies (1000G_Phase3_frq), genotypes (1000G_EUR_Phase3_plink), and weights (1000G_Phase3_weights_hm3_no_MHC) from the 1000 Genomes Project Phase 3.

Chromatin immunoprecipitation followed by sequencing (ChIP-seq)

Removal and trimming of the illumina adapter sequences from the fastq file were performed by trim_galore (v.0.6.4) with the default settings. Mapping of the sequencing reads to the mm10 mouse genome was performed by Bowtie2 [33] (v.2.3.5.1). Peak-calling was performed by MACS2 [34] (v.2.1.1.20160309). The KMT2C peaks were defined as the common peaks among the identified peak sets derived from two biological replicates (rep1 and rep2) and merged read data of the duplicates (rep1 + rep2). The common peaks were extracted using the BED files of each data by “intersectBed” functions of bedtools. The peaks of KMT2C were annotated by HOMER (v.4.11). The fastq files for H3K4me1 and H3K4me3 ChIP-seq from a previous study were obtained from GEO (accession number: GSE85873). Read mapping was performed using the same method as for KMT2C ChIP-seq. The BAM files derived from H3K4me1 and H3K4me3 ChIP-seq (8 weeks old, male mice with B6NCrl background) of cortical plates were obtained from the ENCODE project (H3K4me1 ENCFF438JYD and ENCFF151JWT; H3K4me3, ENCFF474SND, and ENCFF892POZ). Two BAM files for each histone mark were merged by the “merge” option of samtools [25] (v.1.3.1). The heatmaps were plotted by the functions of “computeMatrix” and “plotHeatmap” of deepTools [35] (v.3.3.0) using the BED files of KMT2C peaks as reference. For the analyses of genes bound by KMT2C in their promotor-TSS regions (KMT2C target genes), only the protein-coding genes annotated by Biomart based on the mm10 mouse genome were included. The BED file of SETD1A peaks of cortical samples was obtained from a previous study [13]. The SETD1A peaks were annotated by HOMER (v.4.11) to define the SETD1A target genes as we did for KMT2C. The overlap between KMT2C target genes and DEGs or SETD1A target genes was statistically assessed by a hypergeometric test.

Microarray

The CEL files were analyzed using the GeneSpring software (Agilent Biotechnologies). The SST-RMA method was utilized for the normalization of the expression signals. Genes with uncorrected P < 0.05 calculated by Mann-Whitney’s test and fold change > 1.1 were defined as DEGs. The overlap between the protein-coding DEGs and the KMT2C target genes was analyzed as described above.

scRNA-seq

To generate a gene-cell UMI matrix, the BCL files from an Illumina sequencer were analyzed by the Q2-pipeline (https://github.com/rikenbit/Q2-pipeline) provided by Nikaido lab that developed Quartz-seq2. Subsequent analyses of the UMI matrix were performed by using Seurat [36] (v.4.0.2) in R (v.4.0.4). Cells with the following conditions were filtered out as outliers; 1) the number of detected genes in the cell, that is nFeature_RNA (i.e., gene count per cell), is less than 2500 or more than 10,000, 2) the proportion of mitochondrial UMI is greater than or equal to 12%. After the above quality check (Supplementary Fig. 2c-i), normalization of the expression abundance and extraction of highly variable features for the clustering and scaling were performed with the default settings. For the dimensionality reduction, principal components (PCs) were extracted by the “JackStraw” and “ScoreJackStraw” functions, and the top 15 PCs were utilized for non-linear dimensionality reduction based on the uniform manifold approximation and projection (UMAP) method. Marker genes of each cell cluster were identified by the “FindAllMarkers” function with the default settings. Cell clusters were manually annotated based on the expression patterns of general cell type markers. DEGs in each cluster were identified using the “FindMarkers” function specifying test.use = MAST. The parameter for the logfc.threshold was set as 0.0001. Genes with uncorrected P value < 0.05 were considered DEGs. In the analysis of the cellular composition in the Kmt2c+/fs and Kmt2c+/+ mice, we performed a 2 × 18 (genotypes × cell clusters) Fisher’s exact test using the “fisher.test” function in R (v.4.0.4) with the default setting except for the method parameter set to “simulate.p.value = TRUE”.

Analysis of cell clusters with ASD-associated transcriptomic alterations

To characterize the DEGs in each cell cluster mainly in the context of ASD-associated gene enrichment, we performed the following four analyses.

(1)

After identifying DEGs (uncorrected P < 0.05) in each cell cluster, we evaluated the difference between the observed DEG count and the expected number of DEGs estimated from the number of cells in the cluster by a residual analysis. Because the number of cells in each cluster correlates with the number of DEGs, we employed this method rather than simply counting the number of DEGs in each cluster. The expected number of DEGs in each cluster was estimated by drawing the regression line of the number of cells and DEGs in the 18 clusters. The standardized residuals were obtained by the “chisq.test” function in R (v.4.0.4) using the cross-tabulation table of the observed and expected number of DEGs. The standardized residuals were then converted into the one-tailed P value using the “pnorm” function in R (v.4.0.4).

(2)

We analyzed the overlap between the DEGs in each cell cluster and the 185 ASD exome genes in ref. [10] by a hypergeometric test using the method described in the “Overlap between DEGs in Kmt2c+/fs mice and ASD exome genes or ASD postmortem brain DEGs” section.

(3)

We analyzed if the top 1000 DEGs in each cell cluster are enriched for the SNP-based ASD heritability [37] by LDSC [38, 39], as described in the “Linkage disequilibrium score regression (LDSC)” section. This threshold for DEG selection was determined according to the results of the RNA-seq analysis of bulk forebrain tissues.

(4)

We analyzed the overlap between the DEGs in each cell cluster and the DEGs in a study of ASD postmortem brains (Benjamini-Hochberg corrected P < 0.05) [32] by a hypergeometric test, as in (2).

We then ranked the cell clusters according to the P values obtained by each analysis, calculated the sum of the ranks in the four analyses for each cluster, and considered clusters with a smaller rank-sum to have a higher ASD-associated transcriptomic alteration score. When two or more clusters showed the same total scores, we ranked them using the combined P value of the four P values calculated by Fisher’s method using the “metap” package (v.1.8) in R (v.4.0.4). These analyses were performed for all, upregulated, and downregulated DEGs. To assess the correlation of ranks between the analyses of upregulated and downregulated DEGs, Spearman’s rank correlation coefficient (ρ) was calculated using the “cor.test” function in R (v.4.0.4). The method parameter was set to “spearman”.

Pseudotime trajectory analysis

Pseudotime trajectory analysis was performed by Monocle3 (v.1.2.7) in R (v.4.1.3) using the UMI matrix including neuronal, astrocyte, and oligodendrocyte cells (cluster 0, 1, 2, 3, 5, 7, 8, 9, 11, 12, 14, 16, and 17) as the input. The top 10 PCs were utilized for the normalization of the data. The dimensionality reduction was performed by the default setting of the UMAP method. Pseudotime trajectories were generated by the “learn_graph” and “order_cells” functions with the default settings. The root node of the trajectory graph was manually selected.

Gene ontology enrichment analysis

We used Metascape [40] (https://metascape.org/) for gene ontology enrichment analyses of genesets of interest. The parameters in “Pathway & Process Enrichment” were set with the default settings (Min Overlap, 3; P Value Cutoff, 0.01; Min Enrichment, 1.5). “GO Molecular Functions”, “GO Biological Processes” and “GO Cellular Components” were selected as the input of the terms of the pathways. We performed the analyses based on Ensemble gene IDs. The reference protein-coding gene lists were obtained from Biomart. Mitochondrial genes were excluded from the analysis. We used the mm10 mouse and hg38 human genomes as references. The output files generated by Metascape were subjected to network visualization of the GO terms with Benjamini-Hochberg-corrected P value < 0.05 by Cytoscape [41] (v.3.9.1). GO networks were manually annotated.

Downstream analysis of RNA-seq data obtained after administrating vafidemstat or vehicle

To evaluate the direction of the effect of vafidemstat, a binomial test was performed using the “binom.test” function in R (v.4.0.4) with the default settings and the theoretical success rate = 0.5. Pearson’s correlation coefficients (r) of the fold changes shown in the pharmacological experiments were calculated using the “cor.test” function in R (v.4.0.4) with the default settings. The method parameter was set to “pearson”. For the comparison of two correlation coefficients, we employed the method proposed by Steiger [42]. This comparison was performed using the “cocor” package [43] (v.1.1–4) in R utilizing the option for a comparison of two overlapping correlations based on dependent groups. We note that genes with baseMean < = 20 were excluded from these analyses.

Statistical analyses

All statistical details are indicated in the Materials and methods and/or in the figure legends. The statistical analyses, if not otherwise specified, were performed using R (v.4.0.4).

留言 (0)

沒有登入
gif