Blimp-1 and c-Maf regulate immune gene networks to protect against distinct pathways of pathobiont-induced colitis

Mice

Mice were bred and maintained under specific-pathogen-free conditions in accordance with the Home Office UK Animals (Scientific Procedures) Act 1986. Age-matched male or female mice were used for experiments. Maffl/fl mice were provided by M. Sieweke and C. Birchmeier (Max Delbrück Centre for Molecular Medicine, Germany)52 and backcrossed to C57BL/6J for ten generations and then crossed to Cd4Cre mice to generate Maffl/flCd4Cre mice as previously described13. Prdm1fl/fl mice were purchased from the Jackson Laboratory, backcrossed to C57BL/6J for four generations and then crossed to Cd4Cre mice to generate Prdm1fl/flCd4Cre mice. Prdm1fl/flMaffl/flCd4Cre and Prdm1fl/flMaffl/fl control mice were generated in-house by crossing Maffl/flCd4Cre with Prdm1fl/flCd4Cre mice. All animal experiments were carried out in compliance with UK Home Office regulations and were approved by The Francis Crick Institute Ethical Review Panel.

H. hepaticus colitis model and antibody treatment

H. hepaticus (NCl-Frederick isolate 1A, strain 51449) was grown under anaerobic gas conditions 10% CO2, 10% H2/N2 (BOC) for 3 days on blood agar plates containing 7% laked horse blood (Thermo Scientific) and the Campylobacter selective supplement ‘Skirrow’ containing the antibiotics trimethoprim, vancomycin and polymyxin B (all from Oxoid). Bacteria were collected and then transferred and expanded, again under the anaerobic gas conditions above, for 3–4 days to an optical density of 0.6 in tryptone soya broth (Oxoid) supplemented with 10% FCS (Gibco) and the antibiotics mentioned above. For infection, mice received 1 × 108 colony-forming units of H. hepaticus by oral gavage using a 22-gauge curved blunted needle on day 0 and day 1. Uninfected mice were housed in the same animal facility and only received antibody treatment. For experiments with anti-IL-10R, 1 mg of either anti-mouse IL-10R (CD210) (Clone 1B1.3A, Rat IgG1, kappa) blocking antibody or Rat IgG1 (Clone GL113, Rat IgG1) isotype control was administered on day 0 and day 7.

Histopathology assessment

To assess the severity of colitis in H. hepaticus-infected mice, in addition to uninfected and steady-state aged mice (age 24–30 weeks), formalin-fixed paraffin-embedded cross-sections of proximal, middle and distal colon were stained with hematoxylin and eosin (H&E) and scored by two board-certified veterinary pathologists on a scale of 0–3 across four parameters to give a maximum score of 12. The four parameters included epithelial hyperplasia and/or goblet cell depletion, leucocyte infiltration into the lamina propria, area affected and markers of severe inflammation, which included crypt abscess formation, submucosal leucocyte infiltration, crypt branching, ulceration and fibrosis. Representative images of H&E-stained colon sections were then taken by the pathologists using a light microscope and a digital camera (Olympus BX43 and SC50).

Histopathology scoring throughout the manuscript is collated in the sections below.

For Extended Data Fig. 1, the baseline histopathology score for Maffl/flCD4Cre was 0 for all 22 mice; for Prdm1fl/flCD4Cre, the score was 0 for 16 mice, two mice had a score of 1 or 2 (reflecting very low-level histological changes) and one exceptional mouse had a score of 5 unaccounted for; for Prdm1fl/flMaffl/flCD4Cre, 28 mice had a score of 0, nine mice had low-level histological changes (seven with a score of 2 and two with a score of 3) and one mouse showed a score of 4. The 36 uninfected aged fl/fl mice had a score of 0.

The following range and median of histopathology scores for the different mice infected with H. hepaticus are as follows: Prdm1fl/flMaffl/flCD4Cre, total of 16 mice with scores of 6–11, median 8; Maffl/flCD4Cre, 14 mice scored 2–9 with a median of 6; Prdm1fl/flCD4Cre, 12 mice scored 2–6 with a median of 3.5; fl/fl controls, 25 mice scored 0–3 with a median of 0, showing a consistent trend of increased colitis from the fl/fl control (no colitis) to Prdm1fl/flCD4Cre to Maffl/flCD4Cre to Maffl/flPrdm1fl/flCD4Cre (severe colitis).

For Extended Data Fig. 6a,d, a large number of uninfected Prdm1fl/flMaffl/flCD4Cre mice (and fl/fl control mice) were needed to pool the numbers needed for performing RNA-seq and ATAC-seq on flow-sorted CD4+ T cells from the colon; in this case, 12 Prdm1fl/flMaffl/flCD4Cre mice were pooled in batches of four, for three biological replicates. The histopathology was examined again by two independent pathologists who reported mild changes in three out of the 12 mice; however, the changes were extremely mild, with none of the three individual mice exhibiting histopathology scores greater than two out of a maximal of 12, and the other nine exhibiting histopathology scores of 0 out of 12. The values in Extended Data Fig. 6a,d for uninfected Prdm1fl/flMaffl/flCD4Cre mice were averages of the pooled mice per replicate.

Immunostaining of colon

Proximal, middle and distal colon formalin-fixed paraffin-embedded cross-sections were de-waxed and re-hydrated before being subjected to automated staining on the Leica BOND Rx Automated Research Stainer. Samples were treated with BOND Epitope Retrieval Solution 1 (Leica AR9961) for CD4 and CD68 antibodies, and BOND Epitope Retrieval Solution 2 (Leica AR9640) for the MPO antibody. To block endogenous peroxidase, samples were incubated in 3% hydrogen peroxide solution (Fisher chemical code H/1750/15) and 1% BSA blocking buffer (BSA Sigma-Aldrich A2153-100G, 1003353538 source SLBX0288). A multiplex panel included antibodies against CD4 (rabbit, Abcam ab183685, clone EPR19514; 1:750 dilution), CD68 (rabbit, Abcam ab283654, clone EPR23917-164; 1:2500 dilution) and MPO (goat, R&D Bio-Techne AF3667; 1:200 dilution). Leica Novolink Polymer (anti-rabbit, RE7161) was used as a secondary detection for primary antibodies raised in rabbit (CD4 and CD68) and horse anti-goat IgG polymer reagent (Impress HRP, Vector Laboratories 30036) for primary antibody MPO raised in goat. Samples were then incubated with Opal 690 (Akoya OP-001006) for CD4, Opal 520 (Akoya OP-001001) for CD68 and Opal 570 (Akoya OP-001003) for MPO, followed by DAPI nuclear counterstain (Thermo Scientific 62248; 1:2500 dilution). Slides were scanned using Akoya’s PhenoImager HT at ×20 and viewed in Akoya inForm Automated Image Analysis Software. Scanned slides were imported into QuPath (version 0.4.3) for image analysis. Cell segmentation was performed using the Stardist extension on the DAPI channel in QuPath. Machine learning was used to train object classifiers on representative regions of each experimental group for CD4, CD68 and MPO markers. Exported data was used to determine the number of positive cells per area (µm2) in the gut sections of the mice.

Isolation of colon LPLs

LPLs were isolated from 1.0–1.5 cm pieces of the proximal, middle and distal colon from individual mice, which were cleaned to remove feces, opened lengthwise and transferred into Dulbecco’s PBS with no Ca2+ or Mg2+ ions (Gibco) containing 0.1% (v/v) bovine serum albumin Fraction V (Roche) (PBS + BSA). To remove the epithelium and intraepithelial lymphocytes, colonic tissue was incubated for 40 min at 37 °C with shaking at 220 rpm in 10 ml of RPMI (Lonza, BE12-702F) supplemented with 5% (v/v) heat-inactivated FCS and 5 mM EDTA (RPMI + EDTA). A second RPMI + EDTA wash was performed as above for 10 min, after which the tissue was left standing at room temperature (37 °C) in 10 ml RPMI (Lonza, BE12-702F) supplemented with 5% (v/v) heat-inactivated FCS and 15 mM HEPES (RPMI + HEPES) to neutralize the EDTA. Tissue was then digested at 37 °C with shaking at 220 rpm for 45 min in 10 ml of RPMI + HEPES with 120 µl of Collagenase VIII added at 50 mg ml−1 in PBS (Sigma). The 10 ml of digested tissue was then filtered through a 70 µM filter into a tube containing 10 ml of ice-cold RPMI + EDTA to neutralize the Collagenase VIII and the cells were centrifuged (1,300 rpm, 7 min, 4 °C). The resulting pellet was then resuspended in 4 ml of 37.5% Percoll (GE Healthcare), diluted in PBS + BSA from osmotically normalized stock and centrifuged (1,800 rpm, 5 min, 4 °C). After centrifugation, the pellet was recovered, resuspended in conditioned RPMI and used for subsequent analysis by flow cytometry and RNA and DNA extractions.

Flow cytometry of colon LPLs

For the analysis of intracellular cytokine expression, isolated colon LPLs from individual mice were transferred to 48-well plates and restimulated with conditioned RPMI media containing 500 ng ml−1 Ionomycin (Calibiochem) and 50 ng ml−1 Phorbol 12-myristate 13-acetate (Sigma-Aldrich) for 2 h, after which 10 µg ml−1 Brefeldin A (Sigma-Aldrich) was added to each well and the cells were incubated for another 2 h. All incubations were conducted at 37 °C in a humidified incubator with 5% carbon dioxide. Following re-stimulation, LPLs were transferred into cold Dulbecco’s PBS with no Ca2+ or Mg2+ ions (Gibco). LPLs were first Fc-blocked for 15 min at 4 °C (24G2, Harlan) and then stained with extracellular antibodies: CD90.2 (53-2.1, PE, Invitrogen), CD4 (RM4-5, BV785, BioLegend), TCR-β (H57-597, APC-e780, Invitrogen), CD8 (53-6.7, BV605, BioLegend) and the UV LIVE/DEAD Fixable Blue dead cell stain (Invitrogen). LPLs were then fixed for 15 min at room temperature with 2% (v/v) formaldehyde (Sigma-Aldrich) and permeabilized for 30 min at 4 °C, using permeabilization buffer (eBioscience) and stained with the following cytokine antibodies for 30 min at 4 °C: IL-17A (eBio17B7, FITC, Invitrogen), IFNγ (XMG1.2, PE-Cy7, BD), IL-10 (JES5-16E3, APC, Invitrogen) and GM-CSF (MP1-22E9, BV421, BD). For transcription factor expression analysis, isolated LPLs remained unstimulated, were Fc-blocked and stained with the same extracellular antibodies, plus Ly6G (1A8, PE-Dazzle, BioLegend), CD11b (M1/70, eFluor450, Invitrogen), CD19 (1D3, BV711, BD Biosciences) and UV dead cell stain as with the restimulated LPLs, and then fixed for 30 min at 4 °C using FoxP3/transcription factor staining kit (eBiosciences). Following permeabilization for 30 min at 4 °C using permeabilization buffer (eBioscience), LPLs were then stained with the following transcription factor antibodies for 30 min at 4 °C: RORγt (Q31-378, AF647, BD) and Foxp3 (FJK-16s, FITC, Invitrogen). After staining, cells were resuspended in sort buffer (2% FBS in PBS + 2 mM EDTA) and analyzed on the Fortessa X20 (BD) flow cytometer. Acquired data was analyzed using FlowJo (version 10), with compensation performed using single-color controls from the cells and AbC total compensation beads (Invitrogen). Flow cytometry plots were concatenated for visualization purposes as follows. Each individual acquisition file was down-sampled to the lowest number of events per genotype, thus resulting in a final concatenated file with an even representation of each individual mouse per group. For intracellular cytokine staining, plots in Extended Data Fig. 4d,e,f, are composed of n = 5 for Prdm1fl/flMaffl/fl and n = 4 for Prdm1fl/flCd4Cre, Maffl/flCd4Cre and Prdm1fl/flMaffl/flCd4Cre. The transcription factor staining plots in Extended Data Fig. 4m are composed of n = 5 for Prdm1fl/flMaffl/fl, n = 2 for Prdm1fl/flCd4Cre, n = 4 for Maffl/flCd4Cre and n = 5 for Prdm1fl/flMaffl/flCd4Cre. The extracellular marker staining plots in Fig. 5f are composed of n = 5 for Prdm1fl/flMaffl/fl, n = 5 for Prdm1fl/flCd4Cre, n = 5 for Maffl/flCd4Cre and n = 4–5 for Prdm1fl/flMaffl/flCd4Cre in each uninfected and infected group.

Sorting by flow cytometry of CD4+ T cells from colon lamina propria

Colon LPLs were isolated as described earlier from individual mice and the cells transferred into cold Dulbecco’s PBS (no Ca2+ or Mg2+ ions) (Gibco). Before being sorted, colon LPLs from individual mice within some experimental groups were equally pooled as follows to allow for the sorting of n = 3–6 biological replicates per experiment. In the uninfected groups for Prdm1fl/flMaffl/fl and Prdm1fl/flMaffl/flCd4Cre, n = 12 mice were pooled to give three biological replicates. For the infected Prdm1fl/flMaffl/fl group, n = 16 mice were pooled to give four biological replicates. For the following infected groups, individual mice were not pooled: Prdm1fl/flMaffl/flCd4Cre (n = 6), Prdm1fl/flCd4Cre (n = 4) and Maffl/flCd4Cre (n = 6). For FACS staining, LPLs were first Fc-blocked for 15 min at 4 °C (24G2, Harlan) and then stained with the extracellular antibodies CD90.2 (53-2.1, PE, Invitrogen), CD4 (RM4-5, BV785, BioLegend), TCR-β (H57-597, APC-eFluor 780, Invitrogen), CD8 (53-6.7, BV605, BioLegend) and the UV LIVE/DEAD Fixable Blue dead cell stain (Invitrogen). Live CD4+ T cells (CD4+TCR-β+CD90.2+CD8−) were then sorted to over 95% purity on the FACS Aria III or FACS Aria Fusion cell sorters (both BD). Sorted cells were then used for subsequent RNA and DNA extractions.

RNA-seq of colon LPLs

RNA was extracted from colon LPLs of individual mice using the QIAShredder and RNeasy Mini Kit with on-column DNase digestion, according to the manufacturer’s instructions (Qiagen). RNA-seq libraries were then made with total RNA using the KAPA RNA HyperPrep with RiboErase and unique multiplexing indexes, according to the manufacturer’s instructions (Roche). All libraries were sequenced using the HiSeq 4000 system (Illumina) with paired-end read lengths of 100 bp and at least 25 million reads per sample.

scRNA-seq of colon LPLs

Isolated colon LPLs from individual mice (as detailed above) were filtered using a 70 µm filter, and cells were suspended in PBS 0.04% BSA (UltraPure BSA, Invitrogen). For each sample, an aliquot of cells was stained with AO/PI Cell Viability Kit (Logos Biosystems) and counted with the LunaFx automatic cell counter. For all samples, cell viability before loading was >80%. As per the manufacturer’s instructions, the Master Mix was prepared as detailed in the Chromium Next GEM Single Cell 3′ Reagent Kit v.3.1 (Dual Index) manual, and 10,000 cells per sample were loaded into the 10× Chromium chips (10× Genomics). The 10× Chromium libraries were prepared and sequenced (paired-end reads) using the NovaSeq 6000 (Illumina).

RNA-seq of sorted CD4+ T cells from colon lamina propria

RNA was extracted from flow-sorted CD4+ T cells isolated from the colon lamina propria of individual mice using the QIAShredder and RNeasy Mini Kit with on-column DNase digestion, according to the manufacturer’s instructions (Qiagen). RNA-seq libraries were then made with total RNA using the NEBNext Single Cell/Low Input RNA Library Prep Kit for Illumina and unique multiplexing indexes, according to the manufacturer’s instructions (New England Biolabs). All libraries were sequenced using the HiSeq 4000 system (Illumina) with paired-end read lengths of 100 bp and at least 25 million reads per sample.

ATAC-seq of sorted CD4+ T cells from colon lamina propria

ATAC-seq samples from isolated LPLs were prepared as outlined in a previous publication53. For each sample, 50,000 cells were lysed in cold lysis buffer containing 10 mM Tris-HCl, pH 7.4, 10 mM NaCl and 3 mM MgCl2, 0.1% Nonidet P40 substitute (all Sigma-Aldrich), and the nuclei were incubated for 2 h at 37 °C with 50 μl of TDE1/TD transposase reaction mix (Illumina). Tagmented DNA was then purified using the MinElute kit (Qiagen) and amplified under standard ATAC PCR conditions: 72 °C for 5 min; 98 °C for 30 s and thermocycling at 98 °C for 10 s, 63 °C for 30 s and 72 °C for 1 min for 12 cycles. Each 50 μl PCR reaction consisted of 10 μl Tagmented DNA, 10 μl water, 25 μl NEBNext High-Fidelity 2× PCR Master Mix (NEB), 2.5 μl Nextera XT V2 i5 primer and 2.5 μl Nextera XT V2 i7 primer (Illumina). Nextera XT V2 primers (Illumina) were used to allow larger-scale multiplexing. These sequences were ordered directly from Sigma (0.2 scale, cartridge) and were diluted to 100 μM with 10 mM Tris-EDTA buffer, pH8 (Sigma) and then to 25 μM with DEPC-treated water (Ambion) for use in the reaction. Following amplification, ATAC-seq libraries were cleaned up using 90 μl of AMPure XP beads (Beckman Coulter) and two 80% ethanol washes while being placed on a magnetic plate stand before being eluted in 1 mM (0.1×) Tris-EDTA buffer, pH8 (Sigma-Aldrich) diluted with DEPC-treated water (Ambion). ATAC-seq libraries were then checked on the TapeStation/BioAnalyser (Agilent) before being sequenced on the HiSeq 4000 system (Illumina), with paired-end read lengths of 50 bp and at least 50–80 million uniquely mapped reads per sample.

Statistical analysis

All figure legends show the number of independent biological experiments performed for each analysis and all replicates. Flow cytometry percentages and associated cell numbers were analyzed as a one-way ANOVA with Tukey’s multiple comparisons test and 95% confidence intervals for statistical analysis. All statistical analyses, apart from sequencing, were carried out with Prism8 software (GraphPad), and the following P values were considered statistically significant: *P ≤ 0.05; **P ≤ 0.01; ***P ≤ 0.001; ****P ≤ 0.0001. Analyses for RNA-seq and ATAC-seq data were performed with R version 3.6.1 and Bioconductor version 3.9. Analyses for scRNA-seq data were performed with R version 4.1 and Seurat version 4.1.1. Error bars and sample sizes used are described in the figure legends.

RNA-seq data processing and analysis

For bulk tissue LPL RNA-seq, adaptors were trimmed using Skewer software version 0.2.2 (ref. 54) with the following parameters: '-m pe -q 26 -Q 28 -e -l 30 -L 100', specifying the relevant adaptor sequences. For sorted CD4+ T cell RNA-seq, adaptors were trimmed using FLEXBAR software55, as recommended by the manufacturer (NEB) when using the NEBNext Single Cell/Low input RNA Library Prep Kit for Illumina. FLEXBAR was run following the provider’s suggested pipeline found at https://github.com/nebiolabs/nebnext-single-cell-rna-seq. For both bulk tissue LPL RNA-seq and sorted CD4+ T cell RNA-seq, reads were aligned to the mm10 genome and the GENCODE reference transcriptome version M22 using STAR software version 2.7.1 (ref. 56), excluding multi-mapping reads by setting the parameter 'outFilterMultimapNmax' to 1. To increase read mapping to novel junctions, the parameter 'twopassMode' was set to 'Basic'. Raw gene counts were retrieved using QoRTs software version 1.1.8 (ref. 57). Normalized read counts were retrieved using DeSeq2 version 1.24.0 (ref. 58) and were rlog transformed to visualize gene quantifications.

Differential gene expression of bulk tissue LPLs by RNA-seq

DeSeq2 (ref. 58) was used to obtain DEGs for each of the four H. hepaticus-infected groups: Prdm1fl/flMaffl/fl, Prdm1fl/flCd4Cre, Maffl/flCd4Cre and Prdm1fl/flMaffl/flCd4Cre against the uninfected Prdm1fl/flMaffl/fl control. A gene was considered to be statistically differentially expressed if the fold change was ≥1.5 and the BH-adjusted P value was <0.05, resulting in:

Prdm1fl/flMaffl/fl infected vs uninfected Prdm1fl/flMaffl/fl: 5 DEGs

Prdm1fl/flCd4Cre vs uninfected Prdm1fl/flMaffl/fl: 1,207 DEGs

Maffl/flCd4Cre vs uninfected Prdm1fl/flMaffl/fl: 1,740 DEGs

Prdm1fl/flMaffl/flCd4Cre vs uninfected Prdm1fl/flMaffl/fl: 3,392 DEGs.

Cell enrichment and biological pathway annotation

The identified DEGs (in any condition) were subjected to k-means clustering using k = 9; the expression values for the DEGs were standardized into z-scores and visualized in a heatmap (Fig. 1f). To provide a biological interpretation of these clusters, each cluster was subjected to 'cell-type enrichment' and 'biological pathways' annotation. The cell-type enrichment analysis used the cell-type signatures from a previous publication51, and a Fisher’s exact test was performed to identify cell-type signatures enriched in each of the clusters. Adjusted P values were obtained using the BH correction. Cell-type signatures that are statistically significantly enriched (adjusted P < 0.05) are shown in Fig. 1g. The R package topGO59 was used to obtain the enriched biological processes in each cluster (Extended Data Fig. 1e). Additionally, an ingenuity pathway analysis (IPA) 'core analysis' (Qiagen, www.qiagen.com/ingenuity) was performed to identify the IPA pathways enriched in each cluster. The Prdm1fl/flMaffl/flCd4Cre versus uninfected Prdm1fl/flMaffl/fl expression values served as input for the calculation of the z-scores in the bar plots depicted in Extended Data Fig. 1e, as this was the comparison that resulted in the highest amount of DEGs.

Differential gene expression in sorted colonic CD4+ T cells

DeSeq2 (ref. 58) was used to obtain DEGs for the uninfected Prdm1fl/flMaffl/flCd4Cre and each of the four H. hepaticus-infected groups: Prdm1fl/flMaffl/fl, Prdm1fl/flCd4Cre, Maffl/flCd4Cre and Prdm1fl/flMaffl/flCd4Cre against the uninfected Prdm1fl/flMaffl/fl control (Extended Data Fig. 6b). A gene was considered to be a statistically differentially expressed if the fold change was ≥1.5 and the BH-adjusted P value was <0.05.

ATAC-seq data processing and analysis

Paired-end ATAC-seq reads from sorted CD4+ T cells were quality controlled and adaptors were trimmed using Skewer software version 0.2.2 (ref. 54) with the following parameters: '-m pe -q 26 -Q 30 -e -l 30 -L 50', specifying 'CTGTCTCTTATACAC' as reference adaptor sequence to remove. The reads were then aligned to the mm10 genome using BWA-MEM60, duplicate reads were removed with Picard61, and SAMtools 1.3.1 (ref. 62) was used to discard discordant alignments and/or low mapping qualities (mapQ < 30). To account for transposase insertion, reads were shifted +4 bp in the forward and −5 bp in the reverse strand; moreover, read-pairs that spanned >99 bp were excluded from further analyses as they would span nucleosomes53. MACS2 (version 2.1.1) was used to identify ATAC-seq peaks using the following parameters: 'parameters–keep-dup all–nomodel–shift -100–extsize 200; q-value < 0.01', to identify enrichment of Tn5 insertion sites63. DiffBind software version 2.0.2 (ref. 64) was used to generate raw counts underlying each ATAC-seq peak. Furthermore, batch correction was performed on raw counts using the RUVSeq R package65 to remove batch effects that resulted from independent experiments. BeCorrect software66 was used to generate batch-corrected bigwig files, using the outputs from RUVseq software. The resulting batch-corrected and normalized counts were used for visualization. The R package ggbio was used to plot the genome browser tracks67. To identify differentially accessible sites of interest for each genotype, differentially accessible sites were subjected to k-means clustering using k = 7; the normalized read counts for the differentially accessible sites were standardized into z-scores and visualized in a heatmap.

ChIP-seq data processing and analysis

Publicly available c-Maf ChIP-seq raw fastq files were obtained from GSE40918 (ref. 16) and Blimp-1 ChIP-seq raw fastq files were obtained from GSE79339 (ref. 68). Trimmomatic (version 0.36) was used for quality control and to trim adaptor sequences using the following parameters: 'HEADCROP:2 TRAILING:25 MINLEN:26' (ref. 69). Trimmed reads were aligned to the mouse genome mm10 with Bowtie version 1.1.2 (ref. 70), with the parameters: 'y -m2–best–strata -S'. MACS2 (version 2.1.1) was used with default parameters to identify ChIP-seq peaks, and peaks with a q-value of <0.01 were defined as statistically significant binding sites. 'bamCoverage' from DeepTools (version 2.4.2) was used to normalize ChIP-seq data to RPKMs, and the R package ggbio was used to visualize the genome browser tracks67 together with the ATAC-seq data.

IPA pathways

TH1 and TH17 pathways were constructed in and obtained from the IPA signaling pathways library. Log2(fold changes) from the differential expression analyses outlined above were overlaid on the TH1 and TH17 pathways, all with a fixed scale of −5 (blue) to 3.5 (red).

scRNA-seq data processing and analysis

Fastq files were aligned to the mm10 transcriptome, and count matrices were generated, filtering for GEM cell barcodes (excluding GEMs with free-floating mRNA from lysed or dead cells) using Cell Ranger (version 6.1.2). Count matrices were imported into R and processed using the Seurat library (version 4.0) following the standard pipeline71. Low-quality cells were removed, with cells kept for further analysis if they met the following criteria: the mitochondrial content was within three standard deviations from the median, more than 500 genes were detected and more than 1,000 RNA molecules were detected. DoubletFinder was used to identify doublets, assuming a theoretical doublet rate of 7.5%72. All samples were integrated using the CCA method, implemented by Seurat’s functions FindIntegrationAnchors() and IntegrateData(), using the top 10,000 variable features and the first 50 principal components. A total of 23 clusters were identified from the integrated dataset and were annotated with the scMCA R library (version 0.2.0) using the single-cell Mouse Cell Atlas as a ref. 73 and the clustifyr R library using the Immgen reference dataset (version 1.5.1). A final manually curated annotation was assigned to clusters based on scMCA and clustifyr results, and this resulted in the annotation of 17 distinct cell clusters. Marker genes for these cell clusters were identified using a Wilcoxon rank-sum test, comparing each cluster to all other clusters, and statistically significant genes (adjusted P < 0.05 and log2(fold change) > 0) were kept for further analysis.

CellChat analysis

Cell-to-cell crosstalk was inferred using the R library CellChat (version 1.1.3)74 and the CellChat mouse database. The CellChat analysis was performed as outlined in the CellChat software manual, with the 'population.size' parameter set to TRUE when computing the communication probability between clusters.

Human IBD RNA expression data analysis

Publicly available human IBD RNA expression datasets were obtained from GSE193677 (adult38) and GSE126124 (paediatric75) and were downloaded and RMA-normalized using the limma package (version 3.50.0). From both datasets, only colon biopsies from healthy controls and colon biopsies from untreated patients were used for further analysis. Normalized log2 expression values of the top 10,000 genes based on variances per dataset were used as input to WGNA (version 1.72.1)76. Gene set modules were detected using a minimum module size of 30, and a deep.split of 2 for both datasets. This resulted in 12 modules for GSE193677 and 18 modules for GSE126124. Gene Ontology biological processes enriched in the modules were annotated using clusterProfiler (version 4.0.5). Using clusterProfiler results and manual curation, a final annotation for some key modules was assigned. Furthermore, modules were tested with Qusage (version 2.26.0) using normalized log2 expression values as input for reciprocal datasets; statistically significant modules (adjusted P < 0.05) were plotted using the ggcorrplot function in R. Genes within the modules derived from WGCNA were converted to mouse gene symbols using the bioMart (version 2.56.1) R library, and genes not expressed in our mouse scRNA-seq dataset generated herein were filtered out. The remaining WGNA module genes were scored in our mouse scRNA-seq dataset using the AddModuleScore() function implemented in the Seurat R library.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

留言 (0)

沒有登入
gif