Single Cell Atlas: a single-cell multi-omics human cell encyclopedia

An overview of the multi-omics healthy human map

We conducted integrative assessments of eight omics types from 125 adult and fetal tissues from published resources and constructed a comprehensive single-cell multi-omics healthy human map termed SCA (Fig. 1). Each tissue consisted of at least two omics types, with the colon having the full spectrum of omics layers, which allowed us to investigate extensively the key mechanisms in each molecular layer of colonic tissue. Organs and tissues with at least five omics layers included colon, blood (whole blood and PBMCs), skin, bone marrow, lung, lymph node, muscle, spleen, and uterus (Additional file 2: Table S1). Overall, the scRNA-seq data set contained the highest number of matching tissues between adult and fetal groups, which allowed us to study the developmental differences between their cell types. For scRNA-seq data, majority of the sample matrices retrieved from published studies have already undergone filtering to eliminate background noise, including low-quality cells which are most probable empty droplets. However, some samples downloaded retained their raw matrix form, which contained a significant amount of background noise. Consequently, before proceeding with any additional QC filtering, we standardized all scRNA-seq data inputs to the filtered matrix format, ensuring that all samples underwent the removal of background noise before further processing (Additional file 2: Table S2). This preprocessing step resulted in the removal of 61,774,307 cells out of the original 67,674,775 cells in the downloaded scRNA-seq dataset, leaving us with 5,900,468 cells for subsequent QC filtering. Strict QC was then carried out to filter debris, damaged cells, low-quality cells, and doublets for single-cell omics data [34], as well as low-quality samples for bulk omics data. After QC filtering, 3,881,472 high-quality cells were obtained for scRNA-Seq; 773,190 cells for scATAC-Seq; 209,708 cells for multimodal scImmune profiling with scRNA-seq data; 2,278,550 cells for CyTOF; and 192,925,633 cells for flow cytometry data. For scImmune profiling alone, clonotypes with missing CDR3 sequences and amino acid information were filtered, leaving 167,379 unique clonotypes across 21 tissues in the TCR repertoires and 16 tissues in the BCR repertoires. For RNA-seq and WGS, 163 severed autolysis samples were removed, leaving 16,704 samples for RNA-seq and 837 for genotyping data.

Fig. 1figure 1

A multi-omics healthy human single-cell atlas. Circos plot depicting the tissues present in the atlas. Tissues belonging to the same organ were placed under the same cluster and marked with the same color. Circles and stars represent adult and fetal tissues, respectively. The size of a circle or a star indicates the number of its omics data sets present in the atlas. The intensity of the heatmap in the middle of the Circos plot represents the cell count for single-cell omics or the sample count for bulk omics. The bar plots on the outer surface of the Circos represent the number of cell types in the scRNA-seq tissues (in blue) or the number of samples in bulk RNA-seq tissues (in red)

Single-cell RNA-sequencing analysis of adult and fetal tissues revealed cell-type-specific developmental differences

In total, out of the 125 adult and fetal tissues from all omics types, the scRNA-seq molecular layer in the SCA consisted of 92 adult and fetal tissues (Additional file 1: Fig. S1, Additional file 2: Additional file 2: Table S1), spanning almost all organs and tissues of the human body. We profiled all cells from scRNA-seq data and annotated 417 cell types at fine granularity, in which we categorized them into 17 major cell type classes (Fig. 2A). Comparing across tissues, most of them contained stromal cells, endothelial cells, monocytes, epithelial cells, and T cells (Fig. 2A). Comparing across the cell type classes, epithelial cells constituted the highest cell count proportions, followed by stromal cells, neurons, and immune cells (Fig. 2A). For adult tissues, most of the cells were epithelial cells, immune cells, and endothelial cells; whereas in fetal tissues, stromal cells, epithelial cells, and hematocytes constituted the largest cell type class proportions. Of these 92 tissues from the scRNA-seq data, we carried out integrative assessments of these tissues (Figs. 2 and 3) to study cellular heterogeneities in different developmental stages of the tissues.

Fig. 2figure 2

scRNA-seq integrative analysis revealed similarity and heterogeneity between adult and fetal tissues. A Clustering of the 417 cell types from scRNA-seq data, consisting of 92 tissues based on their cell type proportion within each tissue group. Cell types were colored based on the cell type class indicated in the legend. The numbers in the bracket represent the cell number within the tissue group. B UMAP of the cells present in the 94 adult and fetal tissues from scRNA-seq data, colored based on their cell type class. C Phylogenetic tree of the adult (left) and fetal (right) cell types. Clustering was performed based on their top regulated genes. The color represents the cell type class. Distinct clusters are outlined in black and labeled

Fig. 3figure 3

In-depth assessment of the integrated scRNA-seq further revealed inter-and intra-group similarities between adult and fetal tissues. A Chord diagrams of the highly correlated (AUROC > 0.9) adult and fetal cell types. Each connective line in the middle of the diagrams represents the correlation between two cell types. The color represents the cell type class. B Top receptor-ligand interactions between cell type classes in adult tissues (left) and fetal tissues (right). Color blocks on the outer circle represent the cell type class, and the color in the inner circle represents the receptor (blue) and ligand (red). Arrows indicate the direction of receptor-ligand interactions. C 3D tSNE of the integrative analysis between scRNA-seq and bulk RNA-seq tissues. The colors of the solid dots represent cell types in scRNA-seq data, and the colors of the spheres represent tissues of the bulk data. T indicates the T cell cluster, and B indicates the B cell cluster. D Heatmap showing the top DE genes in each cell type class of the adult and fetal tissues. Scaled expression values were used. Color blocks on the top of the heatmap represent cell type classes. Red arrows indicate the selected cell type classes for subsequent analyses. E Top significant GO BP and KEGG pathways for the cell type classes in adult and fetal tissues. The size of the dots represents the significance level. The color represents the cell type class

For each cell type, we performed differential expression (DE) analysis for each tissue to obtain the DE gene (DEG) signature for each cell type. We assessed the global gene expression patterns between cell types across the tissues based on their upregulated genes (Additional file 2: Table S3) for adult and fetal tissues (Fig. 2C, Additional file 1: Fig. S2). In adult tissues, immune cells (i.e., B, T, monocytes, and NK cells) with hematocytes, stromal cells, neurons, endothelial cells, and epithelial cells formed distinct cellular clusters (Fig. 2C, Additional file 1: Fig. S2A), demonstrating highly similar DEG signatures within each of these cell type classes, consistent with the clustering patterns in the previous scRNA-seq atlas [35]. In fetal tissues, segregation is comparatively less distinctive such that only a subgroup of epithelial cells formed a distinct cell type cluster, cells from the immune cell type classes as well as hematocytes coalesced to form another cluster, and stromal cells formed small clusters between other fetal cell types (Fig. 2C, Additional file 1: Fig. S2B), which could represent the similarity in gene expression with other cell types during lineage commitment of stromal cell differentiation [36].

We next investigated the underlying gene regulatory network (GRN) of the transcriptional activities of cell types across adult and fetal tissues [37]. We identified active transcription factors (TFs) detected for cell types within each tissue (AUROC > 0.1), and based on these TF signatures, we measured similarities between cell types for adult and fetal tissues (Additional file 1: Fig. S3). For adult tissues, clustering patterns similar to Additional file 1: Fig. S1A were observed (Fig. 2C, Additional file 1: Fig. S3A). In fetal tissues, two unique clusters, including immune cells with hematocytes and stromal cells, were observed (Additional file 1: Fig. S3B). Higher similarity in transcription regulatory patterns of stromal cells was observed compared to their gene expression patterns. The concordance between gene expression and transcription regulatory patterns within adult and fetal tissues demonstrated a direct and uniform interplay between the two molecular activities. In terms of the varying TF and DEG clustering patterns between adult and fetal tissues, the adult cell types demonstrated more similar transcriptional activities within the cell type classes than the less-differentiated fetal cell types, which shared more common transcriptional activities.

We dissected the correlation pattern of the clusters shown in Fig. 2C by drawing inferences from their highly correlated (AUROC > 0.9) cell-type pairs (Fig. 3A). Specifically, for the immune cluster in adult tissues, monocytes accounted for most of the high correlations within the immune cell cluster, followed by T cells (Fig. 3A). For fetal tissues, a high number of correlations was observed between the immune cells (i.e., mostly monocytes and T cells) and hematocytes (Fig. 3A), which explained the clustering pattern observed in fetal tissues (Fig. 2C). For fetal stromal cells, other than with their own cell types, large coexpression patterns were observed with the hematocytes and the epithelial cells, and a smaller proportion of correlations with other clusters (Fig. 3A), which accounted for the small clusters of stromal cells formed between other cell types (Fig. 2C, Additional file 1: Fig. S2B).

To describe possible cellular networking between the cell type class clusters in Fig. 2C, we inferred cell–cell interactions [38] based on their gene expression (Additional file 2: Table S4), and variations between adult and fetal tissues were observed (Fig. 3B). In adult tissues, many cell type classes displayed interactions with the neurons, in which they networked with epithelial cells through UNC5D/NTN1 interaction; with stromal cells through SORCS3/NGF; with T cells through LRRC4C/NTNG2; etc. (Fig. 3B). Among the top interactions of fetal tissues, among the top interactions, monocytes actively network with other cells, such as via CCR1/CCL7 with hematocytes, CSF1R/CSF1 with stromal cells, and FPR1/SSA1 with epithelial cells.

We performed a pseudobulk integrative analysis of the cell types of the scRNA-seq data from 19 tissues found in both adult and fetal tissues, with the 54 tissues from the bulk RNA-seq data (Fig. 3C) to compare single-cell tissues with the corresponding tissues in the bulk datasets. For cell types of scRNA-seq data, adult cell types formed distinct clusters of T cells, B cells, hematocytes, stromal cells, epithelial cells, endothelial cells, and neurons (Fig. 3C). Fetal cell types, by comparison, formed a unique cluster of cell types separating themselves from adult cell types. Internally, a gradient of cell types from brain tissues to cell types from the digestive system was observed in this fetal cluster. Fusing the bulk tissue-specific RNA-seq data sets with the pseudobulk scRNA-seq cell types gave close proximities of the bulk brain tissues with the pseudobulk brain-specific cell types, such as neurons and astrocytes (Fig. 3C). Bulk whole blood clustered with pseudobulk hematocytes, and bulk EBV-transformed lymphocytes clustered with pseudobulk B cells. Other distinctive clusters included bulk colon and small intestine clustered with pseudobulk colon- and small intestine-specific epithelial cells, and bulk heart clustered with pseudobulk cardiomyocytes and other muscle cells (Fig. 3C).

Next, we conducted gene ontology (GO) of biological processes (BPs) and KEGG pathway analyses [39,40,41,42] of the top upregulated genes of each cell type class cluster (Fig. 3D) found in Fig. 2C. Multiple testing correction for each cell type class was performed using Benjamini & Hochberg (BH) false discovery rate (FDR) [43]. At 5% FDR and average log2-fold-change > 0.25 (ranked by decreasing fold-change), the top three most significant genes of the remaining cell type classes were each scanned through the phenotypic traits from 442 genome-wide association studies (GWAS) and the UK Biobank [44, 45] to seek significant genotypic associations of the top genes with diseases and traits. Notably, for GO pathways, the most significant BPs for B and T cells in both adult and fetal tissues were similar (Fig. 3E). In contrast, epithelial cells and neurons differ in their associated BPs between adult and fetal tissues. For KEGG pathways, adult and fetal tissues shared common top pathways in T cells and in epithelial cells (Fig. 3E). Among the top genotype–phenotype association results of the top genes (Additional file 1: Fig. S4), SNP rs2239805 in HLA-DRA of adult monocytes has a high-risk association with primary biliary cholangitis, which is consistent with previous studies showing associations of HLA-DRA or monocytes with the disease [46,47,48,49,50].

Multimodal analysis of scImmune profiling with scRNA-sequencing in multiple tissues

To decipher the immune landscape at the cell type level in the scImmune profiling data, we carried out an integrative in-depth analysis of the immune repertoires with their corresponding scRNA-seq data. The overall landscape of the cell types mainly included clusters of naïve and memory B cells, naïve T/helper T/cytotoxic T cells, NK cells, monocytes, and dendritic cells (Fig. 4A) and mainly comprised immune repertoires from the blood, cervix, colon, esophagus, and lung (Additional file 1: Fig. S5). On a global scale, we examined clonal expansions [51, 52] in both T and B cells across all tissues. Here, we defined unique clonal types as unique combinations of VDJ genes of the T cell receptor (TCR) chains (i.e., alpha and beta chains) and immunoglobin (Ig) chains on T cells and B cells, respectively. Integrating clonal type information from both the T and B cell repertoires with their scRNA-seq revealed sites of differential clonal expansion in various cell types (Fig. 4B and C, Additional file 1: Fig. S5). In T cell repertoires, high proportions of large or hyperexpanded clones were found in terminally differentiated effector memory cells reexpressing CD45RA (Temra) CD8 T cells [53, 54] and cytotoxic T cells, and a large proportion of them was found in the lung (Fig. 4C, Additional file 1: Fig. S5), which interplays with the highly immune regulatory environment of the lungs to defend against pathogen or microbiota infections [55, 56]. MAIT cells [57, 58] have also demonstrated their large or high expansions across tissues, especially in the blood, colon, and cervix (Additional file 1: Fig. S5A), with their main function to protect the host from microbial infections and to maintain mucosal barrier integrity [58, 59]. In contrast, single clones were present mostly in naïve helper T cells and naïve cytotoxic T cells. (Additional file 1: Fig. S5B) and were almost homogeneously across tissues (Fig. 4C). This observation ensures the availability of high TCR diversity to trigger sufficient immune response for new pathogens [60]. For the B cell repertoire in blood, most of these immunocytes remained as single clones or small clones, with a small subset of naïve B cells and memory B cells exhibiting medium clonal expansion (Additional file 1: Fig. S5B).

Fig. 4figure 4

Multi-modal analysis of scImmune profiling with scRNA-seq revealed a clonotype expansion landscape in six tissues. A tSNE of cell types from the multi-modal tissues of the scImmune-profiling data. Colors represent cell types. Cell clusters were outlined and labeled. B tSNE of cell types from the multi-modal tissues of the scImmune-profiling data. Colors indicate clonal-type expansion groups of the cells. Cells not present in the T or B repertoires are shown in gray (NA group). C Stacked bar plots revealing the clonal expansion landscapes of the T and B cell repertoires across 6 tissues. Colors represent clonal type groups. D Alluvial plot showing the top clonal types in T cell repertoires and their proportions shared across the cell types. Colors represent clonotypes. E Alluvial plot showing the top clonal types in B cell repertoires and their proportions shared across the cell types. Colors represent clonotypes

Among the top clones (Fig. 4D), TRAV17.TRAJ49.TRAC_TRBV6-5.TRBJ1-1.TRBD1.TRBC1 was present mostly in Temra CD8 T cells and shared the same clonal type sequence with cytotoxic T and helper T cells (Additional file 2: Table S5). This top clone was found to be highly represented in the lung, and comparatively, other large clones of CD8 T cells were found in the blood (Additional file 1: Fig. S5C). The top ten clones were found in Temra CD8 T cells of blood and lung tissues and cytotoxic T cells and helper T cells from blood, cervix, and lung tissues (Additional file 1: Fig. S5C). Some of them exhibited a high prevalence of cell proportions in Temra CD8 T cells (Fig. 4D). In the B cell repertoire of blood, the top clones were found only in naïve and memory B cells, with similar proportions for each of the top clones (Fig. 4E).

Multi-omics analysis of colon tissues across five omics data sets

To examine the phenotypic landscapes and interplays between different omics methods and data sets, we carried out an interrogative analysis of colon tissue across five omics data sets, including scRNA-Seq, scATAC-Seq, spatial transcriptomics, RNA-seq, and WGS, to examine the phenotypic landscapes across omics layers and the interplays and transitions between omics layers. In the overview of the transcriptome landscapes in adult and fetal colons (Fig. 5A and B), the adult colon consisted of a large proportion of immune cells (such as B cells, T cells, and macrophages) and epithelial cells (such as mucin-secreting goblet cells and enterocytes) (Fig. 5A). In contrast, the fetal colon contained a substantial number (proportion) of mesenchymal stem cells (MSCs), fibroblasts, smooth muscle cells, neurons, and enterocytes and a very small proportion of immune cells (Fig. 5B).

Fig. 5figure 5

In-depth scRNA-seq analysis revealed distinct variations between adult and fetal colons. A tSNE of the adult colon; colors represent cell types. B tSNE of the fetal colon; colors represent cell types. C Heatmap showing the correlations of the cell types of the MSC lineage from adult and fetal colons based on their top upregulated genes. The intensity of the heatmap shows the AUROC level between cell types. Color blocks on the top of the heatmap represent classes (first row from the top), cell types (second row), and cell type classes (third row). D Heatmap showing the correlations of the cell types of the MSC lineage from adult and fetal colons based on the expression of the TFs. The intensity of the heatmap shows the AUROC level between cell types. Color blocks on the top of the heatmap represent classes (first row from the top), cell types (second row), and cell type classes (third row). E Pseudotime trajectory of the MSC lineage in the adult colon. The color represents the cell type, and the violin plots represent the density of cells across pseudo-time. F Pseudo-time trajectory of the MSC lineage in the fetal colon. The color represents the cell type, and the violin plots represent the density of cells across pseudotime. G Heatmap showing the pseudotemporal expression patterns of TFs in the lineage transition of MSCs to enterocytes in both adult and fetal colons. Intensity represents scaled expression data. The top 25 TFs for MSCs or their differentiated cells are labeled. H Pseudotemporal expression transitions of the top TFs in the MSC-to-enterocyte transitions for both adult and fetal colons. I Heatmap showing the pseudotemporal expression patterns of TFs in the lineage transition of MSCs to fibroblasts in both adult and fetal colons. Intensity represents scaled expression data. The top 25 TFs for MSCs or their differentiated cells are labeled. J Pseudotemporal expression transitions of the top TFs in the MSC-to-fibroblast transitions for both adult and fetal colons

As there were fewer immune cells observed in the fetal colon as compared to the adult colon, we compared the MSC lineage cell types between the two groups. Based on their differential gene expression signatures (Fig. 5C) and their TF expression (Fig. 5D), the highly specialized columnar epithelial cells, enterocytes, for both molecular layers correlated well between adult and fetal colons, unlike other cell types, which did not demonstrate high correlations between their adult and fetal cells. Other than the enterocytes, adult and fetal fibroblasts were highly similar to MSCs in both transcriptomic and regulatory patterns (Fig. 5C and D). We modeled pseudo-temporal transitions of MSC lineage cells, and similar phenomena were observed (Fig. 5E and F). Both adult and fetal fibroblasts were pseudotemporally closer to MSCs, and the transitions were much earlier than other cells. Analysis across regulatory, gene expression, and pseudotemporal patterns showed in both adult and fetal colons that fibroblasts were more similar to MSCs phenotypically, as shown in prior literature reports [61,62,63] and recently with therapeutic implications [64, 65]. In addition, transient phases of cells along the MSC lineage trajectory were observed for enterocytes and goblet cells (Fig. 5E and F), which demonstrated that these high plasticity cells were at different cell-state transitions before their full maturation, as evident in the literature [66, 67]. By contrast, the fetal intestine was more primitive than the adult intestine during fetal development, and as a key cell type in extracellular matrix (ECM) construction [68], fibroblasts displayed transitional cell stages of cells along the pseudotime trajectory (Fig. 5F).

Comparing regulatory elements of these transitions demonstrated similarities and differences (Fig. 5G–J, Additional file 1: Fig. S6). For MSC-to-enterocyte transitions (Fig. 5G, Additional file 2: Table S6), the leading TFs with significant pseudotemporal changes were labeled. The expression E74 Like ETS transcription factor 3, ELF3, which belongs to the epithelium-specific ETS (ESE) subfamily [69], increased during the transition for both adult and fetal enterocytes (Fig. 5H, Additional file 2: Table S6) and as previously demonstrated is important in intestinal epithelial differentiation during embryonic development in mice [69, 70]. Conversely, high mobility group box 1, HMGB1 [71], decreased pseudotemporally for both adult and fetal enterocytes (Fig. 5H, Additional file 2: Table S6) and has been shown to inhibit enterocyte migration [72]. The nuclear orphan receptor, NR2F6, a non-redundant negative regulator of adaptive immunity, [73, 74], displayed a comparative decline in expression halfway through the pseudotime transition for adult enterocytes but continued to increase for fetal enterocytes (Fig. 5H, Additional file 2: Table S6). Another TF from the ETS family, Spi-B transcription factor, SPIB, also showed differential expression during the transition between adult and fetal enterocytes (Fig. 5H), which was up-regulated in fetal enterocytes and down-regulated in adult enterocytes, suggesting its potential bi-functional role in enterocyte differentiation in fetal-to-adult transition.

For MSC-to-fibroblast transitions (Fig. 5I, Additional file 2: Table S6), TFs such as ARID5B, FOS, FOSB, JUN, and JUNB displayed almost identical trajectory patterns between adult and fetal fibroblasts (Fig. 5J, Additional file 2: Table S6). Of these TFs, FOS, FOSB, JUN, and JUNB were shown to be absent in the healthy mucosa transcriptional networks [75], in line with their observations in Fig. 5J. By contr

留言 (0)

沒有登入
gif