Celloscope: a probabilistic model for marker-gene-driven cell type deconvolution in spatial transcriptomics data

Celloscope model overview

We propose Celloscope, a novel Bayesian probabilistic graphical model of gene expression in ST data, which deconvolutes cell type composition in ST spots, and a method to infer model parameters based on an MCMC algorithm (Methods). Apart from gene expression measurements per each spot localized in the analyzed tissue, ST data come also with corresponding images of hematoxylin and eosin (H&E) staining of the same sample. The first step of the Celloscope’s pipeline is an analysis of those H&E images (Fig. 1A). In this step, the total number of cells for each spot is estimated. Optionally, regions of interest in the tissue (e.g., an inflammation) are annotated by a pathologist, so that further analysis steps are restricted to these regions.

Fig. 1figure 1

Celloscope overview. A The total number of cells for each spot is estimated based on a H&E image. Optionally, regions of interest are annotated. H&E image source: 10x Genomics. B Prior knowledge on marker genes is given as the model’s input, in a form of a binary matrix, together with ST data on marker gene expression in spots (C). D The graphical representation of Celloscope. Gray nodes correspond to the observed variables, double circled to deterministic ones, while the remaining nodes correspond to hidden variables. Arrows represent probabilistic dependencies. Model’s variables are described in Table 1 and hyperparameters in Table 2. E Cell type decomposition in each spot using Celloscope is performed via MCMC inference

Prior knowledge of marker genes for considered cell types is encoded as a binary matrix (Fig. 1B). The types correspond to such cells that are expected to be found in the examined tissue or (optionally) in the selected area of interest. Additionally, we assume the presence of a dummy type accounting for novel or unknown cell types. The dummy type is characterized by zero marker genes.

The estimated cell counts and the binary matrix encoding prior knowledge of marker genes, together with the measured expression of the marker genes in each (selected) spot (Fig. 1C) constitute the input to the probabilistic model (Fig. 1D). The model assumes that the measured expression depends on the hidden cell type mixture in each spot. As the output, Celloscope returns the proportions of cell types for each spot of interest (Fig. 1E).

Celloscope’s results on simulated data prove exceptional performance of marker gene-based cell type deconvolution

To demonstrate the excellent performance of Celloscope in marker gene-based cell type deconvolution, we tested the model on different setups of simulated data, for which we knew the ground truth about the underlying cell type composition, and therefore we were able to confront the true cell type proportions across spots with the model’s estimates. We consider twelve distinct simulation scenarios which differ by two key simulation parameters. The first simulation parameter is the average number of cell types present within each spot, with the dense scenario implying an increased number and the sparse scenario denoting a decreased number of cell types per spot. The second simulation parameter controls what is given as input to Celloscope and how this information is accounted for in the model. By default, Celloscope takes cell counts in each spot and sets them as priors for the hidden variable \(N_s\) (the total number of cells in spot s). Moreover, by default, Celloscope considers that the expression levels of marker genes in each cell type are unknown and are modeled using hidden variables \(\Lambda\). In our simulations, we investigate the performance under these default settings, together with deviations from these settings. Specifically, we vary whether the number of cells in each spot is treated as known (i.e., given as observed variables to the model, fixing variables \(N_s\)), as noisy priors, (i.e., introduced as informative priors for the hidden variables \(N_s\), with two levels of noise added: moderate or high), or as unknown (i.e., not given as input to the model at all―in this case uninformative priors for \(N_s\) are used), and finally, whether the expression values \(\Lambda\) are provided as known and given as observed variables to the model. For each scenario, 15 datasets were simulated, assuming 800 spots, 7 cell types (including the dummy type), and 149 marker genes.

We first computed the average absolute error across spots for each replicate. For a single spot, the absolute error value ranges between 0 (when the model perfectly predicts the simulated fractions of the spot occupied by each cell type) and 2/T, where T denotes the number of types. The upper bound for this error can easily be seen by considering a case when the simulated fractions would be such that the spot would be fully dominated by one cell type with fraction 1, while the model would predict full domination by a different cell type. Thus, for \(T=7\), we expect that the average absolute error across spots will be in the range [0, 0.29]. In this evaluation, Celloscope achieved excellent performance, with error levels between 0.01 and 0.03 (Fig. 2A) for all the aforementioned simulation settings, proving that knowledge of marker genes for cell types is sufficient for accurate cell type deconvolution in ST spots. The default Celloscope version achieved a median error of only around 0.025 for the dense simulation scenario and around 0.027 for the sparse scenario. The additional information on the number of cells in each spot increased the accuracy of the model’s prediction.

Fig. 2figure 2

Excellent performance of Celloscope on simulated data. A Box plots represent distributions of the average absolute error (y-axis; computed using Eq. 16) for different methods (colors) and data simulation scenarios (x-axis). Celloscope outperforms competing methods, stereoscope [8], RCTD [7], and SpatialDWLS [13] which rely on additional input regarding gene expression levels in different cell types. Moreover, Celloscope is robust to noise in cell counting results and its performance remains satisfactory in case of no prior knowledge about the number of all cells in ST spots. B Distribution of the fraction of correctly identified dominant cell types across spots (y-axis), for different methods (colors) and simulation scenarios (x-axis). Celloscope again shows a large advantage over other methods. Here, we compare also to CellAssign [21], indicating the benefit from decomposing a mixture of different cell types in ST spots. C The impact of a lack of exclusivity in marker gene sets for cell types. Overlap in marker gene sets does not affect Celloscope’s performance in case of dense data and the performance remains satisfactory for sparse data. D BayesPrism [15] requires at least 50 genes per type to perform inference successfully

Celloscope is robust to noise in cell counts estimation results

Celloscope proved to be highly robust to noise in the input total number of cells in each ST spot (Fig. 2A). We considered two intensities of Gaussian noise added to the true cell counts, moderate N(2, 3) and high N(5, 5), that were randomly added or subtracted from the true value (which on average was equal to 15). In the case of the moderate noise, for both sparse and dense simulation scenarios, Celloscope manifested comparably high accuracy as in the default case of input cell counts without noise, with an average error of circa 0.025 for both scenarios. Celloscope performance remained satisfactory in the case of high noise, with an average error of 0.033 for the dense and 0.043 for the sparse scenario. Moreover, we tested Celloscope’s performance in the event of a lack of prior knowledge on cell counts; when the model is a priori completely unaware of the number of cells in each ST spot, Celloscope achieved satisfactory accuracy with an average error 0.035 for the dense scenario and 0.044 for the sparse scenario. Due to technical difficulties with H&E image quality or overlapping cell nuclei, the counts estimated for real data may be noisy, and the demonstrated robustness of Celloscope to noise in the input cell counts per spot is an important advantage.

Knowledge of gene expression profiles of cell types does not have a large impact on model accuracy

We next assessed the hypothetical improvement in accuracy coming from including gene expression profiles of cell types obtained from an additional, external data source, such as a scRNA-seq reference dataset. To this end, instead of estimating hidden variables \(\Lambda\), as in the default setting, we run Celloscope with \(\Lambda\) set to their true, simulated values (Fig. 2A). In the case of the dense scenario, the performance stayed at the comparable level to the default Celloscope’s setting. However, we observed some error drop for the sparse scenario (from median 0.027 for the default to 0.022). Still, this improvement was smaller compared to the one obtained from knowing the cell counts per spot (i.e., when the corresponding hidden variables were fixed to their true values). Note that the evaluated setup did not account for the fact that the expression profiles obtained from external data sources may have technical bias and noise. In such a case, these observations could even deteriorate the model’s performance. These results indicated that by accounting for knowledge of marker genes for each cell type and estimating the unknown expression profiles for cell types as a part of the model inference, Celloscope deals well with the lack of external data sources such as a reference RNA-seq dataset.

Overlapping marker gene sets do not affect Celloscope’s performance on simulated data

We further assessed whether Celloscope’s performance is sensitive to the lack of exclusivity in marker genes sets for the different cell types (Fig. 2C). To this end, we fixed the total number of marker genes to 149, as in all other simulation settings, and run Celloscope in three scenarios: (1) exclusivity―with no intersections between the sets of marker genes for cell types; (2) slight overlap―with 9 marker genes shared between cell types 2 and 3, as well as 10 shared between 3 and 4; (3) high overlap―in addition to the overlap from (2), with 17 genes shared between type 5 and 4, resulting in 5 out of 6 (not counting the dummy type, which has no markers) cell types vaguely defined. For the dense simulation scenarios, Celloscope is extremely robust to increasing overlap in the marker genes. For the sparse scenarios Celloscope’s error increases with the amount of overlap, but only slightly, reaching a median error of around 0.041 for the high overlap simulation setting.

Celloscope performs favorably over preceding approaches

We compared Celloscope’s performance to previously published Stereoscope [8], RCTD [7], and SpatialDWLS [13]. These methods, unlike Celloscope, require a reference scRNA-seq dataset to compute gene expression profiles for the analyzed cell types. Importantly, all considered methods were applied on exactly the same simulated datasets, for which the ground truth was known. Thus, we were in possession of the true marker gene expression levels across cell types, which were provided to RCTD, Stereoscope, and SpatialDWLS [13], and used by these methods for the estimation of cell type proportions in simulated spots (see Additional file 1: Sections S1, S2, and S3 for run settings used for RCTD Stereoscope and SpatialDWLS, respectively). These values were not provided to Celloscope, but inferred. Therefore, RCTD, Stereoscope, and SpatialDWLS were given a head start as compared to our model.

Despite being given such an advantage, RCTD, Stereoscope, and SpatialDWLS performed poorly compared to Celloscope (Fig. 2A). The observed much higher error for RCTD might occur due to the fact that this model uses Poisson distribution to model gene expression, in contrast to Celloscope and Stereoscope, that utilize the negative binomial distribution.

We performed a separate simulation study to compare Celloscope’s performance to BayesPrism [15], as this method requires at least 50 genes representative of each considered cell type, higher than 15–35, which we used for Celloscope and other methods in all other simulation scenarios. Importantly, for some cell types, acquiring such a high number of marker genes is unrealistic in the case of real data. In the case of setups with lower, more realistic numbers of marker genes, BayesPrism obtained a very high average error of around 0.1, much higher than Celloscope in its most difficult setup, where it was not given cell counts per spot as input (Fig. 2D). For simulated data with 50–80 and 150–180 marker genes per cell type, BayesPrism achieved better accuracy; however, it is still considerably lower than Celloscope. We also observed a lower average error achieved by Celloscope as the number of marker genes increased, despite no prior knowledge of the total number of cells per spot.

Accounting for cell-type mixtures in spots increases model’s accuracy

To evaluate the importance of the assumption of the presence of mixtures of cells of different types in ST spots and inferring proportions of cell types per spot, in contrast to indicating only the dominant type, we compared the results of Celloscope and other methods designed for ST data to CellAssign [21] (run settings provided in Additional file 1: Section S4). Similarly to Celloscope, CellAssign uses a binary cell type marker matrix as model input. However, CellAssign was originally developed to assign types to cells in scRNA-seq data, and as such it assumes that each observation refers to only a single cell (of a given type). Therefore, we expect that CellAssign, applied to simulated ST data, will treat each spot as homogeneous and indicate the dominant type. For each spot, we checked if the true dominant cell type (the cell type characterized by the highest proportion) was in agreement with the dominant type inferred by each method (Fig. 2B). Both Celloscope and other methods dedicated to ST and performing the cell type deconvolution performed the task of finding the dominant type significantly better than CellAssign (Fig. 2B). For CellAssign in the dense simulation scenario, the median fraction of correctly indicated dominant cell types was only around 0.25, while for the dense scenario, it was around 0.55. For Celloscope in its default setting, these measures were much higher and ranged around 0.75 and 0.88, respectively. These results confirm the key importance for accounting of the mixture of cell types in each ST spot.

Celloscope localizes cell types in agreement with known mouse brain structuresDecomposition of cell types in a sagittal mouse brain section

Celloscope was applied to mouse brain data [22] and was able to successfully indicate brain structures (Fig. 3A). Specifically, an analysis of spatial transcriptomics data on sagittal mouse brain slices (an anterior section) was performed. In contrast to simulated data, for this dataset, there is no ground truth specifying the exact, underlying compositions of cell types in each spot. We can, however, expect that some of the known cell types will dominate in specific brain regions and that some other non-specific cell types will be prevalent across the entire brain tissue. Thus, to evaluate the quality of cell type deconvolution by Celloscope, we first compared the obtained spatial cell type distribution to regions as specified by the mouse brain atlas [23]. Second, we compared these findings to results of other studies localizing cell types in mouse brain regions using different technologies, namely immunofluorescence detection [24], Nissl-staining for cells and genetic marker stains [25] or labeling with anti-TH antibody [26].

Fig. 3figure 3

Results obtained for the anterior part of the mouse brain (sagittal section). CPC, choroid plexus epithelial cells; DOPA, dopaminergic neurons; GABA, GABAergic neurons; GLUT, glutamatergic neurons; OEG, olfactory ensheathing glia; OLG, oligodendrocytes; ASC, astrocytes; EC, endothelial cells; GABA-sub, GABAergic neurons subtype; MG, microglia; VLMC, vascular and leptomeningeal cells; DT, dummy type. A Heatmaps represent spatial composition for selected cell types. Dark violet indicates the absence of the cell type in question, yellow signalizes moderate occurrence, and magenta dominance of a given type. B Results of CellAssign on the same dataset. C Moran's I coefficient for cell types indicated both by CellAssign and Celloscope. D Moran's I coefficient computed for cell types indicated only by Celloscope. E The correlation matrix heatmap represents the values of the Pearson correlation coefficient for all studied cell types, the positive values in red, negative in blue. 0 indicates that there is no relationship between studied variables. “X” denotes an insignificant correlation (p-values of the test with the test statistics based on Pearson’s product moment correlation coefficient \(p \le 0.05\))

Neurons and non-neuron cells called glia are the most commonly occurring brain cells [27]. Two major subclasses of neurons can be distinguished: GABAergic neurons establishing inhibitory synapses and glutamatergic neurons establishing excitatory synapses. We showed that Celloscope was able to spatially distinguish between the two of them: GABAergic neurons were found mainly in the olfactory bulb and olfactory cortex, while glutamatergic neurons were found mainly in the cerebral cortex [23]. These results were similar to those obtained in [25], albeit using a different technology.

Glia do not produce electrical impulses but rather provide support and protection for neurons. Celloscope identified that, one of the main cell types of glial cells, astrocytes, which surround and support neuron functioning, are localized throughout the entire examined sample, similarly as found in [24]. Microglia, the immune cells of the central nervous system, are constantly testing the environment for signals of malfunctioning and acting in the event of trouble. Their function justifies their omnipresence in limited quantities throughout the examined sample, as correctly identified by Celloscope.

Dopaminergic neurons synthesize the neurotransmitter dopamine. Similarly as in [26], these cells were found by Celloscope in the olfactory bulb. What is more, as expected, choroid plexus epithelial cells were localized in the choroid plexus and olfactory ensheathing glia cells were found in the olfactory bulb.

Comparison to CellAssign for the sagittal mouse brain section data

To show the benefits of accounting for the presence of cell type mixtures in spots as opposed to assuming they contain cells of single types, we compared the Celloscope’s outcomes to results obtained with CellAssign [21]. Since CellAssign was originally developed to assign types to single cells based on scRNA-seq data, applying this approach to ST data is equivalent to considering each spot as homogeneous with respect to cell types, as if each spot was a single cell. Similarly to Celloscope, CellAssign correctly delineates mouse brain regions, assigning spots to dominating cell types for each region (Fig. 3B). In contrast to Celloscope, however, CellAssign per construction cannot identify cell types that are present in the examined tissue in lower prevalence, such as astrocytes, endothelial cells, and microglia. For instance, while Celloscope indicates that astrocytes tend to occur in low amounts, mostly in the cerebral cortex, CellAssign omitted this cell type almost entirely, indicating only 33 spots out of 2696 to be dominated by astrocytes. Lastly, microglia endothelial cells were both identified in only six spots. On this account, we distinguished two groups of identified cell types: indicated both by Celloscope and CellAssign (Fig. 3C) and cell types that were identified only by Celloscope (Fig. 3D). In summary, the results obtained by Celloscope and CellAssign were in agreement; however, the Celloscope’s inherent feature of accounting for cell type mixtures enables it to provide a more comprehensive and more insightful description of the cell type composition of the tissue in hand.

Spatial autocorrelation of cell types for the sagittal mouse brain section

Given the naturally occurring tissue organization and structure, it is expected that neighboring spots will display spatial similarity and cells of the same type will co-localize. Note that Celloscope treats all spots as independent, regardless their position and potential proximity. As a consequence, spatial correlation across spots is not enforced in the model and can be used to validate the model’s performance. To this end, we calculate the Moran's I coefficient [28, 29] to quantify the level of spatial autocorrelation of inferred cell type proportions (Fig. 3C, D). The Moran’s I coefficient takes values from \(-1\) to 1, where \(-1\) indicates perfect dispersion, 0 perfect randomness (no autocorrelation), and 1 perfect clustering of homogeneous values. Therefore, high values of the Moran's I coefficient indicate that the inferred cell types cluster in space. We observe very high spatial autocorrelation for the majority of cell types and moderate for microglia and endothelial cells; however, in all cases, the obtained spatial autocorrelation is non-negligible. Note that the level of spatial autocorrelation for Celloscope is similar to the autocorrelation level obtained by CellAssign, despite the fact of solving a more demanding and cumbersome task of cell type deconvolution as opposed to assigning the dominant cell type to a spot. Importantly, those cell types that were found in the tissue only by Celloscope and not by CellAssign also show spatial autocorrelation.

Spatial co-occurrence and mutual exclusivity between cell types for the sagittal mouse brain section

The cell type composition of spots resolved by Celloscope allows investigating spatial co-occurrence and exclusivity of cell types (Fig. 3E). We find that GABAergic neurons tend to spatially co-occur with dopaminergic neurons and GABAergic neurons subtype and glutamatergic neurons with astrocytes. On the other hand, Celloscope results suggest that oligodendrocytes avoid co-localizing with dopaminergic, glutamatergic and GABAergic neurons.

Comparison to STdeconvolve for the sagittal mouse brain section

Furthermore, we compared Celloscope’s performance to STdeconvolve [16], which also does not require a reference in a form of scRNA-seq data to estimate cell type expression profiles. In contrast to Celloscope, the inference in STdeconvolve is not guided by any prior knowledge. Instead, this model works in a fully unsupervised manner and discovers latent topics in gene expression, which are interpreted as cell types. These identified topics are not annotated and need to be further interpreted to assign some specific cell type to each topic. To this end, the transcriptional profiles of the topics inferred by STdeconvolve should be compared to known transcriptional profiles of specific cell types, or gene set enrichment analyses should be performed based on a list of reference gene sets for different cell types.

When applied to the sagittal mouse brain section data, STdeconvolve (see Additional file 1: Section S6 for run setting) identified 10 topics, which in the majority of cases by visual inspection of their spatial localization could be matched to a subset of 12 cell types found also by Celloscope (Additional file 1: Fig. S1). However, while all but one topic identified by STdeconvolve form evident, specific spatial structures, Celloscope indicated three cell types, which did not manifest such a behavior (astrocytes, endothelial cells and microglia). The omnipresence of these three cell types across the whole brain tissue is justified by their function and was reported previously [25]. For example, endothelial cells form the lining of your blood vessels that entwine the whole brain [30]. This suggests that topics found by STdeconvolve can potentially correspond not to a single but several types of cells that happen to be present in the same region of the tissue.

Moreover, Celloscope distinguished between vascular and leptomeningeal cells and olfactory ensheathing glia, in contrast to STdeconvolve that seemingly merged those two cell types (annotation of the corresponding topics was performed here by matching the spatial localization of known cell types found by Celloscope to the unannotated topics found by STdeconvolve).

Celloscope’s results obtained for mouse brain coronal section data are in agreement with the results for the sagittal section

Further, we performed an analysis of spatial transcriptomics data for a coronal mouse brain slice [31] that is orthogonal to the sagittal section analyzed above (Fig. 4A). As we were analyzing the same tissue type, we were locating the same cell types, namely oligodendrocytes, astrocytes, GABAergic neurons, glutamatergic neurons, choroid plexus epithelial cells, endothelial cells, microglia, and vascular and leptomeningeal cells. However, since in our previous analyses (Fig. 3A) a portion of cells was assigned to the dummy type, we also considered two additional cell types, i.e., cholinergic neurons, peptidergic cells, and di-mesencephalon neurons. The markers for the considered cell types were found using the marker identification procedure based on lead genes (Methods), taking candidate marker genes from the dataset of Zeisel et al. [32].

We observed consistency in cell type composition inferred by Celloscope between the two sections, particularly for choroid plexus epithelial cells, oligodendrocytes, and glutamatergic neurons. Similarly, as for the sagittal section, we found that microglia, astrocytes, and endothelial were omnipresent throughout the entire investigated sample. Finally, we found that the spatial localization of di-mesencephalon neurons and peptidergic cells inferred by Celloscope agreed with their known position in the mouse brain.

Again, we computed the Moran's I coefficient to quantify the level of spatial autocorrelation of the inferred cell types’ proportions (Fig. 4B). Very high values of the Moran's I coefficient were acquired for the vast majority of cell types (choroid plexus epithelial cells, GABAergic neurons, glutamatergic neurons, oligodendrocytes, and peptidergic cells), indicating that the inferred cell types clustered in space. Further, we investigated spatial co-occurrence and exclusivity of cell types (Fig. 4C), finding that cholinergic neurons tend to co-occur with peptidergic cells, while glutamatergic neurons tend to avoid peptidergic cells and oligodendrocytes.

As an additional validation, we considered two subtypes of the glutamatergic neuron cell type, assuming candidate marker genes reported in an independent study [32], different from the study of Ximerakis et al. [33] that we used to find the markers for the glutamatergic neuron cell type. To this end, we constructed a new prior cell type-marker matrix, now including the three subtypes instead of the glutamatergic neuron cell type. As expected, the different subtypes occupy distinct sub-regions of the mouse brain (Additional file 1: Fig. S2), and their total abundance agrees with the localization of the glutamatergic neuron cell type. This analysis indicates that our method is returning consistent results irrespectively of the collection of marker genes and the original source of the candidate genes for markers.

Fig. 4figure 4

Results obtained for the anterior part of the mouse brain (coronal section). OLG, oligodendrocytes; OEG, olfactory ensheathing glia; ASC, astrocytes; GABA, GABAergic neurons; GLUT, glutamatergic neurons; DI-M, di-mesencephalon neurons; CPC, choroid plexus epithelial cells; EC, endothelial cells; MG, microglia; VLMC, vascular and leptomeningeal cells; CHOL, cholinergic neurons; PEPTI, peptidergic cells; DT, dummy type. A Heatmaps represent spatial composition for selected cell types. Dark violet indicates the absence of the cell type in question, yellow signalizes moderate occurrence, and magenta dominance of a given type. B Moran's I coefficient for cell types. C The correlation matrix heatmap represents the values of the Pearson correlation coefficient for all studied cell types, the positive values in red, negative in blue. 0 indicates that there is no relationship between studied variables. “X” denotes an insignificant correlation (p-values of the test with the test statistics based on Pearson’s product moment correlation coefficient \(p \le 0.05\))

Comparison to STdeconcolve and SpatialDWLS for the coronal mouse brain section

We applied STdeconvolve [16] to the coronal mouse brain section data (for run setting see Additional file 1: Section S6). The obtained topics visibly cluster in space; however, again as in the case of sagittal section data, the results lack topics that would correspond to omnipresent cell types that are expected due to their function in the mouse brain, such as microglia and endothelial cells (Additional file 1: Fig. S3). This dataset was also previously analyzed with SpatialDWLS [13]. In contrast to Celloscope, SpatialDWLS benefits from using a reference scRNA-seq dataset. Despite this handicap, this method reported similar subtypes as found by Celloscope, with small differences. For example, SpatialDWLS reported slightly different spatial localization of granule neurons (Additional file 1: Fig. S2).

Sensitivity to the choice of marker genes

Finally, we also investigated Celloscope’s sensitivity to the choice of marker genes by running it with the same run settings on the sagittal mouse brain section but different sets of marker genes (Additional file 1: Figs. S4 and S5). These four different marker genes sets were acquired with four different sets of thresholds for the marker gene selection procedure (Methods). These thresholds are used to ensure correlation of the found marker genes with given lead marker genes for each cell type. As the values of the thresholds decrease and the procedure becomes less restrictive, the number of selected genes rises. Consequently, the found marker genes increasingly overlap between the cell types. Still, the acquired results for the different thresholds were generally in agreement, with only minor differences, proving Celloscope is insensitive to changes in marker gene sets for high quality data.

Celloscope elucidated the source of inflammation in a human prostate tissue

Next, we applied Celloscope to analyze human prostate data [20]. The analyzed dataset contained twelve sections from different regions of a resected prostate, which were profiled using ST. Several of these sections contained cancerous tissue. We selected two sections (3.1: Fig. 5B; 4.2: Fig. 6B), where infiltrations of immune cells were visible in their respective H&E images. We applied Celloscope to investigate whether those infiltrations could be associated with an ongoing tumorigenesis in these areas, as was observed by [34, 35], or whether it was due to some other inflammatory process. Notably, this information could not be derived from the H&E image alone, as the fine subtypes of the detectable cells were not distinguishable visually. For example, mononuclear cells with abundant, foamy cytoplasm indicating macrophages or cells with multilobed nucleus indicating neutrophils could be detected in H&E, but it was not possible to distinguish subtypes of lymphocytes (eg., T cells, B cells or NK cells). Similarly, raw gene expression data, measured using ST in this area, was not directly indicative of the type of the visible inflammation. Since detailed dissection of infiltrating immune cell identity is not feasible with classical histopathological inspection, nor directly from ST measurements, computational tools, such as Celloscope, are required to fill this gap.

To resolve the immune cell composition in those regions, we aimed at identifying the following immune cell types across spots: B cells, CD4+ T cells (helper T cells), CD4+ effector memory T cells, cytotoxic CD8+ T cells, dendritic cells, \(\upgamma \updelta\) T cells, M1 and M2 macrophages, neutrophils, monocytes, natural killer cells, and natural CD4+ regulatory T cells (Tregs). However, we also took into consideration non-immune cells that are expected to be present in the prostate tissue: endothelial cells, epithelial cells, and fibroblasts.

Celloscope identif

留言 (0)

沒有登入
gif