Cancer functional states-based molecular subtypes of gastric cancer

Cancer functional states identify three GC subtypes

Based on the 14 cancer functional status scores calculated by the GSVA method, we divided the TCGA-STAD dataset into 3 subtypes using the consensus clustering method (Fig. 1A, Additional file 2: Table S1). Principal component analysis found that the cluster 1 (C1) and the cluster 2 (C2) could be significantly separated, while the cluster 3 (C3) was distributed between the two subtypes, and it seemed that C3 was a category of transition features (Fig. 1G). We found that C3 is more like a cluster of the same characteristics picked from C1 and C2 (Additional file 1: Fig S1). KM analysis shows that C1 tended to have a worse survival prognosis, C2 has the best survival. The C3 has no significant survival difference, but the trend of the curve shows that the overall survival time of C3 is between C1 and C2 (Fig. 1B). C1 had low expression of cell cycle, DNA damage and DNA repair, and the remaining 11 cancer functional states had low levels. C2 is just the opposite of C1. C3 has high level in all cancer functional states (Fig. 1A). Similar results were also observed in the GPL570 dataset and GSE84437 (Fig. 1C–F, H, I).

Fig. 1figure 1

Three subtypes of gastric cancer based on cancer functional states. A Consensus clustering was performed to divide the TCGA-STAD cohort into three gastric cancer subtypes based on the cancer functional status scores calculated by the GSVA method. B Kaplan–Meier curves of OS in TCGA-STAD cohort between three subtypes. C Consensus clustering was performed to divide the GSE84437 cohort into three gastric cancer subtypes. D Kaplan–Meier curves of OS in GSE84437 cohort between three subtypes. E Consensus clustering was performed to divide the GPL570 meta cohort into three gastric cancer subtypes. F Kaplan–Meier curves of OS in GPL570 meta cohort between three subtypes. G, H and I PCA confirmed that the three subtypes of three cohorts could be distinguished significantly

Signaling pathways and immune cells of three GC subtypes

To explore the functional status among the three subtypes, we calculated scores for all Kyoto Encyclopedia of Genes and Genomes (KEGG) signaling pathways using the GSVA method. Interestingly, almost all metabolic related KEGG signaling pathways were highly expressed in C3, suggesting that cells of C3 are in the status of high levels of metabolism (Additional file 1: Fig S2). The activities of the 4 energy metabolic pathways (glycolysis/gluconeogenesis, citrate cycle, pentose phosphate pathway and oxidative phosphorylation) are: C1 < C2 < C3 (Fig. 2A). Immune-related pathway analysis showed that C3 had the highest immune activity, followed by C2, and C1 was the lowest (Fig. 2B). In addition, we also analyzed the expression of 10 cancer-related signaling pathways, all of which are highly expressed in C3 (Fig. 2C).

Fig. 2figure 2

Immune and signaling pathway heterogeneity among the three subtypes. A Comparison of four energy metabolism pathways among three gastric cancer subtypes. B Comparison of KEGG immune pathways among three gastric cancer subtypes (The color of the squares indicates the high and low average scores of all samples within each subtype). C Comparison of 10 cancer-related pathways among three gastric cancer subtypes. D Comparison of Stromal score, Immune score and ESTIMATE score among three gastric cancer subtypes. E Comparison of IFN-γ and TNF signaling pathway among three gastric cancer subtypes. F Relative proportion of 22 infiltrating immune cells estimated by CIBERSORT among three gastric cancer subtypes. The above analyses are all performed in TCGA-STAD

The immune and stromal scores were calculated by the ESTIMATE algorithm. Similar to the results of the signaling pathway analysis, these data confirmed that C1 and C2 had the highest and lowest immune score, stromal score and ESTIMATE score, respectively (Fig. 2D). In addition, we also used the CIBERSORT algorithm to estimate the proportions of 22 immune cells in GC samples. C1 was characterized by high B cells naive, B cells memory, plasma cells, T cells CD4 memory resting, T cells regulatory (Tregs), dendritic cells resting and mast cells resting (Fig. 2F). C3 was remarkably rich in T cells CD4 memory activated, T cells follicular helper, NK cells resting, NK cells activated, macrophages M0, macrophages M1, macrophages M2 and neutrophils (Fig. 2F). C2 is characterized by low infiltration of all immune cells. Furthermore, to assess the immune cytolytic activity of GC samples, we analyzed the differences in IFN-γ and TNF signaling pathway among the three subgroups. Interestingly, C3 was the highest in both scores, indicating that C3 has the strongest immune cytolytic activity (Fig. 2E).

Genomic alterations of three GC subtypes

To explore the genomic differences between three subtypes, we analyzed the CNV profile of TCGA-STAD with a threshold of q < 0.05. After removing germline CNVs, we found that C3 had lower levels of arm- and focal-level CNVs compared to C1 and C2 (Fig. 3A–C). Furthermore, C1 and C2 had a higher proportion of chromosomal instability (CIN) patients relative to C1 (Fig. 3D). The above results demonstrate that C3 has better genomic stability. This was also demonstrated by differences in the number of genes affected by CNV among the three subtypes.

Fig. 3figure 3

Comparison of genomic alterations in three subtypes in the TCGA-STAD cohort. A, B and C Comparison of the somatic copy number variations among three gastric cancer subtypes. D Comparison of TCGA gastric cancer subtypes among three gastric cancer subtypes. E comparison of tumor mutation burden among three gastric cancer subtypes. F Top 20 mutated genes in all gastric cancer patients. G Top 20 differentially mutated genes between gastric cancer subtypes

Next, we analyzed the differences in gene mutations among the three subtypes. We analyzed the top 20 genes with different mutation frequency among the three subtypes, 13 of which (TTN, MUC16, FAT3, LRP1B, SYNE1, CSMD3, PCLO, FAT4, DNAH5, ZFHX4, RYR2, CSMD1 and KMT2D) were also in the top 20 mutated genes for all GC samples (Fig. 3F, G). We found that C3 has a higher mutation frequency than C1 and C2, and C2 has a higher mutation frequency than C1. This is consistent with the distribution of microsatellite instable (MSI) patients among the three subtypes (Fig. 3D). This was also supported by differences in tumor mutational burden (TMB) among the three groups (Fig. 3E). These results indicated that gene mutations may be associated with the phenotype of GC subtypes.

Cancer functional states score as a marker for prognosis

Significant OS differences between subtypes were observed, so we thought whether the cancer functional status score could be a prognostic marker. The Lasso-Cox algorithm was used to identify the most robust cancer functional states pathways for prognosis prediction after obtaining 14 gene set scores of all samples from the TCGA-STAD cohort (Fig. 4A). Finally, risk models associated with cancer functional status pathways were developed as fellow: -0.00135455854945771*Cell Cycle-0.416041346360079*DNA damage + 0.440674355732646*Hypoxia + 0.10833269488575*Invasion. As shown in Fig. 4B, KM analysis demonstrated that patients with higher risk score exhibited worse overall survival in TCGA-STAD (HR = 1.78, 95% CI = 1.27–2.49, P = 7.5e-4). In order to validate the stability of the model, we performed analysis in two additional cohorts. Similar results were also observed in GPL570 cohort and GSE84437(GPL570 dataset: HR = 1.66, 95% CI = 1.32–2.07, p = 8.7e-6; GSE84437: HR = 2.33, 95% CI = 1.59–3.41, p = 8.0e-6; Fig. 4C, D).

Fig. 4figure 4

Construction of caner functional states associated prognostic model. A Partial likelihood deviance for the lasso regression and Lasso regression analysis. B Patients were divided into high-risk and low-risk subgroup based optimal cutoff, Kaplan–Meier analysis demonstrated that patients with higher risk score exhibited worse overall survival in TCGA-STAD. C, D The prognostic difference was validated in GSE84437 and GPL570 cohort

Stromal cells dominate the cancer functional state

Since tumor tissue is not composed of single type of cells, including epithelial cells, immune cells, and mesenchymal cells, it is necessary to explore the differences between various cell types among GC subtypes. The original data contained many tumor-adjacent normal samples and some un-primary tumor samples. But we selected the primary tumor samples for analysis. Detailed clinical and pathological information is provided in Additional file 2: Table S2. After following our standard quality control, we obtained 93583 cells of 26 GC samples for subsequent analysis. We thought that by calculating the average gene expression of all cells in each single-cell sample, we can obtain approximate bulk-level sequencing results. We first randomly divide TCGA-STAD into two parts according to 7:3, the former is used as the training cohort, and the latter is used as the validation cohort. The model was then trained in the training dataset based on the 14 cancer functional status scores, which had an area under the curve (AUC) value of 1 in the training dataset. Shockingly, in the validation cohort, the AUC also reached 0.9909. Then we used the trained model to predict the GC subtypes of the single-cell cohort samples, and finally got 13 patients of C1, 11 of C2, and 2 of C3 (Additional file 2: Table S3). The distributional characteristics of cancer functional status were also consistent with those of the TCGA-STAD cohort (Fig. 5A), which proves the robustness of the classifier. Based on res of 0.8, a total of 27 cell clusters are obtained (Fig. 5B). We then identified five major cell types, including epithelial cells, T cells, B cells, stromal cells, and myeloid cells (detailed markers in Additional file 2: Table S4, Fig. 5C). The heatmaps of these cancer functional status scores showed predominantly high scores in stromal cells (Fig. 5D). So, we next analyzed the differences of stromal cells among the three subtypes.

Fig. 5figure 5

Identification of all cell subtypes in single-cell samples and comparison of stromal cells between three gastric cancer subtypes in GSE183904. A Heatmap displays of 14 cancer functional states for three subtypes of predicted single-cell samples. B, C The tSNE plot of single cells profiled in the presenting work colored by 27 cell clusters and major cell types. D 14 cancer functional states score between three gastric cancer subtypes. E, F The tSNE plot of stromal cells profiled colored by major stromal cell types and three subtypes. G KEGG pathway enriched in marker genes of five stromal cells. H Top 5 marker genes of 5 stromal cells. I Heatmap shows normalized activity of top 5 TF regulons in three subtypes predicted by pySCENIC. J Potential trajectory of iCAF inferred by SCORPIUS. K Kaplan–Meier analysis of high- and low myCAF infiltration in TCGA-STAD. L Comparison of EMT activity of myCAF between three gastric cancer subtypes. M Kaplan–Meier analysis of high- and low iCAF infiltration in TCGA-STAD. N Comparison of iCAF infiltration between three gastric cancer subtypes

The stromal cells can be re-clustered into 14 clusters with res of 0.3. We annotated these stromal cells as pericytes, myo-cancer associated fibroblasts (myCAFs), undefined cancer-associated fibroblasts (undefined-CAFs), inflammatory cancer-associated fibroblasts (iCAFs), and endothelial cells (detailed markers in Additional file 2: Table S5; Fig. 5E). Remarkably, myCAF is a specific cell type of the C1 subtype, implying that myCAF may be involved in the formation of C1 subtype-related characteristic (Fig. 5E, F). We performed KEGG pathway enrichment analysis on the marker genes of myCAF. The results suggested that myCAF was in a relatively active state, which may be related to its tumor-promoting effect and is consistent with C1 having the worst prognosis (Fig. 5G, H). KM analysis also suggested that GC patients with high myCAF infiltration have poor OS (p = 0.02; Fig. 5K). We investigated developmental trajectories in endothelial cells, iCAF and pericytes, respectively. But in endothelial cells and pericytes, we did not find obvious trajectory variety among the three clusters. Only the developmental pattern of C1-C2-C3 presented in iCAF (Fig. 5J). We also performed a scenic analysis in iCAF. The results indicate that MAFK and TWIST2 are highly activated in C3 (Fig. 5I). Previous studies have shown that these two transcription factors induce EMT and then promote tumor progression [30, 31], which is consistent with the high EMT score in C3 (Fig. 5L). At the same time, we found that high iCAF infiltration means worse OS, which may be related to the ability of iCAF to promote tumor progression (Fig. 5M). We also found that iCAF infiltration was also highest in C1 and lowest in C3 among the three subtypes in the TCGA-STAD cohort (Fig. 5N).

C3 has better response to immunotherapy

We annotated T cells as CD4 + T cells, CD8 + T cells, NK T cells, regulatory T cells (Tregs) and proliferative T cells (detailed markers in Additional file 2: Table S6, Fig. 6A). Since T cells play an important role in antitumor immunity, we next investigated the characteristics heterogeneity of T cell between three subtypes. There was no significant difference in the number of various types of T cells among the three subtypes (Fig. 6B). To investigate the differences in the cytotoxicity of the three subtypes of T cells, we focused on CD8 + T cells and NK T cells. We found that C3 had stronger tumor-killing activity, as evidenced by either the IFN-γ pathway and the expression of the IFNG gene or the cytotoxicity score (Fig. 6C, D). And NK T cells of C3 seem to have more significant tumor-killing activity than C1 and C2 (Fig. 6C, D).

Fig. 6figure 6

Comparison of T cells and immunotherapy response between three gastric cancer subtypes in GSE183904. A The tSNE plot of T cells profiled colored by 5 T cell types. B Comparison of 5 T cells of three subtypes. Comparison of IFN-γ signaling pathway, IFNG gene, and cytotoxicity score between three subtypes in CD8 + T cell (C) and NK T cell (D). E Comparison of exhaustion score and PDCD1, and LAG3 in NK T cell between three subtypes. F Heatmap displays of 14 cancer functional states for three subtypes in the IMvigor210 cohort. G Comparison of TIDE score, dysfunction score and exclusion score between three subtypes in TCGA-STAD. H Kaplan–Meier curves of OS in the IMvigor210 cohort between three subtypes. I, J Proportion of patients with response to ICI immunotherapy in three subtypes in the IMvigor210 cohort. CR, complete response; PR, partial response; SD, stable disease; PD, progressive disease. CR/PR was identified as responder, and SD/PD was identified as non-responder

Now that C3 has the strongest antitumor ability, we next analyzed whether the three GC subtypes had different responses to immunotherapy. Among CD8 + T cells, C3 had higher exhaustion scores and immune checkpoint gene expression (PDCD1, LAG3), which proved that C3 may have a better response to immunotherapy (Fig. 6E). We found that at the bulk level, C3 also had higher TIDE scores than C1 and C2, which is consistent with our findings at single-cell resolution (Fig. 6G). We also analyzed differences between the three subtypes in a cohort of urothelial carcinomas undergoing immunotherapy. First, we typed the patients in the IMvigor210 cohort as three subtypes based on the previously established subtype prediction model, and the three subtypes also had the same characteristic distribution as the previous gastric cancer cohort (Fig. 6F). The three subtypes also had the same survival differences as the GC cohort (Fig. 6H). The proportion of immunotherapy responders was significantly higher in C3 compared with C1 and C2, especially the proportion of complete responders (Fig. 6I, J).

Subtype-related treatment strategies

Chemotherapy is an important strategy in cancer treatment. The Cancer Genome Project (CGP) database was used to predict chemotherapeutic response. The results showed that 5-Fluorouracil, cisplatin, docetaxel, mitomycin C and paclitaxel were more suitable for patients of C1 (Fig. 7). We used the CMap database to predict drugs with specific therapeutic effects on the three subtypes. We predicted many drugs specific to the three subtypes (Additional file 2: Table S7).

Fig. 7figure 7

The estimation of chemotherapy response for three gastric cancer subtypes in TCGA-STAD. A–E The chemotherapy response of three subtypes for five common chemotherapy drugs

留言 (0)

沒有登入
gif