Construction and validation of a regulatory T cells-based classification of renal cell carcinoma: an integrated bioinformatic analysis and clinical cohort study

3.1 ScRNA-seq analysis revealed molecular characteristics of CD4+ T cells in RCC

The schematic diagram and detailed flowchart were respectively presented in Fig. 1A and Figure S1, illustrating the primary processes of our study. Generally, we firstly performed scRNA sequencing, annotation, trajectory analysis, and prognostic analysis (TCGA cohort) to determine the TSRPGs. Secondly, we established a novel molecular classification system with three RCC subtypes (RCC-TSC), which demonstrated significant prognostic and immunological implications for RCC patients. Finally, we validated RCC-TSC in Xinhua cohort (370 patients) with IHC staining and statistical analyses, and a diagnostic kit was designed, which could be applied for patient diagnosis in the near future.

For scRNA-seq analysis, CD4+ T cells were specifically extracted, and 24 different cell clusters were annotated among the 12 samples (Fig. 1B). We firstly demonstrated all DEGs in a scatter diagram (Figure S2A). The relative expression level and spatial distribution of the significant marker genes were demonstrated for Tregs (Fig. 1C) and other CD4+ T cells (Figure S2B). Following that, the cleveland dot plot depicted the expression levels of the four key marker genes (CD3D, FOXP3, IL17A, IL21) across all 24 cell clusters (Figure S3A). CD3D was highly expressed in all clusters, while FOXP3 had higher expression in four Treg clusters and two clusters with mixed cells. Additionally, the percentage of each sample for the cell types were shown in bar plots, and Figure S3B visualized the cell numbers and proportions of the 12 sample sources. Also, the top 5 key genes of each cell type were illustrated in a heatmap (Figure S3C). In cellular communication analysis, Figure S3D showed that Tregs had many interactions with other CD4+ T cells. The cell cycle score plots revealed that a high proportion of S1PR1 + Treg and NME1 + mix T cells were at G2M phase (Figure S3E).

3.2 Integrated analyses of DEGs, cellular communication, and functional analysis in Tregs

Tregs exert significant inhibitory effects on the TIME, and to better elucidate the biological functions of Tregs in RCC, we extracted the Treg clusters for further analysis (Fig. 1D). Next, Figure S4A shows the expression levels of four key marker genes (S1PR1, TNFRSF9, RTKN2, OAS1) that characterized the four clusters. In Figure S4B, heatmap of the top five key genes for each Treg cluster displayed great heterogeneity among them. Cellular communication analysis revealed strong and complex interactions among the four clusters (Figure S4C). Considering the biological functions, GSVA identified the activated cancer hallmarks (Figure S4D). Notably, Hedgehog signaling, Kras signaling dn, and epithelial mesenchymal transition signaling pathways were most activated in all the four Treg clusters.

3.3 Trajectory analysis unveiled the temporal dynamics of gene expression and determined different Treg states

To figure out the heterogeneity and temporal dynamics of the four Treg clusters, we performed the trajectory analysis. Figure 1E illustrates the distribution of Tregs across different pseudotime points and spatial locations. With two bifurcation points, five Treg trajectory states were determined. Trajectory states 1 and 5 were positioned at the beginning of pseudotime, whereas trajectory states 3 and 4 were situated at the terminal branch. The four Tregs clusters were also plotted at the trajectory (Fig. 1E, Figure S5A). Comparatively, RTKN2-Treg and OSA1-Treg were respectively located at the beginning and ending of the pseudotime. Furthermore, the heatmap depicted that differential gene expression resulted in the divergence of distinct cell states (Figure S5B), validating their pivotal role in determining Treg states. As a result, these genes were termed Treg states-related genes (TSRGs), which were the candidate biomarkers in determining the Treg states.

3.4 Determination of the Treg States-related prognostic genes (TSRPGs)

Though the TSRGs were significantly associated with Treg temporal dynamics and states, their prognostic implications required further exploration. Consequently, we performed prognostic analysis on the 537 patients in the TCGA-KIRC cohort, whose baseline information is demonstrated in Table S1. As a result, 44 significant genes in univariate Cox proportional hazards regression analysis (Unicox_sig, p < 0.001) and 82 significant genes in K-M survival analysis (KM_sig, p < 0.001) for OS were obtained. Finally, we intersected the TSRGs, Unicox_sig, and KM_sig to build a gene set called the TSRPGs, which contained 44 genes (Fig. 1F). The prognostic information of the TSRPGs were demonstrated in Figure S6A and Table S2. We also built the co-expression network to visualize the relationship between the TSRPGs (Figure S6B). Moreover, TSRPGs in state 1 and state 3 were all identified as risk factors, while KLF6 in state 4 was a favorable prognostic factor.

3.5 Prognostic and immunological implications for RCC-TSC

To improve the integration and operability of the TSRPGs for prognostic prediction, we used clustering analysis to construct a classification system based on the expression levels of TSRPGs from TCGA database. As a result, we named the classification system of three RCC molecular subtypes as RCC Treg states-related prognostic classification (RCC-TSC). The CDF curve, delta area of the CDF, and consensus matrix were demonstrated in Figure S7A-C respectively. In Fig. 2A integrates clinical information, disease conditions, expression levels of TSRPGs, trajectory states, and three clusters in the heatmap. More importantly, in Fig. 2B, the survival probability of cluster 2 was the lowest among the three clusters (p < 0.001), manifesting the significant prognostic relation of RCC-TSC. It should be noted that cluster 2 and cluster 3 respectively had the highest and the lowest expression of TSRPGs in state 1 and 3 (all were risk factors), which could partially explain their differential prognosis (Fig. 2C). The immune infiltration of most immune cells was highest in cluster 2, particularly activated CD8 + T cells, activated dendritic cells (DCs), MDSCs, macrophages, NK cells and various CD4+ T cells (Fig. 2D). Consequently, we renamed cluster 1, 2, 3 as moderate immune infiltration RCC (MIRC), high immune infiltration RCC (HIRC), low immune infiltration RCC (LIRC) respectively.

To further explore the prognostic relevance of RCC-TSC, and to reduce the potential bias introduced by categorical variables, we conducted PCA for the three RCC subtypes. The results were illustrated in Figure S7D, indicating the HIRC was greatly separated from LIRC. The PCA scores of patients in HIRC and LIRC were correspondingly the highest and the lowest (Fig. 2E). Then, Figure S7E suggests that high PCA scores of patients signified significantly shorter OS (p < 0.001). Finally, Figure S7F shows that PCA scores were positively correlated to the infiltration of most immune cells. In summary, using TSRPGs, we proposed an effective molecular classification with distinct OS and immune infiltration features for RCC patients.

Fig. 2figure 2

Construction of a novel molecular classification (RCC-TSC) with significant prognostic and immunological implications. A The heatmap integrated the clinical information, disease conditions, expression levels of TSRPGs, trajectory states, and three RCC subtypes. B In the K-M survival plot, the survival probability of LIRC and HIRC were the highest and the lowest respectively (p < 0.001), manifesting the significant prognostic relation of RCC-TSC. C The expression level of TSRPGs was compared among the three clusters, which was all significant. Notably, cluster 2 (HIRC) and cluster 3 (LIRC) respectively had the highest and the lowest expression of genes in state 1 and 3, which were all risk factors. D The immune infiltration of most immune cells was highest in cluster 2, especially activated CD8 + T cells, activated DCs, MDSC, macrophage, NK cell and various CD4+ T cells. E The PCA scores of patients were calculated, while PCA scores in HIRC and LIRC were correspondingly the highest and the lowest. F The TIDE score was highest in HIRC, followed by MIRC and LIRC. G The dysfunction score was highest in HIRC, indicating HIRC had the highest number of T cells (especially CD8 cells) at late stage of dysfunction. H More patients in HIRC had higher expression for both CTLA4 and PD1. I Patients in H-PCA group had higher expression for both CTLA4 and PD1. RCC-TSC, renal cell carcinoma-Treg states-related prognostic classification; TSRPGs, Treg states-related prognostic genes; LIRC, low immune infiltration RCC, HIRC, high immune infiltration RCC. DC, dendritic cell; MDSC, myeloid-derived suppressor cell; PCA, principal component analysis; TIDE, tumor immune dysfunction and exclusion; MIRC, moderate immune infiltration RCC; H-PCA, high-PCA group

3.6 Unveiling the characteristics of TIME for three RCC subtypes

Given the significant association between RCC-TSC and prognosis, as well as immune infiltration in RCC, we further investigated its relationship with immune dysfunction phenotypes. CD8+, IFNG, and Merck 18 (T-cell-inflamed signature) scores were highest in HIRC (Figure S8A-C). However, the TIDE score (Fig. 2F), dysfunction score (Fig. 2G), were also highest in HIRC, followed by MIRC and LIRC. It indicated HIRC had the highest number of pro-inflammatory cells (especially CD8 + cells), but with a high proportion at late stage of dysfunction [50, 51], which were associated with resistance of ICB reprogramming [51, 52]. No significant difference was found for exclusion score (Figure S8D), while MIRC and LIRC respectively had highest infiltration of CAF, MDSC and TAM M2 (Figure S8E-G). Besides, MSI was highest in LIRC (Figure S8H). PD-L1 did not differ significantly among three subtypes (Figure S8I), but had higher expression in H-PCA group (Figure S8J). Furthermore, patients in HIRC or H-PCA had higher expression of CTLA4 and PD1 (Fig. 2H-I, Figure S8K-L). These analyses revealed that the TIME of HIRC patients with H-PCA was the most dysfunctional, potentially correlating with a higher response rate to immune checkpoint blockade (ICB) therapies.

3.7 Multi-omics analysis of genomics and epigenomics

We delved into the genomic and epigenomic landscapes. Figure S9A and Figure S9B respectively illustrate the top 20 significantly mutated genes in the H-PCA and L-PCA groups. The mutation rate in KIRC patients was approximately 80%, suggesting a high level of genomic instability. VHL mutation dominated in both groups, followed by PBRM1, TTN, BAP1, and SETD2. Next, TMB and MSI showed no significant differences between H-PCA and L-PCA groups, and exhibited no linear relationship with PCA score (Figure S9C). However, patients with high TMB had significantly better OS (Figure S9D). Moreover, patients with H-PCA and H-TMB had the longest OS among all KIRC patients (p < 0.001) (Figure S9E). Almost all the TSRPGs exhibited CNV, with frequent changes observed in PTTG1, LMNB1 and DTYMK (Figure S9F), located in chromosomes 5, 5 and 2 respectively.

3.8 Performing differential analysis for genes, TFs, cancer hallmarks between primary and metastatic KIRC samples

Given that TIME is closely associated with tumorigenesis, progression, metastasis, while metastasis usually indicated a poor outcome, we conducted the differential analysis between primary and metastatic KIRC samples. Figure S10A-B presents the DEGs between primary and metastatic KIRC. All TSRPGs were validated as DEGs, and were specifically pointed out in the volcano plot. Subsequently, DETFs were identified and visualized in Figure S10C-D. Then, GSVA was employed to compare the differentially activated cancer hallmarks (Figure S10E). Afterwards, the GSVA scores were quantified, with activated and inhibited pathways presented at the top and bottom respectively (Figure S10F). Unfolded protein response, DNA repair, E2F targets were most activated, while KRAS signaling dn, pancreas beta cells, spermatogenesis were most downregulated.

3.9 Construction of the regulatory networks for three RCC subtypes

To figure out the metastasis-related molecular mechanisms, we built the regulatory networks by correlation analysis. These networks included the midstream TSRPGs and critical biological components (upstream DETFs and downstream factors including immune cells, immune functions, cancer hallmarks, as well as proteomics) (Fig. 3A). Components in the regulatory networks may play critical roles in the metastasis of three RCC subtypes correspondingly. The correlation coefficients between all the key biological components are shown in Fig. 3B. In MIRC, the dominant regulatory TSRPGs were LMNB1 and PTTG1, both of which exhibited significant co-expression patterns with DETFs (MYBL2, NCAPG, CENPA, LMNB1), cancer hallmarks (E2F targets), protein (CyclinB1), and immune components (T cells CD4+ memory activated and Tfh). In HIRC, the most aggressive subtype, the regulatory TSRPGs were LAG3 and HSPA8. LAG3 exhibited strong co-expression with upstream BATF (R = 0.77), and downstream T cells CD8+ (R = 0.69), CD8 + T cells (R = 0.65), T cell co-inhibition (R = 0.74), and Tfh (R = 0.69). In LIRC, KLF6 and CXCL13 were the key TSRPGs in the regulatory network. CXCL13 showed a high correlation with IRF4 (R = 0.69) and Tfh (R = 0.7). Additionally, the expression levels of DETFs and TSRPGs in the three regulatory networks are illustrated in the heatmaps (Figure S11A). Furthermore, the 10 most significant pathways for the three RCC subtypes are displayed in the KEGG bubble plots (Figure S11B). Finally, ATAC-seq was conducted, validating the chromatin accessibility of six key regulatory TSRPGs (Figure S11C).

3.10 Prediction of drugs for three RCC subtypes

To individualize therapeutic strategies for the three RCC subtypes, we selected the drugs that could confer best therapeutic efficacy for each RCC subtype. We then utilized “pRRophetic” R package to predict the potential candidates of drugs (Fig. 3C). Drugs including sunitinib (a well-developed TKI for RCC patients) [2], cyclopamine (targets the Hedgehog pathway) [53, 54], bosutinib, lenalidomide, all-trans retinoic acid (ATRA) were potential drugs for HIRC. Drugs for treatment of MIRC and LIRC are respectively shown in Figure S12A-B. We subsequently mapped the expression matrix of three RCC subtypes with the cell lines in the CCLE database and the results are displayed with TSRPGs in Fig. 3D, and significant results among the predicted drugs for MIRC, HIRC, LIRC were respectively depicted in Fig. 3E and Figure S12C-D. Among them, selumetinib, pemetrexed, refametinib emerged as potential effective drugs for HIRC.

Fig. 3figure 3

The regulatory networks and potential inhibitors for three RCC subtypes. A The regulatory networks were built by correlation analysis, which contained the midstream TSRPGs and critical biological components (upstream DETFs and downstream factors including immune components, cancer hallmarks, as well as proteomics). B The correlation coefficients between all the key biological components were shown. In MIRC, the dominant regulatory TSRPGs were LMNB1 and PTTG1, which both exhibited significant co-expression pattern with DETFs (MYBL2, NCAPG, CENPA, LMNB1), cancer hallmarks (E2F targets), proteomics (CyclinB1), and immune components (T cells CD4 memory activated and Tfh). In HIRC, the most aggressive subtype, regulatory TSRPGs were LAG3 and HSPA8. LAG3 shown strong co-expression with upstream BATF (R = 0.77), and downstream T cells CD8 (R = 0.69), CD8 + T cells (R = 0.65), T cell co-inhibition (R = 0.74), and Tfh (R = 0.69). In LIRC, KLF6 and CXCL13 dominated. And CXCL13 had high correlation with IRF4 (R = 0.69) and Tfh (R = 0.7). C Utilizing the “pRRophetic” R package, the potential candidates of drugs were predicted. Drugs including sunitinib, parthenolide, roscovitine, cyclopamine, bosutinib, lenalidomide, ATRA, salubrinal were potentially feasible drugs specifically for HIRC. D After mapping the resemblances of the expression matrix of three RCC subtypes with the cell lines in the CCLE database, the results with TSRPGs were displayed. E A comparative analysis of the therapeutic efficacy of drugs was conducted across the three clusters, and significant results of HIRC were illustrated in the boxplots. Selumetinib, pemetrexed, refametinib, could be potential drugs for HIRC. RCC, renal cell carcinoma; TSRPGs, Treg states-related prognostic genes; DETFs, differentially expressed transcription factors; MIRC, moderate immune infiltration RCC; HIRC, high immune infiltration RCC; LIRC, low immune infiltration RCC; ATRA, all-trans retinoic acid; CCLE, Cancer Cell Line Encyclopedia

Potential binding patterns between LAG3 and sunitinib, cyclopamine were analyzed with molecular docking analysis (Figure S12E). When the sunitinib bonded with VAL-231, SER-182, and THR-266, it showed lowest affinity of -7.2 kcal/mol. While for the cyclopamine, it showed lowest affinity of -7.9 kcal/mol in the conformation bonded with ARG-210.

3.11 Construction and validation of a prognostic prediction model with TSRPGs

Using the TSRPGs to construct a prognostic prediction model, and demonstrated the expression level of TSRPGs between primary and metastatic KIRC (Fig. 4A). ORA unveiled TSRPGs were enriched in immune functions, cell cycle, cell division, and cancer-associated pathways (Figure S13A). The hazard ratio (HR) of TSRPGs in univariate Cox proportional hazards regression analysis are shown in Figure S13B. To mitigate overfitting of TSPRGs in our model, LASSO regression analysis was employed (Fig. 4B). A total of 15 genes, along with their corresponding coefficients were determined for the model. Subsequently, KIRC patients were randomly separated into a training set and a test set. The expression levels of the 15 genes were visualized in heatmaps (Figure S13C) and compared between the low-risk and high-risk patients in three sets respectively (Fig. 4C), all of which demonstrated statistical significances. Next, the risk score of each patient was calculated, and combined the risk score together with OS in scatter plots (Figure S13D). A high risk score was associated with shorter OS, as validated in K-M survival plots (Figure S13E). ROC curves and AUC values demonstrated the robust performance of the model (Figure S13F). Then, univariate and multivariate Cox proportional hazards regression analyses were used, validating risk score as a significant risk factor (Fig. 4D, HR = 65.827, 95% CI = 24.764-174.978, p < 0.001) and an independent risk factor (Fig. 4E, HR = 28.612, 95% CI = 9.318–87.856, p < 0.001). Moreover, Fig. 4F shows patients in high-risk group were significantly associated a higher probability of death, higher grade, T stage and M stage (Fig. 4G), and worse cancer status. In Fig. 4H presents the five most significant hallmarks for high-risk patients, prominently featuring cell cycle-related pathways, EMT, and inflammatory response pathways. In Fig. 5I demonstrates that adipogenesis and androgen responses signaling hallmarks dominate in the low-risk patients. GO and KEGG pathways for the two groups are respectively demonstrated in Figure S13G-H.

Fig. 4figure 4

Construction and validation of a prognostic prediction model with TSRPGs. A The heatmap demonstrated the expression level of TSRPGs between primary and metastatic RCC. B To mitigate the overfitting impact of TSPRGs in our model, we employed LASSO regression analysis, and 15 genes together with their corresponding coefficients were determined for the model. C The expression level of the selected genes was compared between the low-risk and high-risk patients in all, train, test sets respectively, which all demonstrated statistical significances. D Through univariate Cox regression analysis, risk score was verified as a significant factor (HR = 65.827, 95% CI = 24.764-174.978, p < 0.001). E Through multivariate Cox regression analysis, risk score was validated as an independent factor (HR = 28.612, 95% CI = 9.318–87.856, p < 0.001). F The Chi-square test demonstrated patients in high-risk group were significantly associated higher probability of death, higher grade, T stage, M stage, and worse cancer status. G High-risk patients had significantly lower proportion of M0 stage (p = 0.001). H The five most significant hallmarks for high-risk patients were presented, prominently featuring cell cycle-related pathways, EMT and inflammatory response pathways. I Adipogenesis, androgen responses signaling hallmarks were highly activated in the low-risk patients. TSRPGs, Treg states-related prognostic genes; RCC, renal cell carcinoma; LASSO, least absolute shrinkage and selection operator; HR, hazard ratio; CI, confidential interval

3.12 Exploring the association between immune components and KIRC patients with different risks

In the immune subtype analysis, low-risk KIRC patients were more frequently associated with C3-subtype (inflammatory) (Figure S14A, p = 0.002), characterized by elevated expression of Th17 and Th1 genes. The C3-subtype exhibited low to moderate cellular proliferation, lower levels of chromosomal aneuploidy, higher somatic CNVs, and the most favorable prognosis [55]. Next, the proportion of 22 immune cells were extracted (Figure S14B), and the immune cells and scores of immune functions were then compared between the low-risk and high-risk groups (Figure S14C-D). Significant differences were observed between high-risk group and low-risk group, which was further significantly associated with distinct survival probabilities (Figure S14E-F). Additionally, TIDE, dysfunction, CD8+, Merck18, and exclusion scores were higher in high-risk KIRC patients (Figure S15A-E). Low-risk KIRC patients exhibited higher MSI, CD274, TAM M2 and MDSC infiltration, but lower CAF infiltration (Figure S15F-K). These findings validated the significant association between immune components and KIRC patients with different risks.

3.13 RCC-TSC was validated as an independent prognostic factor in Xinhua cohort

The flowchart of our cohort study is shown in Fig. 5A, and the analytic processes primarily composed of three parts, including IHC staining for the retrospective cohort, molecular classification, and statistical validation. The relative expression levels of the markers used for IHC staining and classification are shown (Fig. 5B). MKi67 and LAG3 expressed highest in HIRC, while DNAJA1 and DNAJB4 had highest expression in LIRC. Our retrospective cohort contained 370 patients, and the detailed information was presented in Table 1; Fig. 5C. 6.22% of the patients were deceased and 11.08% patients suffered PFS censor. Considering the age category, patients in 21–40, 41–60, 61–88 groups were 8.92%, 52.43%, 38.65% respectively. 69.19% patients were male and most patients (88.65%) were diagnosed with KIRC. Besides, 48.11% and 28.11% patients were in T stage 1a and 1b, while only 0.54% patients had node metastasis and no patient had distant metastasis, manifesting that most patients in this cohort were in early stages. The K-M survival plots with significant results regarding OS and PFS for the clinical variables of our cohort are shown in Figure S16A-B respectively. We illustrate part of the results for IHC staining in Fig. 5D, and we classified the 370 patients as HIRC (122, 32.97%), MIRC (139, 37.57%), and LIRC (108, 29.19%) respectively. The IHC scores of the four markers were compared among the three subtypes (Figure S16C), demonstrating significant differences.

Fig. 5figure 5

IHC staining and scoring for classifying patients in Xinhua cohort based on RCC-TSC. A The flowchart of our study. Our clinical study was mainly composed of three analytic processes, including IHC staining for the tumor tissue of the 370 patients with kidney neoplasm, molecular diagnosis based on the RCC-TSC, and statistical validation. The inclusion and exclusion criteria were also shown in detail. Firstly, 370 patients with kidney neoplasm and received curative surgery from 2016–2018 were included. Secondly, due to the incompleteness of the IHC staining information (DNAJA1, DNAJB4, LAG3, Mki67), one patient was excluded. Thirdly, when performing Chi-square tests and Kaplan-Meier survival analysis, 71 (lacking OS censor and PFS censor) or 75 patients (lacking OS and PFS) were correspondingly excluded. Finally, when conducting the multivariate proportional hazards Cox regression analysis, 10 patients whose histopathological classifications were not KICH, KIRC, or KIRP were excluded. B The boxplots showed the relative expression in transcriptional level for the four markers we used for IHC staining and classification. MKi67 and LAG3, DNAJA1 and DNAJB3 respectively expressed highest in HIRC and LIRC. C The heatmap illustrated the information of the 370 patients in our retrospective cohort. D IHC slides demonstrated the differential staining results of DNAJA1, DNAJB4, LAG3, Mki67 in three RCC-TSC subtypes. Specifically, DNAJA1 and DNAJB4 both had relatively high, medium, and low staining score in LIRC, MIRC, and HIRC. While LAG3, Mki67 both had relatively high, medium, and low staining score in HIRC, MIRC, and LIRC. IHC, immunohistochemical; RCC-TSC, renal cell carcinoma-Treg states-related prognostic classification; OS, overall survival; PFS, progression-free survival; KICH, kidney chromophobe; KIRC, kidney renal clear cell carcinoma; KIRP, kidney renal papillary cell carcinoma; LIRC, low immune infiltration RCC; MIRC, medium immune infiltration RCC; HIRC, high immune infiltration RCC

Table 1 Demographic information and clinical characteristics of the 370 patients

The K-M survival plots showed that patients in the HIRC group had the lowest OS and PFS in KIRC patients (Fig. 6A, OS, p = 0.003; PFS, p < 0.001). More importantly, the K-M survival plots in patients with kidney neoplasm (OS, PFS, p < 0.001), and patients excluding those with KIRC (OS, p = 0.050; PFS, p = 0.052) were demonstrated respectively (Figure S17). These results validated its effectiveness in KIRC, and its potential feasibility across various kidney neoplasms. Moreover, Fig. 6B shows that RCC-TSC was also significantly associated with gender, histopathological classification, OS censor, PFS classification, PFS censor and OS classification. The proportion of male was highest in LIRC (78%), while lowest in MIRC (63%) (Fig. 6C). Percentage of KIRC was highest in MIRC (93%), and all the classification labeled as others were in LIRC (Fig. 6D). Moreover, HIRC conferred higher probability of death (13%, Fig. 6E), rates of PFS censor (23%, Fig. 6F), as well as lower OS (Fig. 6G) and PFS classification (Fig. 6H). More importantly, Fig. 6I-J validated HIRC as an independent risk factor for both OS (HR = 16.68, 95% CI = 1.88–148.1, p = 0.0115) and PFS (HR = 4.43, 95% CI = 1.55–12.6, p = 0.0054). We also constructed nomograms for OS and PFS prediction (Figure S18A). Then, we used residual plots, schoenfeld tests (Figure S18B), calibration curves, ROC curves (Fig. 6K-L) to comprehensively validate our model’s accuracy, robustness and consistency. Finally, we also built a risk-related model based on the multivariate Cox proportional hazards regression analysis, proving its significant association with patients’ survival (Figure S18C) and tumor stages (Figure S18D).

Fig. 6figure 6

Statistical analyses validated the prognostic relevance of RCC-TSC in Xinhua cohort. A The K-M survival plots validated patients in the HIRC cluster had the lowest OS (p = 0.003) and PFS (p < 0.001) in KIRC patients. B The Chi-square test revealed that RCC-TSC were significantly associated with gender, histopathological classifications, PFS censor, PFS classification, OS censor and OS classification. C The pie chart displayed that the proportion of male was highest in LIRC (78%), while lowest in MIRC (63%). D The pie chart showed that percentage of KIRC was highest in MIRC (93%), while lowest in LIRC (82%). All the classification labeled as others were in LIRC. E The pie chart demonstrated that HIRC conferred higher probability of death (13%), while MIRC and LIRC were 8% and 1% respectively. F The pie chart suggested that HIRC conferred higher probability of PFS censor (23%), while MIRC and LIRC were 11% and 7% respectively. G The pie chart displayed that the patients of HIRC had higher proportion of low OS (74%), greatly surpassing MIRC (47%) and LIRC (9%). H The pie chart showed that patients of HIRC had higher proportion of low PFS (77%), higher than MIRC (51%) and LIRC (10%). I Multivariate proportional hazards Cox regression analysis validated RCC-TSC as an independent factor and HIRC was a risk factor for OS (HR = 16.68, 95% CI = 1.88–148.1, p = 0.0115). J Multivariate proportional hazards Cox regression analysis validated HIRC as an independent factor for PFS (HR = 4.43, 95% CI = 1.55–12.6, p = 0.0054). K The calibration curve demonstrated high concordance and high AUC values in 1 (0.962), 2 (0.891), 3 (0.886), 4 (0.979), 5 (0.968) years for OS. L The calibration curve also demonstrated high concordance and high AUC values in 1 (0.915), 2 (0.887), 3 (0.838), 4 (0.834), 5 (0.805) years for PFS. RCC-TSC, renal cell carcinoma-Treg states-related prognostic classification; K-M, Kaplan-Meier; HIRC, high immune infiltration RCC; OS, overall survival; PFS, progression-free survival; KIRC, kidney renal clear cell carcinoma; LIRC, low immune infiltration RCC; MIRC, moderate immune infiltration RCC; HR, hazard ratio; CI, confidential interval; AUC, area under curve

3.14 Using external databases to validate the results of our annotation in the scRNA-seq analysis and the risk-related prediction model

In Figure S19A, 7 bulk labels and 17 sub labels are defined in the scRNA-seq analysis. Feature plots of the marker genes, such as CD4, CD3E, CD3D, TGFβ1, and IL10 are displayed (Figure S19B). The cell number and cell proportion of different cell types are shown in the bar plots (Figure S19C). Subsequently, we compared the distribution of cell types between adjacent normal tissue and tumor tissue (Figure S20A). As illustrated in Figure S20B, the number and proportion of cells co-expressing double-positive markers were higher in the tumor tissue, supporting the potential promoting role of Tregs in the tumor microenvironment. Furthermore, in Figure S21A, we demonstrate strong co-expression of FOXP3, CD4, and the four previously mentioned markers (RTKN2, S1PR1, TNFRSF9, OAS1). Lastly, Figure S21B illustrates the four markers used for IHC staining.

In the GSE22541 dataset, 12 of the 15 genes in the risk-related prediction model were found to be significant for patients’ OS, and eight genes were significant for patients’ disease free survival (Figure S22A-B). Moreover, up to 32 TSRPGs had significantly different expression between samples of the primary KIRC and the pulmonary metastasis of KIRC (Figure S22C). These results collectively indicated the significance of TSRPGs and especially the genes in the risk-related prediction model.

留言 (0)

沒有登入
gif