Single cell analysis unveils B cell-dominated immune subtypes in HNSCC for enhanced prognostic and therapeutic stratification

Identification and characterization of B-cell immune classification in an independent scRNA-seq dataset

After quality control and data filtering, we obtained single-cell transcriptomes of 105943 immune cells from 26 HNSCC samples (GSE139324). To identify the major populations and subpopulation compositions of tumor infiltrating immune cells, we performed clustering using Seurat to identify the major immune cell types, including NK cells, CD4+ T cells, CD8+ T cells, cycling T cells, myeloid cells, and B cells (Fig. 1a). Each cell type was identified based on classical marker genes and literature evidence (Fig. 1b, c). The immune cell cluster composition in each patient sample was also shown to observe the state of tumor immune environment in all samples (Fig. 1d).

Fig. 1figure 1

Identification of immunological subtypes based on TIL-Bs by single cell analysis. a UMAP of tumor infiltrating immune cells (n = 105.943 cells) from 26 samples in HNSCC scRNA-seq, colored by cell clusters. b UMAP plots showing expression of classical marker genes from immune cell clusters. c Heatmap of signature genes for immune cell clusters. Each cell cluster is represented by five specifically expressed genes. d Bar plot showing the immune cell type proportion in each HNSCC sample. e UMAP plot showing 15 B cells clusters. f Two subgroups were identified using ConsensusClusterPlus R package based on the marker genes. g t-SNE plot of HNSCC scRNA-seq cohort (n = 26), colored by sample groups

To interrogate the functional subpopulation and potential role of TIL-Bs in HNSCC, cells defined as B cells in the first level of clustering were selected and re-clustered to identify 15 different B cell subpopulations (Fig. 1e). Then we used Findallmarkers function on these 15 B cell clusters to obtain the B cell-associated differentially expressed genes set (Methods). As a result, a total of 440 marker genes were found, which were used to perform a new unsupervised clustering of the HNSCC samples using ConsensusClusterPlus for feature selection and re-cluster the 25 HNSCC samples (Fig. 1f and Supplementary Fig. 1a). Principal component analysis also classifies the samples into two groups based on the geneset, called the B cells activation group and the B cells inhibition group (Fig. 1g). Samples in these two groups had the largest difference in the proportion and number of their CD8+ T cells (Supplementary Fig. 1bd), indicating a possible distinction in their anti-tumor immunity between these two groups.

Previous data revealed disparities in tumor immunity based on single-cell sample groupings of B-cell signature genes, therefore we dissected the discrepancies between the two groups at the genetic level. As indicated in the volcano map, using log2FC > 1 and P < 0.05 as cut-off thresholds, we identified 43 differentially expressed genes (21 up-regulated and 22 down-regulated genes) between the B cell activation and B cell inhibition groups (Fig. 2a). We defined the 21 upregulated genes as B cell activation gene signature (BCAGS) and used this signature for the next validation analysis. Most of the up-regulated genes were associated with activation of B cells, whereas the down-regulated genes were associated with inhibition of B cells (Fig. 2a, b). The up-regulated genes were highly enriched in the regulation of B cell activation, B cell receptor signaling pathway, and B cell activation pathway, according to GO function enrichment analysis (Fig. 2c). We then performed GO function enrichment for DEGs in other immune cell clusters in these two groups, and found that the upregulated genes in the B cells activation group facilitated multifunctional immunological regulation, such as positive regulation of leukocyte activation, antigen binding, cell killing and immune cell recruitment (Supplementary Fig. 2a–e).

Fig. 2figure 2

Characterization of immunological subtypes based on TIL-Bs. a Volcano plot showing differentially expressed genes between B cells activation group and the B cells inhibition group B cells. Down, downregulated DEGs. NoSignifi, not significant. Up, upregulated DEGs. b Violin plot showing expression of IGHG1, IGHA1, CD24 and IGHM in B cells activation group and the B cells inhibition group. c GO cluster plot showing a chord dendrogram of the clustering of the expression spectrum of significantly upregulated genes in B cells based on the B-cell signature genes classification. d The differential number of interactions (left) or interaction strength (right) in the cell-cell communication network between the two groups was shown by the circle plots

Emerging evidence had demonstrated immune cell function or communication was skewed in malignancies. However, a global profile of immune cell communication in HNSCC based on this grouping approach is largely undefined. To systematically investigate cell-cell communication in the B-cell activation and B-cell suppression groups, we used CellChat to conduct an unbiased analysis of the overall and differential number and strength of interactions, as well as “ligand-receptor” interactions. Intriguingly, the strength of cell-cell communication activation differed between the two groups. As shown, the overall strength of immune cell interactions increased in the B cells activation group (Fig. 2d), and T-cell direct interactions were more active (Supplementary Fig. 2f). And in the ligand-receptor interaction, the B cells activation group exhibited more CSF, CXCL, and MHC-I signaling pathway, while CD22, and CD23 was inhibited, indicating that the B cells activation group improved anti-tumor immunity through cell-cell intercommunication (Supplementary Fig. 2g). These cell-cell communication data provide evidence for the formation of immune cell-cell communication stratification based on the B-cell signature gene HNSCC classification.

Validation for B-cell classification suggests the consistency in different scRNA-seq datasets

To further demonstrate the merits of the obtained B-cell signature genes for classifying HNSCC patients, we collected another HNSCC single-cell data (GSE164690) as the validation cohort. We selected tumor-infiltrating immune cell populations and defined them according to classical markers or evidence from the literature, including NK cells, CD4+ T cells, CD8+ T cells, myeloid cells, and B cells (Supplementary Fig. 3ac). The percentage of immune population cells is also listed for all 18 patients (Supplementary Fig. 3d). We also performed unsupervised clustering of these 18 samples using BCAGS obtained previously, using ConsensusClusterPlus for feature selection and re-clustering (Fig. 3a). Consistently, the patients were clustered into two distinct groups in this cohort (Fig. 3b). Furthermore, in the grouping based on the B-cell signature gene set, we also observed differences in the number and proportion of CD4+ T cells, CD8+ T cells, and B cells between the two groups (Supplementary Fig. 3e, f), which supports the possibility that there is a gap in anti-tumor immunity with this classifying approach.

Fig. 3figure 3

Validation of classification method based on TIL-Bs in another dataset. a Two subgroups were identified using ConsensusClusterPlus R package based on the B-cell signature gene. b t-SNE plot of HNSCC scRNA-seq cohort (n = 15), colored by sample groups. c Volcano plot showing differentially expressed genes between B cells activation group and the B cells inhibition group B cells. Down, down regulated DEGs. NoSignifi, not significant. Up, up regulated DEGs. d GO cluster plot showing a chord dendrogram of the clustering of the expression spectrum of significantly upregulated genes in B cells based on the B-cell signature genes classification. e Violin plot showing expression of IGHG1, IGHA1, CD24 and IGHM in B cells activation group and the B cells inhibition group

We also verified the consistency of this classifying in the single-cell cohort at the gene level. Volcano plots show the DEGs in the transcriptome level between the B cells activation and B cells inhibition groups, where 29 up-regulated and 101 down-regulated genes were observed (Fig. 3c). GO function enrichment showed the upregulated DEGs were significantly enriched in the immunoglobulin complex, circulating, immunoglobulin receptor binding, and regulation of B cell activation pathway (Fig. 3d). Consistent with the previous cohort, the up-regulated genes were mostly B-cell activation-related genes (Fig. 3e). We also performed GO function enrichment on the DEGs of other immune cell populations in these two groups and showed that the upregulated genes in the B cells activation group mediated multifunctional immune regulation, such as positive T cell selection, positive regulation of leukocyte activation, lymphocyte-mediated immunity, and MHC class II protein complex (Supplementary Fig. 4ae). The above results suggest that the classifying method is consistent across different single-cell cohorts.

Furthermore, there was a high degree of consistency for cell-cell communication in both groups, with enhanced total number and strength of immune cell interactions in B cells activation group (Supplementary Fig. 4e) and more active direct T-cell population interactions (Supplementary Fig. 4f). And in ligand-receptor interaction, the B cells activation group showed more active CSF, CXCL, and MHC-I signaling pathway, while CD22 was inhibited, indicating that the B cells activation group improved anti-tumor immunity through cell-cell interactions (Supplementary Fig. 4g). These cell-cell communication data provide evidence for consistency in different single-cell cohorts based on the B-cell signature gene classifying of HNSCC.

B-cell classification in TCGA HNSCC and association with overall survival

To verify whether this B-cell signature gene classifying method applicable to TCGA HNSCC, we performed unsupervised clustering of RAN-seq expression profiles of 501 patient samples and obtained two clusters based on BCAGS (Fig. 4a), and tSNE clustering showed that the samples divided into two groups (Fig. 4b). As shown in the volcano plot, 388 upregulated genes and 42 downregulated genes were identified in the two groups from TCGA HNSCC patients (Fig. 4c). GO function enrichment analysis for the DEGs in the B cells activation group were found to be significantly enriched in lymphocyte activation and immune activation related pathways (Fig. 4d). We correlated the grouping with clinical information to explore the prognostic value of this tumor typing. Previous studies have shown that B cell is associated with improved overall survival in tumor patients.21,22 In our study, the Kaplan-Meier curve showed that patients in the B cells activation group had a significant better survival prognosis (Fig. 4e).

Fig. 4figure 4

Application of classification method based on TIL-Bs in TCGA cohort. a Consensus matrix of TCGA HNSCC cohort (n = 501). b t-SNE plot shows TCGA HNSCC samples were divided into two clusters. c Volcano plot depicts the differentially expressed genes between patients from cluster 1 and cluster 2. d GO cluster plot showing a chord dendrogram of the clustering of the expression spectrum of significantly upregulated genes in B cells activation group based on the B-cell signature genes classification. e The Kaplan-Meier overall survival curves of TCGA HNSCC patients between B cells activation group and the B cells inhibition group. TCGA samples were stratified by the B-cell signature genes. f The enrichment levels of 28 immune-related cells (by ssGSEA analysis) and types in the B cells activation group and B cells inhibition group

To explore the effect of the classifying method on the infiltration fraction of immune cells, ssGSEA was used to visualize the relative abundance of 28 infiltrating immune cell populations. B cells activation group regions showed a higher abundance of immune cell infiltration (Fig. 4f and Supplementary Fig. 5a). Pearson’s correlation study revealed that the abundances of these two groups were positively related within a local environment (Supplementary Fig. 5b). Samples from the B cells activation group in TCGA HNSCC had a higher proportion of tumor-infiltrating lymphocytes cells (Supplementary Fig. 5c), than the B cells inhibition group, as estimated by CIBERSORT. Taken together, our data demonstrated the consistency of such classifying method both in single-cell and RNA-seq cohort.

B-cell classification accurately predict immunotherapy response

We also assessed the infiltration fraction of stromal and immune cells in the tumor using the ESTIMATE function and inferred the purity of the tumor. Higher immune scores, stromal scores, and ESTIMATE score in the B cells activation group, lower tumor purity than in the B cells inhibition group were observed (Fig. 5a–d). Our previous data suggest the presence of a higher abundance of immune cells in the B cells activation group, indicating a potential greater susceptibility to immunotherapy. Meanwhile, we explore the expression profile analysis of typical immune inhibitory molecules (PD-1, CTLA4, LAG3, BTLA, CD274, HAVCR2, VSIR, and PDCD1LG2) on the two groups. Patients within B cells inhibition group showed significant down-regulated inhibitory receptors (Fig. 5e). To further explore possible correction between the classification and immunotherapy response, we calculate the tumor immune dysfunction and exclusion (TIDE) scores in both groups to assess the potential for tumor immune evasion and to predict response to immune response. The results showed B cells activation group had lower TIDE scores than B cells inhibition group (Fig. 5f). The lower TIDE score is usually correlated with better ICB therapy. Our results indicated that patients in the B cells activation group were possibly sensitive to the ICB therapy.

Fig. 5figure 5

B cells-based classification predicts immunotherapy response accurately. Box plots showing lower tumor purity (a), and higher stromal score (b), immune score (c), and ESTIMATE score (d) in the B-cell activation group. e Box plots shows the different expression levels of the typical immune inhibitory receptors (PDCD1, CTLA4, LAG3, BTLA, CD274, HAVCR2, VSIR, and PDCD1LG2) between two groups. f Patients in the B cells activation group showed lower TIDE score, higher Dysfunction, and higher Exclusion score

Four-gene prognostic model development and validation in HNSCC

Our investigation utilized LASSO regression analysis to optimize the BCAGS, unveiling a four-gene ensemble (JCHAIN, GZMB, IGHA1, and PDRX4), fostering the construction of a pivotal prognostic risk model. Among these, JCHAIN, GZMB, and IGHA1 showed significantly elevated expression in the low-risk group, while PDRX4 demonstrated notably increased expression in the high-risk group (Fig. 6a). Subsequently, employing univariate Cox regression analyses across the TCGA cohorts, we sought to discern whether the risk score, untethered from conventional clinical factors, could autonomously prognosticate patient outcomes. Encouragingly, our findings unveiled the risk score (HR: 1.9; 95% CI: 1.2–3) as an autonomous harbinger of OS (Fig. 6b). Notably, KM and log-rank analyses illuminated the stark divergence in OS between high and low-risk HNSCC cohorts (Fig. 6c). Moreover, the efficacy of risk scores in forecasting OS within the TCGA cohort was conspicuous (AUC for 1-, 3-, 5-, and 10-year OS: 0.622 0, 0.634 2, 0.570 2, and 0.626 3, respectively), showcasing their adeptness in long-term prognostication (Fig. 6d). To fortify the prognostic validity of our model, we enlisted a clinical cohort comprising 20 heterogeneous HNSCC patients, aiming to authenticate the expression patterns of the quartet genes (JCHAIN, GZMB, IGHA1, and PRDX4). Employing mRNA quantification of these genes via RT-qPCR, we delineated a precise demarcation between low and high-risk patients within our clinical cohort based on the genes’ expression levels. Strikingly, the JCHAIN, GZMB, and IGHA1 exhibited markedly escalated expression within the low-risk category, while PRDX4 showed a notable decrease (Fig. 6e). Subsequent Kaplan-Meier analysis underscored a significant correlation between the high risk group and a deteriorated prognosis in HNSCC patients (Fig. 6f). Complementary to RT-qPCR, our investigative stride delved deeper through immunohistochemistry (IHC) analysis, unraveling spatial and protein-level insights that corroborated the expression pattern of these genes within the high-risk stratum (Figs. 6g, h). This substantiation further bolsters the nexus between these genes and adverse clinical outcomes. Remarkably, our experimental results seamlessly aligned with our constructed risk model, affirming the potential of these four genes as robust prognostic biomarkers in predicting survival outcomes among HNSCC patients.

Fig. 6figure 6

Unveiling a four-gene risk model for prognostication in HNSCC patients. a Comparative boxplot delineating the modeling-related gene expression levels within the two groups. b Forest plot mapping out OS-related clinical factors via single-factor Cox regression analysis. c Kaplan–Meier survival plots manifesting prognosis-linked risk scores’ impact. d ROC curve spectra, a predictive gauge for 1, 3, 5, and 10-year OS dynamics. e Comparative analysis spotlighting differential gene expressions between high and low-risk groups. f Kaplan-Meier survival curves of HNSCC patients based on four genes Risk-scores. Representative IHC staining images (g) and IHC score (h) in the two groups. Scale bar, 100 μm

留言 (0)

沒有登入
gif