CDH1 overexpression predicts bladder cancer from early stage and inversely correlates with immune infiltration

Identification and integration of DEGs

Thirty-nine NMIBC (Ta and T1 stage) and twenty-nine normal BC samples were enrolled in this study (Table 1). The corresponding clinical and pathological information for each dataset is shown in Additional file 2: Table S1. After normalizing the expression of genes in the four microarray datasets (Additional file 1: Fig. S1), 218, 298, 2925, and 855 up-regulated DEGs and 315, 872, 2399, and 1035 down-regulated DEGs were identified in the GSE3176, GSE7476, GSE40355, and GSE65635 datasets, respectively (Fig. 1A–D). Through integrated analysis by the RRA method, a total of 213 DEGs, 62 up-regulated and 151 down-regulated genes, were finally identified (Additional file 2: Table S2). The top 20 up-regulated and down-regulated genes ranked by fold change are displayed in a heat map (Fig. 1E).

Table 1 The characteristics of each enrolled bladder cancer datasets downloaded from GEO databaseFig. 1figure 1

Identification and integration of differentially expressed genes (DEGs). Volcano plots of A GSE3167, B GSE7476, C GSE40355, and D GSE65635, and E heat map of differentially expressed genes. The abscissa represents the GEO IDs, the ordinate represents the gene name. Red represents logFC > 0; Green represents logFC < 0; and the value in the box represents the logFC

GO and KEGG pathway enrichment analysis for DEGs

A total of 32 and 173 significantly enriched GO terms (adj. P-value ≤ 0.05) for the up-regulated and down-regulated DEGs, respectively, were identified (Additional file 2: Table S3). The up-regulated genes were mainly enriched in cornification, epithelial cell development, keratinocyte and epidermal cell differentiation, and the ERBB2 signaling pathway (Fig. 2A–B). The down-regulated DEGs were mainly enriched in biological processes related to muscle system and mesenchyme development, extracellular matrix and structure organization, and regulation of cell migration and growth (Fig. 2C–D). KEGG analysis indicated that the integrated DEGs were significantly enriched in seven pathways, including focal adhesion (FA), proteoglycans in cancer, vascular smooth muscle contraction, fluid shear stress and atherosclerosis, leukocyte transendothelial migration, dilated cardiomyopathy (DCM), and tight junctions (Table 2 and Fig. 3A–B).

Fig. 2figure 2

GO terms analysis for DEGs. A Bubble plot of prior significantly enriched GO terms of up-regulated DEGs in each classification. B GO Chord plot of the relationship between the top eight GO-BP terms and their corresponding up-regulated genes. A gene is linked to a certain GO term by colored bands. C Bubble plot of the prior significantly enriched GO terms of down-regulated DEGs. D GO Chord plot of the relationship between the eight tumor-related enriched GO-BP terms and their corresponding down-regulated genes

Table 2 KEGG pathway enrichment results for integrated DEGsFig. 3figure 3

KEGG pathway enrichment analysis for DEGs. A Bubble plot of dysregulated pathways. B KEGG Chord plot of the relationship between the selected pathways and their corresponding genes

Multivariate cox analysis and ROC curve plotting for DEGs

According to the AIC, the optimal risk model containing nine genes was determined based on bladder cancer patient gene expression data provided by the TCGA consortium (Table 3). Of these genes, FASN, SPAG4, FCRLB, and UPK2 showed positive coefficients, indicating their role as risk factors for predicting a poor survival. However, UBE2C, FER1L4, EPN3, CTSE, and TOX3 showed negative coefficients, suggesting that they might be protective factors associated with a longer survival. The calculated median value for the risk score of each sample was 1.009. Each sample was grouped into “high-risk” and “low-risk” according to the median risk score. Survival analysis (Fig. 4A) revealed a five-year survival rate of 27.5% [95% confidence interval (CI), 19.3–39.0%] in the high-risk group (202 patients) and 58.1% (95% CI, 49.1–68.7%) in the low-risk group (203 patients). We plotted the ROC curve for the risk model, which yielded an AUC value of 0.726 (Fig. 4B).

Table 3 Coefficients for the nine genes in the optimal risk model for survival of bladder cancer patientsFig. 4figure 4

Survival and ROC curve analysis for DEGs. A Survival curve. Patients were divided into a high-risk group (> 1.009) and a low-risk group (< 1.009) according to the risk score. The difference in the survival rate between the high-risk and low-risk groups was significant (P < 0.001). B ROC curve. The calculated AUC was 0.726. The larger the AUC value, the more likely the current classification algorithm placed the positive sample in front of the negative sample

Integration of PPI network and module analysis

A PPI network of the integrated DEGs was constructed that comprised 171 nodes and 561 edges (Fig. 5A). Then, the top 30 DEGs ranked by degree value in the PPI network were identified (Fig. 5B). After the module analysis of the PPI network, the top three significant modules were determined (Fig. 5C–E) and enriched in several KEGG pathways. Thirteen genes in module 1 were significantly enriched in vascular smooth muscle and cardiac contraction and pathways associated with cardiomyocyte structural and functional abnormalities. Fourteen genes in module 2 were significantly enriched in proteoglycans in cancer, malaria, and ECM–receptor interaction, and seven genes in module 3 were significantly enriched for HTLV-I infection, cell cycle, and ubiquitin-mediated proteolysis (Additional file 2: Table S4).

Fig. 5figure 5

PPI network construction and module analysis for DEGs. A The PPI network of DEGs was constructed, and DEGs with ≤ 3 nodes and ≤ 3 edges were removed from the network. Up-regulated genes are marked in red, and down-regulated genes are marked in green. Larger sizes or darker colors of a node indicate that the corresponding gene had a greater logFC. B Bar plot of the nodes with top 30 degrees in the PPI network. Higher numbers and longer bar lengths represent greater interactions of the protein. C The most significant module in the PPI network had 13 nodes and 77 edges. D The second most significant module in the PPI network had 14 nodes and 36 edges. E The third most significant module in the PPI network had 7 nodes and 13 edges

Identification and survival analysis for hub genes

According to the degree value in the PPI network, up-regulated DEGs exhibiting the top five high degree genes (CDH1, IGFBP3, PPARG, SDC1, and EPCAM) and down-regulated DEGs exhibiting the top five high degree genes (ACTA2, COL3A1, TPM1, ACTC1, and ACTN1) were screened out as hub genes. The gene descriptions, fold changes, and corresponding degree values for the hub gene are presented in Table 4. GO analysis for the hub genes identified eleven significantly enriched GO terms, which were the most associated with the differentiation and development of muscle cells and the cytoskeleton (adj. P-value ≤ 0.05) (Additional file 2: Table S5). The hub genes were also confirmed to be expressed in many BC tissue samples (Fig. 6A), showing that CDH1, IGFBP3, and EPCAM were significantly up-regulated in BC samples from different stages, while ACTA2, TPM1, ACTC1, and ACTN1 were significantly down-regulated. In addition, an analysis of the correlation between the OS and DFS of patients with BC and hub genes showed that patients with BC showing altered expression of ACTA2, COL3A1, TPM1, ACTC1, and ACTN1 exhibited a worse OS, while those with altered expression of PPARG had a better OS, and only high-expressed ACTC1 could predict a worse DFS (Fig. 6B).

Table 4 The fold change and degree values for the hub genesFig. 6figure 6

Expression verification and survival analysis for hub genes. A mRNA expression levels of the ten hub genes using GEPIA online tool. *P < 0.01. B Overall survival (OS) and disease free survival (DFS) analyses and for the hub genes using the GEPIA online tool. Only the genes whose altered expression significantly affect OS and DFS are shown

Correlation between CDH1 expression and clinicopathological characteristics in BC

As the gene with the highest degree in the PPI network, CDH1 was significantly overexpressed in multiple tumor tissues, especially in tumors originating from urogenital tracts, breast, lung, cholangio, cervix, and endometrium, based on TCGA database. However, in colon adenocarcinoma, renal clear cell and papillary cell carcinoma, and thyroid carcinoma, CDH1 was significantly decreased (Fig. 7A). Moreover, CDH1 showed a significantly higher expression in patients with low and high histological grades than in normal BC tissues (Fig. 7B). However, patients with different histological grades, i.e., T, N, and M stages, shared similar CDH1 expression levels (Fig. 7C). Given the results of CDH1 expression in patients with Ta/T1 stage BC, we speculate that CDH1 is markedly up-regulated in the tumor initiation stage but not further altered during the tumor development phase.

Fig. 7figure 7

Association of CDH1 expression with clinicopathological characteristics and CDH1 methylation with prognosis in BC. A Comparison of CDH1 expression between tumor and para-carcinoma tissues in different types of cancers based on TCGA database. ns, P > 0.05; *P < 0.05; **P < 0.01; ***P < 0.001. B CDH1 expression in different histological grade BC tissues and normal bladder tissues. C Patients with different T, N, and M stages shared similar CDH1 expression levels in BC. D Correlation between CDH1 methylation and its expression level in BC. E CDH1 methylation was correlated with prognosis in BC from the MethSurv database. F Heat maps of the association between the methylation level of CDH1 and patient characteristics and genomic subregions

Correlation between methylation of CDH1 and prognosis in BC

Although altered expression of CDH1 did not significantly influence the OS rate in patients with BC, DNA methylation also affects clinical outcomes. The cBioPortal platform was used to evaluate CDH1 methylation in BC, showing that CDH1 expression was highly negatively correlated with methylation (R =  − 0.47, P < 0.001) in BC (Fig. 7D). Kaplan–Meier plots by MethSurv analysis were drawn to identify methylation sites in CDH1 associated with prognosis in BC. We found six methylated sites indicating good prognosis (5′-UTR;1stExon-Island-cg11255163 and cg23989635, Body-Open_Sea-cg09220040 and cg07762788, Body-S_Shore-cg09406989, and Body-S_Shelf-cg08616585) and one site indicating an aberrant prognosis (Body-Open_Sea-cg10313337) (Fig. 7E). The heat map plotted by ‘Gene Visualization’ further illustrates the relationship of CDH1 methylation levels with gene subregions and available characteristics of patients (Fig. 7F).

CDH1 expression and identification of correlated genes in BC cell lines

Based on the CCLE database, CDH1 was also significantly overexpressed in several different cancer cells (Fig. 8A), derived from the urinary tract, breast, lung, cholangio, prostate, esophagus, colorectal, and endometrium, which generally conformed to that in tumor tissues. Through co-expression analysis in 25 different urinary tract cancer cell lines extracted from the CCLE database, a total of 755 and 197 genes were identified that were positively and negatively co-expressed with CDH1, respectively (Additional file 2: Table S6). The top 20 positively and negatively co-expressed genes are depicted in a heat map (Fig. 8B). Notably, another hub gene, EPCAM, had a significant positive correlation of 0.805 with CDH1 (Fig. 8C). Both CDH1 and EPCAM were confirmed by qRT-PCR to be up-regulated in 5637 and RT4 BC cells compared to those in the normal urinary tract epithelial cell line SVHUC1 (Fig. 8D). High and low expression CDH1 phenotypes of significantly enriched pathways, including representative metabolic signaling, cell cycle, and RNA degradation pathways (Fig. 8E and Additional file 2: Table S7), were determined using GSEA.

Fig. 8figure 8

CDH1 expression in vitro and corresponding gene identification and function analysis. A CDH1 expression in various cancer cell lines based on the CCLE database. The ordinate represents the expression level of CDH1. B Heat map of the top 20 positively and negatively co-expressed genes with CDH1 in urinary tract cancer cell lines extracted from the CCLE database. C Correlation between the expression of CDH1 and EPCAM. Cor is the correlation coefficient. D mRNA expression levels of CDH1 and EPCAM in SVHUC1 cells, and 5637 and RT4 bladder tumor cells examined using RT-PCR. Three independent experiments were conducted for each RT-PCR assay, and P < 0.05 was considered to reflect a statistically significant difference from the SVHUC1 group. E GSEA function enrichment analysis of differentially expressed genes in the high CDH1 expression group and low CDH1 expression group based on the CCLE database

Correlation between CDH1 expression and immune cell infiltration in BC

By the ssGSEA method, we quantified the infiltration levels of 24 immune cell types for 413 BC samples of TCGA-BLCA and investigated the association between CDH1 expression and immune cell infiltration. Spearman correlation analyses revealed that high CDH1 expression was mainly associated with low infiltration of the most immune cell types (Fig. 9A), especially pDCs (R = − 0.369, P < 0.001), cytotoxic cells (R = − 0.309, P < 0.001) and Th1 cells (R = − 0.291, P < 0.001). Besides, CD8 + T cells (R = − 0.258, P < 0.001), Treg (R =− 0.233, P < 0.001), T cells (R = − 0.233, P < 0.001), macrophages (R = − 0.216, P < 0.001), neutrophils (R = -0.181, P < 0.001), DCs (R = − 0.175, P = 0.001), B cells (R = − 0.165, P < 0.001), and NK cells (R = − 0.155, P = 0.002) were all negative correlated with CDH1 expression. We observed weakly positive correlations of CDH1 expression level with infiltration of only two immune cell types, including T helper cells (R = 0.1, P = 0.042) and Tcm cells (R = 0.099, P = 0.045). The immune cells relevant to infiltration levels were further evaluated in distinct CDH1 groups (Fig. 9B), which conformed to the results in Fig. 9A.

Fig. 9figure 9

Correlation between CDH1 expression and immune cell infiltration in BC and the ceRNA network construction. A Relationships among infiltration levels of 24 immune cell types and CDH1 expression profiles in BC samples based on the TCGA database by Spearman correlation analysis. B The infiltration levels of the 24 immune cell types were evaluated in distinct CDH1 groups in BC samples. ns, P > 0.05; *P < 0.05; **P < 0.01; ***P < 0.001. DCs, dendritic cells; aDCs, activated DCs; iDCs, immature DCs; pDCs, plasmacytoid DCs; NK, natural killer cells; Th, T helper cells; Th1, type 1 Th cells; Th2, type 2 Th cells; Th17, type 17 Th cells; Tcm, T central memory cells; Tem, T effector memory cells; Tfh, T follicular helper cells; Tgd, T gamma delta cells; Treg, regulatory T cells. C ceRNA network containing hub genes, differently expressed miRNAs and lncRNAs in BC. D ceRNA network associated with CDH1. Larger circular nodes represent mRNA, smaller circular nodes represent miRNA, and hexagonal nodes represent lncRNA

Construction of ceRNA network

We identified 556,431 significantly up-regulated and 125 significantly down-regulated DElncRNAs and 289,181 significantly up-regulated and 108 significantly down-regulated DEmiRNAs between BC and adjacent non-cancer bladder tissues in TCGA database (Additional file 1: Fig S2). Furthermore, a total of 552 miRNAs related to hub genes were predicted by integrating the results obtained in the TargetScan and miRDB databases (Additional file 2: Table S8). Subsequently, a ceRNA network was constructed using DEmiRNAs that interacted with both hub genes and DElncRNAs (Additional file 2: Table S9), including 46 lncRNA nodes, 42 miRNA nodes, and 10 mRNA nodes (Fig. 9C). Of these, hsa-miR-383 was the only significantly down-regulated miRNA interacting with CDH1 in BC and associated with LINC00337, AL356608.1, AL357153.2, LINC00485, MALAT1, and HULC, which were all significantly up-regulated in BC (Fig. 9D).

留言 (0)

沒有登入
gif