Identification of Anoikis-Related Genes in Chronic Kidney Disease Based on Bioinformatics Analysis Combined with Experimental Validation

Introduction

Chronic kidney disease (CKD) is a progressive condition characterized by diverse etiological factors that result in structural changes and dysfunction of the kidneys. It is associated with a high incidence and morbidity and remains incurable.1 The etiology of CKD is complex and multifactorial, prompting ongoing efforts to identify novel biomarkers indicative of renal function and structure.2 Consequently, an intensive study of the mechanism of CKD has significant implications for identifying and screening potential diagnostic biomarkers for the occurrence and development of CKD.

Anoikis, a special mode of apoptosis resulting from cell loss or inappropriate cell adhesion,3 plays a critical role in normal development and tissue homeostasis.4 Fibroblast growth factor (FGF) and Bone morphogenetic protein (BMP) signaling pathways help maintain this homeostasis through the regulation of anoikis.5 Activation of anoikis could suppress the disseminated metastasis of Ewing Sarcoma (EwS) cells.6 In hepatocellular carcinoma (HCC), diminishing resistance to anoikis may prevent tumor invasion metastasis.7 In the context of breast cancer treatment, resistance to anoikis not only facilitates tumor metastasis but also induces alterations in innate immunity.8 Conversely, modulation of innate immunity has been suggested as a potential strategy to mitigate tumor dissemination.9 Interestingly, accumulating evidence has highlighted that the occurrence and progression of CKD could be driven by the activation of the innate immune response.10 Nevertheless, detailed insights into the mechanisms and biomarkers of Anoikis-related genes (ARGs) associated with CKD are still unclear. While previous studies have utilized IHC to validate diagnostic biomarkers in CKD patients, they have not employed GSVA.11 Research on biomarkers related to anoikis in CKD needs to incorporate a variety of analytical methods.

Currently, many biomarkers lack the reliability needed to predict CKD progression and treatment responses, and there are inconsistencies in studies based on differing etiological factors and inclusion criteria. Hence, elucidating the role of anoikis as well as identifying biomarkers and therapeutic targets for CKD is of great significance for clinical intervention. Therefore, in this study, we aim to employ data mining techniques to identify differential genes associated with Anoikis-mediated immunity. We plan to validate our findings through in vivo modeling, aiming to further elucidate the pathogenic mechanisms of CKD and identify potential biomarkers and therapeutic targets.

Materials and MethodsData Acquired and Processing

The CKD related datasets including GSE62792,12 GSE70528,13 GSE6649414 were obtained using the GEOquery15 package. The GSE62792 dataset comprised 18 samples from Homo sapiens, which included 12 CKD blood samples (comprising 3 technical replicates of 4 distinct samples) and 6 normal blood samples (including 3 technical replicates of 2 samples). The associated data platform is GPL10558. To reduce the impact of technical repetition on the results, we used the averaging method to combine duplicate samples and ultimately selected 6 averaged samples for analysis. The GSE70528 dataset, also derived from Homo sapiens and utilizing the GPL570 data platform, included 19 samples: 7 peripheral blood mononuclear cell (PBMC) samples from CKD patients, 4 from patients undergoing hemodialysis, and 8 from patients with hypertension. For a focused analysis on CKD characteristics, only the 7 PBMC samples from CKD patients were selected. The dataset GSE66494, obtained from Homo sapiens via the GPL6480 platform, contained a total of 61 samples, consisting of 53 CKD renal biopsy samples and 8 normal renal biopsy samples. All samples from this dataset were included in our analysis. Datasets GSE66494 and GSE70528 lack information on CKD disease staging. In contrast, the GSE62792 dataset includes samples from various stages of CKD, specifically stages 2 through 5. However, we did not select samples from specific CKD stages for analysis; instead, we included samples from all stages to preliminarily explore the associated biomarkers. This study is exempt from ethical review under items 1 and 2 of Article 32 of the Measures for Ethical Review of Life Science and Medical Research Involving Human Subjects, effective February 18, 2023, in China.

ARGs were gathered from the GeneCards database16 and Molecular Signature Database (MSigDB).17 From the GeneCards database, we retrieved ARGs associated with “Anoikis” categorized as “Protein Coding” resulting in a total of 739 ARGs. Additionally, we identified 947 ARGs from the MSigDB database through a keyword search that included the term “Anoikis”. After merging and removing duplicates, we obtained a total of 108 unique ARGs, with specific details provided in Table S1.

ARDEGs

We performed a thorough process to remove potential batch effects from merging multiple datasets before data analysis. To identify DEGs associated with various groups, we initially utilized the R package sva (Version 3.50.0)18 to remove batch effects from the datasets GSE62792, GSE70528, and GSE66494. This process resulted in a merged dataset comprising 64 CKD samples and 10 normal samples.

Subsequently, we conducted differential expression analysis on the expression profile data from the merged dataset using the limma package (Version 3.58.1). Applying the screening criteria19 of |logFC| > 0.5 and p.adjust < 0.05, we identified DEGs for further analysis. Specifically, genes with |logFC| > 0.5 and p.adjust < 0.05 were classified as upregulated, while those with |logFC| < 0.5 and p.adjust < 0.05 were considered downregulated.

To acquire the ARDEGs, we intersected the DEGs that met the criteria of |logFC| > 0.5 and p.adjust < 0.05 with the ARGs, as represented in a Venn diagram. Additionally, we created a volcano plot using the ggplot2 package (Version 3.4.4) in the R environment to visualize the results of the differential analysis, and heat maps were generated utilizing the heatmap package in R.

Protein-Protein Interaction (PPI)

PPI consisted of individual proteins interacting with each other. The STRING database20 serves to identify both known and predicted interactions between proteins. We visualized the ARDEGs as a PPI network using Cystoscope,21 querying the STRING database with the biological species set to human and establishing a minimum correlation coefficient threshold of 0.400 as the benchmark.

GO Functional and KEGG Pathway Enrichment Analysis of ARDEGs

GO analysis is a method commonly applied for large-scale functional enrichment studies, encompassing biological processes (BP), molecular functions (MF), and cellular components (CC).22 The KEGG database23 is extensively utilized for storing information concerning genomes, biological pathways, diseases, and drugs. We conducted GO functional enrichment analysis on the ARDEGs using the clusterProfiler (Version 4.10.0)24 R package. The criteria for item selection included a p.adjust < 0.05 and False discovery rate (FDR) value (q.value) <0.20, with the Benjamini-Hochberg (BH) method employed for P-value correction. Finally, KEGG pathway enrichment analysis was visualized using the R package pathview (Version 1.0.12).25

GSEA and GSVA

GSEA26 is employed to assess the distribution trend of genes within predefined gene sets, where the gene table is sorted based on phenotypic correlation to evaluate their contributions to the phenotype. In this study, we first sorted the differential genes from the merged dataset according to their logFC and subsequently conducted enrichment analysis on all differential genes using the clusterProfiler package. The parameters for GSEA were defined as follows: the seed value was set to 2020, the number of permutations was 1000, the minimum number of genes in each gene set was 10, the maximum was 500, and the BH method was utilized for P-value correction. The “c2.cp.v7.2.symbols” gene set was retrieved from the MSigDB database,17 with the criteria for significant enrichment set at p.adjust < 0.05 and FDR value (q.value) < 0.25. GSVA,27 known as gene set variation analysis, is a non-parametric as well unsupervised analysis method. It was mainly conducted to assess the gene set enrichment results of the chip nuclear transcriptome transforming the expression matrix of genes between different samples into which of gene sets between various samples, to assess whether disparate pathways are enriched during diverse samples sequentially. GSVA was performed on the merged dataset samples from the calculated functional differences of the enrichment pathway between the CKD group and the Normal group, using the GSVA package with gene sets from the “h.all.v7.4.symbols.gmt” gene set in the MSigDB database.17

Weighted Gene Co-Expression Network Analysis (WGCNA)

WGCNA evaluates the co-expression relationship among genes by calculating the normalized expression correlation coefficients for each gene. Genes exhibiting co-expression relationships are grouped into modules. Within a module, gene expression patterns are similar, while significant variation in expression occurs among genes from different modules. This approach facilitates the transformation of complex high-throughput data into simpler, interpretable modules, thereby achieving a degree of dimensionality reduction. Ultimately, it allows for the identification of correlations between these gene co-expression modules and clinical phenotypes, providing insight into the biological significance of the module.28 To obtain the co-expression modules from the merged dataset, we performed WGCNA using the R package (Version 1.73).29 The minimum number of genes per module was set to 25, the module merging cut height was established at 0.2, and the minimum distance threshold was also set to 0.2. Additionally, we screened for genes with the top 5000 median absolute deviations.

We utilized the GSVA R package (Version 1.50.0)27 to calculate the Anoikis-related score (ARs) of each sample in the merged dataset by the single-sample Gene Set Enrichment Analysis (ssGSEA) algorithm based upon the expression matrix of ARDEGS. We then calculated the correlation between the co-expression modules and ARs to identify module genes associated with anoikis. The genes within the modules that exhibited the highest correlation with ARs were intersected with ARDEGs, and these intersected genes were designated as hub genes. Finally, interactions among the hub genes were visualized using the STRING database.

Differences in Hub Genes Between the CKD Group and Normal Groups

To investigate the differences in hub genes between the CKD group and the Normal group, we first conducted univariate logistic regression analysis for all hub genes based on their expression levels. This analysis aimed to determine whether the observed differences in hub gene expression were statistically significant (P < 0.05) between the two groups. Subsequently, we performed an independent Student’s t-test to compare the expression levels of hub genes between the CKD and Normal groups. The results were presented in the form of group comparison charts to visually illustrate these differences. Additionally, we plotted ROC curves to evaluate the diagnostic performance of the hub genes in distinguishing between the two groups.

Competing Endogenous RNAs (ceRNA) Networks

The ENCORI30 database (https://starbase.sysu.edu.cn/) is the 3.0 version of the starBase database, which provides insights into various interactions, including miRNA-ncRNA, miRNA-mRNA, ncRNA-RNA, RNA-RNA, RBP-ncRNA, and RBP-mRNA. These interactions are based on data mining from CLIP-seq and degradome sequencing (for plant) studies, offering extensive visual interfaces for exploring additional microRNA targets. For miRNA target gene prediction and functional annotation, we utilized the miRDB database.31 Interaction predictions between miRNAs and hub genes were conducted using both the miRDB and miRNet databases. Specifically, we focused on data with a Target Score greater than 80 from the miRDB database, which were then intersected with the mRNA-miRNA interaction network.

After that, we employed the ENCORI database to predict lncRNA-miRNA interactions, applying a screening criterion of clipExpNum ≥ 5 to obtain the interaction network. We subsequently collated the mRNA-miRNA and miRNA-lncRNA interaction relationships to map the comprehensive mRNA-miRNA-lncRNA interaction network.

Immune Infiltrate Analysis

CIBERSORTx32 is based on the linear support vector regression criteria to deconvolve the transcriptome expression matrix, thereby estimating the formation and abundance of immune cells in mixed cells. We uploaded the gene expression matrix data of the combined dataset to the online website of CIBERSORTx, associated with the LM22 feature gene matrix, filtered out the samples based on pvalue < 0.05, and eventually derived the immune cell infiltration matrix. The data whose immune cell enrichment scores over zero were then output after filtering, then we received the specific results of the above immune cell infiltration matrix. Ultimately, histograms were constructed using the ggplot2 package in the R software to display the composition of 22 immune cell infiltrations for each sample. The heat maps and lollipop plots were plotted to characterize the correction between immune cells and hub genes via the psych package.

Construction of Immune Characteristic Subtypes

Consistency clustering33 allows for the categorization of samples into distinct subtypes based on various omics datasets, facilitating the identification of novel disease subtypes and enabling comparative analyses between different subtypes. In our study, we employed the R package (Version 1.67.0) Consensus ClusterPlus34 for clustering analysis to construct immune signature subtypes, categorizing CKD samples into different groups based on the level of immune cell infiltration determined by CIBERSORTx. We conducted 100 repetitions for each parameter setting (reps=100), utilizing a sample resampling proportion of 80% (pltem=0.8) with a resampling rate of 100% (pFeature=1). Partitioning Around Medoids (PAM) was employed as the clustering algorithm. Following the clustering, we conducted a detailed examination of the differences in immune cell composition and hub gene expression across the immune signature subgroups. Additionally, we performed ROC analysis to validate the significance of immune cells and hub genes that exhibited statistically significant (P<0.05) differences.

Experimental Animals

This study utilized 4 distinct forms of chronic kidney disease (CKD)-related mouse models, generated through the following methods:

The AKI model was induced by bilateral renal ischemia-reperfusion injury (IRI) through 45 minutes of bilateral renal artery clamping, followed by 48h of reperfusion. Adult male C57BL/6 mice (special pathogen-free (SPF) level) weighing 18–22 g (6–8 weeks old), were obtained from Guangdong Medical Laboratory Animal Center (Certificate Number: SCXK 2022-0002).

The unilateral ureteric obstruction (UUO) model was established by ligating the left ureter. Kidney samples were collected one week post-surgery. Adult male C57BL/6 mice (SPF level) weighing 18–22 g (6–8 weeks old), were obtained from HuHai Bestest Biotechnology Co., Ltd. (Certificate Number: SCXK 2020-0051).

The diabetic nephropathy (DN) model was induced via a solid high-fat diet, with control group mice receiving regular feed. After 4 weeks of a high-fat diet, mice in the model group were rendered diabetic through daily intraperitoneal injections of streptozotocin (STZ, 50 mg/kg) for 5 days, while the control group received an equal volume of citrate buffer. Successful establishment of the DN model was confirmed when urine protein levels in the model group were significantly elevated compared to the control group. Adult male C57BL/6 mice (SPF level) weighing 18–22 g (6–8 weeks old), were obtained from Guangdong Medical Laboratory Animal Center (Certificate Number: SCXK 2022-0002).

The CKD model was induced using a high-adenine (0.25%) solid diet. Following a one-day fasting period, the mice were fed a high-adenine diet for 2 weeks before transitioning to a normal diet for an additional 2 weeks. After a total of 4 weeks of feeding, the mice were sacrificed. Adult male C57BL/6 mice (SPF level) weighing 18–22 g (6–8 weeks old), were obtained from Guangdong Medical Laboratory Animal Center (Certificate Number: SCXK 2022-0002).

All animal experiments were conducted in strict accordance with the guidelines set forth by the Ethics Committee of Guangzhou University of Chinese Medicine (Ethics approval number: ZYD-2022-263, ZYD-2022-091, ZYD-2022-133, 2022-157). Mice were housed in a SPF environment (25±2°C, 65%humidity, 12h light/12h dark, free access to food and water) with adaptive feeding for at last 7 days. After adaptive feeding, animals were randomly grouped via a random number generator, and the blinded method was used in the experiment to reduce operator bias. All procedures involved the administration of anesthesia using sodium pentobarbital (50mg/kg) via intraperitoneal injection.

RT-qPCR Assay

To validate the hub genes identified through data mining, we opted the RT-qPCR assay. We selected 4 distinct CKD-related mouse models that had already been established by our research team. RNA from the kidney tissue of mice was extracted by AG RNAex Pro Reagent (AG21102, ACCURATE BIOTECHNOLOGY, HUNAN, Co., Ltd), and subsequently converted to complementary DNA (cDNA) by a reverse transcription kit (Code No.AG11728). For the quantitative polymerase chain reaction (PCR), we employed SYBR Green PCR Master Mix (Code No.AG11718). The primer sequences are provided in Table S2. The relative expression levels of the hub genes were calculated using β-actin as an endogenous control according to the 2-∆∆CT method.

Statistical Analysis

The data processing and statistical analyses related to data mining were conducted using R software (version 4.2.0). For RT-qPCR data, statistical analyses were performed via GraphPad Prism software. Differences between two continuous groups were analyzed using the independent Student’s t-tests for normally distributed variables, Mann–Whitney U-tests (Wilcoxon’s rank-sum test) for non-normally distributed variables, whereas the Fisher exact test or a chi-square test for categorical variables. The correlation coefficients between different variables were calculated using Spearman’s method. If not specified, all P values were two-sided, with a P value≤0.05 considered statistically significant.

ResultsThe Flowchart

The bioinformatics workflow of this analysis was illustrated in Figure 1.

Figure 1 The flowchart of this analysis.

Abbreviations: ARGs, Anoikis-related genes; ARDEGs, Anoikis-related differentially expressed genes; GSEA, Gene Set Enrichment Analysis; GSVA, Gene Set Variation Analysis; WGCNA, Weighted gene co-expression network analysis; Go, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; CeRNA, competing endogenous RNAs.

Analysis of ARDEGs

Firstly, the CKD datasets GSE62792, GSE70528, and GSE66494 were processed to eliminate batch effects using the R package sva, resulting in a merged GEO dataset.

Subsequently, we conducted differential analysis on the merged dataset using the R package limma to compare gene expression values between the CKD and normal groups. This analysis revealed 1738 DEGs based on the thresholds of |logFC|>0.5 and p.ajust<0.05, with 592 genes identified as upregulated and 1146 as downregulated. The expression profiles of these genes were visualized using volcano plots (Figure 2A). To obtain the ARDEGs, we intersected the identified DEGs with ARGs that met the criteria of |logFC|>0.5 and p.adjust<0.05, and we plotted a Venn diagram (Figure 2B). This analysis yielded a total of 13 ARDEGs: MST1, SOX9, TAGLN, TNFSF10, LAMC2, NRP1, BIN1, CDH3, NDRG1, CLDN1, LAMB3, CASP8, ICAM1. Following this, we analyzed the expression differences of ARDEGs among various sample subgroups within the combined dataset, and heatmaps (Figure 2C) were generated. We also constructed a correlation heatmap (Figure 2D) for the ARDEGs, which indicated that CDH3 was significantly (P<0.05) correlated with all other ARDEGs. Comparative grouping analysis (Figure 2E) demonstrated that all ARDEGs exhibited significant expression differences (P<0.05) between the CKD and Normal groups.

Figure 2 Analysis of ARDEGs. (A) Volcano plot of DEGs in the CKD and Normal groups of the merged dataset. (B) Venn diagram of DGEs and ARGs. (C) Heat map of MRDEGs in the CKD and Normal groups for the merged dataset. (D) Correlation heat map of ARDEGs for the merged dataset. (E) Group comparisons of ARDEGs in the CKD and Normal groups for the merged dataset. (F) PPI network diagram of ARDEGs. The symbol * denotes P<0.05, with statistical significance. The symbol ** denotes P<0.01, with highly statistically significant. The symbol *** denotes P< 0.001 and with extremely statistically significant.

Abbreviations: CKD, Chronic kidney disease; DEGs, Differentially expressed genes; ARGs, Anoikis-related genes; ARDEGs, Anoikis-related differentially expressed genes; PPI, Protein-protein interaction.

Ultimately, to elucidate the interactions (Figure 2F) among the ARDEGs, we utilized the String database. This analysis revealed that, with the exception of NRP1, BIN1, and SOX9, the remaining ARDEGs established interactive relationships with each other.

GO Functional and KEGG Pathway Enrichment Analysis of ARDEGs

To reveal the biological processes, molecular functions, cellular components, and biological pathways of the 13 ARDEGs (MST1, SOX9, TAGLN, TNFSF10, LAMC2, NRP1, BIN1, CDH3, NDRG1, LAMB3, CASP8, ICAM1), we initially performed GO functional and KEGG pathway enrichment analysis. Afterward, the results of GO functional enrichment analysis (Figure 3A and B) and KEGG pathway enrichment analysis (Figure 3C and D) were displayed by histogram and circular network plot.

Figure 3 GO functional and KEGG pathway enrichment analysis of ARDEGs. (A) Histogram of GO functional enrichment analysis results for ARDEGs. (B) Circular network plot of GO function enrichment analysis for ARDEGs. (C) Histogram of KEGG pathway enrichment analysis results for ARDEGs. (D) Circular network plot of KEGG pathway enrichment analysis results for ARDEGs. (E) The pathway plot of ECM-receptor interaction. (F) The pathway plot of Influenza A.

Abbreviations: GO, Gene Ontology; BP, biological process; CC, cellular component; MF, molecular function; KEGG, Kyoto Encyclopedia of Genes and Genomes; ARDEGs, Anoikis-related differentially expressed genes.

From Figure 3A–D, it is evident that ARDEGs are primarily enriched in BP, such as cell junction assembly, cell junction organization, negative regulation of extrinsic apoptotic signaling pathway, and regulation of extrinsic apoptotic signaling pathway. In terms of CC, the ARDEGs are associated with the collagen-containing extracellular matrix, cell-cell junction, laminin complex, and extracellular matrix components. Regarding MF, they exhibit activities such as tumor necrosis factor receptor binding, tumor necrosis factor receptor superfamily binding, virus receptor activity, and hijacked molecular function. For comprehensive details, please refer to Table S3. Besides, the principal pathways enriched among the ARDEGs included those involved in cell cycle progression Toxoplasmosis, Cell adhesion molecules, Influenza A, Pathogenic Escherichia coli infection, Human papillomavirus infection, Viral myocarditis, ECM-receptor interaction, Small cell lung cancer, Amoebiasis, TNF signaling pathway, and Leukocyte transendothelial migration. For additional details, see Table S4. The pathways ECM-receptor interaction (Figure 3E) and influenza A (Figure 3F) from the KEGG enrichment analysis results were depicted using pathway plots.

GSEA and GSVA

To investigate the influence of DEGs on the progression of CKD, we analyzed the associations among DEGs expression levels in the merged dataset and the biological processes, cellular components, and molecular functions involved using GSEA (For additional details, see Table S5). These results demonstrate that DEGs in the combined dataset were significantly enriched in pathways related to fatty acid metabolism (Figure 4A), oxidative phosphorylation (Figure 4B), glucose metabolism (Figure 4C), glycolysis gluconeogenesis (Figure 4D), among several others.

Figure 4 GSEA and GSVA of the merged dataset. (A–D) DEGs within the merged dataset were significantly enriched in fatty acid metabolism (A), oxidative phosphorylation (B), glucose metabolism (C), and glycolysis gluconeogenesis (D) pathway. (E) Complex numerical heat map of GSVA results between CKD and Normal groups in the merged genes.

Abbreviations: GSEA, Gene Set Enrichment Analysis; GSVA, Gene Set Variation Analysis.

To assess the differences in the HALLMAPK gene set between the CKD group and Normal group samples during the merged dataset, we then performed GSVA (For additional details, see Table S6). We examined the differential expression of 18 hALLMAPK pathways across different subgroups based on the GSVA results. We used the R package pheatmap (Version 1.0.12) to depict a heat map displaying the findings of this specific differential analysis (Figure 4E). The results demonstrated that a sum of 18 hALLMAPK gene sets, including HALLMAPK_HYPOXIA, exhibited significant (P<0.05) differences between the various subgroups of the pooled datasets.

WGCNA

We applied WGCNA to the merged dataset to screen the co-expression modules. During the WGCNA analysis, we first filtered outlier samples from the merged dataset using hierarchical clustering (Figure 5A) with a cut height of 80. The optimal soft threshold was determined to be 7, as indicated by the power diagram (Figure 5B). Figure 5C presents the gene clustering tree, showing that the genes in the merged dataset were grouped into 26 distinct modules (Figure 5D). Notably, the modules MElightyellow (r=0.88), MElightcyan (r=0.67), MEroyalblue (r=−0.7) exhibited moderate correlations with Ars. Soon afterward, we selected the intersection between the MElightyellow module and the ARDEGs, subsequently plotting a Venn diagram (Figure 5E) to illustrate this relationship. The genes that emerged from this intersection were designated as hub genes, which include LAMC2, NRP1, CDH3, NDRG1, CLDN1, and LAMB3. To further explore the interactions among these hub genes (Figure 5F), we utilized the STRING database, which revealed notable interactions between CLDN1 and CDH3, as well as between LAMB3 and LAMC2.

Figure 5 WGCNA identified co-expression modules in merged datasets. (A) Sample clustering tree displayed in the merged dataset. (B) Gene clustering tree of the merged dataset. (C) Power scatter plot of the merged dataset. (D) Heat map of correlations between modular genes and Ars in the merged dataset. (E) Venn diagram of Melightyellow’s modular genes and ARDEGs. (F) PPI network diagram of hub genes.

Abbreviations: WGCNA, Weighted gene co-expression network analysis; ARDEGs, Anoikis-related differentially expressed genes; PPI, Protein-protein interaction; Ars, Anoikis-related score.

Differences in Hub Genes Between the CKD and Normal Groups

To check into the differences in hub gene expression between the CKD and Normal groups, to begin with, we conducted the univariate logistic regression analysis of the hub genes (LAMC2, NRP1, CDH3, NDRG1, CLDN1, LAMB3) based on their expression levels. The results were presented as a forest plot (Figure 6A), which clearly indicates significant (P<0.05) diagnostic effects for all hub genes. Subsequently, we generated a group comparison diagram (Figure 6B) to illustrate the expression profiles of the hub genes between the CKD and normal groups within the merged dataset. Apart from this, we performed ROC analysis for all hub genes (Figure 6C–H) to further validate their diagnostic efficacy. The expression levels of these genes displayed a degree of diagnostic potential for both groups.

Figure 6 The differences of hub genes between the CKD and Normal group. (A) Forest plot for univariate logistic regression analysis of hub genes. (B) The group comparison diagram of hub genes between the CKD and the Normal group in the merged dataset. (C–H) ROC verification of CDH3 (C), CLDN1 (D), LAMB3 (E), LAMC2 (F), NDRG1 (G), NRP1 (H). The symbol ** denotes P<0.01, with highly statistically significant. The symbol *** denotes P<0.001 and with extremely statistically significant. The closer the AUC is to 1, the better the diagnostic effect. AUC in the range of 0.5–0.7 corresponds to an inferior degree of accuracy; AUC in the range of 0.7–0.9 corresponds to some degree of accuracy; AUC above 0 corresponds to a senior degree of accuracy.

Abbreviation: CKD, Chronic kidney disease.

The ceRNA Network of Hub Genes

We utilized the mRNA-miRNA data from the ENCORI and miRDB databases to predict the interactions between the miRNAs and 6 hub genes (LAMC2, NRP1, CDH3, NDRG1, CLDN1, LAMB3), thereafter, we identified the intersection of the predicted miRNA sets and used the NCORI database to predict associated long non-coding RNAs (lncRNAs). In the end, we visualized the mRNA-miRNA-lncRNA interaction network (Figure 7) using Cytoscape. In the resulting interaction network, we identified 4 mRNAs (LAMC2, NRP1, NDRG1, CLDN1), 27 miRNAs, which include hsa-miR-3163, hsa-miR-29c-3p, hsa-miR-29a-3p, hsa-miR-29b-3p, hsa-miR-577, hsa-miR-193b-3p, hsa-miR-193a-3p, hsa-miR-449c-5p, hsa-miR-34b-5p, hsa-miR-2683-5p, hsa-miR-2682-5p, hsa-miR-124-3p, hsa-miR-5094, hsa-miR-506-3p, hsa-miR-148a-3p, hsa-miR-152-3p, hsa-miR-148b-3p, hsa-miR-1-3p, hsa-miR-206, hsa-miR-320d, hsa-miR-613, hsa-miR-4429, hsa-miR-320c, hsa-miR-338-3p, hsa-miR-9-5p, hsa-miR-320b, and hsa-miR-24-3p, as well as hsa-miR-186-5p. Additionally, we identified 114 lncRNAs associated with these interactions. The detailed mRNA-miRNA-lncRNA interactions are elaborated in Table S7.

Figure 7 ceRNA network. In the figure, the green circular block represents mRNA, the blue circular block represents miRNA, and the yellow circular block represents lncRNA. The lines represent interactions; an increased number of lines indicates a more important gene within the network.

Abbreviation: ceRNA, competing endogenous RNAs.

Correlation Analysis of Hub Genes

To investigate the corrections among hub genes in the merged dataset, we calculated the Spearman correlation coefficients for the hub genes (LAMC2, NRP1, CDH3, NDRG1, CLDN1, LAMB3). We then visualized these correlations using a heatmap (Figure 8A) and a chord diagram (Figure 8B). As shown in these figures, the correlations among all hub genes were statistically significant (P< 0.05). Among them, a relatively high positive correlation existed between CLDN1 and NRP1, as well as between CLDN1 and LAMC2. Conversely, a high negative correlation was noted between NDRG1 and both LAMC2 and LAMB3. To further illustrate these correlations, we generated scatter plots (Figure 8C–F). Figure 8C and D illustrated that there were moderate correlations among the hub genes. Specifically, a positive correlation was observed between NDRG1 and LAMC2 (r = 0.714), as well as between NDRG1 and LAMB3 (r = 0.746). Conversely, a negative correlation existed between NDRG1 and CLDN1 (r = −0.754), and also between NDRG1 and LAMB3 (r = −0.586).

Figure 8 Correlation analysis of hub genes. (A) Correlation heat map of hub genes. (B) Correlation chord diagram of hub genes. (C–F) Correlation scatter plots of NRDG1 and LAMC2 (C), NDRG1 and LAMB3 (D), NRDG1 and LAMC2 (E), NDRG1 and LAMB3 (F). The symbol ** denotes P<0.01, with highly statistically significant. The correlation coefficient r is positive, indicating that there might be a positive correlation between the two variables; if negative, indicating that there might be a negative correction coefficient above 0.8, is considered a strong correlation, at the range from 0.5 to 0.8 is considered a moderate correlation, at the range from 0.3 to 0.5 is considered generally related, below 0.3 is considered weakly or irrelevantly correlated.

Immuno-Infiltration Analysis of Merged Data Sets

The expression profiles from the merged dataset were processed and uploaded to the CIBERSORTx online website, in which the algorithm was applied to calculate the correlation between 22 immune cells and the expression profile data of the CKD and Normal groups in the merged dataset. According to the results of the immune infiltration analysis, we constructed a histogram (Figure 9A) depicting immune cell infiltration in the merged dataset. Additionally, we performed a correlation analysis of expression levels between the immune cells and the hub genes, based on the abundance of immune cell infiltration, and plotted a heatmap (Figure 9B) to illustrate these correlations. The analysis revealed significant correlations (P<0.05) between specific immune cells, notably Dendritic cells resting and Mast cells resting, with the hub genes. Furthermore, we created lollipop plots to visualize the correlations of Dendritic cells resting (Figure 9C) and Mast cells resting (Figure 9D) with the hub genes. As shown in these figures, Dendritic cells resting exhibited the highest positive correlation with NDRG1 and the highest negative correlation with CLDN1. In contrast, Mast cells restings demonstrated the highest positive correlation with NDRG1 and the highest negative correlation with CDH3.

Figure 9 Immune-infiltration analysis of hub genes. (A) Histogram of the infiltration results of immune cells in the merged dataset. (B) Heat map of the correlation between immune cells and hub genes. (C) The lollipop plot of the correlation between Dendritic cells resting in immune cells and hub genes. (D) The lollipop plot of the correlation between Mast cells resting in immune cells and hub genes. The symbol * denotes P<0.05, with statistical significance. The symbol ** denotes P<0.01, with highly statistically significant.

Analysis of Subtypes of Immune Characteristics

We performed unsupervised consensus clustering on the CKD group samples from the merged dataset based on the infiltration abundance of 22 immune cell types. This analysis identified 2 distinct subtypes within the CKD group (cluster1: n = 40; cluster2: n=24, Figure 10A–C). Shortly afterward, we demonstrated the differences in immune cell infiltration among the immune signature subtypes (cluster1, cluster2) by grouped comparison plots (Figure 10D) and then performed the ROC validation for genes in which there were significant (P<0.05) differences (Figure 10E–J).

Figure 10 Analysis of immune characteristic subtypes. (A) Consistent clustering heat map of the merged dataset. (B) Function diagram of consistent cumulative distribution. (C) Delta Area of consistent clustering. (D) Comparative graph of grouping of immune cells in subgroups of different immune characteristic subtypes. (E–J) ROC validation of Eosinophils (E), Mast cells resting (F), Neutrophils (G), T cells CD4 memory resting (H), T cells CD8 (I), Tregs (J). (K) Comparative graph of the grouping of hub genes in different immune characteristic subtypes. (L and M) ROC validation of CDH3 (L), NRP1 (M). The symbol ns denotes P≥ 0.05, with not statistically significant. The symbol * denotes P<0.05, with statistical significance. The symbol ** denotes P<0.01, with highly statistically significant. The symbol *** denotes P<0.001 and with extremely statistically significant. The closer the AUC is to 1, the better the diagnostic effect. AUC in the range of 0.5–0.7 corresponds to an inferior degree of accuracy; AUC in the range of 0.7–0.9 corresponds to some degree of accuracy; AUC above 0 corresponds to a senior degree of accuracy.

The results demonstrated that 6 immune cells types - Eosinophils, Mast cells resting, Neutrophils, T cells CD4 memory resting, T cells CD8, and regulatory T cell (Tregs) - showed significant (P<0.05) differences between the subgroups characterized by varying immune profiles. Among them, Eosinophils (AUC=0.699, Figure 10E), Mast cells resting (AUC=0.681, Figure 10F), Neutrophils (AUC=0.697, Figure 10G), Tregs (AUC=0.699, Figure 10J) showed a low degree of diagnostic effect in immune characteristic subtypes (cluster1, cluster2). T cells CD4 memory resting (AUC=0.792, Figure 10H) and T cells CD8 (AUC=0.785, Figure 10I) had a general diagnostic effect in immune characteristic subtypes (cluster 1, cluster2). We as well presented the differential expression of hub genes in the 2 immune signature subtypes (cluster1, cluster2) by grouped comparison plots (Figure 10K). These results demonstrated that 2 hub genes (CDH3, NRP1) were significantly (P<0.05) different between the immune characteristic subgroups. Notably, the expression levels of CDH3 (AUC=0.722, Figure 10L) and NRP1 (AUC=0.728, Figure 10M) exhibited general diagnostic effects in immune characteristic subtypes (cluster1, cluster2) severally.

Validation of Hub Genes

To validate the hub genes derived from bioinformatics, we assessed the relative mRNA expression levels of these hub genes in kidney tissues from multiple CKD animal models, as illustrated in Figure 11Y1 and Y2, using RT-qPCR, shown in Figure 11A–X. The results showed that the Model groups exhibited significantly increased expression levels of LAMB3 (Figure 11A, G, M and S) and CDH3 (Figure 11C, I, O and U) when compared to Control/Sham tissues. These findings align with the results obtained from bioinformatic analyses.

Figure 11 Validation of hub genes. (A–F) The relative mRNA expression level of hub genes in the model of IRI-induced AKI model mice. (G–L) The relative mRNA expression level of hub genes in the model of UUO model mice. (M–R) The relative mRNA expression level of hub genes in the model of DN model mice. (S–X) The relative mRNA expression level of hub genes in the model of CKD model mice. (Y1 and Y2) The detailed schematic diagrams of mouse models were drawn by Figdraw (UIURT31101). The results of gene expression are presented as fold change, which was calculated relative to β-actin expression. Data were expressed as mean±SD. N=4/5/6. The symbol ns denotes P≥0.05, with not statistically significant. The symbol * denotes P<0.05, with statistical significance. The symbol ** denotes P< 0.01, with highly statistically significant. The symbol *** denotes P<0.001 and with extremely statistically significant.

Discussion

CKD is characterized by insidious onset, rapid progression, and poor prognosis, affecting approximately 10% of adults worldwide due to various underlying causes.1 Increased resistance to anoikis plays a crucial role in altering the tumor microenvironment in prostate cancer,35 with CD8+ T cells being particularly prominent in this process.36 Of note, the adaptive immune system acts a complex role in the pathogenesis of diverse diseases.37 Additionally, innate immune regulation is involved in the initiation and advancement of CKD.38 However, the exact mechanisms of anoikis related to CKD are currently unclear. At present, there are no specific targets established for the study of CKD-related anoikis. Identifying potential biomarkers associated with anoikis and CKD progression might be an effective avenue to prevent and treat CKD.

In this study, we utilized bioinformatics approaches to identify 1738 DEGs between the CKD and Normal groups, among which 13 were identified as ARDEGs: MST1, SOX9, TAGLN, TNFSF10, LAMC2, NRP1, BIN1, CDH3, NDRG1, CLDN1, LAMB3, CASP8, and ICAM1. Furthermore, a total of 26 co-expression modules were identified through WGCNA, leading to the selection of the MElightyellow module for further investigation with the ARDEGs. This analysis yielded 6 hub genes: LAMC2, NRP1, CDH3, NDRG1, CLDN1, and LAMB3. In addition, univariate logistic regression analysis demonstrated that all hub genes had significant diagnostic effects, and ROC analysis also revealed that the expression of all genes had a certain degree of diagnostic effect on both the CKD group and Normal groups.

The methylation modification of LAMC2 is related to its expression in aging kidneys and the progressive decline in renal function.39 Knocking out NRP1 has been shown to reduce the migration and invasion capabilities of renal cell carcinoma cells, thereby helping to maintain the undifferentiated phenotype of these cancer cells.40 In epithelial ovarian cancer (EOC), HOXA9 binding to the CDH3 gene promoter induces CDH3 expression while inhibiting anoikis.41 CDH3 plays a crucial role in the organization of cells during the epithelial-mesenchymal transition (MET) in kidney development, thus, controlling CDH3 expression or activity could improve kidney cell aggregation and the MET process, potentially slowing kidney disease progression.42 The stability of NDRG1 is regulated by the phosphorylation state of upstream protein CX43, which can alter the stability of gap junctions and impact responses to ischemic injury.43 Additionally, the gene expression of claudin-1 in glomerular podocytes is associated with severe albuminuria, which could stabilize the SD protein complex in podocytes.44 In pancreatic ductal adenocarcinoma (PDAC), the upregulation of LAMB3 has been identified as a regulator of cell apoptosis, invasion, and metastatic capacity, LAMB3 is also linked to poor prognosis in PDAC. Inhibiting LAMB3 or blocking its downstream signaling pathways could offer new treatment options for this disease.45 Furthermore, caspase-8 functions as a molecular switch among apoptosis, necrosis, and pyroptosis,46 and it is also implicated in the occurrence of cell anoikis.47 High levels of ICAM-1 expression in tumors and plasma of patients with non-small cell lung cancer have been associated with prolonged survival, and soluble ICAM-1 may enhance the efficacy of anti-PD-1 therapies by activating cytotoxic T cells.48 The studies on the anoikis-related differentially expressed genes (ARDEGs) in CKD lend support to our findings. Notably, our experimental results concerning the hub genes align with existing literature, suggesting that these genes may serve as potential therapeutic targets for drug development, as well as possible diagnostic biomarkers for CKD.

KEGG pathway analysis indicated that ARDEGs are associated with several pathways, including Toxoplasmosis, Cell Adhesion Molecules, Influenza A, Pathogenic Escherichia coli Infection, Human Papillomavirus Infection, Viral Myocarditis, ECM-Receptor Interaction, Small Cell Lung Cancer, Amoebiasis, TNF Signaling Pathway, and Leukocyte Transendothelial Migration Pathway. Studies have shown that ECM-receptor interaction is implicated in abnormal protein accumulation during glomerular podocyte injury,49 as well as in the process of anoikis.50 Furthermore, influenza vaccination may reduce the risk of lung cancer in patients with CKD.51 Additionally, the signal transduction pathways involving adhesion-related molecules influence the anoikis of cancer cells, with some branches also regulating TNF family signal transduction.52 These findings suggest that preventing anoikis could be beneficial in reducing mortality and morbidity following kidney injury. The pathways identified through enrichment analysis may indicate that ARDEGs could serve as potential biomarkers for CKD. Moreover, during the complex transition from AKI to CKD, metabolic reprogramming of renal tubular epithelial cells (TECs) contributes to inflammation, lipid accumulation, and fibrosis, thereby accelerating disease progression.53 Aside from this, vascular calcification associated with CKD is mediated by oxidative phosphorylation.54 Among our results, GSEA showed that DEGs in the combined data set were significantly enriched in pathways related to fatty acid metabolism, oxidative phosphorylation, glucose metabolism, glycolysis, and gluconeogenesis, aligning with findings from previous research. These studies further validate our results to some extent regarding CKD-related signaling pathways.

Immune cells play a pivotal role in renal injury, and their complex metabolic processes exhibit different pathological effects at various stages of CKD.55 Following renal injury, the expression of tissue-resident macrophages and Tregs was significantly increased.56 T cells, particularly CD8+ subsets, display pronounced signs of immunosenescence during early-stage CKD renal dysfunction.57 Additionally, podocyte DNA damage may lead to changes in CD8+ T cell infiltration, thereby exacerbating persistent renal damage in CKD.58 A study created a gene knockout mouse model of Ndrg1-CreER and found that the glomerular albumin sieving coefficient (GSC) increased 2.2-fold compared to non-diabetic mice.59 This data supports the understanding of renal albumin dynamics associated with hyperfiltration in DN. In our study, the immune infiltration analysis of hub genes showed that there was a significant (P<0.05) correlation between Dendritic cells resting and Mast cells resting with hub genes. Furthermore, our analysis of immune characteristic subtypes revealed the presence of six immune cell types and two hub genes across different immune characteristic subtypes. Although this study conducted immune cell infiltration analysis based on the CIBERSORTx algorithm, the lack of comparison with other immune infiltration assessment methods, such as EPIC and xCell, might limit our comprehensive understanding and validation of the immune cell infiltration results. Future research should integrate multiple methods and datasets to enhance the reliability and accuracy of the findings.

CKD encompasses a range of syndromes characterized by chronic renal dysfunction and involves complex mechanisms that include various critical signaling pathways, such as lipotoxicity, oxidative stress, inflammation, and activation of myofibroblasts.60 Given this complexity, it is challenging to assess the effectiveness of specific interventions for different stages of CKD. We have conducted wet experiments to verify the relative mRNA expression levels of these hub genes in the several CKD animal models. However, due to research progress and resource limitations, we cannot currently conduct comprehensive validation across diverse populations and multiple independent datasets.

In patients with CKD, traditional indicators like creatinine and proteinuria are commonly used to assess kidney damage and its progression.61 We recognize that biomarkers could have different diagnostic values at various stages of CKD. We did not select specific stage samples for analysis, which prevented us from deeply analyzing the impact of each stage on biomarkers. However, this has established a broad foundation for further research, allowing us to identify a set of biomarkers that may have potential clinical significance. In future studies, we will incorporate additional independent datasets for comprehensive validation. We also aim to investigate the expression of these genes in various populations and pathological conditions to ensure that our findings are broadly applicable and clinically significant. In future studies, we plan to add more independent datasets for validation and examine the expression of these genes across different populations and pathological conditions to enhance the applicability and clinical relevance of our findings.

Conclusion

In our study, we identified 6 hub genes, LAMC2, NRP1, CDH3, NDRG1, CLDN1, and LAMB3, from ARDEGs, which were also partly involved in Dendritic cells resting and Mast cells resting in immune infiltration. Accordingly, hub genes such as CDH3 might serve as potential key biomarkers of anoikis-related progression in CKD. Overall, CDH3 and its related hub molecules might be potential core biomarkers in the development of CKD, and the insights gained from this study might be of assistance to provide a new perspective on the diagnostic markers and therapeutic targets of CKD.

Funding

This work was supported by the National Natural Science Foundation of China (No. 82174061), the Government-funded Postdoctoral Fellowship Program (GZC20230629).

Disclosure

The authors declare that they have no known competing financial interests or personal relationships that could be construed as a potential conflict of interest.

References

1. Kalantar-Zadeh K, Jafar TH, Nitsch D, Neuen BL, Perkovic V. Chronic kidney disease. Lancet. 2021;398(10302):786–802. doi:10.1016/s0140-6736(21)00519-5

2. Levey A, Grams M, Inker L, Ingelfinger JR. Uses of GFR and albuminuria level in acute and chronic kidney disease. New Engl J Med. 2022;386(22):2120–2128. doi:10.1056/NEJMra2201153

3. Taddei ML, Giannoni E, Fiaschi T, Chiarugi P. Anoikis: an emerging hallmark in health and diseases. J Pathol. 2012;226(2):380–393. doi:10.1002/path.3000

4. Kakavandi E, Shahbahrami R, Goudarzi H, Eslami G, Faghihloo E. Anoikis resistance and oncoviruses. J Cell Biochem. 2018;119(3):2484–2491. doi:10.1002/jcb.26363

5. Macabenta F, Sun HT, Stathopoulos A. BMP-gated cell-cycle progression drives anoikis during mesenchymal collective migration. Dev Cell. 2022;57(14):1683–1693.e3. doi:10.1016/j.devcel.2022.05.017

6. Zhang HF, Hughes CS, Li W, et al. Proteomic screens for suppressors of anoikis identify IL1RAP as a promising surface target in Ewing sarcoma. Cancer Discov. 2021;11(11):2884–2903. doi:10.1158/2159-8290.Cd-20-1690

7. Song J, Liu Y, Liu F, et al. The 14-3-3σ protein promotes HCC anoikis resistance by inhibiting EGFR degradation and thereby activating the EGFR-dependent ERK1/2 signaling pathway. Theranostics. 2021;11(3):996–1015. doi:10.7150/thno.51646

8. Dai Y, Zhang X, Ou Y, et al. Anoikis resistance—protagonists of breast cancer cells survive and metastasize after ECM detachment. Cell Commun Signal. 2023;21(1):190. doi:10.1186/s12964-023-01183-4

9. Charrier M, Mezquita L, Lueza B, et al. Circulating innate immune markers and outcomes in treatment-naïve advanced non-small cell lung cancer patients. Eur J Cancer. 2019;108:88–96. doi:10.1016/j.ejca.2018.12.017

10. Speer T, Dimmeler S, Schunk SJ, Fliser D, Ridker PM. Targeting innate immunity-driven inflammation in CKD and cardiovascular disease. Nat Rev Nephrol. 2022;18(12):762–778. doi:10.1038/s41581-022-00621-9

11. Liu T, Zhuang XX, Qin XJ, Wei LB, Gao JR. Identifying effective diagnostic biomarkers and immune infiltration features in chronic kidney disease by bioinformatics and validation. Front Pharmacol. 2022;13:1069810. doi:10.3389/fphar.2022.1069810

12. Sayanthooran S, Gunerathne L, Abeysekera TDJ, Magana-Arachchi DN. Transcriptome analysis supports viral infection and fluoride toxicity as contributors to chronic kidney disease of unknown etiology (CKDu) in Sri Lanka. Int Urol Nephrol. 2018;50(9):1667–1677. doi:10.1007/s11255-018-1892-z

留言 (0)

沒有登入
gif