Identification of Hub Genes and Immune Infiltration in Coronary Artery Disease: A Risk Prediction Model

Introduction

Coronary artery disease (CAD), which results from atherosclerosis, is a common cardiovascular disease. Hypertension, diabetes, obesity, and smoking are the main factors that induce CAD.1,2 As a major cardiovascular disease affecting the global population, CAD has been demonstrated to be the leading cause of death in developed and developing countries.3 The prognosis of CAD remains unfavorable due to slow death from deteriorating cardiac function, sudden ventricular fibrillation, or atherosclerotic plaque rupture.4,5

In the case of CAD, coronary angiography (CAG) is considered the gold standard of diagnosis, but its application is limited due to its high cost and invasiveness. In contrast, blood biomarker detection exhibits marked advantages, such as lower cost and less invasiveness.6 The diagnosis of CAD is often based on a variety of blood markers, including myocardial injury markers (myocardial enzyme, troponin), inflammatory markers (C-reactive protein [CRP]), blood lipid indicators, and homocysteine Hcy.7 Many studies have confirmed that CRP is an independent factor for predicting primary and secondary CAD events.8 Dyslipidemia is considered a key risk factor for atherosclerosis. Small and dense low-density lipoprotein (sdLDL) is often observed in the serum of patients with atherosclerosis.9 Lipoprotein (a), another cardiovascular risk factor, has multiple effects on atherosclerotic cardiovascular disease (ASCVD).10 These markers provide essential clues for diagnosis. Still, it should be noted that they cannot be used as the basis for diagnosis alone and should be combined with clinical manifestations and other examination methods. In addition, microparticles (MP) and exosomes, as emerging biomarkers, have shown great potential in treating various diseases.11 Exosomes play a central role in atherosclerosis because they are deeply involved in lipid metabolism. However, although exosomes are closely related to atherosclerosis, the detailed mechanism of the specific goods (ie, contents) carried by exosomes in atherosclerosis is still largely unknown.

Recently, microarray and high-throughput sequencing are now widely applied to predict potential therapeutic targets for atherosclerosis.12 Furthermore, these methods can be used to identify biomarkers that can be applied to diagnose and treat CAD.5 Although polygenic risk scores have been developed to test for the prevalence of high coronary artery disease accurately,13 the applications of these current biomarkers are limited under many conditions.2 In addition, most studies have focused on a single disease feature, such as lncRNA, mRNA, or miRNA expression signatures. Previous studies have demonstrated that the expression signatures of numerous genes perform better in disease classification and diagnosis than the expression signatures of a single gene.14 Therefore, we updated multiple signatures of CAD diagnostic methods using a high-tech strategy of gene expression signatures combined with immune scoring. The infiltration of various immune cell types may play a significant role in CAD and offer viable targets for disease therapy.15,16 Moreover, genetic polymorphisms in tumor immune microenvironment (TME)-related mRNAs may be associated with clinicopathological features in coronary lesions.17

This study aimed to explore the biological function of hub genes in CAD. We utilize the CIBERSORT algorithm to analyze the characteristics of immune cell infiltration and establish an immune scoring system. Based on this, we jointly integrate gene characteristics to construct a CAD risk prediction model.

Materials and Methods Data Collection and Preprocessing

The GEO database was systematically searched. The inclusion criteria were: (1) specimens were derived from patients with CAD, and (2) data were based on gene chip or high-throughput sequencing. The exclusion criteria were: (1) sample size less than 6, and (2) data unable to be processed under our existing conditions. According to the above criteria, two datasets related to CAD, namely, GSE113079 and GSE59421, were included in the present study. GSE59421 is a miRNA dataset with 33 CAD samples and 37 control samples.18 GSE113079 is a gene expression profile dataset with 93 CAD and 48 control samples.19 After the datasets were obtained from the GEO database, we normalized the data with the help of the normalizeBetweenArray of the “limma” package. We performed log2 conversion on the required data for further analysis.

Identification of DEGs

To identify the DEGs between the CAD and control groups, we utilized the “limma” package in R to perform differential expression analysis using the following thresholds: |logFC| > 1 and P < 0.05. LogFC < −1 indicated that the gene expression in the CAD group was downregulated. Conversely, logFC > 1 indicated that the gene expression in the CAD group was upregulated.

Construction of the PPI Network and Identification of Hub Genes

The STRING database (http://string.embl.de/) was used to build the protein-protein interaction (PPI) network, visualized via Cytoscape software. Based on the degree method, we used the cytoHubba plug-in to screen the PPI network hub genes according to the degree criterion≥ 10. In addition, the MCODE plug-in in Cytoscape software was applied to identify and screen essential modules in the PPI network, and the module genes were used as hub genes.

The Metascape database (http://metascape.org/gp/index.html#/main/step1) has multiple functions, such as functional enrichment and gene annotation.20 We imported the DEGs into Metascape to construct a PPI network. The MCODE algorithm in the Metascape tool was used to identify essential modules. Similarly, module genes were also used as hub genes. Finally, we obtained the intersection of the hub genes screened by the above three methods, and the overlapping genes were ultimately determined as hub genes identified in the present study.

Functional Enrichment Analysis of DEGs and Hub Genes via DAVID and Metascape

As an online public database, the DAVID database can be applied for functional annotation of genes and is commonly used in bioinformatics analysis.21 In the present study, we used the DAVID database (https://david.ncifcrf.gov/) to perform functional enrichment analysis to understand the functions and potential mechanisms of DEGs and hub genes in CAD. Concisely, we performed 2 analyses, namely, GO and KEGG pathway analyses. The analysis results are presented as bubble charts. In the present study, Metascape was applied to perform pathway and function enrichment analysis of DEGs.

Expression Pattern and Distinguishing Capability of Hub Genes

We used GraphPad Prism 8.0 software to draw a receiver operating characteristic (ROC) curve to determine whether hub genes can differentiate the CAD group from the control group. Then, the area under the curve (AUC) was measured to estimate the distinguishing and diagnosing capability of the hub genes in CAD. Furthermore, after the expression profile of the hub genes was extracted, we used R software to draw a violin plot to determine their expression patterns in the CAD and control groups.

Identification of miRNAs Targeting Hub Genes

The miRWalk 3.0 online database (http://mirwalk.umm.uni-heidelberg.de/) was applied to identify miRNAs that targeted the hub genes.

Identification of DEmiRNAs and Construction of the DEmiRNA-DEG Regulatory Network

Since miRNA is the key gene regulation noncoding RNA, its expression level is often low, and the change range is small. Therefore, when analyzing the differential expression of miRNA, it is necessary to set a lower threshold to capture its subtle expression changes sensitively.22,23 We identified differentially expressed miRNAs (DEmiRNAs) between the CAD and control groups in the GSE59421 dataset founded on the following criteria: |logFC| > 0.1 and P < 0.05. Furthermore, the predicted miRNAs and DEmiRNAs were intersected, and the common miRNAs were considered the miRNAs targeting the hub genes. Subsequently, we constructed a DEmiRNA-DEG regulatory network, including two levels (DEmiRNAs and hub genes).

Construction and Verification of the Risk Prediction Model Based on the Gene Expression Signature

Lasso regression is an algorithm that is widely considered to be suitable for high-dimensional data. Lasso regression uses penalty and regularization methods for statistical modeling. The optimal value of the penalty coefficient (λ) with the smallest partial likelihood deviation is determined, and the λ value is determined using λmin.24

We extracted the expression profile of the hub genes and used Lasso regression to screen the optimal predictors of CAD to distinguish the CAD group from the control group using the “glmnet” package in R. The predictive factors with nonzero coefficients were selected to construct the CAD risk prediction model.25 Subsequently, the GSE113079 dataset was randomly allocated to the training set (70%) and the test set (30%). To construct the risk score formula, we applied the following formula to weight the expression values of the hub genes used in the model construction in the training set: ; where expgene is the expression value of genes; and β is the regression coefficient of the gene obtained by Lasso regression. Furthermore, to estimate the prediction performance of the risk prediction model, ROC curve analysis was performed on the training set and test set using the “ROCR” package. We used the “ggplot2” package for visualization of the data.

Immune Cell Infiltration and Coexpression Analysis

CIBERSORT is a deconvolution algorithm that can derive cellular components in complex microarray data.26 CIBERSORT was utilized to estimate the abundance of immune cells in CAD, and the “ggplot2” package was used to draw a violin plot to visualize the difference in the infiltration abundance of immune cells. The “corrplot” package was applied to draw a correlation heatmap to visualize the correlation of the infiltration abundance of immune cells.

Construction and Verification of the Immune Scoring Model Based on the Immune Infiltration Signature

In the present study, we extracted data on the abundance of infiltration of the immune cells. In the training set, we also used Lasso regression to screen out the immune cells with significant predictive values, and those immune cells with a P < 0.05 were included in the construction of the immune scoring model. The immune scoring formula used to predict the risk of CAD was as follows: , where abundance of the immune cell is the infiltration abundance of immune cells, and β is the regression coefficient of immune cells obtained by Lasso regression. Moreover, the ROC curve was applied to estimate the immune scoring model’s prediction performance for the CAD risk. We used the “ROC” package to analyze the ROC curve and the “ggplot2” package to visualize the data.

Construction and Verification of the Risk Prediction Model Based on Combined Immune Infiltration and the Gene Expression Signature

We integrated immune infiltration and gene expression signatures to develop the CAD risk prediction model. Briefly, Lasso regression analysis was performed on the predictive gene expression signature and immune score obtained by the above screening further to screen the meaningful signatures for the risk prediction model. Then, multivariate logistic regression analysis was combined with the selected features in the Lasso regression to build the CAD risk prediction model.

The present study generated a nomogram to visualize the CAD risk prediction model. To quantify the predictive performance of the CAD risk prediction nomogram, we measured the Harrell concordance index (C-index). The nomogram was also bootstrapped and verified to calculate the relatively corrected C index.27 In addition, decision curve analysis (DCA) was performed, and the clinical usefulness of the nomogram was determined by quantifying the net benefits under the different threshold probabilities.28–30 ROC curve analysis was applied to test the predictive performance of the CAD risk prediction model.

Quantitative Real-Time Polymerase Chain Reaction (RT‒qPCR)

AC16 cells from TongPai, Shanghai, China, were used to construct an oxygen-glucose deprivation (OGD) model to simulate the state of cardiomyocytes in CAD. This study, AC16 cells were cultured for 24 hours in Dulbecco’s modified Eagle’s medium (DMEM; Gibco, Grand Island, NY, USA) at 37°C and 20% oxygen. In vitro induction of CAD was carried out by washing the cells three times with phosphate-buffered saline (PBS) and then incubating them with glucose- and serum-free DMEM for six hours at 37 °C and 2% oxygen. Following the manufacturer’s instructions, TRIzol reagent (Invitrogen) was used to extract total RNA from AC16 cells. The mRNA expression of CXCL1, CXCL3, PIK3RI, and PTGER1 in different groups of AC16 cells was measured by RT‒qPCR. The primer sequences used in this study are shown in Table 1.

Table 1 The Primer Sequences Used for RT-qPCR

Statistical Analysis

The present study used Lasso and multivariate logistic regression for feature selection and model construction. The “ROCR”, “ggplot2”, “glmnet”, and “rms” packages in R software were used to perform ROC analysis, ROC visualization, 10-fold cross-validated Lasso regression analysis, logistic regression analysis, and nomogram construction. ROC curve analysis and visualization of a single gene were performed by GraphPad Prism 8.0 software. The students’ t-test was applied to compare the CAD and control groups. Statistical significance was defined as a P value < 0.05.

Results Identification and Functional Enrichment Analysis of DEGs

In the present study, |logFC| > 1 and P < 0.05 were used as the criteria to screen the DEGs. Using the workflow shown in Figure 1, we obtained 613 DEGs (346 upregulated and 267 downregulated genes), which were used for further analysis (Figure 2A). The 20 top upregulated genes and the top 20 downregulated genes in the CAD group are presented in the heatmap shown in Figure 2C. Principal component analysis (PCA) revealed that the expression profile data of DEGs distinguished the CAD group from the control group (Figure 2D).

Figure 1 The workflow of the present study.

Figure 2 Differential expression analysis and identification of hub genes. (A) Volcano plot of differentially expressed genes (DEGs), red represents up-regulated genes, blue represents down-regulated genes, and grey represents no significantly differentially expressed genes. (B) Protein-protein interaction network (PPI) network of 613 DEGs. (C) A heatmap of the 20 most up-regulated and 20 most down-regulated genes. (D) Two-dimensional PCA cluster plot of the GSE113079 dataset; blue represents the coronary artery disease (CAD) group, and red represents the control group. (E) PPI network of 25 hub genes identified by degree method.

We next performed GO and KEGG analyses on DEGs using the online DAVID database. GO analysis was comprised of three parts, namely, cellular component (CC), biological process (BP), and molecular function (MF). For BP, DEGs were most enriched in signal transduction, inflammatory response, G-protein coupled receptor signaling pathway, and immune response (Figure 3A). Regarding CC, DEGs were mainly enriched in extracellular space, extracellular region, and plasma membrane (Figure 3B). In terms of MF, DEGs were significantly enriched in cytokine activity, chemokine activity, and CXCR chemokine receptor binding (Figure 3C). Besides, KEGG analysis showed that these genes were associated with the cytokine‒cytokine receptor interaction, chemokine signaling pathway, and TNF signaling pathway (Figure 3D).

Figure 3 Functional enrichment analysis of 613 DEGs by DAVID. (A) Biological process (BP), (B) Cellular component (CC), (C) Molecular function (MF), and (D) KEGG analysis.

In addition, the Metascape tool showed that these genes were significantly enriched in the adenylate cyclase-modulating G protein-coupled receptor signaling pathway, cytokine‒cytokine receptor interaction, regulation of T cell activation, and chemotaxis (Figure 4A).

Figure 4 Functional enrichment analysis of DEGs and the PPI of 5 important modules by Metascape. (A) Heatmap of enriched terms across input DEGs list, colored by p.value, via Metascape. (B) Network of enriched terms colored by p.value. (C) Network of enriched terms colored by cluster identity. (D) Module 1 contains 10 genes, including CXCL1, CXCL2, CXCL3, CCR2, CXCL12, CCL20, S1PR5, CORT, ADRA2A, and S1PR1. (E) Module 2 contains 6 genes, including ADRA1A, FFAR3, NMB, PTGER1, AVPR1B, and OPN4. (F) Module 3 contains 3 genes, including PDGFRB, PIK3R1, KITLG. (G) Module 4 contains 3 genes, including MYL2, TNNI2, and MYBPC3. (H) Module 5 contains 4 genes, including EPHA8, TGFA, PDGFD, and INSRR.

Construction of the PPI Network and Identification of Hub Genes and Corresponding Functional Enrichment Analysis

Metascape analysis constructed the PPI network of DEGs (Figure 4B and C). In addition, five necessary modules were identified from the PPI network via the MCODE algorithm in Metascape (Figure 4D–H). The genes contained in each module are shown in Table 1. In addition, the construction of the PPI network was also performed by the STRING tool (Figure 2B). According to the standard of degree ≥10, 25 hub genes were obtained with the help of the cytoHubba plug-in (Figure 2E). Through the MCODE plug-in in Cytoscape, six modules were identified from the PPI network (Figure 5A–F and Table 2).

Table 2 The Hub Genes Were Obtained Through Two Different Methods

Figure 5 PPI of 6 important modules by MCODE plug-in in Cytoscape. (A) Module 1 contains 14 genes, including CCL4, S1PR1, S1PR5, CCL20, CXCL1, CXCL2, CCR2, GNB4, CXCL3, GPR31, ADRA2A, CXCL12, CORT, and CX3CL1. (B) Module 2 contains 8 genes, including FFAR3, PTGER1, ADRA1A, PIK3R1, NMB, OPN4, BRS3, and AVPR1B. (C) Module 3 contains 4 genes, including DNAJC8, HNRNPA0, CSTF2T, and DDX23. (D) Module 4 contains 4 genes, including ADAMTS2, SEMA5B, ADAMTSL1, and SPON2. (E) Module 5 contains 8 genes, including GRIK5, KIF3A, KIF17, GRIN3B, DLG3, SHANK1, TTC26, and IFT52. (F) Module 6 contains 8 genes, including TGFBR3, MYBPC3, TNNI2, MYBPH, MYL2, SMAD7, ACTG2, and SMURF1.

The Venn diagram indicated that there were 12 common DEGs among the Metascape MCODE, Cytoscape MCODE, and CytoHubba plug-in, namely, PTGER1, CCL20, CXCL1, PIK3R1, CXCL3, CXCL2, ADRA2A, CORT, CXCL12, S1PR1, S1PR5, and CCR2. (Figure 6A and B). All 12 genes were regarded as hub genes in the present study and used in further analysis.

Figure 6 Identification of hub genes shared by the three methods and their functional enrichment analysis. (A) The VENN diagram shows that there were 12 common hub genes among “Metascape MCODE”, “Cytoscape MCODE”, and “cytoHubba plug-in”. (B) Differential expression analysis results of 12 hub genes. (C) biological process (BP). (D) cellular component (CC). (E) molecular function (MF). (F) KEGG analysis.

Analysis of the 12 hub genes using the DAVID database indicated that they were enriched in the following terms: BP terms included inflammatory response, G-protein coupled receptor signaling pathway, and immune response (Figure 6C); CC terms included extracellular region and extracellular space (Figure 6D); and MF terms included cytokine activity, chemokine activity, CXCR chemokine receptor binding, and G-protein coupled receptor binding (Figure 6E). In addition, KEGG analysis indicated that the cytokine‒cytokine receptor interaction, chemokine signaling pathway, and TNF signaling pathway were significantly enriched pathways (Figure 6F).

Expression Pattern and Distinguishing Capacity of Hub Genes in CAD

We explored the expression patterns of the 12 hub genes in the GSE113079 dataset. Five genes (PTGER1, CXCL12, CCR2, ADRA2A, and CORT) were significantly upregulated in the CAD group, while seven genes (CXCL1, S1PR5, CXCL2, PIK3R1, S1PR1, CXCL3, and CCL20) were significantly downregulated in the CAD group (Figure 7).

Figure 7 Expression patterns of 12 hub genes in CAD group and control group.

Besides, the ROC curve showed that the 12 hub genes had an upper-middle ability to distinguish the CAD group from the control group. Among them, PTGER1, PIK3R1, ADRA2A, CORT, CXCL12, and S1PR5 had high distinguishing ability (all 6 genes showed AUC > 0.90, P < 0.001) (Figure 8). These results indicated that these DEGs may be the important part in the development of CAD and have the potential to be used as CAD diagnostic biomarkers.

Figure 8 Receiver Operating Characteristic curves of 12 hub genes expression for the differentiation of CAD group from control group based on GSE113079 dataset.

Identification of Overlapping miRNAs Based on Online Prediction and Differential Expression Analysis

In this study, a total of 57 DEmiRNAs were identified, which are presented in the volcano plot in Figure 9A. The expression of DEmiRNAs is shown in the heatmap in Figure 9B. Moreover, the miRWalk 3.0 tool was used to predict miRNAs targeting the 12 hub genes. Subsequently, we intersected the prediction results of each hub gene with the results of differential expression analysis for miRNAs. The detailed results are shown in Table 3. Overlapping miRNAs were used as the miRNAs targeting the 12 hub genes. Our study demonstrated that 15 DEmiRNAs, included in the prediction results, were involved in regulating the 12 hub genes (Table 4).

Table 3 The Overlapping miRNAs Through Online Prediction and Differential Expression Analysis

Table 4 The 15 DEmiRNAs Target 12 hub Genes

Figure 9 Identification of differentially expressed miRNAs (DEmiRNAs) and construction of DEmiRNAs-DEGs regulatory network. (A) Volcano plot of d DEmiRNAs, red represents up-regulated miRNAs, blue represents down-regulated miRNAs, and grey represents no significantly differentially expressed miRNAs. (B) Heatmap of 57 DEmiRNAs. (C) DEmiRNAs-DEGs regulatory network. The red polygons represent the selected 15 DEmiRNAs, and the blue ovals represent the 12 hub genes. (D) The Sankey diagram shows the regulatory relationship between 15 DEmRNAs and 12 hub genes. The size of the rectangle visualizes the degree of connectivity and presents the expression pattern of DEmRNAs and hub genes in CAD.

Construction of the DEmiRNA-DEG Regulatory Network

To visualize the regulatory relationship between the DEmiRNAs and DEGs identified as hub genes, we used Cytoscape software to construct a DEmiRNA-DEG regulatory network (Figure 9C). The results showed that 15 DEmiRNAs jointly participated in the regulation of these hub genes. In addition, we used the “ggplot2” package in R to draw a Sankey diagram, which is shown in Figure 9D.

Construction and Verification of the Risk Prediction Model Based on the Gene Expression Signature

To further explore the potential of the 12 hub genes as predictors for the risk of CAD, we investigated the corresponding gene expression profile data and performed Lasso regression analysis on the training set to screen the hub genes that could be used to construct a risk prediction model. The results showed that four genes, namely, PTGER1, CXCL1, PIK3R1, and CXCL3, had nonzero regression coefficients, and λmin=0.010 was included in the constructed model (Figure 10A). The constructed risk prediction model was as follows:

Figure 10 A risk prediction model for CAD based on gene expression signature. (A) Lasso regression. (B) ROC curve analysis of the GSE113079 dataset. (C) ROC curve analysis of training set (70% of the GSE113079 dataset). (D) ROC curve analysis of test set (30% of the GSE113079 dataset).

ROC curve analysis based on the GSE113079 dataset showed that the AUC value of the risk prediction model was 0.981 (Figure 10B). In addition, the AUC values in the training set and test set were 0.99 and 0.98, respectively. These results showed that the constructed risk prediction model had a high predictive ability (Figure 10C and D).

Analysis of the Infiltration Abundance and Co-Expression of Immune Cells

The violin plot of the difference in the infiltration abundance of immune cells showed that the infiltration abundance of gamma delta T (γδ T) cells, CD4 naïve T cells, regulatory T cells (Tregs), M2 macrophages, resting mast cells, and monocytes in the CAD group were significantly higher. In contrast, the infiltration abundance of CD8+ cells, activated NK cells, activated memory CD4+ T cells, activated mast cells, and resting NK cells was significantly lower in the CAD group (all P < 0.05) (Figure 11A). There were significant differences in the infiltration abundance of the above 11 immune cells, which indicated that they may play a key role in CAD. Therefore, these immune cell types were used for further analysis. The correlation heatmap of the infiltration abundance of immune cells (Figure 11B) showed that various immune cells had a significant positive or negative correlation. Notably, activated mast cells and activated NK cells had the most significant positive correlation (R = 0.52, P < 0.001), while CD8 T cells and monocytes had the most significant negative correlation (R = −0.78, P < 0.001).

Figure 11 A risk prediction model for CAD based on immune infiltration signature. (A) Using the CIBERSORT algorithm to evaluate the composition and proportion of infiltrating immune cells in CAD. Red represents the CAD group, and blue represents the control group. (B) The results of co-expression analysis between the immune cells with significant differences in infiltration abundance. *P < 0.05, **P < 0.01, ***P < 0.001. (C) Lasso regression. (D) ROC curve analysis of training set (70% of the GSE113079 dataset). (E) ROC curve analysis of test set (30% of the GSE113079 dataset).

Construction and Verification of the Immune Scoring Model Based on the Immune Infiltration Signature

We included the 11 immune cell types in the training set to screen immune cells for constructing the immune scoring model. Through Lasso regression analysis, we found that the regression coefficients of five immune cell types were nonzero (λmin=0.011) (Figure 11C). The constructed immune scoring model was as follows: Immune score=4.766*(abundance of CD4 naïve T cells)- 1.938*(abundance of activated memory CD4 T cells) + 3.912*(abundance of regulatory T cells (Tregs)) + 3.719*(abundance of gamma delta T cells)-0.765*(abundance of resting NK cells).

The ROC curve showed that the AUC values of the five immune cell type-based models in the training and testing sets were 0.92 and 0.92, respectively, suggesting that the immune scoring model based on the immune infiltration signature distinguished the CAD group from the control group well. In summary, the immune scoring model has a high predictive ability for the risk of CAD (Figure 11D-E).

Construction and Verification of the Risk Prediction Model Based on Combined Immune Infiltration and the Gene Expression Signature

We performed a Lasso regression analysis based on the gene expression profile data of four hub genes (PTGER1, CXCL1, PIK3R1, and CXCL3) and the immune score. Subsequently, multivariate logistic regression analysis was applied to combine the features selected in the Lasso regression analysis to build a CAD risk prediction model. Weighting the corresponding predictors with logistic regression coefficients, the established risk prediction model was as follows:

To evaluate the ability of the model to predict the risk of CAD, we calculated the AUC value of the ROC curve. As shown in Figure 12B, the AUC value of the risk prediction model was 0.98.

Figure 12 Construction and evaluation of nomogram for predicting CAD risk. (A) The nomogram for predicting the risk of CAD. (B) ROC curve analysis. (C) Decision curve analysis.

Construction and Evaluation of the Nomogram

To improve the usability and better characterize the risk prediction model, we constructed a nomogram containing the abovementioned predictive factors (Figure 12A). The C index of the risk prediction nomogram was 0.988 (95% CI: 0.975–1.000), and after verification by bootstrapping, the C index of the risk prediction nomogram was 0.985. The risk prediction model showed good predictive ability in the CAD risk prediction nomogram.

The decision curve analysis (DCA) of the CAD risk prediction nomogram is shown in Figure 12C. When the threshold probability was between 1% and 100%, the decision curve showed that using the CAD risk prediction nomogram to predict the risk of CAD had more benefits. Thus, these findings indicated that the CAD risk prediction nomogram has excellent clinical usability.

In vitro Expression Validation of Hub Genes

We successfully constructed a CAD risk prediction model based on the above method. PCR analysis showed that the mRNA expression levels of CXCL1, CXCL3, and PIK3R1 were significantly higher in the Con group than in the OGD group (Figure 13). In contrast, the mRNA expression levels of PTGER1 were significantly lower in the Con group than in the OGD group (Figure 13). Overall, PCR analysis verified the bioinformatics prediction of the downregulation and upregulation of the hub genes in the CAD risk model.

Figure 13 The mRNA expression of CXCL1, CXCL3, PIK3RI, and PTGER1 in different groups of AC16 cells. ***P < 0.001.

Discussion

In our study, 613 DEGs were identified in the GSE113079 dataset, and 12 hub genes were obtained through three different methods. The functional enrichment analysis of 613 DEGs and 12 hub genes demonstrated that these genes were mainly related to the inflammatory response, immune response, chemotaxis, and the interaction between cytokines and cytokine receptors. Gene mapping studies have revealed that novel pro- and anti-inflammatory pathways linking lipid and inflammatory biology in humans activate CAD-related variants,31 and interleukin 10, and transforming growth factor beta plays key roles in arterial rupture and thrombosis.32 Thus, these genes are fundamentally important in CAD.33 The present findings indicated that the 12 hub genes may play an essential role in the development and progression of CAD, and the majority of the genes may be diagnostic and prognostic biomarkers.

Some of the hub genes have been studied in cardiovascular disease. PTGER1 is a receptor for prostaglandin E (PGE), primarily expressed on various cells in the cardiovascular system, including vascular endothelial cells, smooth muscle cells, and cardiomyocytes. By binding to PGE, PTGER1 regulates multiple physiological functions of the cardiovascular system, such as vasodilation, inflammatory responses, and platelet aggregation.34 CXCL12 derived from endothelial cells is causally related to immunopathogenic pathways associated with serum cytokine imbalance, particularly TNF-α and IL-10, and there are higher allele frequencies of CXCL12 in patients with CAD.35 The serum levels of CXCL12 in patients with CAD continuously increase with the number of coronary artery lesions, showing a positive correlation with the number of coronary artery lesions in patients.36 This discovery provides new clues for assessing the severity and prognosis of coronary heart disease patients. ADRA2A belongs to the subfamily of G-protein-coupled receptors and consists of three highly homologous subtypes: alpha2A, alpha2B, and alpha2C.37 These receptors primarily regulate neurotransmitters released by sympathetic nerves and adrenergic neurons, playing a crucial role in the modulation of central nervous system functions. The ADRA2A gene is also associated with physiological traits such as aerobic endurance. Studies have indicated that polymorphisms in the ADRA2A gene may be related to the risk of CAD. However, further research is needed to identify which specific polymorphic loci are associated with CAD risk and how these polymorphisms influence the occurrence and progression of CAD.38 In atherosclerosis, CCR2 is bound to a functional receptor (CCR2) that modifies chemokine gradients that control leukocyte migration and inflammatory traffic, indicating that developing CCR2 therapeutic targets may alter the overall anti-inflammatory response from the local level.39 The activation of S1PR1 inhibits the autophagy and hypertrophy of cardiomyocytes, and the S1PR1 signaling pathway promotes the maintenance of cardiac ejection function in pressure-overloaded mice.40 S1PR1 also docks closely with two CAD therapeutics, DB12371 and DB12612, and Chen et al reported that S1PR1 is a diagnostic biomarker and potential drug target for CAD.41 PIK3R1 is mediated by miR-342-5p to participate extensively in MAPK family signaling cascades and interleukin signaling, ultimately promoting vascular smooth muscle cell (VSMC) proliferation and differentiation.42 Taken together, these genes may be applied to diagnostic biomarkers, prognostic biomarkers, and therapeutic targets for diseases, including CVD. These previous studies supported the reliability of the present results, especially in terms of modulating immune infiltration mechanisms. However, a considerable number of genes mentioned above have not been studied in the field of CAD, which may limit our comprehensive and in-depth understanding of the pathogenesis and influencing factors of the disease.

We also utilized the CIBERSORT algorithm to assess the composition and proportion of infiltrating immune cells in CAD. The findings revealed that the infiltration abundance of several immune cell types, including γδ T cells, CD4 naïve T cells, Tregs, M2 macrophages, resting mast cells, and monocytes, was significantly elevated in the CAD group. Conversely, the infiltration abundance of CD8+ T cells, activated NK cells, activated memory CD4+ T cells, activated mast cells, and resting NK cells was notably decreased in the CAD group. The research underscores the pivotal role of γδ T cells in the immune system, functioning as a bridge between innate and adaptive immunity. These cells possess a unique ability to recognize a broad spectrum of antigens, including those that conventional T cells cannot, thereby demonstrating dual efficacy in both anti-tumor and anti-infectious processes. The heightened presence of γδ T cells in CAD suggests a close association with coronary artery inflammation and injury.43 These cells may actively engage in the pathological process of CAD, striving to repair and restore normal vascular function by recognizing and attacking damaged vascular cells or related antigens. CD4 naive T cells, being undifferentiated, have the capacity to differentiate into multiple effector T cell types. They play a vital role in adaptive immunity by recognizing antigens, undergoing activation, and differentiating into effector T cells to counter infections. In patients with CAD, the immune system responds actively to coronary artery inflammation and injury by augmenting the reserve of undifferentiated T cells, thereby bolstering the immune response.44 Tregs are crucial in maintaining immune balance and preventing autoimmune diseases.45 They accomplish this by inhibiting the activation of other immune cells, thereby preventing excessive immune responses. The increase in Treg counts observed in patients with CAD may indicate that the immune system is attempting to mitigate coronary artery inflammation and injury by inhibiting excessive immune responses. However, this may concurrently weaken the immune system’s capacity to clear coronary artery lesions, thereby potentially influencing the progression of the disease. NK cells, as innate immune cells, possess the ability to recognize and eliminate infected or cancerous cells without prior antigen recognition. However, in patients with CAD, the number of activated NK cells is reduced, suggesting a suppression of the innate immune system’s function.46 This suppression may be linked to CAD-induced systemic immune suppression or impaired NK cell function, further compromising the patients’ ability to combat infections and repair damage.

In addition, the results showed that the infiltration of CD8+ T cells and monocytes was higher than that of other infiltrating immune cells in the CAD and control groups. Moreover, the infiltration abundance of monocytes in the CAD group was significantly higher compared to the control group, while the infiltration abundance of CD8+ T cells was significantly lower in the CAD group compared to the control group (both P < 0.001). Further analysis found that CD8+ T cells and monocytes had the most significant negative correlation (R = −0.78, P < 0.001). These results suggested that the infiltration of CD8+ T cells and monocytes may play a more important role compared to other infiltrating immune cells. It has been suggested that CD8+ T cells and monocytes, which contain stromal cell-derived factor-1 (SDF-1), are critical cells for vascular development and hematopoiesis, thereby regulating immune polymorphisms in CAD.47 Presently, the epigenetic function of genes regulating the infiltration of various immune cells is considered essential in cardiovascular disease.48 Activating Toll-like receptors leads to the production of inflammatory cytokines and other substances to induce inflammation. Monocytes infiltrating the arterial wall differentiate into macrophages under the stimulation of inflammation and become a key component of atherosclerotic plaques. After a series of complicated processes, these cells finally differentiate into typical atherosclerosis cells, namely, foam cells.49,50 In addition, miRNA-1 promotes cardiomyocyte pyroptosis and inflammatory factor release by downregulating the expression level of PIK3R1 through the FoxO3a pathway. Downregulation of PIK3R1 expression in H9c2 cells increases the expression of miRNA-1.51 CD8+ T cells producing inflammatory cytokines (TNF-a and IFN-γ), which induce inflammation and apoptosis to drive atherosclerosis development.52 However, inflammation may increase the instability of plaques, which may lead to plaque rupture, creating conditions for myocardial damage or infarct thrombosis.53 In addition, the cytotoxic activity of antigen-presenting cells and the presence of regulatory CD8+ T cell subsets weaken immunity and may limit atherosclerosis.52 The present study suggested that CD8+ T cells with significantly reduced infiltration may inhibit atherosclerosis. According to other studies, decreased demethylation of FOXP3 in Tregs is related to an increased risk of adverse CAD outcomes.54 Similarly, the increased infiltration of Tregs and anti-inflammatory macrophages (M2) may also play a role in inhibiting atherosclerosis.55 In addition, the present study revealed the correlation among various immune cells. Overall, the present findings indicated that the composition and proportion of immune cells in CAD involve complex mechanisms of epigenetic regulation of genes.

In the present study, we first established and validated risk prediction models for CAD based on immune infiltration and gene expression signatures. Both models showed high predictive performance in the training set and test set (AUC > 0.900 for both). Furthermore, a risk prediction model combining immune infiltration, and the gene expression signature was established and verified, and a nomogram was also constructed (C index=0.988; interval verification C index reached 0.985) to predict the risk of CAD. The model effectively distinguished the CAD group from the control group and had a high CAD risk prediction ability (AUC=0.988).

However, our study had some limitations. First, the data were obtained from a limited cohort of a single database, which may result in a lack of generalizability and credibility. In addition, the difference in experimental design between different data sets may significantly affect the research results. To more accurately verify our findings, future studies need to use the same batch of data sets with consistent experimental design for further verification. Second, although in vitro cellular experiments were performed to verify the findings to some extent, the clinical value of the model requires additional studies for verification, especially in the human specimen. Third, this study conducted a preliminary functional verification and characteristic analysis of genes with significant differences. However, to more accurately reveal these genes, miRNAs and their complex network relationship of immune cell infiltration in patients with disease and healthy controls, need to carry out more in-depth experimental verification work, including the analysis of the expression level, interaction of these molecules and their correlation with immune cell infiltration. These studies are crucial to clarify the specific pathways of these molecular markers in cell biological processes, disease pathogenesis, and disease progression. Fourth, the immune score and the identification of central genes are based on the RNA level reported in the above data set. Due to the influence of post-transcriptional regulation, it is necessary to evaluate the level of protein and metabolomics in future research. Fifth, the relationship between central genes and immune cells is based on expression and correlation analysis. It provides a perspective to understand the potential relationship, but it is not enough to fully explain the complex relationship between genes and immune cells, nor can it directly prove the actual impact of these relationships on the immune microenvironment. Specifically, whether these genes are involved in regulating the migration, transfer, activation, or inhibition of immune cells is still inconclusive according to the current data set. The in-depth exploration of these mechanisms needs to rely on more elaborate experimental verification. Sixth, the assessment of immune infiltration subpopulations requires up-to-date and authoritative algorithms and tools, such as ImmuCellAI, which was not used in the present study.56,57 Finally, there is a need for further studies to confirm the relationships and mechanisms of immune cell subpopulations regulating CAD.

Conclusion

Our study elucidated the potential mechanism of DEGs in the development of CAD, identified 12 hub genes, and explored their mechanism of action, expression patterns, and distinguishing ability in CAD. We also constructed an effective CAD risk prediction model based on the combined immune infiltration and gene expression signature. These findings will provide useful guidance for clinical practice and decision-making.

Data Sharing Statement

The datasets supporting the conclusions of this study are included in the GEO database.

Ethics Declarations

This study does not involve animal or human specimen experiments, and an ethical review is not suitable for our study. According to item 1 and 2 of Article 32 of “the Measures for Ethical Review of Life Science and Medical Research Involving Human Subjects”, this study is exempt from ethical review and approval.

Author Contributions

All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis, and interpretation, or all these areas; took part in drafting, revising, or critically reviewing the article; gave final approval of the version to be published; and agree to be accountable for all aspects of the work.

Funding

This work was supported by the There is no funding to report.

Disclosure

The authors report no conflicts of interest in this work.

References

1. Zhong Z, Zhong W, Zhang Q, Zhang Q, Yu Z, Wu H. Circulating microRNA expression profiling and bioinformatics analysis of patients with coronary artery disease by RNA sequencing. J Clin Lab Anal. 2020;34(1):e23020. doi:10.1002/jcla.23020

2. Zhang YH, Pan X, Zeng T, Chen L, Huang T, Cai YD. Identifying the RNA signatures of coronary artery disease from combined lncRNA and mRNA expression profiles. Genomics. 2020;112(6):4945–4958. doi:10.1016/j.ygeno.2020.09.016

3. Malakar AK, Choudhury D, Halder B, Paul P, Uddin A, Chakraborty S. A review on coronary artery disease, its risk factors, and therapeutics. J Cell Physiol. 2019;234(10):16812–16823. doi:10.1002/jcp.28350

4. Kendall MJ. Clinical trial data on the cardioprotective effects of beta-blockade. Basic Res Cardiol.95. Bas Res Cardio. 2000;95 Suppl 1(Suppl 1):I25–30. doi:10.1007/s003950070005

5. Du L, Xu Z, Wang X, Liu F. Integrated bioinformatics analysis identifies microRNA-376a-3p as a new microRNA biomarker in patient with coronary artery disease. Am J Transl Res. 2020;12(2):633–648.

6. Zhang X, Xiang Y, He D, et al. Identification of potential biomarkers for CAD using integrated expression and methylation data. Front Genet. 2020;11:778. doi:10.3389/fgene.2020.00778

7. Huang L, Zhang Y, Su E, et al. Eight biomarkers on a novel strip for early diagnosis of acute myocardial infarction. Nanoscale Adv. 2020;2(3):1138–1143. doi:10.1039/c9na00644c

8. Amezcua-Castillo E, González-Pacheco H, Sáenz-San Martín A, et al. C-reactive protein: the quintessential marker of systemic inflammation in coronary artery disease-advancing toward precision medicine. Biomedicines. 2023;11(9). doi:10.3390/biomedicines11092444

9. Huang J, Gu J-X, Bao H-Z, et al. Elevated serum small dense low-density lipoprotein cholesterol may increase the risk and severity of coronary heart disease and predict cardiovascular events in patients with type 2 diabetes mellitus. Dis Markers. 2021;2021:5597028. doi:10.1155/2021/5597028

10. Singh S, Baars DP, Desai R, Singh D, Pinto-Sietsma S-J. Association between lipoprotein (a) and risk of atrial fibrillation: a systematic review and meta-analysis of mendelian randomization studies. Curr Probl Cardiol. 2024;49(1 Pt A):102024. doi:10.1016/j.cpcardiol.2023.102024

11. Harishkumar M, Radha M, Yuichi N, et al. Designer exosomes: smart nano-communication tools for translational medicine. Bioengineering. 2021;8(11):158. doi:10.3390/bioengineering8110158

12. Tan X, Zhang X, Pan L, Tian X, Dong P. Identification of key pathways and genes in advanced coronary atherosclerosis using bioinformatics analysis. Biomed Res Int. 2017;2017:4323496. doi:10.1155/2017/4323496

13. King A, Wu L, Deng HW, Shen H, Wu C. Polygenic risk score improves the accuracy of a clinical risk score for coronary artery disease. BMC Med. 2022;20(1):385. doi:10.1186/s12916-022-02583-y

14. Zeller T, Blankenberg S. Blood-based gene expression tests: promises and limitations. Circ Cardiovasc Genet. 2013;6(2):139–140. doi:10.1161/circgenetics.113.000149

15. Dounousi E, Duni A, Naka KK, Vartholomatos G, Zoccali C. The innate immune system and cardiovascular disease in ESKD: monocytes and natural killer cells. Curr Vasc Pharmacol. 2021;19(1):63–76. doi:10.2174/1570161118666200628024027

16. SahBandar IN, Ndhlovu LC, Saiki K, et al. Relationship between circulating inflammatory monocytes and cardiovascular disease measures of carotid intimal thickness. J Atheroscler Thromb. 2020;27(5):441–448. doi:10.5551/jat.49791

17. Wang XB, Cui NH, Liu X, Ming L. Identification of a blood-based 12-gene signature that predicts the severity of coronary artery stenosis: an integrative approach based on gene network construction, support vector machine algorithm, and multi-cohort validation. Atherosclerosis. 2019;291(291):34–43. doi:10.1016/j.atherosclerosis.2019.10.001

18. Kok MGM, Halliani A, Moerland PD, Meijers JCM, Creemers EE, Pinto-Sietsma S-J. Normalization panels for the reliable quantification of circulating microRNAs by RT-qPCR. FASEB J. 2015;29(9):3853–3862. doi:10.1096/fj.15-271312

19. Li L, Wang L, Li H, et al. Characterization of LncRNA expression profile and identification of novel LncRNA biomarkers to diagnose coronary artery disease. Atherosclerosis. 2018;275(275):359–367. doi:10.1016/j.atherosclerosis.2018.06.866

20. Zhou Y, Zhou B, Pache L, et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat Commun. 2019;10(1):1523. doi:10.1038/s41467-019-09234-6

21. Dennis G Jr, Sherman BT, Hosack DA, et al. DAVID: database for annotation, visualization, and integrated discovery. Genome Biol. 2003;4(5):3. doi:10.1186/gb-2003-4-5-p3

22. Hua S-L, Liang J-Q, Hu G-F, Yang X-R, Fang D-L, Lu J-L. Constructing a competing endogenous RNA network for osteoarthritis. Ann Transl Med. 2022;10(3):147. doi:10.21037/atm-21-6711

23. Geng T, Heyward CA, Chen X, Zheng M, Yang Y, Reseland JE. Comprehensive analysis identifies ameloblastin-related competitive endogenous RNA as a prognostic biomarker for testicular germ cell tumour. Cancers. 2022;14(8

View original article

JOURNAL OF INFLAMMATION RESEARCH

分享书签

0 0 0 0 0 0 0

More from this channel

Identification of Hub Genes and Immune Infiltration in Coronary Artery Disease: A Risk Prediction Model

留言 (0)