We integrated GSE58331 and GSE105149 and conducted batch match evidence integration. PCA corroborated the successful demarcation of patients into risk-specific cohorts (Fig. 2a-b). Among the 314 DEGs, some DEGs were found to be significantly different. In addition, some genes cluster in the treat group and some in the control group. Treat: PPP1R1A, CAB39L, MTURN, MAOA2, NGFRAP1, CDR1, etc. Control: ITGB2, CAPG, CHI3L1, SLAMF8, APOC1, TCIRG1, etc. (Fig. 2c). Some of these DEGs were significantly up-regulated (TCIRG1, IGHM, CXCL9, PROM1, PIGR, HLA-DQA1, etc). However, some genes were significantly down-regulated (HLF, ADH1B, MGST1, LARP6, PGM1, C2orf40, TGFBR3, etc). (Fig. 2d) (Table.S1).
Fig. 2Principal Component Analysis. a-b Analysis of PCA. c Heatmap. d Volcano map
Construction of the modelThe LASSO, Cox regression analysis, and optimum value were used to establish a gene signature (Fig. 3a-b). The SVM-RFE was used to build the machine learning model to validate the model's accuracy and reliability. The accuracy of this model was 0.894, and the error was 0.106 (Fig. 3c-d). Some important genes were identified by Random forest analysis, and these genes included SRPX, ITM2A, PGM1, HLF, etc (Fig. 3e-f). We attempted to combine the key genes of these three algorithms to construct the model. However, it was found that only LASSO and SVM-RFE had the most stable key gene construction models. Finally, we obtained 15 hub genes (Fig. 3g) (Table.S3).
Fig. 3The development of the signature. a Regression of the NSOI-related genes using LASSO. b Cross-validation is used in the LASSO regression to fine-tune parameter selection. c-d Accuracy and error of this model. e-f Random forest analysis. g Venn
DEG identification and visualizationWe visualized these 15 hub genes in the NSOI group and the normal sample group respectively (Fig. 4). In addition, we also put these genes in the whole graph for visual comparison (Fig. 5). In the confirmation of 15 hub genes, we analyzed the ROC of these genes, showing that the accuracy of these genes is high. HLF (AUC: 0.945), PGM1 (AUC: 0.911), GPR146 (AUC: 0.907), IRF8 (AUC: 0.840), TNS1 (AUC: 0.802), PLA2G16 (AUC: 0.801), PALMD (AUC: 0.824), CCL4 (AUC: 0.813), IGK (AUC: 0.765), CORO2B (AUC: 0.887), IGSF10 (AUC: 0.882), AKR1C1 (AUC: 0.836), ENPP6 (AUC: 0.830), MAP1B (AUC: 0.842), RHOBTB3 (AUC: 0.806) (Fig. 6).
Fig. 4Expression of 15 hub genes in NSOI group and normal sample group respectively
Fig. 5All hub genes are co-expressed in the same line plot
Fig. 6Validation of hub genesGSE58331 was used for validation to boost our model's confidence and prediction accuracy of these hub genes. What's interesting is that these DEGs are showed significant differences in GSE58331 analysis (Fig. 7). In the GSE58331 analysis of 15 hub genes, we analyzed the ROC of these genes, showing that the accuracy of these genes is high. HLF (AUC: 0.971), PGM1 (AUC: 0.938), GPR146 (AUC: 0.943), IRF8 (AUC: 0.851), TNS1 (AUC: 0.861), PLA2G16 (AUC: 0.839), PALMD (AUC: 0.867), CCL4 (AUC: 0.798), IGK (AUC: 0.857), CORO2B (AUC: 0.919), IGSF10 (AUC: 0.923), AKR1C1 (AUC: 0.810), ENPP6 (AUC: 0.882), MAP1B (AUC: 0.862), RHOBTB3 (AUC: 0.861). These results also confirmed the high reliability and accuracy of our model (Fig. 8).
Fig. 7Expression of 15 hub genes in GSE58331 analysis
Fig. 8DEG identification of IRF8By differential analysis of single gene targets, we identified 507 DEGs. Among the 507 DEGs, some DEGs were found to be significantly different. In addition, some genes cluster in the high group and some in the low group. High: IL21R, IRF8, FGD3, BCL2A1, LCK, CD48, RAC2, CD53, etc. Low: IRX5, PON3, ARHGEF37, ANO1, RAB3D, PHGDH, S100A1, etc. (Fig. 9a-b). In addition, we constructed a correlation matrix plot related to IRF8 (Fig. 9c) (Table.S4).
Fig. 9DEG Identification of IRF8. a Heatmap. b Volcano map. c Correlation matrix diagram
Enrichment analysis of DEGs of IRF8GO enrichment analysis revealed 996 core targets, including BP, MF, and CC. The MF mainly involves in actin binding (GO:0003779), receptor ligand activity (GO:0048018), immune receptor activity (GO:0140375). The CC mainly involves in external side of plasma membrane (GO:0009897), collagen-containing extracellular matrix (GO:0062023), endocytic vesicle (GO:0030139). The BP mainly involves in leukocyte mediated immunity (GO:0002443), leukocyte cell-cell adhesion (GO:0007159), negative regulation of immune system process (GO:0002683). KEGG enrichment analysis revealing that the over-expressed genes were mainly involved in Cytokine-cytokine receptor interaction (hsa04060), Chemokine signaling pathway (hsa04062), Cell adhesion molecules (hsa04514) (Fig. 10 and Table.S5a-b).
Fig. 10For PMGs, GO, and KEGG analyses were performed. a The GO circle illustrates the barplot, chord, circos, and cluster of the selected gene's logFC. b The KEGG barplot, chord, circos, and cluster illustrates the scatter map of the logFC of the indicated gene
GSEA of analysisGSEA was deployed to identify functional alterations across the DEGs of IRF8. In high expression group of GO analysis, the functional enrichment mainly involves in BP lymphocyte mediated immunity, BP leukocyte mediated immunity, BP adaptive immune response. In low expression group of GO analysis, the functional enrichment mainly involves in BP sensory perception of bitter taste, BP detection of chemical stimulus involved in sensory perc, BP sensory perception of taste (Fig. 11a).
Fig. 11GSEA of Analysis in PDE4B and PDE6D. a GO. b KEGG
In high expression group of KEGG analysis, the functional enrichment mainly involves in proximal tubule bicarbonate reclamation, drug metabolism cytochrome p450, glycine serine and threonine metabolism. In low expression group of KEGG analysis, the functional enrichment mainly involves in allograft rejection, autoimmune thyroid disease, systemic lupus erythematosus (Fig. 11b) (Table.S6)
GSVA of analysisGSVA was deployed to identify functional alterations across the DEGs of IRF8. In the GO analysis, the functional enrichment mainly involves in BP ureter development, MF aldehyde dehydrogenase nad p plus activity, BP ear morphogenesis, MF transforming growth factor beta receptor binding, CC 90s preribosome (Fig. 12a). In the KEGG analysis, the functional enrichment mainly involves in phenylalanine metabolism, histidine metabolism, drug metabolism cytochrome p450, glycine serine and threonine metabolism (Fig. 12b) (Table.S7).
Fig. 12GSVA of Analysis in IRF8. a GO. b KEGG
Immune landscape characterizationThe immunological environment has a critical role in the initiation and progression of NSOI. Intriguingly, the risk-associated profiles displayed stark differences in immune cell infiltration. Within the IRF8 cohort, aDCs, APC co inhibition, APC co stimulation, B cells, CCR, CD8+ T cells showed significant variance between the low and high-risk groups. While, Mast cells showed no significant variance between the low and high-risk groups (P>0.05) (Fig. 13a). In immune cell, B cells naive, T cells CD4 memory resting, and Dendritic cells resting were highly expressed in the treat group. While, Monocytes, Macrophages M0, and Mast cells activated were highly expressed in the Control group (Fig. 13b). In addition, we also constructed an immune infiltration correlation rectangle plot and heatmap (Fig. 13c-d). Through PCA analysis, immune-based patient categorization was again successfully executed (Fig. 13e). A Lollipop was created to display the expression patterns of Correlation Coefficient. Mast cells resting, Macrophages M2, Monocytes, B cells memory, NK cells activated (Fig. 13f). B cells naive, Macrophages M0, Macrophages M1, T cells CD4 memory activated, T cells CD4 memory resting, T cells CD4 naive, and T cells gamma delta were shown to be positively associated with IRF8. While, Mast cells resting, Monocytes, NK cells activated, Plasma cells, T cells CD8, and T cells regulatory (Tregs) were shown to be negatively linked with IRF8 (Fig. 14) (Table.S7).
Fig. 13Immune Landscape Characterization. a Expression of immune function. b Expression of immune cells (c) Correlation rectangle plot. d Heatmap. e PCA analysis. f The expression patterns of Correlation Coefficient
Fig. 14Immune infiltration analyses
Identification of common RNAs and construction of miRNAs-LncRNAs shared genes networkThree databases were searched for 30 miRNAs and 23 lncRNAs linked with NSOI (Table.S7a-b). The network of miRNAs-lncRNAs-genes was constructed by taking the intersection of them and shared genes (obtained by Lasso regression and SVM-RFE). Finally, the miRNAs-genes network included 22 lncRNAs (CTA-414D7.1, LINC01070, RP11-99L13.2, MIR325HG, LINC01165, LINC00613, DYX1C1-CCPG1, RP11-343D2.11, RP11-154D6.1, RP11-22A3.2, SFTPD-AS1, RP1-288H2.2, AC124997.1, CTD-2410N18.4, AJ003147.8, CTD-3046C4.1, RP11-227H15.4, RP11-989E6.10, LINC00662, CTB-181F24.1, RP11-627J17.1, SNHG14), 6 miRNAs (hsa-miR-545-3p, hsa-miR-618, hsa-miR-194-5p, hsa-miR-938, hsa-miR-186-5p, hsa-miR-302a-5p) (Fig. 15) (Table.S8).
Fig. 15miRNAs-LncRNAs shared Genes Network. Note: Red circles are mrnas, blue quadrangles are miRNAs, and green triangles are lncRNAs
留言 (0)