The roles of IRF8 in nonspecific orbital inflammation: an integrated analysis by bioinformatics and machine learning

DEG identification and principal component analysis

We integrated GSE58331 and GSE105149 and conducted batch match evidence integration. PCA corroborated the successful demarcation of patients into risk-specific cohorts (Fig. 2a-b). Among the 314 DEGs, some DEGs were found to be significantly different. In addition, some genes cluster in the treat group and some in the control group. Treat: PPP1R1A, CAB39L, MTURN, MAOA2, NGFRAP1, CDR1, etc. Control: ITGB2, CAPG, CHI3L1, SLAMF8, APOC1, TCIRG1, etc. (Fig. 2c). Some of these DEGs were significantly up-regulated (TCIRG1, IGHM, CXCL9, PROM1, PIGR, HLA-DQA1, etc). However, some genes were significantly down-regulated (HLF, ADH1B, MGST1, LARP6, PGM1, C2orf40, TGFBR3, etc). (Fig. 2d) (Table.S1).

Fig. 2figure 2

Principal Component Analysis. a-b Analysis of PCA. c Heatmap. d Volcano map

Construction of the model

The LASSO, Cox regression analysis, and optimum value were used to establish a gene signature (Fig. 3a-b). The SVM-RFE was used to build the machine learning model to validate the model's accuracy and reliability. The accuracy of this model was 0.894, and the error was 0.106 (Fig. 3c-d). Some important genes were identified by Random forest analysis, and these genes included SRPX, ITM2A, PGM1, HLF, etc (Fig. 3e-f). We attempted to combine the key genes of these three algorithms to construct the model. However, it was found that only LASSO and SVM-RFE had the most stable key gene construction models. Finally, we obtained 15 hub genes (Fig. 3g) (Table.S3).

Fig. 3figure 3

The development of the signature. a Regression of the NSOI-related genes using LASSO. b Cross-validation is used in the LASSO regression to fine-tune parameter selection. c-d Accuracy and error of this model. e-f Random forest analysis. g Venn

DEG identification and visualization

We visualized these 15 hub genes in the NSOI group and the normal sample group respectively (Fig. 4). In addition, we also put these genes in the whole graph for visual comparison (Fig. 5). In the confirmation of 15 hub genes, we analyzed the ROC of these genes, showing that the accuracy of these genes is high. HLF (AUC: 0.945), PGM1 (AUC: 0.911), GPR146 (AUC: 0.907), IRF8 (AUC: 0.840), TNS1 (AUC: 0.802), PLA2G16 (AUC: 0.801), PALMD (AUC: 0.824), CCL4 (AUC: 0.813), IGK (AUC: 0.765), CORO2B (AUC: 0.887), IGSF10 (AUC: 0.882), AKR1C1 (AUC: 0.836), ENPP6 (AUC: 0.830), MAP1B (AUC: 0.842), RHOBTB3 (AUC: 0.806) (Fig. 6).

Fig. 4figure 4

Expression of 15 hub genes in NSOI group and normal sample group respectively

Fig. 5figure 5

All hub genes are co-expressed in the same line plot

Fig. 6figure 6Validation of hub genes

GSE58331 was used for validation to boost our model's confidence and prediction accuracy of these hub genes. What's interesting is that these DEGs are showed significant differences in GSE58331 analysis (Fig. 7). In the GSE58331 analysis of 15 hub genes, we analyzed the ROC of these genes, showing that the accuracy of these genes is high. HLF (AUC: 0.971), PGM1 (AUC: 0.938), GPR146 (AUC: 0.943), IRF8 (AUC: 0.851), TNS1 (AUC: 0.861), PLA2G16 (AUC: 0.839), PALMD (AUC: 0.867), CCL4 (AUC: 0.798), IGK (AUC: 0.857), CORO2B (AUC: 0.919), IGSF10 (AUC: 0.923), AKR1C1 (AUC: 0.810), ENPP6 (AUC: 0.882), MAP1B (AUC: 0.862), RHOBTB3 (AUC: 0.861). These results also confirmed the high reliability and accuracy of our model (Fig. 8).

Fig. 7figure 7

Expression of 15 hub genes in GSE58331 analysis

Fig. 8figure 8DEG identification of IRF8

By differential analysis of single gene targets, we identified 507 DEGs. Among the 507 DEGs, some DEGs were found to be significantly different. In addition, some genes cluster in the high group and some in the low group. High: IL21R, IRF8, FGD3, BCL2A1, LCK, CD48, RAC2, CD53, etc. Low: IRX5, PON3, ARHGEF37, ANO1, RAB3D, PHGDH, S100A1, etc. (Fig. 9a-b). In addition, we constructed a correlation matrix plot related to IRF8 (Fig. 9c) (Table.S4).

Fig. 9figure 9

DEG Identification of IRF8. a Heatmap. b Volcano map. c Correlation matrix diagram

Enrichment analysis of DEGs of IRF8

GO enrichment analysis revealed 996 core targets, including BP, MF, and CC. The MF mainly involves in actin binding (GO:0003779), receptor ligand activity (GO:0048018), immune receptor activity (GO:0140375). The CC mainly involves in external side of plasma membrane (GO:0009897), collagen-containing extracellular matrix (GO:0062023), endocytic vesicle (GO:0030139). The BP mainly involves in leukocyte mediated immunity (GO:0002443), leukocyte cell-cell adhesion (GO:0007159), negative regulation of immune system process (GO:0002683). KEGG enrichment analysis revealing that the over-expressed genes were mainly involved in Cytokine-cytokine receptor interaction (hsa04060), Chemokine signaling pathway (hsa04062), Cell adhesion molecules (hsa04514) (Fig. 10 and Table.S5a-b).

Fig. 10figure 10

For PMGs, GO, and KEGG analyses were performed. a The GO circle illustrates the barplot, chord, circos, and cluster of the selected gene's logFC. b The KEGG barplot, chord, circos, and cluster illustrates the scatter map of the logFC of the indicated gene

GSEA of analysis

GSEA was deployed to identify functional alterations across the DEGs of IRF8. In high expression group of GO analysis, the functional enrichment mainly involves in BP lymphocyte mediated immunity, BP leukocyte mediated immunity, BP adaptive immune response. In low expression group of GO analysis, the functional enrichment mainly involves in BP sensory perception of bitter taste, BP detection of chemical stimulus involved in sensory perc, BP sensory perception of taste (Fig. 11a).

Fig. 11figure 11

GSEA of Analysis in PDE4B and PDE6D. a GO. b KEGG

In high expression group of KEGG analysis, the functional enrichment mainly involves in proximal tubule bicarbonate reclamation, drug metabolism cytochrome p450, glycine serine and threonine metabolism. In low expression group of KEGG analysis, the functional enrichment mainly involves in allograft rejection, autoimmune thyroid disease, systemic lupus erythematosus (Fig. 11b) (Table.S6)

GSVA of analysis

GSVA was deployed to identify functional alterations across the DEGs of IRF8. In the GO analysis, the functional enrichment mainly involves in BP ureter development, MF aldehyde dehydrogenase nad p plus activity, BP ear morphogenesis, MF transforming growth factor beta receptor binding, CC 90s preribosome (Fig. 12a). In the KEGG analysis, the functional enrichment mainly involves in phenylalanine metabolism, histidine metabolism, drug metabolism cytochrome p450, glycine serine and threonine metabolism (Fig. 12b) (Table.S7).

Fig. 12figure 12

GSVA of Analysis in IRF8. a GO. b KEGG

Immune landscape characterization

The immunological environment has a critical role in the initiation and progression of NSOI. Intriguingly, the risk-associated profiles displayed stark differences in immune cell infiltration. Within the IRF8 cohort, aDCs, APC co inhibition, APC co stimulation, B cells, CCR, CD8+ T cells showed significant variance between the low and high-risk groups. While, Mast cells showed no significant variance between the low and high-risk groups (P>0.05) (Fig. 13a). In immune cell, B cells naive, T cells CD4 memory resting, and Dendritic cells resting were highly expressed in the treat group. While, Monocytes, Macrophages M0, and Mast cells activated were highly expressed in the Control group (Fig. 13b). In addition, we also constructed an immune infiltration correlation rectangle plot and heatmap (Fig. 13c-d). Through PCA analysis, immune-based patient categorization was again successfully executed (Fig. 13e). A Lollipop was created to display the expression patterns of Correlation Coefficient. Mast cells resting, Macrophages M2, Monocytes, B cells memory, NK cells activated (Fig. 13f). B cells naive, Macrophages M0, Macrophages M1, T cells CD4 memory activated, T cells CD4 memory resting, T cells CD4 naive, and T cells gamma delta were shown to be positively associated with IRF8. While, Mast cells resting, Monocytes, NK cells activated, Plasma cells, T cells CD8, and T cells regulatory (Tregs) were shown to be negatively linked with IRF8 (Fig. 14) (Table.S7).

Fig. 13figure 13

Immune Landscape Characterization. a Expression of immune function. b Expression of immune cells (c) Correlation rectangle plot. d Heatmap. e PCA analysis. f The expression patterns of Correlation Coefficient

Fig. 14figure 14

Immune infiltration analyses

Identification of common RNAs and construction of miRNAs-LncRNAs shared genes network

Three databases were searched for 30 miRNAs and 23 lncRNAs linked with NSOI (Table.S7a-b). The network of miRNAs-lncRNAs-genes was constructed by taking the intersection of them and shared genes (obtained by Lasso regression and SVM-RFE). Finally, the miRNAs-genes network included 22 lncRNAs (CTA-414D7.1, LINC01070, RP11-99L13.2, MIR325HG, LINC01165, LINC00613, DYX1C1-CCPG1, RP11-343D2.11, RP11-154D6.1, RP11-22A3.2, SFTPD-AS1, RP1-288H2.2, AC124997.1, CTD-2410N18.4, AJ003147.8, CTD-3046C4.1, RP11-227H15.4, RP11-989E6.10, LINC00662, CTB-181F24.1, RP11-627J17.1, SNHG14), 6 miRNAs (hsa-miR-545-3p, hsa-miR-618, hsa-miR-194-5p, hsa-miR-938, hsa-miR-186-5p, hsa-miR-302a-5p) (Fig. 15) (Table.S8).

Fig. 15figure 15

miRNAs-LncRNAs shared Genes Network. Note: Red circles are mrnas, blue quadrangles are miRNAs, and green triangles are lncRNAs

留言 (0)

沒有登入
gif