Screening COPD-Related Biomarkers and Traditional Chinese Medicine Prediction Based on Bioinformatics and Machine Learning

Introduction

Chronic obstructive pulmonary disease (COPD) is recognized as a heterogeneous ailment,1 primarily characterized by airway alterations (bronchitis, bronchiolitis) and/or alveolar abnormalities (emphysema), leading to chronic respiratory symptoms (dyspnea, cough, expectoration) and a progressive, persistent limitation of airflow.2 COPD represents a significant global public health challenge, with research indicating its lengthy latency period.3 Once diagnosed, altering the course of COPD proves exceedingly challenging,4 inflicting not only personal suffering but also imposing a substantial economic burden on society.5 Globally, COPD is the causative factor for more than half of all chronic respiratory disease cases,6 gradually becoming the third leading cause of death worldwide.7 With the increase in population aging, both the prevalence and mortality rates of COPD are on an upward trajectory.8 A study targeting middle-aged individuals revealed that the prevalence of COPD among those over 30 years of age was approximately 11.7%,9 while research focused on China indicated a prevalence rate of 13.7% among individuals over 40 years of age, with nearly 100 million people affected.10 Smoking, secondhand smoke, occupational exposure, and a history of previous lung infections have been identified as high-risk factors for COPD.11

Historically, COPD was predominantly considered an ailment of elderly individuals, and little attention has been given to its clinical and pathological characteristics in younger individuals. This oversight led to cases remaining undiagnosed or only being identified at advanced stages, thereby missing critical opportunities for early intervention and the deceleration of disease progression.12 However, with the advancement of research and the subsequent publication of several large cohort studies,13–15 there is a growing recognition that the origins of COPD can be traced back to early life, possibly even before birth, due to prenatal passive smoke exposure, as well as active and passive smoke exposure during childhood, adolescence, or adulthood, all of which contribute to the development of COPD.16 The early progression of the disease is influenced by genetics,17 environmental factors,18 and smoking,19 yet the heterogeneity and susceptibility of COPD patients cannot be fully explained by environmental factors such as smoking alone.20 It is partly believed that COPD represents the cumulative outcome of gene‒environment interactions over the course of an individual’s life,21 with this dynamic process revealing chronic symptoms and structural and functional impairments, ultimately affecting lung development and aging. This understanding has led to the introduction of concepts such as Pre-COPD22–24 and PRISm,25,26 making early screening, diagnosis, and preemptive intervention or halting of progression a focal point of research. Airway inflammation is central to the pathogenesis of COPD,27 acting as a key factor in disease progression and exacerbation, with persistent inflammation present in both the lung parenchyma and peripheral airways.28,29 Ferroptosis, a form of necrotic cell death, is linked to the pathogenesis of COPD and may serve as a potential therapeutic target.30 Intervening in the progression of ferroptosis could have a significant impact on airway epithelial cells in COPD.31,32 Mendelian randomization, employed as an instrumental variable, infers the causal relationship between exposure and outcome, mitigating confounding and reverse causality.33

Thus, this investigation employs bioinformatics and machine learning to discern the inflammatory responses associated with COPD, and through the integration of ferroptosis, identifies distinctive immune cells and potential biomarkers. It further refines the selection of traditional Chinese medicines aimed at prevention and treatment. By incorporating Mendelian randomization analysis, it examines the causal relationships between immune cells, genes, and COPD. This approach offers innovative strategies for the early diagnosis, prevention, and treatment of COPD, and furnishes a novel perspective on the application of traditional Chinese medicine in treating COPD.

Information and Methods Identification and Correlation Analysis of Common Differentially Expressed Genes

Using the R language to extract data from the GEO database, GSE5058 served as the training set, and GSE106986 served as the validation set. The data were normalized, and differential gene expression was screened with |logFC| > 1 and adj.P. Val < 0.05. Concurrently, GSEA enrichment analysis was conducted on all differentially expressed genes, and KEGG and GO enrichment analyses were performed on the upregulated and downregulated genes.

Construction and Selection of the WGCNA Network

The R language was used to perform WGCNA on the GSE5058 dataset, identifying relevant gene modules and selecting gene sets potentially associated with COPD.

Determination of COPD Genes and Enrichment Analysis of Inflammatory Response Genes

Intersection of differentially expressed genes and genes selected from WGCNA identified COPD-related genes. The inflammatory response genes with a relevance score>10 from the GeneCards database (https://www.genecards.org) were intersected with COPD-related genes to identify genes associated with COPD-related inflammation. Further enrichment analysis of this gene intersection was conducted, and a PPI network diagram was drawn.

Selection of PPI Core Genes and Lasso Diagnostic Model with External Independent Validation

Employing R language, we have constructed a PPI network diagram for intersecting genes and isolated the core genes through betweenness centrality analysis to explore their interrelations. Additionally, we have developed a LASSO diagnostic model for these core genes and assessed it using receiver operating characteristic (ROC) analysis, with further validation conducted in the dataset GSE106986.

Connection Between Core Genes and Immune Infiltration and Construction of the TF-miRNA‒mRNA Network

The expression of core genes in the COPD and healthy control groups was explored and validated in the GSE106986 dataset. The Network Analyst database (https://www.networkanalyst.ca/NetworkAnalyst/) was used to construct the TF-miRNA‒mRNA network of the core genes, and immune analysis of the core genes was performed.

Selection and Correlation Analysis of Core Immune Cells

The CIBERSORT method was used to compare immune cell differences between the COPD and control groups and to identify differential immune cells. LASSO regression was used to select COPD-related immune cells, and the intersection of these methods was used to identify the core immune cells associated with COPD. The correlation between core genes and core immune cells was further explored.

Differential and Immune-Related Analysis of COPD Patients Combined with Ferroptosis

Ferroptosis-related genes were obtained from the FerrD (http://www.zhounan.org/ferrdb/current/) and GeneCards databases. Differential analysis was conducted on the dataset combined with ferroptosis-related gene sets to explore their correlation with immune infiltration.

Machine Learning Selection of COPD Characteristic Genes and Prediction of Diagnostic Models

Three different machine learning algorithms—SVM-REF, LASSO, and random forest—were used to select characteristic genes for COPD. Coexpression heatmaps, columnar diagnostic models, and ROC curves of characteristic genes were constructed and validated in the GSE106986 dataset.

Single-Gene GSEA Analysis of Characteristic Genes and Their Connections with Specific Gene Sets

GSEA was conducted on each characteristic gene to explore its relevance to COPD and further analyze its connection with specific gene sets.

Prediction and Molecular Docking Validation of Traditional Chinese Medicine for COPD Core Genes

The TCMSP database (https://old.tcmsp.com/tcmsp.php) was used to collect information on compounds that act on core genes and to identify candidate compounds and actual genes. A “target-compound” network diagram was constructed with Cytoscape software. TCMSP is utilized to reverse-engineer potential traditional Chinese medicines, and a “target-compound-traditional Chinese medicine” network diagram is constructed. Molecular docking validation was performed on selected compounds and genes.

Study of Mendelian Randomization for Immune Cells and Genes

We utilized the GWAS database (https://gwas.mrcieu.ac.uk/) to acquire genetic data related to the disease COPD (finn-b-J10_COPD), with a sample size of 193,638 individuals of European descent, including 16,380,382 SNPs; genetic data related to 731 immune cells were derived from the genetic characteristics of immune cells as the basis of autoimmunity in 2020;34 genetic data related to core genes were obtained from the GWAS database, with some lacking genetic data being deleted. The outcome data still included the genetic data of COPD patients used above. We conducted an association analysis by selecting SNPs based on a significance level of P<1×10−5. Subsequently, we eliminated SNPs in linkage disequilibrium using R2<0.001 and Kb=10000 criteria, calculated the F-statistic for the selected SNPs to remove weak instrumental variables, and considered F values greater than 10 as indicative of non-weak instrumental variable. An analysis was conducted on a total of 13,318 immune cell-related SNPs through the selection of instrumental variables. The causal relationship with COPD was primarily assessed using the Inverse Variance Weighted method, with MR-Egger, Weighted Median, Simple Mode, and Weighted Mode methods utilized as supplementary approaches to detect heterogeneity and pleiotropy, along with conducting sensitivity analyses.

Results Selection and Correlation Analysis of Common Differentially Expressed Genes

Using the R programming language, relevant datasets from the GEO database, specifically the GSE5058 dataset containing 39 samples, were extracted. The expression matrix was normalized, incorporating 15 COPD patients and 24 healthy controls into the study. From the dataset, a total of 2443 DEGs related to COPD were identified, with 1149 upregulated and 1294 downregulated genes in comparison to those in the healthy control group. Concurrently, GSEA enrichment analysis was conducted on all differentially expressed genes, and both upregulated and downregulated genes were subjected to KEGG and GO enrichment analyses. The enrichment of DEGs was predominantly associated with pathways such as the Notch signaling pathway, ferroptosis, the Wnt signaling pathway, circadian rhythms, Th1 and Th2 cell differentiation, vascular smooth muscle contraction, the MAPK signaling pathway, cholesterol metabolism, T-cell activation, and positive regulation of the inflammatory response (Figure 1).

Figure 1 Standardized GSE5058 dataset (A); volcano plots and heatmaps of differentially expressed genes are shown (B and C); GSEA plots are shown (D and E); KEGG and GO enrichment analyses of upregulated and downregulated genes are shown (F); and a focus on the Notch signaling pathway is shown (G).

Construction and Selection of the WGCNA Network

A WGCNA was conducted on 15 COPD patients and 24 healthy controls, clustering the 39 samples and removing any apparent outliers (Figure 2A). A soft-thresholding power of β=8 was chosen with an R2 greater than 0.9 (Figure 2B). By employing a clustering height limit of 0.6 to merge highly correlated modules, six modules were identified for further analysis and are displayed beneath the dendrogram (Figure 3C and D). The study explored the correlation between module eigengene (ME) values and clinical traits and revealed that the turquoise module was positively correlated with the healthy control group (r=0.86, P=2e-12) and negatively correlated with the COPD group (r=−0.86, P=2e-12) (Figure 2E). The significance of COPD-related genes within the turquoise module is shown in Figure 2F. A heatmap of the interactions among genes within the module, where deeper colors indicate stronger interactions, is presented (Figure 2G). Finally, a heatmap illustrating the correlation between modules and traits, detailing the relationships of various modules, is depicted (Figure 2H).

Figure 2 Displays the dendrogram of sample clustering (A); Selects an appropriate threshold (B); Visualizes the modules (C); Clusters the modules (D); Illustrates the relationship between different modules and groups (E); Highlights the significance of COPD-related genes within the turquoise module (F); Presents a heatmap of module interactions (G); Shows a heatmap of the correlation between modules and traits (H).

Figure 3 DEGs related to COPD (A); DEGs associated with the inflammatory response in COPD (B); DEG enrichment analysis (C); GO enrichment analysis (D); heatmap of DEGs related to the inflammatory response in COPD (E); KEGG enrichment analysis (F–H); and visualization of the protein‒protein interaction (PPI) network (I).

Identification of COPD Genes and Enrichment Analysis of Inflammatory Response Genes

According to the WGCNA, 8435 genes within the turquoise module intersected with 2443 DEGs, yielding 2.320 COPD-related genes identified as disease genes for COPD (Figure 3A). Further analysis of the 1226 inflammatory response genes obtained from the GeneCards database revealed 141 COPD-related inflammatory response genes (Figure 3B). DO enrichment analysis revealed that these 141 genes are associated with diseases such as chronic obstructive pulmonary disease, obstructive lung disease, bacterial infectious disease, tuberculosis, bronchial disease, and interstitial lung disease (Figure 3C). GO enrichment analysis revealed biological processes such as regulation of inflammatory response, positive regulation of inflammatory response, and reactive oxygen species metabolic process, occurring in locations such as the cytoplasmic vesicle lumen, membrane raft, caveolae, and endoplasmic reticulum lumen and involving cytokine receptor binding, growth factor activity, lipopolysaccharide binding, and hormone activity (Figure 3D). KEGG pathway enrichment primarily focused on the JAK-STAT signaling pathway, MAPK signaling pathway, NF-kappa B signaling pathway, IL-17 signaling pathway, TNF signaling pathway, and T-cell receptor signaling pathway (Figure 3F–H). Additionally, a protein‒protein interaction (PPI) network was constructed (Figure 3I).

Selection of PPI Core Genes and Lasso Diagnostic Model with External Independent Validation

Using R language, we constructed a PPI network for 141 genes and identified core genes through betweenness centrality, resulting in a total of 37 core genes (Figure 4A). These genes include CD4, CSF3, CD69, IL1A, IL11, EGF, CCR7, TP53, MMP3, ACHE, FOS, PLG, CCR2, LRRK2, CCR6, MAPK11, CSF1, AGT, PTGS2, PTPN22, FASLG, TLR4, FOXP3, TLR7, SPP1, INS, POMC, ADIPOQ, CARD11, LTA, ACE2, NR4A1, IL10, GFAP, ABCB1, NOS1, and MYC. A heatmap was generated to further illustrate the variations and clustering relationships among these core genes (Figure 4B), and their interrelations were explored further (Figure 4C). A lasso diagnostic model was constructed for these core genes, and its efficacy was evaluated using receiver operating characteristic (ROC) curve analysis. The AUC values for all 37 core genes exceeded 0.75, with the overall training set achieving an AUC of 1, indicating a commendable diagnostic capability for COPD (Figure 4D–F). The GSE106986 dataset was used as a validation set, where the overall AUC for the core genes exceeded 0.65, further confirming its diagnostic utility for COPD (Figure 4G).

Figure 4 The core genes of the PPI network (A); heatmap of the core genes (B); correlations among the core genes (C); lasso diagnostic model (D); ROC curve analysis of the core genes (E); overall ROC analysis of the training set (F); overall ROC analysis of the validation set (G).

Connection Between Core Genes and Immune Infiltration and Construction of the TF-miRNA‒mRNA Network

The differences in the expression of the core genes between the COPD group and the control group were significant (Figure 5A); the expression of the core genes ACHE, CARD11, CD4, CSF1, CSF3, FOS, FOXP3, GFAP, IL11, INS, LTA, MAPK11, MYC, POMC, NR4A1, and NOS1 decreased in the COPD group, while the expression of the remaining core genes increased in the COPD group. The expression of the core genes in the validation set was also significant (Figure 5B). A TF-miRNA‒mRNA network was constructed for the core genes to better explore the mechanisms by which core genes regulate transcription factors such as ETS1, CREB1, and TFAP2A and are involved in COPD through interactions with hsa-miR-543, hsa-miR-181c, and hsa-miR-200a (Figure 5C). By employing the CIBERSORT method to analyze the correlation between core genes and immune cells, it was discovered that core genes might influence the progression of COPD by affecting various immune cells, such as Eosinophils, M2 Macrophages, Plasma cells, and activated NK cells (Figure 5D).

Figure 5 Expression of core genes in both groups (A); expression of core genes in both groups within the validation set (B); TF-miRNA‒mRNA network diagram of core genes (C); correlation between core genes and immune cells (D). The significance markers are shown as: #P>0.05; *P<0.05; **P<0.01; ***P<0.001.

Selection and Correlation Analysis of Core Immune Cells

Using the CIBERSORT method, the characteristics of immune cells in the COPD group were compared with those in the healthy control group, and the differences in expression between the two groups were investigated. This analysis revealed the relative proportions and expression trends of 22 immune cell types (Figure 6A and B) and included a principal component analysis (PCA) of immune cells (Figure 6C). There were differences in the number of immune cells between the COPD group and the healthy control group, particularly in 11 types of immune cells, including plasma cells, activated memory CD4+ T cells, follicular helper T cells, regulatory T cells (Tregs), gamma delta T cells, activated NK cells, monocytes, M0 macrophages, M2 macrophages, resting dendritic cells, and eosinophils (Figure 6D). Through LASSO regression, 8 types of immune cells were selected: memory B cells, plasma T cells, activated memory CD4+ T cells, gamma delta T cells, resting NK cells, activated NK cells, M2 macrophages, and eosinophils (Figure 6E). Together, these findings revealed 6 core immune cell types in COPD, namely, plasma cells, activated memory CD4 T cells, gamma delta T cells, activated NK cells, M2 macrophages, and eosinophils (Figure 6F). Further analysis of the correlation between core genes and core immune cells revealed that plasma cells were positively correlated with CD4, CSF1, and POMC and negatively correlated with the PLG, while eosinophils were negatively correlated with NOS1, among others (Figure 6G and H).

Figure 6 Bar chart of immune cells (A); heatmap of immune cells (B); PCA plot of immune cells (C); differential analysis of immune cells (D); selection of immune cells (E); selection of core immune cells (F); heatmap of core immune cells and core genes (G); and correlation analysis of core immune cells (H).

Differential and Immune-Related Analysis of COPD Patients Combined with Ferroptosis

From the FerrD and GeneCards databases, 396 genes related to ferroptosis were identified. Differential analysis of ferroptosis between the COPD group and the healthy control group revealed 181 DEGs. Further intersection with the core genes revealed TP53, PTGS2, TLR4, and NR4A1, suggesting that COPD could be treated from the perspective of ferroptosis by modulating these four genes (Figure 7A–C). Further analysis of the correlation between ferroptosis and immune infiltration revealed differences in the relative proportions and expression trends of 22 immune cells in the ferroptosis gene set (Figure 7D and E), with differences observed in 9 types of immune cells. Compared to the healthy control group, the COPD group exhibited greater numbers of activated memory CD4 T cells, gamma delta T cells, M0 macrophages, resting dendritic cells, and eosinophils but lower numbers of plasma cells, follicular helper T cells, regulatory T cells (Tregs), activated NK cells, and M2 macrophages; plasma cells and M2 macrophages were negatively correlated with ferroptosis (Figure 7F).

Figure 7 Differential and ssGSEA of the COPD combined with ferroptosis gene set (A); GSEA of ferroptosis (B); analysis of key genes associated with COPD combined with ferroptosis (C); analysis of immune cells related to ferroptosis (D); differential analysis of immune cells related to ferroptosis (E); and correlation between ferroptosis and immune cells (F). The significance markers are shown as: ns, P>0.05; *P<0.05; **P<0.01; ***P<0.001.

Machine Learning Selection of COPD Characteristic Genes and Prediction of Diagnostic Models

To identify core genes associated with chronic obstructive pulmonary disease (COPD), we embarked on a meticulous selection of characteristic genes, employing three distinct machine learning algorithms for this purpose. Using the SVM-REF algorithm, we identified 22 characteristic genes, namely, ABCB1, EGF, PTPN22, INS, NR4A1, ACE2, SPP1, LTA, POMC, TLR4, CD4, TP53, ACHE, CSF3, CARD11, PTGS2, CCR7, PLG, CD69, FOXP3, TLR7, and MMP3. The LASSO technique yielded 8 characteristic genes, namely, EGF, FOS, PLG, PTPN22, TLR7, SPP1, NR4A1, and MYC. Moreover, the random forest method identified 20 characteristic genes, including EGF, POMC, PLG, IL1A, ABCB1, CD4, NR4A1, MYC, MAPK11, and PTPN22 (Figure 8A). The intersection of these methods revealed 4 genes quintessential to COPD: EGF, PLG, PTPN22, and NR4A1 (Figure 8B). A coexpression heatmap of these genes revealed a positive correlation among EGF, PLG, and PTPN22, whereas NR4A1 exhibited a negative correlation (Figure 8D). The construction of a diagnostic model based on these COPD characteristic genes, along with ROC curve analysis (Figure 8C and E), revealed an overall AUC of 0.947, indicating the high diagnostic value of these genes, particularly NR4A1, which has emerged as a pivotal gene. The expression of these four genes was significantly different between the COPD group and the healthy control group (P<0.05), with EGF, PLG, and PTPN22 being overexpressed in the COPD group, whereas NR4A1 was underexpressed. Validation in a separate dataset confirmed these four genes as potential diagnostic markers for COPD and targets for its treatment (Figure 8F and G).

Figure 8 Machine learning-based selection of characteristic genes (A); identification of characteristic genes (B); predictive model of characteristic genes using columnar diagrams (C); correlation of characteristic genes (D); ROC curve analysis of characteristic genes (E); expression of characteristic genes (F); and expression of characteristic genes in the validation set (G).The significance markers are shown as: *P<0.05; **P<0.01; ***P<0.001.

Single-Gene GSEA Analysis of Characteristic Genes and Their Connections with Specific Gene Sets

Single-gene GSEA of the four characteristic genes revealed that all four genes were predominantly associated with the Notch signaling pathway, mirroring the findings of previous differential gene enrichment analyses. Thus, it is plausible that these characteristic genes may influence the mechanisms of COPD through the Notch signaling pathway (Figure 9A). In the analysis of the expression of 50 specific gene sets between the COPD group and the healthy control group, differences were observed in the expression of 16 gene sets between the two groups. These include the REACTIVE OXYGEN SPECIES PATHWAY, PI3K AKT MTOR SIGNALING, NOTCH SIGNALING, WNT BETA CATENIN SIGNALING, and CHOLESTEROL HOMEOSTASIS, among others (Figure 9B). The relationships between the four characteristic genes and specific gene sets revealed that EGF, PLG, and PTPN225 were negatively correlated with NOTCH signaling, whereas NR4A1 was positively correlated with these genes (Figure 9C). This further corroborates the hypothesis that characteristic genes may exert their influence on COPD mechanisms through the Notch signaling pathway.

Figure 9 GSEA of characteristic genes (A); expression of gene sets characteristic of both groups (B); heatmap of the expression of characteristic genes in specific gene sets (C). The significance markers are shown as: ns and #P>0.05; *P<0.05; **P<0.01; ***P<0.001.

Prediction and Molecular Docking Validation of Traditional Chinese Medicine for COPD Core Genes

By utilizing the TCMSP database for the analysis of 37 core genes, targets without corresponding compounds were eliminated, resulting in 23 actual genes (Table 1) and 78 candidate compounds. This facilitated the construction of a “target-compound” network diagram (Figure 10A), with TP53, FOS, IL10, MMP3, MYC, FASLG, IL1A, EGF, INS, and ADIPOQ ranking high in degree. Further analysis through TCMSP identified 437 traditional Chinese medicines corresponding to the 78 compounds, leading to the creation of a “target-compound-traditional Chinese medicine” network diagram (Figure 10B). Prominent traditional Chinese medicines in terms of degree value include Ephedra, Ginkgo biloba leaf, Coriander, Perilla, Mulberry leaf, Chrysanthemum, Sea buckthorn, and Kudzu flower, with compounds MOL000069, MOL000098, MOL000675, MOL000008, MOL000511, MOL000006, MOL000415, MOL000879, MOL000223, and MOL000432 being noteworthy. The compounds MOL000069, MOL000098, and MOL000008, which have a high degree of value, were selected for molecular docking validation with TP53 and MMP3 (Figure 10C and Table 2).

Table 1 Actual Genes

Table 2 Candidate Compounds and Potential Traditional Chinese Medicines

Figure 10 Depicts the target-compound diagram (A); Illustrates the target-compound-traditional Chinese medicine diagram (B); Presents the molecular docking diagram (C).

Study of Mendelian Randomization for Immune Cells and Genes

Mendelian randomization studies revealed that 36 types of immune cells have a causal relationship with COPD, while other cells do not exhibit such a relationship. It is possible to reduce the risk of COPD through Basophil AC, CD16+ monocyte % monocyte, CD3− lymphocyte AC, and TCRgd AC, whereas CD27 on CD24+ CD27+, CD27 on sw mem, and CD62L on monocytes may increase the risk of COPD. The selected core genes did not have a causal relationship with COPD, indicating that the identified core genes do not increase or decrease the risk of developing COPD. (Figure 11, Tables 3 and 4)

Table 3 Relevant Genetic Data for the Core Genes

Table 4 MR Heterogeneity and Pleiotropy Analysis

Figure 11 Mendelian randomization of immune cells (A); Mendelian randomization of core genes (B).

Discussion

COPD is a complex process of environmental-genetic-host interactions that evolves over time, manifesting as systemic chronic inflammation.35 As time progresses, the narrowing and disappearance of small airways occur,36,37 with even the terminal and respiratory bronchioles being compromised and lost before the gradual decline in lung function becomes apparent in early-stage COPD.38 This leads to irreversible damage, fixed airflow obstruction, and chronic respiratory symptoms.2 With the field of COPD research advancing toward the early stages of the disease, there has also been a shift in the paradigm of COPD prevention and treatment toward prevention and early intervention. Advancing research into the biological mechanisms of COPD, the trajectory of disease progression, and early intervention can positively impact the prognosis of COPD patients, thereby aiming to reduce the prevalence and mortality rate of the disease. The majority of PRISm cases may develop COPD, and conditions such as chronic bronchitis and emphysema may represent stages in the progression of COPD. By intervening early in chronic bronchitis and emphysema, it may be possible to prevent the onset of COPD. TCM can treat COPD through various mechanisms, including inhibiting inflammatory responses, reducing oxidative stress,39 inhibiting cell apoptosis, improving airway remodeling,40 and regulating the gut microbiota.41 This approach can improve related symptoms, reduce side effects of medications, alleviate economic burdens, and enhance quality of life, especially for patients who have not been diagnosed with COPD, potentially halting its progression in advance. Therefore, the study of immune cells in COPD, potential biomarkers, and the targeted prediction of traditional Chinese medicine for the early diagnosis and intervention of COPD is highly important for reducing the prevalence of this disease.

The Notch signaling pathway plays a pivotal role in the development, differentiation, and regeneration of the lungs following injury;42 participates in the induction of goblet cells and mucosal secretion; and is associated with chronic inflammation.43 Notch signal transduction can control the balance between ciliated cells and secretory cells in COPD, thereby altering chronic inflammation in the airway epithelium,44 regulating alveolar morphology,45 affecting bronchial stem and progenitor cells,46 modulating macrophages,47 and inducing immune imbalance.48 The downregulation of Notch pathway gene expression is associated with smoking and COPD.49 Ferroptosis is related to COPD,50,51 especially age-related disease,52 and the mechanism of COPD is time dependent. Ferroptosis represents a novel target for treating inflammatory diseases,53 complementing inflammation, intervening in immune system regulation of cell death,54 alleviating airway inflammation,55 and regulating macrophage M2 polarization,56 among other aspects involved in the mechanisms of COPD. Morning symptoms are most common in COPD patients,57 with symptom variability over time related to fluctuations in lung function due to circadian rhythms.58 Circadian rhythms are linked to the pathophysiology of COPD,59,60 with symptoms worsening over time and affecting sleep.61 Circadian rhythms play a significant role in chronic inflammatory lung diseases through oxidative stress, inflammatory responses, and alterations in lung function.62

Screening identified 141 genes associated with the inflammatory response in COPD, and disease ontology (DO) enrichment analysis revealed their significant enrichment in pulmonary diseases such as COPD, obstructive lung disease, tuberculosis, and interstitial lung disease, further substantiating the substantial correlation of these 141 genes with COPD. Cytokines play a crucial role in coordinating and maintaining the chronic airway inflammation characteristic of COPD.63 Lipopolysaccharide, through modulation of the gut microbiome,64 can ameliorate COPD. Growth factors are vital for lung morphogenesis and are associated with increased susceptibility to COPD.65 The persistent inflammatory response forms the basis of COPD pathogenesis65 and is central to its progression;66 thus, inhibiting this inflammatory response is an effective method for treating COPD.67 The JAK-STAT signaling pathway activates cytokines in the inflammatory response,68 playing a significant role in COPD, while the IL-17 signaling pathway coordinates pulmonary immune defense in COPD.69 Further screening identified 37 core genes, validating their diagnostic capability for COPD. CD4 regulates T cells to control autoimmunity, thereby managing pulmonary inflammation in COPD.70,71 CD69 regulates various pulmonary inflammatory events and is an early activation marker of lymphocytes;72 increased levels of CD4 and CD69 in the blood can reduce airway obstruction in COPD patients.73 CSF3, a member of the IL-6 family, controls the production, differentiation, and function of granulocytes and is a survival and proliferation factor for macrophages and neutrophils; genetic variations in CSF3 are associated with lung function in smokers with COPD.74 IL11 is related to genetic susceptibility to COPD75 and represents a potential target for treating chronic lung diseases.76 MMP3 may serve as a serum marker for COPD77 and plays a crucial role in small airway remodeling.78 Chemokines such as CCR2 and CCR6 participate in the regulation of COPD through chronic inflammation or bronchial remodeling.79 NOS1 is associated with the inflammatory response in COPD patients80 and can alleviate pulmonary hypertension in COPD patients.81 ACE2 is highly expressed in the blood of COPD patients.82 MicroRNAs (miRNAs) are intimately associated with COPD and are pivotal biomarkers for both diagnosis and prognosis. The potential of miRNAs as therapeutic targets for COPD treatment has a broad spectrum of application prospects, underscoring their significance in the medical field.83,84 The establishment of a core gene TF-miRNA‒mRNA network revealed that the transcription factor CREB1 could reduce susceptibility to COPD,85 hsa-miR-543 can regulate the progression of COPD by targeting IL33,86 and hsa-miR-181c87 plays a significant role in influencing the inflammatory response, neutrophil infiltration, reactive oxygen species production, and inflammatory factors in COPD. MiR-146a-5p, through the negative modulation of IL1A, can provoke the induction of IL8, thus leading to persistent inflammatory conditions in the pulmonary regions affected by COPD88.

Through the infiltration and screening of immune cells, six characteristic immune cells associated with COPD were identified: plasma cells, activated memory CD4 T cells, activated gamma delta T cells, activated NK cells, M2 macrophages, and eosinophils. The presence of plasma cells is related to excessive mucus secretion in COPD patients.89 Activated memory CD4+ T cells play a role in immune protection of the lungs90 and participate in the inflammatory response in COPD.91 Gamma delta T cells serve as potential biomarkers for COPD.92 Activated NK cells regulate proinflammatory factors in lung injury93 and modulate the inflammatory response in both the upper and lower respiratory tracts.94 Activation of M2 macrophages on alveoli can reduce the airway inflammatory response in COPD patients.94 Eosinophils have become potential therapeutic targets for COPD95 and are associated with airway inflammation, and targeting eosinophils in COPD patients can achieve better therapeutic effects.96 Ferroptosis analysis revealed that the TLR4 receptor is associated with COPD97 and can regulate the airway inflammatory response in COPD patients.98 Inhibiting ferroptosis in bronchial epithelial cells through PTGS2 can improve airway remodeling in COPD.99 The repair and regeneration of alveolar epithelial cells are associated with ferroptosis100, and immune cells can interact synergistically with ferroptosis.101 M2 macrophages can clear senescent red blood cells and recycle iron to participate in iron homeostasis while also triggering ferroptosis in macrophages, thereby limiting their immune activity.102

Using three machine learning techniques, four genes associated with the inflammatory response in COPD patients were identified: EGF, PLG, PTPN22, and NR4A1. NR4A1 acts as a negative regulator of signal transduction, modulating immune tolerance, iron homeostasis, and the inflammatory response of lung macrophages in COPD.103 A common pathological feature of COPD is excessive mucus secretion,104 with EGF being related to the expression of MUC5AC in the airway epithelium.105,106 The PLG plays a crucial role in various inflammatory conditions,107 participating in cell migration and tissue remodeling.108 PTPN22 is involved in immune responses109,110 and various inflammatory reactions.111 Correlation analysis of specific gene sets revealed that both characteristic and core genes are significantly associated with the Notch signaling pathway. IL10, known for its broad-spectrum anti-inflammatory effects, is related to the rapid decrease in FEV1112 and the severity of COPD.113 FOS is involved in cell proliferation, death, and inflammation,114 particularly in the inflammatory response of COPD.115 The predicted compounds targeting the core genes included palmitic acid, quercetin, apigenin, and luteolin. In COPD patients, palmitic acid is increased116 and plays a role in delaying COPD exacerbation.117 Quercetin acts on COPD through anti-inflammatory and antioxidant mechanisms,118 alleviating the progression of COPD.119 Moreover, apigenin reduces the impact of senescent cells on COPD through its antioxidant and anti-inflammatory effects.120 The predicted traditional Chinese medicines include ephedra, mulberry leaf, chrysanthemum, sea buckthorn, and kudzu flower. Ephedra protect against airway and pulmonary inflammation in COPD,121 improving COPD by inhibiting inflammation and cell apoptosis through endoplasmic reticulum stress.122 Moreover, chrysanthemum has certain anti-inflammatory effects,123 sea buckthorn can improve airway inflammation,124 and it can improve COPD through modulation of the gut microbiome.125

In conclusion, through Mendelian randomization, we explored the causal relationships between immune cells, genes, and COPD. We deduced that there exists a causal connection between 36 types of immune cell phenotypes and COPD, with 22 types showing a negative correlation and 14 types displaying a positive correlation with COPD. Reverse Mendelian randomization revealed no causal relationship. Chronic inflammation persists in patients with COPD.126 Furthermore, an increase in B cell content is observed in the blood of COPD patients,127 which correlates with the subgroups of TBNK lymphocytes,128 potentially influencing the remodeling of the bronchiolar walls129 and the reduction of terminal bronchioles.130 The augmentation of B cells is somewhat associated with small airway dysfunction and impaired pulmonary function.131 Regulatory T cells (Tregs) are linked to lung function,132 and modulating these cells may ameliorate pulmonary inflammation.133 Monocytes are pivotal drivers of pulmonary inflammation and tissue remodeling.134 Through MR analysis, we investigated the causal relationship between core genes and COPD. Regrettably, no causal connection exists between them; however, this outcome does not conflict with prior analyses. These core genes simply lack a causal relationship with COPD. This causality should be cautiously interpreted, as variations may stem from differences among diverse populations, as well as correlations with various subtypes and stages of COPD. Additionally, discrepancies might be due to variations in different databases, and it is also conceivable that core genes may indirectly contribute to the mechanisms driving the development of COPD, rather than having a direct impact. However, a survey of European populations revealed no significant correlation between IL1A and the rate of decline in lung function,135 aligning with our MR analysis results. Further research suggests that this may be related to the dysfunction of COPD airway epithelial cells,136 which are major regulators of mucosal repair,137 affecting the secretion of mucins138 and serving as potential immunological markers of lung injury.139 Thus, IL1A might have an indirect causal relationship with COPD through pathways involving the regulation of airway epithelial cell function, mucosal repair, mucin protein, and immune functions. Additional studies have found that regulating TP53 can improve pulmonary inflammation140 and has therapeutic effects on chronic bronchitis,141 a stage in the development of COPD, thereby linking them; MYC, a gene associated with allergen sensitization,142 may relate to COPD through its effects on eosinophils.143

Conclusion

Through the integration of bioinformatics and Mendelian randomization analysis, we identified potential biological markers and core immune cells for COPD via machine learning. This approach has enabled the targeted prediction of compounds and traditional Chinese medicines related to COPD. By analyzing these medicines at the miRNA and gene levels, we have furnished novel insights into COPD research, offering fresh perspectives on the intricate molecular mechanisms underlying the development of COPD. This study also lays a foundation for further advancements in clinical drug development. Mendelian randomization analysis suggested a certain causal relationship between immune cells and COPD, yet the core genes identified through our screening did not exhibit a causal relationship with COPD, necessitating further validation.

Data Sharing Statement

Publicly available datasets were analyzed in this study. These data can be found in the GEO database and GWAS database.

Ethics Statement

This study has been reviewed by the Ethics Committee of the Affiliated Hospital of Changchun University of Chinese Medicine. As it utilizes publicly available databases with fully informed consent, and in accordance with the requirements of China’s National Health Commission, research conducted using legally obtained public data or data generated through observation without interfering with public behavior may be exempt from ethical review. Therefore, this study does not require additional IRB approval.

Funding

The author(s) received no specific funding for this work.

Disclosure

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as potential conflicts of interest.

References

1. Global Initiative for Chronic Obstructive Lung Disease (GOLD). Global strategy for the diagnosis, management and prevention of chronic obstructive lung disease (2024 Report). Available from: https://goldcopd.org/. Accessed September19, 2024.

2. Labaki WW, Rosenberg SR. Chronic obstructive pulmonary disease. Ann Intern Med. 2020;173(3):ITC17–ITC32. doi:10.7326/AITC202008040

3. Divo MJ, Liu C, Polverino F, Castaldi PJ, Celli BR, Tesfaigzi Y. From pre-COPD to COPD: a Simple, Low cost and easy to IMplement (SLIM) risk calculator. Eur Respir J. 2023;62(3). doi:10.1183/13993003.00806-2023

4. Celli BR, Wedzicha JA. Update on clinical aspects of chronic obstructive pulmonary disease. N Engl J Med. 2019;381(13):1257–1266. doi:10.1056/NEJMra1900500

5. Meghji J, Mortimer K, Agusti A, et al. Improving lung health in low-income and middle-income countries: from challenges to solutions. Lancet. 2021;397(10277):928–940. doi:10.1016/S0140-6736(21)00458-X

6. Collaborators GBDCRD. Prevalence and attributable health burden of chronic respiratory diseases, 1990-2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet Respir Med. 2020;8(6):585–596. doi:10.1016/S2213-2600(20)30105-3

7. World Health Organization. Global health estimates: leading causes of death. Cause-specific mortality 2000–2019. Available from: https://www.who.int/data/gho/data/themes/mortality-and-global-health-estimates/ghe-leading-causes-of-death. Accessed September19, 2024.

8. Adeloye D, Song P, Zhu Y, et al. Global, regional, and national prevalence of, and risk factors for, chronic obstructive pulmonary disease (COPD) in 2019: a systematic review and modelling analysis. Lancet Respir Med. 2022;10(5):447–458. doi:10.1016/S2213-2600(21)00511-7

9. Adeloye D, Chua S, Lee C, et al. Global and regional estimates of COPD prevalence: systematic review and meta-analysis. J Glob Health. 2015;5(2):020415. doi:10.7189/jogh.05.020415

10. Wang C, Xu J, Yang L, et al. Prevalence and risk factors of chronic obstructive pulmonary disease in China (the China Pulmonary Health [CPH] study): a national cross-sectional study. Lancet. 2018;391(10131):1706–1717. doi:10.1016/S0140-6736(18)30841-9

11. Christenson SA, Smith BM, Bafadhel M, Putcha N. Chronic obstructive pulmonary disease. Lancet. 2022;399(10342):2227–2242. doi:10.1016/S0140-6736(22)00470-6

12. Soriano JB, Polverino F, Cosio BG. What is early COPD and why is it important? Eur Respir J. 2018;52(6). doi:10.1183/13993003.01448-2018

13. Casas M, den Dekker HT, Kruithof CJ, et al. The effect of early growth patterns and lung function on the development of childhood asthma: a population based study. Thorax. 2018;73(12):1137–1145. doi:10.1136/thoraxjnl-2017-211216

14. Bui DS, Lodge CJ, Burgess JA, et al. Childhood predictors of lung function trajectories and future COPD risk: a prospective cohort study from the first to the sixth decade of life. Lancet Respir Med. 2018;6(7):535–544. doi:10.1016/S2213-2600(18)30100-0

15. Belgrave DCM, Granell R, Turner SW, et al. Lung function trajectories from pre-school age to adulthood and their associations with early life factors: a retrospective analysis of three population-based birth cohort studies. Lancet Respir Med. 2018;6(7):526–534. doi:10.1016/S2213-2600(18)30099-7

16. Martinez FJ, Han MK, Allinson JP, et al. At the root: defining and halting progression of early chronic obstructive pulmonary disease. Am J Respir Crit Care Med. 2018;197(12):1540–1551. doi:10.1164/rccm.201710-2028PP

17. Zhang J, Xu H, Qiao D, et al. A polygenic risk score and age of diagnosis of COPD. Eur Respir J. 2022;60(3). doi:10.1183/13993003.01954-2021

18. Gordon SB, Bruce NG, Grigg J, et al. Respiratory risks from household air pollution in low and middle income countries. Lancet Respir Med. 2014;2(10):823–860. doi:10.1016/S2213-2600(14)70168-7

19. Zhang PD, Zhang XR, Zhang A, et al. Associations of genetic risk and smoking with incident COPD. Eur Respir J. 2022;59(2). doi:10.1183/13993003.01320-2021

20. Cho MH, Hobbs BD, Silverman EK. Genetics of chronic obstructive pulmonary disease: understanding the pathobiology and heterogeneity of a complex disorder. Lancet Respir Med. 2022;10(5):485–496. doi:10.1016/S2213-2600(21)00510-5

21. Agusti A, Melen E, DeMeo DL, Breyer-Kohansal R, Faner R. Pathogenesis of chronic obstructive pulmonary disease: understanding t

留言 (0)

沒有登入
gif