Proteomics and metabolomics profiling reveal panels of circulating diagnostic biomarkers and molecular subtypes in stable COPD

Clinical characteristics of participants

Blood samples were collected from 70 patients with advanced COPD and 70 healthy controls from GIRD COPD Biobank (Table 1). There were statistically significant differences between COPD patients and healthy controls in terms of room ventilation, heart diseases, and pulmonary function (P < 0.05), but sex, smoking, pack years, age, height, weight, BMI, fan in kitchen, preserved food consumption, cooking, other comorbidities, and family history were not significantly different between the two groups (P > 0.05). Additional proteomics analysis was conducted using an independent validation set comprising COPD patents (n = 29) and controls (n = 31) (Table 1 on the validation stage). Results indicated that COPD patients had significantly lower mean height, weight, and pulmonary function than healthy individuals (Pmax = 0.001). Moreover, COPD patients, but not healthy controls, reported complications related to chronic respiratory disease (CRD) at admission (P = 0.02). Pulmonary function (except post-FVC_%Pred) and mean platelet volume (MPV) were significantly lower in COPD patients than in healthy individuals (Pmax = 0.025). Similarly, COPD patients exhibited significantly higher monocytes than healthy subjects (P = 0.027) (Table 2). Next, we used proteomics results to stratify COPD patients into 3 subgroups (Table 3): subtype I were mainly COPD without other respiratory diseases (simplex COPD, n = 19), subtype II largely for COPD co-existing with bronchiectasis (COPD-BE, n = 9), and subtype III focused on COPD co-existing with metabolic syndrome (COPD-MD, n = 12). Further analysis revealed that participants in the COPD-BE and COPD-MD groups had significantly lower room ventilation and COPD without chronic respiratory disease than the simplex COPD (P = 0.002 and 0.014, respectively), while those in the COPD-BE group had significantly lower pre_FEV1_%Pred relative to those in other groups (P = 0.025).

Table 2 Blood count of validation cohorts for targeted proteomicsTable 3 Characteristics of proteomics-driven subtype cohorts by using the discovery proteomics dataProteomic profiles and functional alterations related to COPD

Serum samples were obtained from 40 patients with advanced COPD and 40 healthy controls for TMT-labeled proteomic analysis. The proteomic patterns of serum from COPD patients were distinct from those of serum obtained from healthy controls. A total of 1432 proteins were identified and quantified. Quality control analysis was carried out, the lengths and mass errors of peptides, as well as coverage and sequence distribution of the proteins were calculated (Additional file 1: Fig. S1A–D). Consequently, 251 differentially expressed proteins (DEPs) were identified between the two groups, of which 151 and 100 were significantly up-regulated and down-regulated, respectively (fold change ≥ 1.2 or ≤ 0.83 and a P < 0.05) (Additional file 1: Fig. S1E). Moreover, 31.43% of these proteins were involved in extracellular matrix, whereas 29.29 and 18.57% among them regulated functions in the cytoplasm and nucleus, respectively (Additional file 1: Fig. S1F). The DEPs were divided into Q1–Q4 according to the multiple of fold change, and the heatmap of enrichment analysis (GO and KEGG) shown in Additional file 2: Fig. S2. In sum, the quality control analysis showed the data were acquired with a high degree of consistency and reproducibility, and the significantly up-regulated DEPs indicated a generally activated effect of biological processes in COPD.

The detailed data processing protocols for COPD and healthy controls are showed in Fig. 2A. In total, 251 dysregulated DEPs were identified between the two groups (Fig. 2B). For biological processes, these proteins were mainly involved in immune response, myeloid leukocyte activation, neutrophil mediated immunity, granulocyte activation, platelet activation, and homotypic cell–cell adhesion. For molecular functions, the proteins were primarily involved in seven function processes, namely identical protein binding, structural molecule activity, cadherin binding, oxidoreductase activity, structural constituent of cytoskeleton, tetrapyrrole binding, and heme binding. Most of these proteins were located in vesicle lumen and secretory granule (Fig. 2C). Results from KEGG pathway analyses revealed that these DEPs were significantly enriched in carbon metabolism, and glycolysis/gluconeogenesis. Heatmap analyses showed higher antioxidant activity and activated glycosaminoglycan binding in COPD compared to healthy controls (Fig. 2D). Among these DEPs, the final dysregulated proteins were selected (Fig. 2E), and validated by targeted proteomics according to differential significance levels, including ORM1, HP, HBB, VCL, TPIA, HBA1, CA2, SOD1, FGA, PRDX2, CDH5, ALDOA, CA1, TNC, CAT, and LRG1 (Fig. 2F). In final, 16 DEPs were selected associated with COPD compared with healthy controls.

Fig. 2figure 2

Proteomic profiles and functional alterations related to COPD. A Data process. B Venn plot showing identification of the COPD specific proteins among COPD vs healthy controls. C Gene Ontology annotation and KEGG enrichment analysis of differentiated expressed proteins (DEPs). D Heatmap showing the differentiated expressed proteins (DEPs). The red and colors in the heatmap denote higher gene expression and lower gene expression, respectively. E The final selected dysregulated proteins. F Protein validation via targeted proteomics (PRM)

Protein validation via targeted proteomics (PRM)

The COPD-related proteomics and functional alteration results from the discovery study were then used to develop protein marker panels for accurate prediction of severity of COPD. Thus, we analyzed members of upregulated functional groups based on top Pmin value and fold changes (max). Finally, 16 DEPs with confident quantitation data were validated in an additional cohort comprising 29 COPD and 31 healthy controls (Additional file 6: Table S1). Considering the challenge of quantifying dozens of protein candidates in parallel, we employed a median-throughput mass spectrometry-based approach as the Parallel Reaction Monitoring (PRM) for analysis of 176 tryptic peptides. Eventually, this targeted proteomic analysis approach detected 16 protein candidates with robust signal across the validation set. The trends of the marked proteins in COPD samples corroborated results from the discovery study (Fig. 2F). To sum to, the final 5 significantly dysregulated proteins were selected after validating via targeted proteomics, including ORM1, HP, HBB, VCL, and CDH5.

Proteomic subtypes of COPD and their association with clinical outcomes

Consensus clustering based on the 107 most variable proteins in COPD identified three proteomic subtypes (each disease normalized by health and SD > 0.5) (Fig. 3A). They were designated as subtype I (n = 19), subtype II (COPD-BE, n = 9), and subtype III (COPD-MD, n = 12). The resulting heatmap revealed that the DEPs were significantly enriched in metabolic pathways and complement and coagulation cascades in subtype I. Moreover, 5 highly expressed proteins, including B4GAT1, GNPTG, ADAMTSL4, CFP, and EXTL2 were identified. We found that the subtype II was enriched in metabolic pathways, biosynthesis of antibiotics, carbon metabolism, biosynthesis of amino acids, and glycolysis/gluconeogenesis, and the involved proteins included SOD1, PRDX2, CAT, PRDX6, HBB, GSTO1, and HBA1. For subtype III group, the complement and coagulation cascades were significantly enriched, and the following proteins were enriched: HP, LBP, SERPINA (1, 3), SAA1, CRP, ORM1, ORM2, and CRP. GO enrichment analysis was performed to annotate the putative functional implications of the grouped DEPs (Fig. 3B–D, F).

Fig. 3figure 3

Proteomic subtypes of COPD and their association with clinical outcomes. A Data process. B Heatmap showing the DEPs among COPD 3 subtypes. Proteome based stratification of COPD revealed three subtypes (subtype I–III) related to different clinical outcomes and molecular feature: subtype I were patients with simplex COPD, and subtype II were COPD mainly co-existing with bronchiectasis, and subtype III were COPD largely co-existing with metabolic syndrome. The red and colors in the heatmap denote higher gene expression and lower gene expression, respectively. C Pathways for dysregulated proteins enriched. D Gene Ontology annotation and KEGG enrichment analysis of DEPs among COPD 3 subtypes. E A tridimensional plot via Principal Component Analysis (PCA) showing the configuration of indexes on COPD and its co-morbidities. F The final selected dysregulated proteins. G ROC analysis of PCA and the combination of (RRM1, SUPV3L1, KRT78). H The corresponding information on blood tests

A tridimensional plot via PCA showed the configuration of indexes on COPD and COPD with co-morbidities in Fig. 3E. Plots of individual component scores for the first principal component (PC1) versus the second principal component (PC2) versus the third principal component (PC3) were provided. PC1, PC2, and PC3 showed clear separation of COPD from COPD subtypes. Combinations of PC1, PC2, and PC3 could explain 58.4% proportion of the whole variances. Based on the selected proteins panel, as indicated in Fig. 3G, ROC analysis of PCA and the combination of RRM1 + SUPV3L1 + KRT78 was calculated, and results showed that the auROC was 0.95 and 0.96, respectively. There was no significant difference between PCA analysis and the combination of RRM1 + SUPV3L1 + KRT78 (P > 0.05). In addition, basophil count showed the ability to distinguish COPD from COPD-BE or COPD-MD, while white blood cells and neutrophil ratio was able to distinguish COPD from COPD-BE, as well COPD-BE from COPD-MD (Fig. 3H).

In sum, COPD were subtyped into three based on their corresponding clinical outcomes. We also identified that both PCA analysis and the combination of RRM1 + SUPV3L1 + KRT78 could effectively differentiate COPD and COPD with co-morbidities.

Metabolomic profiles and functional alterations associated with COPD

A library of known metabolite standards (APPLIED PROTEINS TECHNOLOGY Co. Ltd) was employed to identify 210 differentially expressed metabolites (DEMs) in COPD compared to healthy controls. In addition, quality control analyses were carried out based on correlation distributions for total and separately metabolites (or by group). The EBAM plots, normalization, PLS-DA, and t test were conducted (Additional file 3: Fig. S3A, B). Results indicated that PLS-DA produced a model that could separate positive and negative metabolites. Heatmaps depicting clustering of total and selected metabolites in positive and negative modes, respectively, are shown in Fig. 4A, B. Notably, 44 differentially expressed metabolites between the two groups were identified, among which 15 and 29 were positive and negative metabolites, respectively. The functions of the selected metabolites were displayed on VIP and volcano plots, and these metabolites were palmitoylethanolamide, trans-Dehydroandrosterone, decanoyl-l-carnitine, betaine, pseudouridine, camphor, 1-stearoyl-2-hydroxy-sn-glycero, hypoxanthine, theophylline, l-isoleucine pregnenolone, androsterone sulfate, azelaic acid, sunitinib, bisindolymalemide1 (Fig. 4C, D). By using the complementary approach, the weighted gene co-expression network analysis (WGCNA), we identified several co-expression modules (Additional file 5: Fig. S5). The Betaine in MEcyan module was found to be significantly associated with COPD. Summary, 8 positive- and 6 negative-metabolites were selected by metabolomic analysis.

Fig. 4figure 4

Metabolomic profiles and functional alterations associated with COPD. A Heatmaps depicting clustering of total and selected metabolites across positive modes. B Heatmaps depicting clustering of total and selected metabolites across negative modes. C The functions of the selected positive metabolites depicted using variable importance in the projection (VIP) value and volcano. D The functions of the selected negative metabolites depicted using VIP and volcano

Integrated analyses of proteomics and metabolomics dataCorrelation analysis

After appropriate sample quality control (QC) and normalization procedures, we performed PCA on the proteomics and metabolomics data. All datasets effectively distinguished COPD from healthy controls, with the best separation observed with the combined proteomics and metabolomics analysis (Fig. 5E). We found that a considerable number of proteins and metabolites were both involved in mineral absorption, proximal tubule bicarbonate reclamation, inflammatory mediator regulation, lysosome, neuroactive ligand-receptor interaction, cAMP signaling pathway, biosynthesis of amino acids, purine metabolism, fructose and mannose metabolism, glycolysis/gluconeogenesis Fig. 5C. However, P value of 0.05 as a cutoff, the significant enrichment pathways were enriched including both proteomics and metabolomics data in Fig. 5A, B. Heatmap analyses of the differentially expressed proteins and metabolites identified relatively strong or weak proteins-metabolites correlations. Proteins or metabolites with strong or weak correlations were detailed in Fig. 5F. The final differential proteins (DEPs) or DEMs were selected as the target proteins or metabolites. To analyze the interactions between them, the network between DEPs and DEMs was analyzed by Cytoscape, and the results detailed in Fig. 5G. In sum, enrichment analyses of DEPs and DEMs were performed to investigate the potential correlations between them, and the results showed that there were strong or weak correlations between these proteins or metabolites.

Fig. 5figure 5

Integrated analyses of proteomics and metabolomics data. A KEGG enrichment analysis of differentiated expressed proteins (DEPs) and differentiated expressed metabolites (DEMs). B The number of DEPs and DEMs. C Number of proteins and metabolites common involved in one pathway. D Validation study of the final predicted model on mild-to-moderate COPD patients. E PCA analysis of proteomics or (and) metabolomics data. F Heatmap analyses of DEPs and DEMs identified relatively strong or weak proteins-metabolites correlations. G The network analysis between DEPs and DEMs. H Establishment of predictive panels for COPD (single and combined biomarkers analysis)

Establishment of diagnostic panels for COPD (diagnostic efficacy of single biomarkers)

Before the biomarkers were integrated, the profile of each biomarker was first analyzed separately (Fig. 5Ha, Hb). Subsequently, the ROC models were applied to calculate the auROC, specificity, and sensitivity of single biomarkers. The calculations were performed using the following formula: %sensitivity = [true-positive/(true-positive + false-negative)] * 100; %specificity = [true-negative/(true-negative + false-positive)] * 100. Thereafter, 7 positive and 7 negative metabolites, alongside 6 proteins that had shown significant changes in COPD patients (Pmax = 0.029) were individually subjected to ROC analysis, to evaluate their sensitivity and specificity and help discriminate COPD from healthy controls. As showed in Additional file 7: Table S2, results indicated that palmitoylethanolamide, which was used as a positive metabolite had a maximal auROC of 78.0%, with sensitivity and specificity of 68.0 and 72%, respectively. On the other hand, 1-Stearoyl-sn-glycerol (used as a negative metabolite) had an auROC of 78.0% and a sensitivity of 71.0% against controls. For the proteomics data, CDH5 recorded a maximum auROC of 85.0%, with a sensitivity and specificity of 80.0 and 78%, respectively. Results from blood routine tests showed that MPV had an auROC of 64.0%, with a sensitivity and specificity of 59.0 and 53.0% respectively, while monocytes recorded an auROC of 68.0%, with a sensitivity of 62.0%, and specificity of 63.0%. In sum, diagnostic efficacy of single biomarker was established based on metabolites, proteins, and blood routine test. The result indicated that palmitoylethanolamide, 1-Stearoyl-sn-glycerol, and CDH5 had the highest auROC values for positive metabolites, negative metabolites, and proteins, respectively.

Diagnostic capability of combined biomarkers

The data shown in Additional file 8: Table S3 and Fig. 5H indicate that analysis of predictive capability of a combination of 4 positive metabolites, namely palmitoylethanolamide, trans-dehydroandrosterone, decanoyl-l-carnitine, and betaine, obtained an auROC of 91.0%, with a sensitivity and specificity of 83.0 and 85.0%, respectively. When 4 negative metabolites (theophylline, l-isoleucine, 1-stearoyl-sn-glycerol, and hypoxanthine) were combined, an auROC of 95.9%, was obtained with a sensitivity of 90.0%, and specificity of 90.0%. On the other hand, combining scores from 3 positive metabolites (palmitoylethanolamide, decanoyl-l-carnitine, and betaine) with those from 2 negative ones (theophylline and hypoxanthine) resulted in ROC curve with an auROC of 97.0%, a sensitivity of 88.0%, and specificity of 93.0%. The same model was used to construct a logistic model using the 5 markers, dubbed diagnostic P5, and observed differential abundance in predicting serious COPD as follows:

$$}_} = 1|} = 0)}} = - 14.645 + \left( } + 1.41*}\;} - 4.83*} + 0.15*} + 1.17*}} \right)/10000.$$

Using this P5 score, advanced COPD participant can be distinguished predicted with high sensitivity and specificity, and the auROC reached 0.97 in our data set (Fig. 5Hc).

Combining scores from all proteins resulted in an auROC of 93.6%, with a sensitivity and specificity of 88.0 and 90.0%, respectively (Fig. 5Hd). The 3-protein (ORM1, CDH5, and PRDX2) based logistic model generated a dichotomous score, dubbed diagnostic P3, which allowed classification of each participant. The relationship between the probability score of a participant being positively diagnosed with advanced COPD and the log2 intensity value of each protein marker was defined as follows:

$$}_} = 1|} = 0)}} = - 10.323 + 2.354*}1 + 6.834*}5 + 1.694*}2.$$

Combining scores from 3 metabolites and that of 1 protein resulted in a high auROC value of 98.0%, with a sensitivity of 94.0%, and specificity of 95.0% (Additional file 9: Table S4). The final logistic model, dubbed P4, comprised palmitoylethanolamide, theophylline, hypoxanthine, and CDH5 (all auROCmin > 0.724), and was expressed as follows:

$$}_} = 1|} = 0)}} = - 17.934 + \left( } + 0.13*} + 0.77*}} \right)/10000 + 8.340*}5.$$

The scores from the P4 model had significantly higher power than scores from other models in predicting advanced COPD. The sensitivity, specificity, and auROC of P4 for COPD prediction were greatest (Fig. 5He). The highest Youden index (0.835), which indicates the model’s ability to correctly diagnose true serious COPD patients, was achieved at the cut-point. Taken together, results from the logistic model indicated that a combination of palmitoylethanolamide, theophylline, hypoxanthine, and CDH5 was the best signature of serum biomarkers for predicting advanced COPD.

Validation study of the final predicted model

The final predictors were further verified on mild-to-moderate COPD patients and healthy controls. For CDH5, it was found that its expression was not significantly different between COPD and controls (Fig. 5D). The clinical and demographic characteristics of participants are presented in Table 4. A total of 30 patients with COPD and 30 healthy controls were enrolled. Of note, there were statistically significant differences between COPD patients and controls in terms of pack_years, age, heart disease, and pulmonary function (P < 0.05), but no significant difference was found between the two groups in terms of sex, smoking, BMI, respiratory symptoms, chronic respiratory diseases, poison exposure, room ventilation, cook, other comorbidities, and family history of cancers (P > 0.05). Among the metabolites, theophylline was not significantly different between the two groups, but hypoxanthine showed significant differences in the validation cohort (data missing for palmitoylethanolamide) (Fig. 5D). The detailed clinical and demographic characteristics for participants was described in the previous study [36].

Table 4 Characteristics of validation cohorts on mild-to-moderate COPD for CDH5

In sum, theophylline and CDH5 had not significantly different between mild-to-moderate COPD patients and healthy controls.

留言 (0)

沒有登入
gif