Controlling the confounding effect of metabolic gene expression to identify actual metabolite targets in microsatellite instability cancers

Characteristics of cancer cell lines and matching

The metabolite and gene expression data of 902 cancer cell lines were identified in the DepMap database. Of these cancer cell lines, 827 were associated with MSS cancers, whereas 75 were associated with MSI cancers. We selected and matched cell lines on a 1:3 basis for both MSI and MSS cancers (Additional file 4: Table S1). APC mutations, TP53 mutations, and cancer cell lines from GI, breast, gynecologic (GYN), hematologic (Hema), genitourinary (GU), and other cancer cell lineages were used to match MSI (n = 75) and MSS (n = 225) cancers. Additional file 4: Table S1 presents comparisons of MSI- and MSS-matched cancer cell lines. The percentage of APC mutations was 25.3% in MSI cancers and 20.4% in MSS cancers. No discernible distinction could be made between MSI and MSS cancers in terms of the clinical characteristics of TP53 mutations and cancer cell lineages. There were 32% TP53 mutations in both MSI and MSS cancers. Regarding cancer lineages, GI cancers accounted for 33.3%, breast and gynecological cancers accounted for 34.7%, and hematological cancers accounted for 17.3% of cancer cell lines. Based on data from 300 cancer cell lines, CATCH model analysis was applied to adjust metabolite data using gene expression data as covariates.

Identification of adjusted metabolite features affecting MSI cancer status by the CATCH model

To use the CATCH method for predicting MSI cancer status, 225 metabolite data points were considered \(25\times 25\) tensor data, and 87 metabolic genes were considered confounding covariates (Fig. 1A). The 87 metabolic genes were selected from four major metabolic pathways associated with 225 metabolites (Additional file 5: Tables S2 and Additional file 6: Table S3), namely, the amino acid, carbohydrate, lipid, and nucleotide metabolic pathways [31]. The eight most significant adjusted metabolite features were selected based on the variable selection algorithm in the CATCH model for predicting the MSI cancer status (Additional file 7: Table S4 and Fig. 2). The adjusted metabolite features distinguished MSI from MSS cancer. The direct effect, coefficient \(B\) in the CATCH model, on the MSI cancer status ranged from − 0.17–0.56 (Table 1).

Fig. 2figure 2

Adjusted metabolite features with confounding covariates in MSI and MSS cancers. The heatmap illustrates the relationship between adjusted metabolite features and microsatellite instability (MSI)/microsatellite stable (MSS) cancer status. In the CATCH model, the Y-axis displays adjusted metabolite features and levels. Eight crucial adjusted metabolite features were found to distinguish between MSI and MSS cancers. The cancer cell lineages (gastrointestinal (GI), breast, and gynecologic (GYN), hematologic (Hema), genitourinary (GU), and others), MSI/MSS cancer status, and APC and TP53 mutations (mutation/wild type: M/W) are displayed on the X-axis. Positive values for adjusted metabolite features suggest a stronger association with MSI cancers, while negative values represent a strong association with MSS cancer. The eight different metabolite features were Hippurate, 3-phosphoglycerate, cholesterol ester (CE, C14:0), lysophosphatidylethanolamine (LPE, C18:0), 6-phosphogluconate, phosphatidylcholine (PC, C36:1), reduced glutathione (GSH), and sarcosine

Table 1 Direct effect of adjusted metabolite features

Positive coefficient values implied that the metabolite features were more relevant to the MSI cancer status. Seven adjusted metabolite features were present in MSI cancer cell lines, namely, 3-phosphoglycerate, cholesterol ester (CE, C14:0), lysophosphatidylethanolamine (LPE, C18:0), 6-phosphogluconate, phosphatidylcholine (PC, C36:1), reduced glutathione (GSH), and sarcosine. All had a positive relationship with MSI cancer.

3-phosphoglycerate is related to cellular energy. The glycolytic intermediate 3-phosphoglycerate is a source of sarcosine and serine. The oncometabolite sarcosine has been associated with invasive prostate cancer cells [32]. 6-phosphogluconate affects nucleotide metabolism, which aids cell growth. CEs, LPE, and PC are also related to lipid metabolism in cancer [33]. Glutathione is associated with the survival of cancer cells through reactive oxygen species (ROS) mechanisms. Clinical studies have also linked glutathione to chemotherapy resistance [34].

If the coefficient value was negative, then the metabolite feature exhibited greater relevance to MSS cancer. One metabolite feature, Hippurate, was associated with MSS cancer. Based on the CATCH model, we demonstrated the direct effect of adjusted metabolites on the prediction of MSI cancer status.

Performance of the CATCH model

Using metabolomic and genomic data, we compared the performance of the CATCH model with that of RF, the most common classification algorithm in machine learning. To evaluate the performance, we randomly split the entire dataset into training and testing datasets. MSI and MSS cancer cell lines were maintained at a 1:3 ratio throughout the training and testing datasets. The training dataset contained 90% of the entire data, whereas the testing dataset contained 10%. The splitting process was run for 100 iterations, and the average performance metrics were calculated. Table 2 shows the performance of the RF and CATCH models. The CATCH model performed well, with high accuracy (0.82), sensitivity (0.66), specificity (0.88), precision (0.65), and F1 score (0.65). For RF, a simultaneous approach was used to predict MSI cancer status. The RF model had an accuracy of 0.77, sensitivity of 0.10, specificity of 0.99, precision of 0.81, and F1 score of 0.26. The CATCH model was more accurate in classifying MSI and MSS cancer status than the RF model in terms of accuracy and F1 score.

Table 2 Performance of CATCH and random forest modelsThe significance of metabolite data with or without adjustment

To better understand the confounding effects of gene expression covariates on metabolite features, we compared their significance between non-adjusted and CATCH-adjusted metabolite data. Additional file 2: Fig. S2 displays a boxplot comparing the non-adjusted, standardized, and CATCH-adjusted metabolite data between MSI and MSS cancers. Considering the confounding covariates of metabolic genes, we obtained eight adjusted metabolite features that were strongly correlated with MSI and MSS cancers (p < 0.05, Supplementary Fig. S2). Supplementary Fig. S2 shows that three metabolite features, namely, 3-phosphoglycerate (non-adjusted and standardized, p = 0.855), LPE (C18:0) (non-adjusted and standardized, p = 0.056), and GSH (non-adjusted and standardized, p = 0.25), were initially not correlated with MSI and MSS cancers, but after adjustment, they had a significant correlation with MSI cancers (p < 0.001). Hippurate, CE (C14:0), 6-phosphogluconate, PC (C36:1), and sarcosine were five non-adjusted and standardized metabolite features that were substantially associated with MSI or MSS cancers (p < 0.05).

Hippurate, for example, had a higher level in MSS cancers (non-adjusted and standardized, p = 0.026) (Fig. 3A, 3B). After adjustment, it was more significantly associated with MSS cancers (CATCH-adjusted, p value < 0.001) (Fig. 3C). Without adjustment for metabolic gene expression, the level of 6-phosphogluconate was negatively correlated with MSI cancers (non-adjusted and standardized, p value = 0.008) (Fig. 3D, 3E). In contrast, it was positively associated with MSI cancers after elimination of the confounding effect of metabolic gene expression (CATCH-adjusted, p value < 0.001) (Fig. 3F). Sarcosine had a higher value and was positively correlated with MSI cancers (non-adjusted and standardized, p value = 0.001) (Fig. 3G, 3H). After adjustment for metabolic gene expression, sarcosine was more significantly associated with MSI cancers (CATCH-adjusted, p < 0.001) (Fig. 3I).

Fig. 3figure 3

CATCH model-adjusted versus non-adjusted metabolite data in MSI and MSS cancers. Boxplot comparing the differences among the non-adjusted, standardized, and CATCH-adjusted metabolite levels as well as the p value in MSI and MSS cancers. Boxplots of non-adjusted, standardized, CATCH-adjusted Hippurate levels and p values are shown in A, B, and C, respectively. Boxplots of non-adjusted, standardized, CATCH-model-adjusted 6-phosphogluconate levels and p values are shown in D and F, respectively. Boxplots of non-adjusted, standardized, CATCH model-adjusted sarcosine levels and p values are shown in G, H, and I, respectively

The relationship between adjusted metabolite features and metabolic genes

We quantified the relationship between adjusted metabolite levels and metabolic gene expression in cancer cell lines to identify the potential metabolic pathways in MSI cancers. The α coefficients are listed in Additional file 8: Table S5. In Fig. 4, we present a heatmap visualization based on eight adjusted metabolite features and 87 metabolic genes. Table 3 shows the eight adjusted metabolite features and metabolic genes in the same metabolic pathway. Hippurate is correlated with the expression of metabolic genes such as QDPR, FAH, PAOX, MPST, and SLC7A5, which are involved in the metabolism of amino acids and their derivatives. 3-phosphoglycerate is related to ST3GAL2, PFKP, HS3ST1, HPSE, and PGM1 metabolic gene expression, which are involved in carbohydrate metabolism. LPE is linked to the expression of metabolic genes such as PTGS1, CHPT1, SC5D, PLA2G3, and DHCR24, which are involved in lipid metabolism. Sarcosine has been associated with ALDH4A1, GPT2, AGMAT, ASL, AADAT, and MPST metabolic gene expression involved in the metabolism of amino acids. By investigating the relationship between adjusted metabolite features and metabolic gene expression, we found potential biological relevance in cancer metabolic pathways.

Fig. 4figure 4

The relationships between adjusted metabolite features and metabolic gene expression. The correlation between eight significantly adjusted metabolite features and 87 metabolic genes was used to create a heatmap. The Y-axis displays eight adjusted metabolite features, including Hippurate, 3-phosphoglycerate, CE (C14:0), LPE (C18:0), 6-phosphogluconate, PC (C36:1), GSH, and sarcosine. On the X-axis, each metabolic pathway is represented by 87 metabolic genes, including amino acids, carbohydrates, lipids, and nucleotides

Table 3 The relationship between adjusted metabolites and metabolic genes in the same metabolic pathwayCancer metabolism in MSI and MSS cancers

Figure 5 displays the results of the metabolic pathway analysis using the HMDB and the KEGG databases [28, 29]. Cancer metabolism involves eight critical adjusted metabolites and four metabolic genes. Metabolic pathways are related to glycolysis, nucleotide, glutamate, and lipid metabolism. In MSI cancers, the four major metabolic pathways for cancer metabolism are the serine synthesis pathway (3-phosphoglycerate and sarcosine), pentose phosphate pathway (6-phosphogluconate), glutamate pathway (GSH), and lipid metabolism pathway (CE, LPE, and PC). After integrating the adjusted metabolite features and metabolic genes in the glycolytic and glutamate metabolic pathways, we found that 3-phosphoglycerate increased with phosphofructokinase 1 (PFKP) metabolic gene expression in the CATCH model. Sarcosine was associated with the expression of ALDH4A1 and GPT2 metabolic genes (Table 3). Proline is converted to glutamate through the expression of the ALDH4A1 metabolic gene. GPT2 metabolic gene expression is involved in the conversion of 2-oxoglutarate to glutamate. These findings suggest that an increase in sarcosine levels may occur due to glycolytic and glutamate metabolism. The conversion of choline to PC, which increases LPE metabolism, involves CHPT1 metabolic gene expression. These results indicate that dysregulation of PFKP, ALDH4A1, GPT2, and CHPT1 metabolic gene expression may lead to cancer metabolism in MSI cancer cell lines.

Fig. 5figure 5

Metabolic pathways in MSI and MSS cancers. MSI cancer cells exhibit glycolytic metabolism, including the serine synthesis (sarcosine synthesis) and pentose phosphate (nucleotide synthesis) pathways. Sarcosine, 3-phosphoglycerate, and 6-phosphogluconate levels were elevated. Additionally, lipid metabolism and GSH synthesis were observed in MSI cancer metabolism. Levels of PC, LPE, CE, and GSH were elevated. Phosphofructokinase 1 (PFKP), ALDH4A1, GPT2, and CHPT1 are involved in MSI cancer metabolism. These metabolic pathways promote cancer cell proliferation, energy production, and survival. DNA repair genetic mutations drive cancer metabolism, and sarcosine damages the DNA. Sarcosine and genomic alterations can regulate each other. In MSS cancers, environmental factors, such as the microbiota, may play a crucial role in Hippurate synthesis

留言 (0)

沒有登入
gif