To identify a signature panel that characterises prostate cancer metastasis, we performed proteomic analysis on rapid autopsy samples (two localised prostate cancer and eight prostate cancer metastasis samples). These samples were obtained from four patients who were diagnosed with androgen-independent metastatic castration-resistant prostate cancer (mCRPC) [14, 23]. The clinical information of these four patients is described in Table 1 and includes age at diagnosis, Gleason score, and treatment history. The samples utilised in this study were part of the Rapid Autopsy Program at the University of Michigan. The eight samples of prostate cancer metastases were collected from various metastatic sites, including one sample from the right lung, one sample from the peritoneal lymph node, two samples from the mediastinal lymph node, one sample from liver, one sample from kidney, one sample from the periaortic lymph node, and one sample from the dura (Fig. 1a). Samples were subjected to flash freezing and prepared for liquid chromatography–mass spectrometry (LC-MS) (Fig. 1b). Each of the ten rapid autopsy samples were analysed in triplicate by LC-MS, and the protein expression profiles of the metastasis samples were compared to the protein profile of the localised prostate cancer group. The adjusted P-values were computed using the Benjamini–Hochberg (BH) procedure.
To reduce the background signal of the proteomic results and to increase the relevance of the proteomic analysis, a threshold false discovery rate (FDR) of P < 0.05 and a threshold fold change (FC) of FC > |1.5| were applied to the proteomic results, revealing 154 protein candidates with increased levels in the metastasis group relative to the localised prostate group, and 129 candidates with decreased levels in the metastasis group (Fig. 1c, d). To select metastasis candidates that capture the worse clinical prognosis of prostate cancer metastasis, we utilised multiple publicly available prostate cancer patient datasets to further screen the 154 proteins with increased expression in the metastasis group (Supplementary Fig. S1A). We assessed these 154 proteins in the TCGA Firehose Legacy dataset and the BS Taylor, Cancer Cell, [17, 18] dataset to search for candidates that correlate with prostate cancer biochemical recurrence and predict shorter time of disease-free survival (Supplementary Fig. S1A). Elevated levels of 28 candidates from the 154 proteins were identified to correlate with prostate cancer recurrence in at least one of the two datasets, and increased levels of 37 protein candidates correlate with worse disease-free survival in at least one of the two datasets (Fig. 1c). This identified 12 candidates that positively correlated with biochemical recurrence and shorter disease-free survival in at least one dataset (Fig. 1c). To reduce dataset-specific candidates, a selection criterion was set to include candidates that displayed at least one set of consistent positive correlations (P < 0.05) in both datasets (RUVBL1, HDGF, POSTN, STMN1, ASPN, CA2, H2AC1) (Supplementary Fig. S1B). Candidates that were implicated in a single dataset (U2AF2, FABP4, XPO1, DDX39B) were only included in further analyses if they satisfied a more stringent FDR of P < 0.01 in their association with biochemical recurrence and worse disease-free survival (Supplementary Fig. S1B).
Then, these 11 candidates were further analysed in the Chandran UR, BMC Cancer, [19] dataset to discover candidates associated with metastasis relative to localised prostate cancer and normal samples in these datasets [17, 19] (Supplementary Fig. S1A, B). This led to the identification of 7 genes that fit these criteria (U2AF2, RUVBL1, HDGF, FABP4, XPO1, POSTN, and STMN1) (Fig. 1c, d). These 7-gene candidates were chosen due to their elevated levels in metastatic prostate cancer in both protein and mRNA levels and their association with worse clinical prognosis in terms of increased risk of biochemical recurrence and worse patient disease-free survival outcome (Supplementary Fig. S1A, B).
Elevated levels of the 7-gene candidates, U2AF2, RUVBL1, HDGF, FABP4, XPO1, POSTN, and STMN1, correlate with prostate cancer biochemical recurrence and worse patient disease-free survivalThe 7-gene candidates we identified demonstrated a positive correlation with recurrent prostate cancer in at least one of the two datasets with clinical information of biochemical recurrence (Fig. 2). In the TCGA Firehose Legacy dataset, U2AF2 (P = 0.0045), RUVBL1 (P = 0.0038), HDGF (P = 0.042), XPO1 (P = 0.0021), POSTN (P = 0.044), and STMN1 (P = 0.0008) were elevated in the recurrent group (N = 58) relative to the non-recurrent group (N = 371) (Fig. 2a, b-left, c, e, f-left, g-left). In the BS Taylor, Cancer Cell, 2010 dataset, RUVBL1 (P = 0.013), FABP4 (P = 0.0005), POSTN (P = 0.0073), and STMN1 (P < 0.0001) were significantly elevated in recurrent prostate cancer (N = 36) relative to non-recurrent prostate cancer (N = 104) (Fig. 2b-right, d, f-right, g-right). There was no difference in FABP4 levels between the non-recurrent and recurrent groups in the TCGA Firehose Legacy dataset (Supplementary Fig. S2A). There was also no significant difference in U2AF2, HDGF, and XPO1 levels between the non-recurrent and recurrent groups in the BS Taylor, Cancer Cell, 2010 dataset, potentially due to a smaller sample size (Supplementary Fig. S2B–D).
Fig. 2: The 7 candidates, U2AF2, RUVBL1, HDGF, FABP4, XPO1, POSTN, and STMN1, are highly expressed in recurrent prostate cancer relative to non-recurrent prostate cancer.a Scatter dot plot shows the mRNA expression of U2AF2 in recurrent prostate cancer (N = 58) and non-recurrent prostate cancer (N = 371) from the TCGA Firehose Legacy dataset. b Scatter dot plots show the mRNA expression of RUVBL1 in recurrent vs non-recurrent prostate cancer from the TCGA Firehose Legacy dataset (left) and the BS Taylor, Cancer Cell, 2010 dataset (right). In the BS Taylor, Cancer Cell, 2010 dataset, N = 36 for recurrent prostate cancer and N = 104 for non-recurrent prostate cancer. c Scatter dot plot of HDGF mRNA expression profile in recurrent vs non-recurrent prostate cancer from the TCGA Firehose Legacy dataset. d Scatter dot plot of FABP4 mRNA expression profile in recurrent vs non-recurrent prostate cancer from the BS Taylor, Cancer Cell, 2010 dataset. e Scatter dot plot of XPO1 mRNA expression profile in recurrent vs non-recurrent prostate cancer from the TCGA Firehose Legacy dataset. f Scatter dot plots of POSTN mRNA expression levels in recurrent vs non-recurrent prostate cancer in the TCGA Firehose Legacy dataset (left) and the BS Taylor, Cancer Cell, 2010 dataset (right). g Scatter dot plots of STMN1 mRNA expressions in recurrent and non-recurrent prostate cancer in the TCGA Firehose Legacy dataset (left) and the BS Taylor, Cancer Cell, 2010 dataset (right). The Student’s t-test was performed with *P < 0.05, **P < 0.01, ***P < 0.001, and ****P < 0.0001 for all comparisons between the two groups. The P-values are labelled correspondingly on each of the scatter dot plots.
In addition, increased expression of these 7-gene candidates was also associated with worse prostate cancer disease-free survival in either the TCGA Firehose Legacy dataset and/or the BS Taylor, Cancer Cell, 2010 dataset (Fig. 3) [17, 18]. Patient disease-free survival was selected as an inclusion criteria since metastasis is the major contributor to prostate cancer-driven mortality. Our results identified that an increased level of U2AF2 was associated with worse disease-free survival in the TCGA, Firehose Legacy dataset with P = 0.0015 (Fig. 3a). The median expression of U2AF2 was used as the cutoff threshold to determine the U2AF2 high (N = 246) and U2AF2 low (N = 245) groups. HDGF, RUVBL1, XPO1, POSTN, and STMN1 also displayed the same trend with P = 0.0157, 0.0005, 0.0025, 0.0116, and 0.0001 respectively (Fig. 3b-left, c–e, f-left). In the BS Taylor, Cancer Cell, 2010 dataset, elevated levels of HDGF, STMN1, and FABP4 correlated with worse disease-free survival with P = 0.0429, 0.0016, and 0.0079 respectively (Fig. 3b-right, f-right, g). The median expression levels of each candidate were also used as the cutoff threshold, and 70 samples were included in each of the high and low-expression groups. Due to variations between datasets, FABP4 expression did not predict patient disease-free survival in the TCGA Firehose Legacy dataset, while U2AF2, RUVBL1, XPO1, and POSTN did not predict disease-free survival in the BS Taylor, Cancer Cell, 2010 dataset (Supplementary Fig. S2E–I). In addition, six of the seven candidates (all except FABP4) were also associated with higher Gleason scores when assessed in the TCGA Firehose Legacy dataset (Supplementary Fig. S3). Notably, U2AF2 and STMN1 could differentiate between all Gleason scores ranging from 6 to 9+ (Supplementary Fig. S3A, G). This further suggests an association between increased expression of the candidates and worse clinical risks and prognosis. The statistically significant correlation between these 7-gene candidates and clinical prognostic factors such as biochemical recurrence and patient disease-free survival suggests their clinical potential as prognosis indicators for worse outcomes.
Fig. 3: High expressions of the 7 candidates correspond with worse patient disease-free survival in prostate cancer patients.a Kaplan–Meier plot of prostate cancer disease-free survival outcomes based on high and low U2AF2 expressions in the TCGA Firehose Legacy dataset. The high (N = 246) and low (N = 245) groups were determined using the median expression level of U2AF2 as the cutoff threshold. b Kaplan–Meier plots of HDGF expressions and patient disease-free survival outcome in the TCGA Firehose Legacy dataset (left) and the BS Taylor, Cancer Cell, 2010 dataset (right). In the BS Taylor, Cancer Cell, 2010 dataset, N = 70 for both HDGF high and HDGF low groups. The groups were determined using the median HDGF expression level as cutoff. c Kaplan–Meier plot of RUVBL1 expression and prostate cancer patient disease-free survival outcome in the TCGA Firehose Legacy dataset. d Kaplan–Meier plot of XPO1 expression and prostate cancer patient disease-free survival outcome in the TCGA Firehose Legacy dataset. e Kaplan–Meier plot shows the correlation between POSTN expression and patient disease-free survival outcome in the TCGA Firehose Legacy dataset. f Kaplan–Meier plots show the correlation between STMN1 expression and prostate cancer patient disease-free survival outcome in both the TCGA Firehose Legacy dataset (left) and BS Taylor, Cancer Cell, 2010 dataset (right). g Kaplan–Meier plot of FABP4 expression and prostate cancer patient disease-free survival outcome in the BS Taylor, Cancer Cell, 2010 dataset. For all Kaplan–Meier plots, the Log-rank P-values are computed and labelled on the corresponding plots. *P < 0.05, **P < 0.01, and ***P < 0.001.
The 7-gene candidates are elevated in metastatic prostate cancerIn addition, the levels of the 7-gene candidates were increased in metastasis samples relative to localised prostate cancer, benign prostate tissue adjacent to cancer, and normal prostate tissues (Fig. 4). The Chandran UR, BMC Cancer, 2007 dataset included 18 normal prostate tissues, 63 benign prostate tissues adjacent to tumour, 65 localised prostate cancer, and 25 prostate cancer metastasis [19]. In this dataset, all the 7-gene candidates exhibited a significant increase of mRNA expression in the metastasis group relative to localised tumours, with RUVBL1 exhibiting near statistical significance (U2AF2 P = 0.0106, RUVBL1 P = 0.051, HDGF P < 0.0001, FABP4 P = 0.0005, and STMN1 P = 0.032) (Fig. 4a–g). In addition to differentiating between metastasis and localised groups, the candidates also demonstrated the ability to stratify between normal and metastatic groups and between begin adjacent to tumour tissues and metastatic groups. The expressions of U2AF2 (P = 0.014, P = 0.0001), RUVBL1 (P = 0.005, P < 0.0001), HDGF (P = 0.0038, P < 0.0001), XPO1 (P < 0.0001, P < 0.0001), POSTN (P = 0.0068, P = 0.0002), and STMN1 (P = 0.0058, P < 0.0001) were all significantly elevated in the metastasis group relative to both normal prostate and benign adjacent to tumour groups (Fig. 4a–g). However, while the increased expression of FABP4 was statistically significant between metastasis and benign tissues adjacent to the tumour (P = 0.0004), this difference was not statistically significant between the metastasis and the normal group (P = 0.088) (Fig. 4d).
Fig. 4: The 7 candidates are highly expressed in prostate cancer metastasis relative to localised prostate cancer and normal prostate tissues.a Expression profiles of U2AF2 in the Chandran, BMC Cancer, 2007 dataset (N = 18 for normal prostate tissues, N = 63 for normal prostate tissues adjacent to tumour, N = 65 for localised prostate cancer tumours, and N = 25 for metastatic prostate cancer). b–g Expression profiles of RUVBL1 (b), HDGF (c), FABP4 (d), XPO1 (e), POSTN (f), and STMN1 g in the Chandran, BMC Cancer, 2007 dataset described in (a). h Expression levels of U2AF2 in localised prostate cancer tumours vs prostate cancer metastases using the BS Taylor, Cancer Cell, 2010 dataset (N = 131 for localised prostate cancer samples and N = 19 for prostate cancer metastasis samples). i–l The expression levels of RUVBL1 (i), HDGF (j), FABP4 (k), and STMN1 (l) in localised prostate tumours vs prostate cancer metastases as described in (h). For all comparisons between the two groups, Student’s t-test was performed with ns = non-significant, *P < 0.05, **P < 0.01, ***P < 0.001, and ****P < 0.0001.
To further assess the positive association between the 7-gene candidates and prostate cancer metastasis, we also compared their expressions in the BS Taylor, Cancer Cell, 2010 (N = 131 for localised prostate cancer samples and N = 19 for metastasis samples). In the analysis of this dataset, five of the seven candidates displayed increased expression in the metastasis group relative to localised prostate cancer with U2AF2 P = 0.0007, RUVBL1 P < 0.0001, HDGF P = 0.0237, FABP4 P < 0.0001, and STMN1 P < 0.0001 (Fig. 4h–l) [17]. The difference in expression levels of XPO1 and POSTN did not reach statistical significance, likely due to the relatively small sample size of the metastasis group (N = 19) (Supplementary Fig. S2J–K). These results indicate that the 7-gene candidates selected have the potential to distinguish metastatic prostate cancer from localised prostate cancer. This, coupled with their association with worse clinical prognosis, suggests their clinical potential to assist in prognosis prediction and therapy selection.
Novel 5-gene signature panel predicts worse patient disease-free survival relative to individual candidatesTo develop a metastasis signature panel to best predict patient outcome, we further assessed whether different combinations of the seven candidates would achieve improved prediction of worse outcome relative to individual candidates. To determine the best combinations, we first performed principal component analysis on the expression profiles of the seven candidates in the TCGA Firehose Legacy dataset (Fig. 5a). We illustrated that four of the seven candidates (U2AF2, RUVBL1, STMN1, HDGF) have expression profiles in a cluster, suggesting that these candidates exhibit similar profiles that associate with similar features (Fig. 5a). To identify metastasis signature panel, we assessed the hazard ratios and statistical significance captured by a variety of combinations using Kaplan–Meier plots in the TCGA Firehose Legacy and the BS Taylor, Cancer Cell, 2010 datasets (Fig. 5b–e, Supplementary Fig. S4, S5). We identified that the 4-gene panel (U2AF2, RUVBL1, STMN1, HDGF) achieved an improved prediction in the TCGA Firehose Legacy dataset relative to individual candidates (Figs. 3, 5c-left). However, this 4-gene panel did not improve prediction in the BS Taylor, Cancer Cell, 2010 dataset since both STMN1 (P = 0.0016, HR = 2.879 [1.493–5.552]) and FABP4 (P = 0.0079, HR = 2.434 [1.263–4.69]) achieved better statistical significance and hazard ratios relative to the combined panel (P = 0.01, HR = 2.374 [1.23–4.584]) (Figs. 3f-right, g-right, 5b, c-right). To improve the separation between patient survival outcomes, we combined all 7 genes to capture more relative risk by including more features (Fig. 5b, d). However, the 7-gene panel only improved outcome prediction in the BS Taylor, Cancer Cell, 2010 dataset and is not consistent in the TCGA Firehose Legacy dataset (Fig. 5b, d). To prevent adding additional features that are subtractive for the prediction, we added features from the additional three candidates, FABP4, XPO1, and POSTN, to the 4-gene panel to find the optimal combination that best captures patient disease-free survival outcomes. After assessing all 5-gene and 6-gene panels, we discovered a combination that consistently improved disease-free survival outcome prediction in both TCGA Firehose Legacy and BS Taylor, Cancer Cell, 2010 (Fig. 5b–e, Supplementary Fig. S4, S5). The combination that demonstrated the best disease-free survival prediction was the 5-gene panel comprised of U2AF2, RUVBL1, STMN1, HDGF, and FABP4 (Fig. 5e). In both datasets, this combination achieved a separation with smaller P-values relative to all individual candidates, suggesting its ability to achieve a lower false positive rate in outcome prediction (Figs. 3a–c, f, g, 5e). The improvement of the statistical significance of this 5-gene signature panel relative to all single candidates in terms of P-values and \(^\) statistics was also the most consistent across the two datasets when compared to all other combinations (Fig. 5c–e, Supplementary Fig. S4, S5). Additionally, this 5-gene panel also consistently captured more relative risks in its hazard ratios relative to all other combinations in both datasets, suggesting not only more confident, statistically significant predictions, but also increased risks association with an elevation in its expression (Fig. 5b–e, Supplementary Fig. S4, S5).
Fig. 5: The 5-gene signature panel demonstrates improved ability to capture prostate cancer patient disease-free survival outcome.a Principal component analysis of the expression profiles of the seven candidates (U2AF2, RUVBL1, STMN1, HDGF, FABP4, XPO1, and POSTN) in the TCGA Firehose Legacy dataset. b Comparison of the hazard ratios of patient disease-free survival outcomes (with confidence intervals) using the seven individual candidates, the 4-gene panel (U2AF2, RUVBL1, STMN1, HDGF), the 7-gene panel, and the 5-gene signature panel (U2AF2, RUVBL1, STMN1, HDGF, and FABP4). The left panel shows the hazard ratios from the TCGA Firehose Legacy dataset, and the right panel compares the hazard ratios from the BS Taylor, Cancer Cell, 2010 dataset. c Kaplan–Meier plots that show the association between the expression levels of the 4-gene panel (U2AF2, RUVBL1, STMN1, and HDGF) and patient disease-free survival outcomes in the TCGA Firehose Legacy dataset (left) and the BS Taylor, Cancer Cell, 2010 dataset (right). For the right panel, N = 245 for the low-expression group and N = 246 for the high-expression group. For the left panel, N = 70 for both the high- and low-risk groups. The high and low-expression groups are determined assuming equal contributions of the four genes, and the median level was used as the cutoff threshold. The Log-rank P-value, hazard ratio (HR) with confidence intervals, and the chi-square statistics (\(^\)) are labelled correspondingly. d Kaplan–Meier using the same datasets as (c) but with the 7-gene panel (U2AF2, RUVBL1, STMN1, HDGF, FABP4, XPO1, and POSTN). e Kaplan–Meier plots using the same datasets as (c) but with the 5-gene signature panel (U2AF2, RUVBL1, STMN1, HDGF, and FABP4). For all, *P < 0.05, **P < 0.01, ***P < 0.001, and ****P < 0.0001.
We then attempted to improve this 5-gene signature by using elastic net model to find weighted coefficients for the signature genes. We used the glmnet package in R to generate three elastic net models. Model 1 was trained in the TCGA Firehose Legacy dataset. Model 2 was trained in the BS Taylor, Cancer Cell, 2010 dataset, and model 3 was trained on the two datasets combined. However, we did not observe significant improvement in the weighted models relative to the original equally weighted 5-gene signature panel (Fig. 5e, Supplementary Fig. S6). Models 1 and 2 only significantly improved prediction in the datasets that they were trained in, which suggests overfitting to their training datasets (Supplementary Fig. S6A, B, Fig. 5e). Model 3 generated comparable results as the original 5-gene signature panel in both TCGA Firehose Legacy (model 3 HR = 2.52, unweighted HR = 2.37) and BS Taylor, Cancer Cell, 2010 datasets (model 3 HR = 3.28, unweighted HR = 3.01) (Supplementary Fig. S6C, Fig. 5e). However, the original unweighted signature demonstrates a better separation between high and lor groups in the Kaplan–Meier plots. Thus, the unweighted 5-gene signature panel is selected for further analyses.
The 5-gene signature panel also correlates with metastatic prostate cancer in additional patient datasetsWith the 5-gene prostate cancer metastasis signature panel comprised of U2AF2, RUVBL1, STMN1, HDGF, and FABP4, we further assessed the power of prediction of this gene signature panel in two different, independent public patient datasets (Grasso CS, Nature, 2012; Varambally S CS, Cancer Cell, 2005) [20, 21]. The Grasso CS, Nature, 2012 dataset included 28 benign prostate tissues, 59 localised prostate cancer tissues, and 35 metastatic castration-resistant prostate cancer (mCPRC) tissues [20]. In this dataset, 4 of the 5 signature genes (except FABP4) displayed a positive association with the onset of mCRPC relative to benign prostate and localised prostate cancer groups (U2AF2 P = 1.71 × 10−9, 8.75 × 10−12; RUVBL1 P = 1.13 × 10−12, 4.15 × 10−9; HDGF P = 2.06 × 10−8, 3.36 × 10−11; STMN1 P = 3.89 × 10−7, 2.49 × 10−11; FABP4 P = 0.0606, 0.1268) (Supplementary Fig. 7A–E). The Varambally S CS, Cancer Cell, 2005 dataset included six benign prostate, seven localised prostate cancer, and six metastatic prostate cancer samples [21]. The increased expression of the metastasis group relative to localised prostate cancer did not reach statistical significance in U2AF2 (P = 0.090), STMN1 (P = 0.087), and FABP4 (P = 0.093), potentially due to small sample sizes (Supplementary Fig. S7F–H). However, we still observed a statistically significant increase in the expressions of RUVBL1 (P = 0.0073, 0.012) and HDGF (P = 0.025, 0.025) when comparing the metastasis group to both benign and localised groups (Supplementary Fig. S7I, J). In addition, while the elevation of FABP4 in metastasis relative to benign prostate tissues did not reach statistical significance (P = 0.15), the expression of U2AF2 (P = 0.013) and STMN1 (P = 0.011) was significantly increased in the metastasis group relative to the benign group (Supplementary Fig. S7F–H).
After characterising the expression profiles of the individual signature candidates in different stages of prostate cancer progression, we also tested the ability of the 5-gene signature panel to separate the metastatic group from the benign and localised groups (Fig. 6a, b). In both datasets, the 5-gene signature panel achieved improved separation between the metastasis and the localised groups relative to all individual candidates (Fig. 6a, b, Supplementary Fig. S7A–J). In the Grasso CS, Nature, 2012 dataset, the 5-gene signature panel displayed improved statistical significance when comparing the metastasis group to both benign (5-gene P = 1.88 × 10−17; single gene U2AF2 P = 1.71 × 10−9, RUVBL1 P = 1.13 × 10−12, HDGF P = 2.06 × 10−8, STMN1 P = 3.89 × 10−7, FABP4 P = 0.061) and localised groups (5-gene P = 6.17 × 10−22; single gene U2AF2 P = 8.75 × 10−12, RUVBL1 P = 4.15 × 10−9, HDGF P = 3.36 × 10−11, STMN1 P = 2.49 × 10−11, FABP4 P = 0.1268) (Fig. 6a, Supplementary Fig. S7A–E). Similarly, analysis in the Varambally S CS, Cancer Cell, 2005 dataset revealed the same improvement when comparing the mean of the metastasis group to that of the benign (5-gene P = 0.0035; single gene U2AF2 P = 0.0132, STMN1 P = 0.0111, FABP4 P = 0.1475, RUVBL1 P = 0.0073, and HDGF P = 0.0247) and localised groups (5-gene P = 0.0071; single gene U2AF2 P = 0.0899, STMN1 P = 0.0868, FABP4 P = 0.0932, RUVBL1 P = 0.0124, HDGF P = 0.0246) (Fig. 6b, Supplementary Fig. S7F–J). These results suggest that the 5-gene signature panel displays a more reliable association with prostate cancer metastasis than any single candidate by outperforming all single candidates across datasets.
Fig. 6: The 5-gene signature panel also displays improved ability to predict prostate cancer progression and the onset of prostate cancer metastasis in other independent datasets.a Box and whisker plot that shows the mRNA expression profile (z-score) of the 5-gene signature panel in the Grasso CS, Nature, 2012 dataset. This dataset is composed of 28 benign prostate samples, 59 localised prostate cancer samples, and 35 metastatic prostate cancer samples. The combined 5-gene expressions are calculated assuming equal contributions from the five signature genes. b Box and whisker plot that shows the mRNA expression profile of the 5-gene signature using the Varambally S CS, Cancer Cell, 2005 dataset. This dataset includes six benign prostate samples, seven localised prostate cancer tumours, and 6 metastatic prostate cancer samples. For all comparisons between the means of the two groups, Student’s t-test is performed with ns = non-significant, *P < 0.05, **P < 0.01, ***P < 0.001, and ****P < 0.0001. The corresponding P-values are labelled accordingly. c Receiver-operating characteristic (ROC) analysis of the 5-gene signature panel in the Grasso CS, Nature, 2012 dataset. In this analysis, N = 35 for patients with prostate cancer metastasis and N = 59 for patients with localised prostate cancer. The area under the curve (AUC) and the P-value are labelled. d–h Receiver-operating characteristic (ROC) analysis using each of the 5 single signature genes.
In addition, we performed Receiver-Operating Characteristic (ROC) analysis in the Grasso CS, Nature, 2012 dataset (Fig. 6c–h). Our results demonstrated that the 5-gene signature panel displays an area under the curve (AUC) of 97.38% (P < 0.0001) relative to that of the single candidates (U2AF2 87.17%, P < 0.0001; RUVBL1 81.74%, P < 0.0001; HDGF 83.73%, P < 0.0001; FABP4 59.68%, P = 0.25; and STMN1 86.1%, P < 0.0001) (Fig. 6c–h). The 5-gene signature panel also associates with biochemical recurrence in the TCGA Firehose Legacy dataset (P = 0.0002) and an independent dataset, Gerhauser, Cancer Cell, 2018 (P = 0.019) (Supplementary Fig. S8A–C) [22]. The 5-gene signature demonstrates improved statistical significance relative to all single candidates in the TCGA dataset (Supplementary Fig. S8A, B). While the 5-gene signature panel is not the most statistically significant in the Gerhhauser, Cancer Cell, 2018 dataset, it demonstrates the smallest 95% confidence interval (Supplementary Fig. S8A, 8D–H). With its consistent association with worse patient outcomes and the metastatic phenotype across multiple datasets, the 5-gene signature can assist in making more reliable prognostic predictions by reducing the variation and enhancing the statistical significance of the risks associated with its expression.
留言 (0)