Single-cell and spatial transcriptomics identify COL6A3 as a prognostic biomarker in undifferentiated pleomorphic sarcoma

We identified a cohort of 639 dermal-based soft tissue sarcomas (544 AFX and 95 PDS) from a national pathology company biopsied between 2018 and 2023. Patient and tumor characteristics are included in Supplemental Table 1. In accordance with prior studies, we demonstrated that both AFX and PDS occur more frequently in males [2]. However, while AFX and PDS are more likely to occur on the head and neck in males (94.8% and 72.1%), PDS was more likely to occur on the trunk and extremity in females (61.5%).

Next, we selected five tumors, biopsied 2019–2023, all located on the scalp, that had paired biopsy and excision specimens with clear histopathologic evidence of either invasion into the subcutis (PDS, n = 2) or confinement to the dermis (AFX, n = 3) (Fig. 1A, Supplemental Fig. 1). For one of the five excision specimens, we performed paired scRNA-seq with 10X Genomics Chromium Single Cell Gene Expression Flex and spatial transcriptomics with 10X Genomics Visium (Fig. 1B). From the excision specimen, we identified 15 clusters (Fig. 1B). We classified the clusters into the following cell types using the indicated marker genes: three fibroblast clusters (PDGFRA), three keratinocyte clusters (KRT5 and KRTDAP), tumor cells (PDGFB), T cells (CD3D and CD8A), B cells (CD79A), M2/Tumor-associated macrophages (MSR1), M1 macrophages (AIF1), pericytes (RGS5), melanocytes (MLANA), eccrine cells (AQP5), and plasmacytoid dendritic cells (CLEC4C) (Supplemental Fig. 2A) [8].

Fig. 1figure 1

Overview of scRNA-seq in AFX and PDS specimens. (A) Schematic of two PDS and three AFX tumors that were analyzed with single-cell and spatial transcriptomics. (B) UMAP of scRNA-seq of Pt 1 PDS excision specimen. Clusters are labeled by predicted cell type. (C) UMAP of scRNA-seq of the five biopsy specimens in panel A. Clusters are labeled by predicted cell type. (D) Prognostic value of the top 30 genes enriched in PDS over AFX for overall survival in 17 cancer types included in the Human Protein Atlas. The number of cancer types for which each gene has an unfavorable prognostic value is in red to the left and the number of cancer types for which each gene has a favorable prognostic value is in green to the right. (E) PDGFRA (fibroblast), PDGFRB (tumor cells), and COL6A3 (prognostic marker) expression in cells from the five biopsy specimens in panel A.

The top differentially expressed genes in the tumor cells, by logistic regression, are included in Supplemental Fig. 2B. Of note, the most enriched gene, CD74, has been proposed as a marker to distinguish AFX and PDS from other undifferentiated pleomorphic sarcomas [9, 10]. While many of the top differential genes identified in the scRNA-seq are novel, collagen subunit genes were enriched in tumor cells from scRNA-seq and bulk RNA-seq (COL4A1, COL4A2 in scRNA-seq from PDS vs. other cell types (Supplemental Table 2) and COL3A1 in bulk RNA-seq from AFX vs. fibroblasts and keratinocytes) [11]. Of note, the AFX samples with bulk RNA-sequencing included cases with invasion into the subcutis, which would now be classified as PDS. Moreover, tumor cells express markers of both fibroblasts (PDGFRA) as well as differentiated keratinocytes (KRTDAP) (Supplemental Fig. 2C-D), supporting that these tumor cells share features of both cell types.

For the five biopsy specimens, we performed scRNA-seq using 10X Chromium Flex. While Flex is a probe-based design and traditional scRNA-seq is reverse-transcription based, gene expression and cell population analysis correlate highly between fixed and fresh assays [12]. Nevertheless, in fixed tissue, RNA degradation and probe design can introduce biases in expression data. We integrated all five biopsy scRNA-seq samples with Scanorama’s batch correction for a total 21,822 cells passing filter (11,013 from AFX and 10,809 from PDS). We annotated known cell types using marker genes (Fig. 1C, Supplemental Fig. 3A). We captured all cell types in both tumor types (Supplemental Fig. 3B-C). We also performed differential gene expression analysis between the AFX and PDS tumor cells with Scanpy’s logistic regression function (Supplemental Fig. 3D). The three most significant Gene Ontology enrichments for the top 30 genes enriched in PDS were cell matrix adhesion (p = 9.32e-10), blood vessel morphogenesis (p = 6.87e-9), and regulation of epithelial-mesenchymal transition (p = 3.16e-7). To address whether these genes have prognostic value in tumorigenesis broadly, we analyzed data from the Human Protein Atlas, which includes mRNA expression and cancer patient survival for 17 different cancers in nearly 8,000 patients [13]. Twenty-five of the top 30 genes had prognostic value for at least one cancer in the Human Protein Atlas. Moreover, these 30 genes were enriched for negative prognostic value in other cancers compared to the background distribution (p = 3.9e-5, Fig. 1D). This finding suggests that our results from AFX and PDS may be applicable to other tumors.

We were particularly interested in whether our findings apply to UPS, which falls in the broader family of dedifferentiated sarcomas with AFX and PDS, although they are now considered biologically distinct tumors. To identify biomarkers predictive of outcomes specifically for UPS tumors, we focused on an independent cohort of 46 patients with UPS in the Cancer Genome Atlas (TCGA) [14]. We performed survival analyses based on gene expression of the top nine differential genes identified with Scanpy’s logistic regression (TNXB, COL6A3, THY1, ZFP36L2, PRSS23, BGN, CD47, F2RL2, and FSTL1) (Supplemental Fig. 4A, Supplemental Methods). All nine of these genes are enriched in bulk RNA-sequencing of AFX compared to adjacent healthy skin (Supplemental Fig. 4B) [13].

While seven of these nine genes trended toward worse survival with high expression, only two reached significance. COL6A3, the second most significant gene in our scRNA-seq data, was most predictive of overall survival in the 46 patients with UPS (p = 0.00232), followed by BGN (p = 0.00843) (Fig. 2A, Supplemental Table 2). COL6A3 is expressed in the PDGFRA and PDGFRB positive cells in all AFX and PDS samples in this study, as well as in a mouse UPS model, suggesting that it is being expressed by tumor cells (Fig. 1E, Supplemental Fig. 5A-B) [15]. However, it was also expressed in fibroblasts and endothelial cells, such that the signal could have contributions from other cell types or could be diluted in bulk RNA-sequencing (Supplemental Fig. 2A). Nevertheless, expression of COL6A3 in bulk tumor is prognostic of overall survival in this cohort. We hope that as single cell RNA-sequencing becomes more prevalent, assays such as this will show a greater prognostic value. Both COL6A3 and BGN are expressed throughout the depth of the tumor based on 10X Visium spatial transcriptomics (Supplemental Fig. 6).

Fig. 2figure 2

Prognostic value of COL6A3 in two independent cohorts of patients with UPS. (A) Cohort 1: Overall survival of 46 patients with UPS included in the Cancer Genome Atlas separated by expression of COL6A3 with log-rank p-value. (B) Cohort 2: Top row displays metastasis-free survival in 74 non-UPS sarcomas separated by CINSARC class (left) or COL6A3 expression (right). Bottom row displays metastasis-free survival for 38 UPS tumors, separated by CINSARC class (left) or COL6A3 expression (right). (C) Cohort 2: ROC curves for metastasis in 74 non-UPS sarcomas (top) or 38 UPS tumors (bottom) based on COL6A3 expression

To test the prognostic value of COL6A3 expression and compare its performance to CINSARC, we performed survival analyses on an independent cohort of 112 sarcoma patients with metastasis data, CINSARC classification, and bulk RNA-sequencing [16]. On the complete set of 112 patients, both CINSARC and COL6A3 expression predicted metastasis-free survival (p = 0.0092 and 0.017) (Supplemental Fig. 7). However, within the 38 patients diagnosed with UPS, COL6A3 expression significantly separated metastasis-free survival (p = 0.0069) while CINSARC classification did not (p = 0.369) (Fig. 2B). Of note, BGN also separated metastasis-free survival in this cohort (p = 0.024). COL6A3 expression had an AUC of 0.696 at predicting metastasis in UPS patients (Fig. 2C). In the 74 non-UPS sarcomas, CINSARC separated metastasis free survival (p = 0.016) while COL6A3 did not (p = 0.395) (Fig. 2B-C). We therefore propose that COL6A3 is a prognostic marker in UPS, and, within this soft-tissue sarcoma subtype, outperforms CINSARC.

COL6A3 encodes the alpha-3 chain of type VI collagen. Type VI collagen is expressed in several cancers and has been shown to enhance tumorigenesis and epithelial-mesenchymal transition [17, 18]. Moreover, type VI collagen levels have been associated with chemotherapy resistance in breast, lung, and pancreatic cancers [18]. Therefore, COL6A3 expression may have a role in predicting treatment response along with its prognostic value in UPS.

留言 (0)

沒有登入
gif