Clinical and molecular predictors of very late recurrence in oestrogen receptor-positive breast cancer patients

Patients and clinicopathologic data

From an initial list of 1335 patients diagnosed between 1988 and 1998, 194 controls and 96 cases were reviewed on the Electronic Patient Record system (Supplementary Figure 1). After tissue-related exclusions, 50 cases (recurrence between years 10 and 20) and 67 controls (disease free beyond 20 years) had RNA and DNA extracted from the tumour. After further exclusions on the basis of insufficient material, ER-negative or HER2-amplified status, 98 samples (44 cases, 54 controls) had RNA expression and 71 samples (38 cases, 34 controls) had DNA sequencing data. Clinico-pathological parameters are described in Table 1. Median age was 50 with no significant difference between cases and controls. Cases had significantly larger tumours than controls (21 mm vs 16 mm, p = 0.01) and a significantly greater proportion of patients with node-positive disease (p = 0.0002). A larger proportion of cases were treated with chemotherapy compared to controls (76% vs 42% p = 0.0007). The CTS5 was calculated for the subset of cases and controls with all relevant data available and it was significantly higher in cases compared to controls 3.63 versus 2.91, p = 0.0003. Histological subtype was evenly distributed between cases and controls with 75% of all cancers being invasive ductal carcinoma. Data on menopausal status were lacking in many patients as these data were rarely codified in older patient record systems. In the overall patient population, 28 patients were pre-menopausal, 27 were post-menopausal and 43 had unknown menopausal status. The most common site of metastasis was bone (55%) followed by liver, lung and nodal tissue. 40% patients had more than one site of metastasis. Local recurrence was more common in cases (42%) compared to controls (34%).

Table 1 Baseline characteristics of patients within the VLR study and METABRIC group > 10 yr

The demographic information for the > 10 yr recurrence and > 20 yr non-recurrence groups METABRIC is also shown in Table 1. The METABRIC dataset also showed a similar trend for differences in tumour size and nodal status and a highly significant difference in CTS5 between the cases and controls. The demographic information for the 0–5 yr and 5–10 yr groups is described in Supplementary Table 2.

The relationship between these clinicopathologic data and risk of DR in VLR and METABRIC is shown in Fig. 1A–E. The data from METABRIC for the time periods of 0–5 and 5–10 years after diagnosis are also shown to allow comparisons with the relationships of the clinicopathologic parameters sooner after diagnosis. There is a clear excess of high nodal status, large tumour size and, to a lesser extent, high grade, that persisted in cases with a DR beyond 10 years compared to controls that is consistent between both the VLR and METABRIC datasets.

Fig. 1figure 1

Percentage of cases and controls in METABRIC and VLR A nodal status, B tumour size, C histopathologic grade. D Percentage of cases and controls in METABRIC and VLR according to PAM50 subtypes. E Percentage of cases and controls in METABRIC and VLR according to the age at diagnosis by decade. Three time intervals after diagnosis are shown for METABRIC: Recurrence (R) in years 0–5 (0–5 yr R) vs no recurrence in years 0–5 (0–5 yr noR); recurrence in years 5–10 (5–10 yr R) vs no recurrence in years 0–10 (0–10 yr noR); recurrence after 10 years (10 yr R) vs no recurrence in years 0–20 yr (0–20 yr noR). Note 0–5 yr noR and 0–10 yr noR groups include patients who went on to recur at a later time

Copy number alteration (CNA)

There was no significant difference in the percentage of genome with a CNA between cases and controls. Cases in METABRIC similarly showed no significant difference in CNAs from controls (Supplementary Figure 2). In METABRIC, tumours from patients showing recurrence by 5 years showed a significantly greater number of CNAs than those recurrence free beyond 5 years (p =  < 0.0001) but there was no difference for those recurring between 5 and 10 years and, consistent with the VLR data, those who were recurrence free in years 0–10 (p = 0.23) (Supplementary Figure 2). Thus, the importance of CNA for prognosis seems to be largely lost 10 years and possibly as early as 5 years after diagnosis.

Although the exomic analysis of the limited gene set conducted in VLR provided less sensitivity for gains and losses than the pan-genome analysis conducted in METABRIC, the overall patterns of gains and losses are similar and there are no significantly altered regions in the late recurrence METABRIC data. No chromosomal regions were altered significantly differently between cases (> 10 yr R) and controls (0–20 yr noR) in both VLR and METABRIC (Supplementary Figure 3). Similarly, there are no regions with significant differences after multiple correction between cases (0–5 yr R) and controls (0–10 yr noR) in METABRIC, in contrast to the many large chromosomal regions with highly significant differences for cases with earlier recurrences (Supplementary Figure 3).

Mutation detection

Overall, there was no difference in somatic mutational burden between cases and controls. There were trends for greater numbers of MAP3K1 and GATA3 mutations in controls compared to cases (p = 0.07 and 0.07 respectively, Fig. 2A). This did not remain significant after correction for multiple testing but the pattern for both these genes was also seen in the METABRIC dataset, strikingly so for GATA3. The trend for greater proportion of GATA3-mutated tumours in the controls than in the cases with time to recurrence is evident in the METABRIC dataset (Fig. 2B) in contrast to TP53 which showed highly significant differences in early recurrences (p < 0.00001) but not in later recurrences (> 10 yr p = 0.1 and > 20 yr p = 0.82) (Fig. 2C). PIK3CA was the most commonly detected mutation in both cases and controls and this was concordant with data from METABRIC. A combined analysis of the VLR and METABRIC data is shown in Fig. 2D and emphasizes the apparently protective effect of GATA3 (p = 0.005) mutations for late DR with little difference in the incidence of the other mutations.

Fig. 2figure 2

A Percentage of cases (red) and controls (blue) in VLR and METABRIC (> 10 yr R and > 20 yr noR) with a mutation in genes with at least 4 mutations overall in VLR; B, C in the combined VLR and METABRIC data for GATA3 and TP53, respectively, in comparison with earlier time intervals for METABRIC; D in the combined VLR and METABRIC data for the 13 genes common to both analyses and with at least 4 mutations in the VLR cohort

Gene expression

Figure 3 shows intrinsic molecular subtyping for VLR cases and controls. Distribution was largely as expected for an ER+ population with little difference between cases and controls. In particular, there was a similar proportion of cases and controls from within each of the luminal A and luminal B subtypes indicating no prognostic significance of these intrinsic subtypes beyond 10 years. The METABRIC data similarly showed no substantial differences in intrinsic subtypes between controls and cases after 10 years but did show the expected excess of luminal A tumours that were non-recurrent up to 5 years and between 5 and 10 years.

Fig. 3figure 3

A Heatmap with unsupervised clustering of all samples analysed according to the patterns of expression of genes found to be significant and B with samples ordered by DR. Molecular subtype shown by coloured bars (dark blue—luminal A, pale blue—luminal B, pink—HER2-enriched, red—basal, green—normal-like). Recurrence is shown by black bar, non-recurrence shown by no bar

Of particular note, while both ESR1 and PGR showed higher expression in non-recurrent tumours in the first 5 years of follow-up in METABRIC, neither showed a significant difference after 10 years in either METABRIC or VLR (Supplementary Figure 4A and B). Conversely, proliferation (based on the average expression of the 18 proliferation genes of the PAM50 gene set [26]) showed significantly higher expression in patients with recurrences at 0–5 and 5–10 years in the METABRIC, but not after 10 years in both METABRIC and VLR (Supplementary Figure 4C).

Sixty-five individual genes were differentially expressed between cases and controls by univariate analysis (unpaired t-test; Supplementary Table 3, Supplementary Figure 5). After correction for multiple testing, none of these remained significant. Similarly, there were no significantly differentially expressed genes in METABRIC (> 10 years recurrence vs > 20 years no recurrence) after multiple correction.

Unsupervised hierarchical clustering of the samples according to the expression of all analysed genes showed the presence of 2 distinct clusters, which separated more according to their molecular subtype rather than their DR outcome status (Supplementary Figure 6).

Unsupervised hierarchical clustering of just the 65 significant genes highlighted three separate clusters of samples (Fig. 3A). The prevalence of recurrence was statistically significant between the 3 clusters (χ2 of 9.74, p = 0.0077). The most distinct of the 3 clusters contained a subset of 24 samples (extreme left-hand side of Fig. 3A). This cluster was enriched for luminal A and normal-like subtypes. Only 4 of the 24 patients (21%) and only 2 of the 14 (14%) luminal A in this cluster had a DR. These samples were characterized by high expression of immune-related genes and low expression of proliferation genes. In contrast to the first cluster, the second cluster was also dominated by luminal A subtype tumours but with higher DR rate: 20/45 (63%) of the patients in this cluster and 16/22 (73%) of those with a luminal A tumour had a DR. Cluster 2 had a gene expression pattern largely opposite to that of the first cluster. The third cluster was dominated by luminal B subtype and 20/45 patients had a DR. In general, this cluster showed less distinct gene expression groupings but of note there was a high expression of cell cycle and DNA replication-related genes. The second and third groups clustered more closely to one another than to the first, which itself had the most distinct pattern of gene expression. When clustering was ordered by DR status (Fig. 3B) there was no distinct gene expression pattern.

Gene set enrichment analysis

Significantly reduced expression of gene sets involved with epigenetic regulation and cytokine and chemokine signalling were found to be associated with recurrence from the NanoString breast cancer 360 module analysis (Table 2). Genes involved in cell immune response and apoptosis were identified from the gene ontology analysis, apoptotic genes from the KEGG gene set analysis and cell cycle inhibition and DNA damage response genes from the Hallmarks gene set enrichment analysis. Following correction for multiple testing none of these remained significant.

Table 2 Gene set enrichment data. Gene ratio describes the ratio of genes within each gene set to total significant genes either up (a) or down (b) in cases

None of the breast cancer 360 modules were significantly increased in cases. From the gene ontology analysis, KEGG and Hallmark gene sets, gene sets associated with DNA replication and cell cycle progression were found to be significantly increased in expression in cases compared to controls (Table 2, Supplementary Figure 7). Following correction for multiple testing none of these remained significant.

留言 (0)

沒有登入
gif