BIRC5 expression by race, age and clinical factors in breast cancer patients

Study population

The Carolina Breast Cancer Study (CBCS) [28] is a population-based study that utilized rapid case ascertainment with the North Carolina Central Cancer Registry to identify women aged 20–74 years across 44 counties diagnosed with first primary breast cancer. Recruitment occurred in three phases: 1993–1996 (Phase 1), 1996–2001 (Phase 2), and 2008–2013 (Phase 3). Black women and younger women (< 50 years of age) were oversampled using randomized recruitment [28], such that the final study population is 50% Black and 50% < 50 years old at diagnosis. Out of 4806 invasive breast cancer cases enrolled across all phases, 2174 bulk tumor samples were profiled by Nanostring (Phase 1: N = 259; Phase 2: N = 491; Phase 3: N = 1424). Exclusions included samples with depleted tissue (n = 1188, predominantly from CBCS1/2) or samples with low-quality or insufficient RNA (n = 241). This study was approved by the University of North Carolina at Chapel Hill (UNC-CH) School of Medicine Institutional Review Board in accordance with the revised U.S. Common Rule, and participants provided written informed consent.

Demographic and clinical characteristics

Health history and demographic variables were collected by a nurse during in-home interviews. Race was self-reported and categorized as Black and non-Black; > 94.7% of non-Black participants self-reported as White (n = 1005), while < 5.3% self-identified as either multiracial (n = 9, 0.85%), Hispanic (n = 15, 1.41%), American Indian/Eskimo (n = 8, 0.75%), Asian or Pacific Islander (n = 23, 2.17%) or Arab (n ≤ 5, < 1%). Importantly, we interpret race herein under a cells-to-society framework [29, 30], that defines race as a social construct, representing the culmination of biological, social (individual and community-level), and environmental exposures that differ by self-reported race. Tumor size, AJCC stage, estrogen receptor (ER), progesterone receptor (PR), and HER2 receptor status were abstracted from medical records and pathology reports.

Recurrence data were available for CBCS Phase 3 (2008–2013; n = 1424). Recurrence-free survival (RFS) was defined as the time between the date of diagnosis to the first local, regional, or distant breast cancer recurrence and verified through medical record review. Recurrence data are complete through October 2019, with a 5-year follow-up completed for all study participants. Among 1424 eligible women, 50 participants were stage IV at diagnosis and excluded from the recurrence analysis. Among 1374 patients with Stage I-III breast cancer, 159 recurrences were identified within 5 years.

Gene expression dataNormalization, molecular subtyping, and BIRC5

RNA was isolated from bulk tumor tissue using the Qiagen FFPE RNeasy isolation kit (Germantown, MD), assayed using Nanostring nCounter technology (Seattle, Washington), and normalized using Remove Unwanted Variation (RUV) as previously described [31,32,33]. PAM50 molecular subtyping was performed using a research version of the predictor to classify tumors as Luminal A, Luminal B, HER2-Enriched, Basal-like, or Normal-like, and to generate proliferation and risk of recurrence scores (ROR-PT) incorporating tumor size, proliferation and subtype [31, 34].

BIRC5 was considered as both a continuous and categorical variable. For continuous measures of BIRC5, log2-transformed gene expression was utilized in all analyses. Standardized clinical cutpoints do not exist for survivin/BIRC5, and while it is a target of both OncotypeDX [18] and Prosigna [19] multi-gene assays, single gene levels are not established. Therefore, for use as a categorical variable, BIRC5 expression was dichotomized into BIRC5-low and BIRC5-high expression categories using the upper limit of the third expression quartile as a cut point (Log2 3rd quartile cutpoint: CBCS = 7.6; TCGA = 9.4). Differences in the expression of BIRC5 between CBCS and TCGA are likely a result of the different mRNA platforms used in each study (i.e., NanoString in CBCS, RNAseq in TCGA). All tumors were treatment naïve at the time of collection and prior to NanoString assay assessing BIRC5 mRNA expression.

Statistical analysis

Continuous BIRC5 expression levels were compared across race and clinical tumor characteristics using Welch’s two-sample t-tests. Generalized linear models (glm) with binomial distribution and the identity link function were used to calculate relative frequency differences (RFDs) and 95% confidence intervals (CIs) as the measure of association between BIRC5 expression categories and covariates of interest in CBCS. RFDs are defined as the percentage difference between index and referent groups, namely, the relative frequency of BIRC5-high tumors across demographic and clinical variables. With smaller sample sizes, RFDs could not be computed for TCGA because several models failed to converge. Thus, to measure the strength of association between BIRC5-high and covariates of interest in both CBCS and TCGA, multivariate logistic regression was used to calculate odds ratios (ORs) and 95% CIs. Multivariable models were adjusted for age and race according to the CBCS randomized recruitment design in reduced models, and additionally adjusted for ER status and tumor stage in full models. In models comparing age or race, age comparisons were only adjusted for race, and race comparisons were only adjusted for age. Similarly, in models additionally adjusting for ER status and stage, ER comparisons were only adjusted for stage, and stage comparisons were only adjusted for ER status. Multivariable analyses relied on complete case analysis as rates of missingness were < 1.3% for all covariates. Normal-like tumors were excluded from analyses because this subtype arises from insufficient tumor cellularity [31].

Kaplan–Meier curves and log-rank tests were used to compare mean time to recurrence across BIRC5 categories in stage I-III cases (n = 1374). Recurrence analyses were stratified according to clinical breast cancer subtypes (i.e., ER-positive/Her2-, and TNBC) and were performed across all tumor subtypes, overall. Hazard ratios (HR) and 95% CI were calculated using crude and multivariate Cox proportional hazard models adjusted for patient age and tumor stage. The Wald p-value was used to assess the assumption of proportionality. While there was evidence of non-proportional hazards, point estimates did not differ substantially between models. All statistical analyses were performed in R version 4.0.3.

Data availability

RNA sequencing and clinical data from TCGA breast cancer dataset, including 1095 primary tumors, were used to compare and validate BIRC5 relationships identified in CBCS. These data are publicly available under dbGaP accession phs000178.v1.p1, with additional data available at https://gdc.cancer.gov/about-data/publications/PanCan-CellOfOrigin35. CBCS data are available upon request (https://unclineberger.org/cbcs).

留言 (0)

沒有登入
gif