Risk Stratification of Early‐Stage Cervical Cancer with Intermediate‐Risk Factors: Model Development and Validation Based on Machine Learning Algorithm

Introduction

Every year, more than 500,000 women are diagnosed with cervical cancer (CC), and more than 300,000 deaths occur due to this disease worldwide [1]. Although cervical screening strategies have decreased the incidence of CC, data from global population-based CC registries have revealed that 5-year survival has improved only slightly in recent decades [2]. Furthermore, nearly 90% of deaths from CC occur in developing and low-resource countries [3]. The prognosis of patients with CC is closely related to the clinical staging system determined by the International Federation of Gynecology and Obstetrics (FIGO). Radical hysterectomy with pelvic lymphadenectomy is the preferred surgical plan for patients with early CC [4]. Surgical risk factors and lymph node status were first included in the FIGO staging system in 2018 [5]. In addition to lymph node status, other pathological risk indicators widely recognized to affect survival and recurrence in CC include parietal infiltration and marginal positivity [6-9]. Furthermore, the Gynecologic Oncology Group (GOG) defined lymphovascular space invasion (LVSI), stromal invasion (SI), and tumor size as “Sedlis criteria,” which are intermediate-risk factors used to guide adjuvant treatment decisions [10].

For patients with early-stage CC who meet the Sedlis criteria, controversies remain regarding adjuvant therapy after surgical treatment. The European Society for Medical Oncology clinical practice guidelines for CC recommend that patients with intermediate-risk do not need further adjuvant therapy (evidence level II, B) [3]. The FIGO CC report recommends that postoperative radiotherapy is required, but chemotherapy is not recommended, if a patient exhibits any two of the following risk factors: tumor size more than 4 cm, LVSI, and deep SI [5]. The National Comprehensive Cancer Network clinical practice guidelines for CC recommend pelvic external beam radiation therapy (category 1) with or without concurrent platinum-containing chemotherapy (category 2B for chemotherapy) for lymph node–negative, postsurgery patients who were diagnosed at stage IA2, IB1, or IIA1 and have large primary tumors, deep SI, and/or LVSI [11].

Therefore, evaluation of intermediate-risk factors and adjuvant therapy remains controversial, and potential intermediate-risk factors for both recurrence and survival may include factors beyond the Sedlis criteria. Patients with early-stage CC typically have a favorable prognosis, and radiotherapy is often associated with considerable adverse effects during adjuvant therapy [12]. Thus, participating clinicians must identify the suitable management method after surgery to avoid overtreatment. At present, there are no published survival prediction models for predicting overall survival (OS) or disease-free survival (DFS) of patients with early-stage CC by pathological intermediate-risk factors.

The objective of our study is to establish prognostic evaluation models and stratify the prognostic risk in patients with CC with intermediate-risk factors. Our results provide a more individualized reference for planning postoperative adjuvant treatment in clinical practice.

Materials and Methods Patients

After the study obtained approval from the Ethical Committee of Qilu Hospital of Shandong University (protocol number 2018 066) and received a waiver for informed consent, 481 patients with CC who had received treatment at Qilu Hospital of Shandong University between January 2005 and December 2016 were included in our study. All patients met the following inclusion criteria: (a) stage IB–IIA CC according to the 2009 FIGO staging system [13]; (b) primary treatment by radical or modified radical hysterectomy and pelvic lymphadenectomy. The exclusion criteria were (a) lymph node metastasis, parametrial involvement, or positive resection margin after surgery; (b) the presence of other primary malignant tumors; (c) insufficient medical records.

Predictors and Endpoints

The following clinical characteristics were included: age, FIGO stage (2009), histology, histological grade, LVSI, SI, tumor size, adjuvant therapy after surgery, and survival and recurrence information. Recurrence and death were defined as the primary outcomes of this study. DFS and OS were described as the time interval from surgery to the first evidence of any recurrence and death or last follow-up. The tumor size was measured by clinical palpation.

Model Development

The workflow of this study is presented in Figure 1. Each characteristic was estimated using univariable Cox survival analysis for both DFS and OS, and results were described as hazard ratios (HRs), with associated 95% confidence intervals (CIs), and p values. After selection of risk factors, the age, pathological risk factors as defined in the Sedlis criteria (LVSI, SI, and tumor size), and adjuvant treatment methods were entered into a multivariable Cox proportional hazards regression analysis to construct intermediate-risk prediction models. The time-dependent receiver operating characteristic (ROC) curves and the area under the ROC curve (AUC) were used to evaluate the discrimination ability of the model. Nomogram lists were developed to predict the risk of 2- and 5-year DFS, as well as 2- and 5-year OS. We then calculated the risk score of each patient according to the nomogram lists, selecting the median risk score as the cutoff value. Patients were divided into low-risk and high-risk groups according to risk score. A heatmap was generated based on the distribution of risk factors in the two groups. The Kaplan-Meier method with the log-rank test was used to compare the abilities of the traditional Sedlis criteria and the new risk groups from the developed model to distinguish prognoses.

image

The workflow of this study. Abbreviations: Ada, AdaBoost; DFS, disease-free survival; DT, decision tree; KNN, k-nearest neighbor; LR, logistic regression; ML, machine learning; NB, naïve Bayes; OS, overall survival; RF, random forest; ROC, receiver operating characteristic; SVM, support vector machine.

Model Validation

Currently, machine learning (ML) is often used in the development and validation of prediction models in clinical research [14]. We divided patients into four groups according to whether there was recurrence or death within 2 and 5 years of the primary surgery. ML algorithms, including logistic regression (LR), support vector machine (SVM), random forest (RF), decision tree, k-nearest neighbor, naïve Bayes, and AdaBoost, were used for model validation. Fivefold cross-validation was applied for each algorithm. All patients were randomly partitioned into five equal-sized subsamples. Four subsamples were used in training data, and the final subsample was selected as the validation data for testing. AUCs were calculated over multiple rounds of cross-validation to assess the models.

Statistical Analysis

The descriptive statistical analysis, univariate and multivariate Cox proportional hazard regression analysis, nomogram lists, ROC analysis, and log-rank test were conducted with R (version 3.6.1). The ML algorithms were conducted in Python (version 3.6.4) using the machine learning library scikit-learn (version 0.19.1).

Results Patient Characteristics

Patient characteristics are reported in Table 1. In 481 patients with early-stage CC, 344 (71.5%) women were diagnosed at stage IB1, 94 (19.5%) at stage IB2, 25 (5.2%) at stage IIA1, and 18 (3.7%) at stage IIA2. Most patients underwent laparotomy with radical hysterectomy (n = 385, 80%), and 379 (80.9%) patients received adjuvant therapy after surgery. After a median follow-up period of 31 months (range, 9–145 months), 35 (7.3%) patients experienced recurrences, and 20 (4.2%) died. The 2-year DFS and OS were 94.1% and 98.0%, respectively, and the 5-year DFS and OS were 77.9% and 82.7%, respectively.

Table 1. Characteristics of patients Characteristic Total (n = 481) Recurrence (n = 35, 7.3%) Death (n = 20, 4.2%) Age, years ≤40 137 (28.5) 5 (14.3) 2 (10.0) >40 344 (71.5) 30 (85.7) 18 (90.0) FIGO stage (2009) IB1 344 (71.5) 24 (68.6) 15 (75.0) IB2 94 (19.5) 9 (25.7) 4 (20.0) IIA1 25 (5.2) 2 (5.7) 1 (5.0) IIA2 18 (3.7) 0 (0.0) 0 (0.0) Operation method Laparotomy 385 (80.0) 30 (85.7) 18 (90.0) Laparoscopy 96 (20.0) 5 (14.3) 2 (10.0) Histology Squamous 390 (81.1) 9 (25.7) 5 (25.0) Nonsquamous 91 (18.9) 26 (74.3) 15 (75.0) Histological grade I (well differentiated) 41 (8.5) 2 (5.7) 13 (65.0) II (moderately differentiated) 152 (31.6) 9 (25.7) 5 (25.0) III (poorly differentiated) 288 (59.9) 24 (68.6) 2 (10.0) LVSI No 376 (778.2) 27 (77.1) 16 (80.0) Yes 105 (21.8) 8 (22.9) 4 (20.0) Stromal invasion Superficial 1/3 134 (27.9) 7 (20.0) 4 (20.0) Middle 1/3 185 (38.5) 9 (25.7) 4 (20.0) Deep 1/3 162 (33.7) 19 (54.3) 12 (60.0) Tumor size, cm <2 102 (21.4) 5 (14.3) 2 (10.0) ≥2 280 (58.2) 17 (48.6) 10 (50.0) ≥4 58 (12.1) 7 (20.0) 5 (25.0) ≥5 40 (8.3) 6 (17.1) 3 (15.0) Adjuvant therapy None 92 (19.1) 8 (22.9) 5 (25.0) Chemotherapy 222 (46.2) 14 (40.0) 6 (30.0) Radiotherapy 20 (4.2) 2 (5.7) 1 (5.0) Chemotherapy and radiotherapy 147 (30.6) 11 (31.4) 8 (40.0) Values are presented as n (%). Abbreviations: FIGO, International Federation of Gynecology and Obstetrics; LVSI, lymphovascular space invasion. Predictor Assessment of DFS and OS

As shown in Table 2, there was a significant association between patients older than 40 years and postoperative recurrence (HR 2.60, 95% CI 1.02–6.78, p = .046), whereas the association between this age group and death was weaker (HR 3.95, 95% CI 0.92–17.02, p = .065). Furthermore, the FIGO stage (2009), operation method, histology, histological grade, LVSI, SI, tumor size, and adjuvant therapy were not significantly associated with either DFS or OS.

Table 2. Univariable Cox proportional hazards regression analysis for DFS and OS Characteristic DFS OS HR (95% CI) p value HR (95% CI) p value Age, years .046 .065 ≤40 Reference Reference — >40 2.63 (1.02–6.78) 3.95 (0.92–17.02) FIGO stage (2009) .989 .933 IB1 Reference — Reference — IB2 1.06 (0.49–2.28) .891 0.70 (0.23–2.10) .519 IIA1 1.29 (0.30–5.44) .733 1.06 (0.14–8.02) .956 IIA2 — — — — Operation method .350 .374 Laparotomy Reference Reference Laparoscopy 1.61 (0.60–4.32) 2.02 (0.43–9.45) Histology .378 .640 Squamous Reference Reference Nonsquamous 0.71 (0.33–1.52) 0.79 (0.29–2.16) Histological grade .561 .648 I (well differentiated) Reference — Reference — II (moderately differentiated) 1.47 (0.35–6.23) .600 0.73 (0.16–3.22) .672 III (poorly differentiated) 0.99 (0.21–4.60) .992 0.49 (0.10–2.54) .398 LVSI .385 .551 No Reference Reference Yes 1.42 (0.64–3.14) 1.40 (0.47–4.20) Stromal invasion .130 .202 Superficial 1/3 Reference — Reference — Middle 1/3 1.04 (0.39–2.78) .946 0.82 (0.21–3.30) .784 Deep 1/3 2.02 (0.85–4.82) .111 2.04 (0.66–6.33) .218 Tumor size, cm .126 .239 <2 Reference — Reference — ≥2 1.24 (0.46–3.35) .678 1.82 (0.40–8.30) .441 ≥4 2.29 (0.73–7.23) .157 3.88 (0.75–20.02) .105 ≥5 3.09 (0.94–10.13) .063 4.00 (0.67–23.96) .129 Adjuvant therapy .822 .363 None Reference — Reference — Chemotherapy 0.71 (0.30–1.69) .435 0.49 (0.15–1.60) .237 Radiotherapy 0.72 (0.15–3.43) .682 0.52 (0.06–4.50) .555 Chemotherapy and radiotherapy 0.97 (0.39–2.41) .942 1.21 (0.40–3.70) .740 Abbreviations: CI, confidence interval; DFS, disease-free survival; FIGO, International Federation of Gynecology and Obstetrics; HR, hazard ratio; LVSI, lymphovascular space invasion; OS, overall survival. Model Development of DFS and OS

Multivariable Cox proportional hazards regression used age, LVSI, SI, tumor size, and adjuvant treatment method to develop prediction models for DFS and OS. The two models are shown as nomogram lists in Figure 2. The risk score for DFS = 100 × (age > 40 years) + 66.8 × (LVSI [+]) + 3.5 × middle 1/3 invasion +62.8 × deep 1/3 invasion +14.4 × (tumor size ≥2 cm) + 79.5 × (tumor size ≥4 cm) + 98.4 × (tumor size ≥5 cm) + 66.8 × no adjuvant treatment +12.2 × chemotherapy +19.7 × chemotherapy and radiotherapy. The risk score for OS = 100 × (age > 40 years) + 51.9 × (LVSI [+]) + 13.3 × superficial 1/3 invasion +54.8 × deep 1/3 invasion +37.1 × (tumor size ≥2 cm) + 99.3 × (tumor size ≥4 cm) + 77.6 × (tumor size ≥5 cm) + 80.5 × no adjuvant treatment +10.4 × chemotherapy +60.3 × chemotherapy and radiotherapy.

image

Nomogram lists of risk prediction models for DFS (A) and OS (B). Abbreviations: DFS, disease-free survival; LVSI, lymphovascular space invasion; OS, overall survival.

ROC curves for both models are shown in supplemental online Figure 1. The recurrence model yielded an AUC of 0.74 in 2-year DFS and 0.66 in 5-year DFS (supplemental online Fig. 1A). The death model yielded an AUC of 0.87 in 2-year OS and 0.69 in 5-year OS (supplemental online Fig. 1B). The time-dependent ROC curve is shown in supplemental online Figure 1C. To determine the effect of risk score on clinical outcome, we divided the cohort over the median risk score. In the recurrence model, patients with a risk score higher than 167 points were placed in the high risk of recurrence group (supplemental online Fig. 2A). In the death model, patients with a risk score higher than 194 points were placed in the high risk of death group (supplemental online Fig. 2B). The survival and recurrence time of each patient is shown in supplemental online Figure 2C, 2D. We compared the distribution of prognostic factors in the two groups by heatmap (supplemental online Fig. 2E, 2F), in which the red color indicates higher significance. Patients in the high-risk groups were older, had larger tumors, and had more diagnoses of positive LVSI and deep of cervical invasion.

Comparison of the Traditional Sedlis Criteria Groups and New Risk Groups

Patient distribution according to the traditional Sedlis criteria groups and the new risk groups defined in this study are shown in Table 3. The Kaplan-Meier analysis showed that the Sedlis criteria could not distinguish DFS or OS in patients with CC (p > .05; Fig. 3A–3D) and that high-risk groups for both recurrence and death were significantly associated with poor DFS (p = .001; Fig. 3E) and OS (p = .011; Fig. 3F).

Table 3. The information of patients in traditional Sedlis criteria groups and new risk groups Groups according to Sedlis criteria Total (n = 481) Recurrence (n = 35, 7.3%) Death (n = 20, 4.2%) Sedlis criteria (detailed) None 337 (70.1) 22 (62.9) 13 (65.0) LVSI + Deep 1/3 32 (6.7) 3 (8.6) 1 (5.0) LVSI + Middle 1/3 + Tumor size ≥2 cm 39 (8.1) 3 (8.6) 2 (10.0) LVSI + Superficial 1/3 + Tumor size ≥5 cm 0 (0.0) 0 (0.0) 0 (0.0) Middle or deep 1/3 + Tumor size ≥4 cm 73 (15.2) 7 (20.0) 4 (20.0) Sedlis criteria No 337 (70.1) 22 (62.9) 13 (65.0) Yes 144 (29.9) 13 (37.1) 7 (35.0) Risk group of recurrence Low-risk group 224 (46.6) 7 (20.0) 4 (20.0) High-risk group 257 (53.4) 28 (80.0) 16 (80.0) Risk group of death Low-risk group 248 (51.6) 10 (28.6) 4 (20.0) High-risk group 233 (48.4) 25 (71.4) 16 (80.0) Values are presented as n (%). Abbreviation: LVSI, lymphovascular space invasion. image

Kaplan-Meier analysis of the Sedlis criteria (detailed) with DFS (A) and OS (B); Sedlis criteria with DFS (C) and OS (D); and risk group with DFS (E) and OS (F). Abbreviations: DFS, disease-free survival; LVSI, lymphovascular space invasion; OS, overall survival.

Model Training and Validation Based on ML Algorithms

For further validation, ML algorithms were applied to verify the discrimination ability of the two models. Figure 4 shows the results from the fivefold cross-validation. The AUC of the 2-year DFS prediction model ranged from 0.61 to 0.69, with the highest AUC produced by the SVM algorithm (Fig. 4A). In the 5-year DFS model, the AUC ranged from 0.64 to 0.69, and the LR calculation yielded the highest AUC (Fig. 4B). The best AUC was obtained for the 2-year OS prediction model, which ranged from 0.84 to 0.88 (Fig. 4C), and the AUC of the 5-year OS prediction model ranged from 0.60 to 0.63 (Fig. 4D). Overall, the LR and SVM algorithms had good discrimination compared with the other five algorithms in the fivefold cross-validation process.

image

ROC curves of the ML-based validation of patient recurrence occurred within 2 years after surgery (A) and 5 years after surgery (B); and patient death occurred within 2 years after surgery (C) and 5 years after surgery (D). Abbreviations: Ada, AdaBoost, AUC, area under the receiver operating characteristic curve; DT, decision tree; KNN, k-nearest neighbor; LR, logistic regression; NB, naïve Bayes; RF, random forest; SVM, support vector machine.

Discussion

The present paper describes the development of prediction models for DFS and OS in patients with early-stage CC based on pathological intermediate-risk factors and postoperative adjuvant therapy. The prediction models were constructed by age, LVSI, depth of SI, tumor size, and postoperative adjuvant therapy. We found significant differences in DFS and OS between patients in the high-risk and low-risk groups and showed that risk grouping can be used to stratify the prognostic risk of early-stage CC in patients with intermediate-risk factors. Finally, we used ML algorithms to validate the models. Our results suggest that the favorable prognostic risk assessment model could be used in clinical practice as a potential postoperative evaluation tool for patients with early-stage CC.

The Sedlis criteria, which were proposed by the GOG through a prospective study, are currently widely used and include three factors (LVSI, deep of SI, and tumor size) [10]. A retrospective study of CC showed that although 50% of recurrences in the study occurred in patients who did not meet t

留言 (0)

沒有登入
gif