Automated graded prognostic assessment for patients with hepatocellular carcinoma using machine learning

Patient characteristics

A total of 555 patients (mean age, 63.8 years ± 8.9 [standard deviation]; 118 females) with treatment-naïve HCC and multiphasic contrast-enhanced MRI at the time of diagnosis were included in the study. Patients without MRI at baseline (n = 501), < 18 years (n = 2), missing clinical information (n = 16), no triphasic image acquisition (n = 84), and non-diagnostic MRI (n = 14) were excluded from the study (Fig. 1). Patient baseline characteristics are summarized in Table 1, and MRI parameters are reported in Supplemental Table 1. HCC was either proven by imaging criteria or histopathology.

Fig. 1figure 1

Flowchart of patient inclusion and exclusion. From an institutional database with 1172 patients, 555 patients (118 females, 437 males, 63.8 ± 8.9 years) with imaging- or histopathologically proven treatment-naïve hepatocellular carcinoma and baseline multiphasic contrast-enhanced magnetic resonance imaging at the time of diagnosis were included in the study

Table 1 Patient baseline characteristics

A total of 287 (51.7%) patients died after a median time of 14.40 months (range, 0.20–97.12 months; interquartile range (IQR), 22.23) after the date of imaging, and patients were followed up for a median of 32.47 months (range: 0.20–118.90 months; IQR: 61.5) after the date of imaging. The median time between the laboratory results and the imaging date was 10 days (IQR, 28.3). First treatments based on the institution’s multidisciplinary tumor board decisions were as follows: 192 (34.6%) patients underwent transarterial chemoembolization, 138 (24.9%) thermal ablation, 82 (14.8%) hepatectomy, 68 (12.3%) a combination of transarterial chemoembolization and thermal ablation, 24 (4.3%) Sorafenib, 24 (4.3%) best supportive care, 20 (3.6%) transarterial radioembolization with Yttrium-90, and lastly 7 (1.3%) liver transplantation. For model development and validation, a total of 471 (85%) patients were randomly allocated to the development cohort and 84 (15%) to the independent validation cohort.

Survival model

Figure 2 summarizes the entire model development pipeline. The proposed model attained C-indices of 0.8503 and 0.8234 in the development- and validation cohort, respectively. Table 2 summarizes all performance metrics for the proposed model and conventional clinical staging systems. For the interpretability of the proposed model, Fig. 3 depicts the variable importance scores of the included variables. On average, the proposed framework required a running time of 1.11 min per patient (automated liver segmentation, 0.70 s; extraction of 23 included radiomic features, 1.09 min; model prediction, 0.42 s).

Fig. 2figure 2

Model development. An automated liver segmentation framework was adopted for radiomic feature extraction after automated image co-registration. To predict overall survival, a random survival forest was fit from a combination of clinical and radiomic variables. Model performance was evaluated using Harrell’s C-index and the area under the time-dependent receiver operating characteristic curve (AUC). Patients were stratified into low-, intermediate-, and high-risk groups based on their predicted risk scores

Table 2 Performance evaluationFig. 3figure 3

Variable importance scores. The bar chart shows the mean variable importance score (error bars show standard deviation) of each included variable of the final risk prediction model obtained by 10 permutations of random shuffling. Naming convention of radiomic features: The prefix specifies the image type (original image or filter-derived MR image (“log”: Laplacian of Gaussian) with extraction parameters); the suffix specifies the MR contrast phase (“_pre”: pre-contrast phase, “_art”: late arterial phase, “_pv”: portal venous phase, “_del”: delayed phase). Equations for the calculation of each radiomic feature are available in ref. [22]. (AFP: Alpha-fetoprotein; INR: international normalized ratio; PTT: partial thromboplastin time)

Mortality risk predictions and graded prognostic assessment

The distribution of risk scores in the development and validation cohort is shown in Fig. 4. In the development- and validation cohort, the mean (± standard deviation) predicted risk score was 121.57 (± 65.31) and 135.64 (± 67.59), respectively. Cox proportional hazards regression analysis showed a highly significant association between the predicted risk score and OS in the development cohort (coefficient, 0.021658 (p <.00001); HR, 1.022 (95% CI: 1.02, 1.024)), and in the validation cohort (coefficient, 0.021676 (p <.00001); HR, 1.022 (95% CI 1.016, 1.028)). The cutoff values determined by the hierarchical method to stratify patients into low-, intermediate-, and high-risk groups based on the proposed model’s predicted risk scores were 93.08 and 172.73. Detailed results of the Cox proportional hazards regression analysis for each risk group can be found in Supplemental Table 2. Example cases are shown in Fig. 5. In the development cohort, 193 (41%) patients were assigned to the low-risk group, 185 (39%) patients to the intermediate-risk group, and 93 (20%) patients to the high-risk group. In the validation cohort, 27 (32%) patients were allocated to the low-risk group, 32 (38%) patients to the intermediate-risk group, and 25 (30%) patients to the high-risk group. Supplemental Table 3 shows a cross-tabulation analysis of the proposed risk groups across conventional clinical staging systems.

Fig. 4figure 4

Distribution of predicted risk scores in the development- and independent validation cohort. Based on the risk score predictions of the survival model in the development cohort, we derived two cutoff points (93.08 and 172.73) to stratify patients into low-, intermediate-, and high-risk groups. We applied the same cutoff points for stratification in the independent validation cohort. For plotting, a Gaussian smoothing kernel was used

Fig. 5figure 5

Example cases. Standard-of-care clinical data and axial pre-contrast-, late arterial, portal venous-, and delayed-phase MRI with corresponding automated liver segmentations overlaid in blue. Low-risk group: A 59-year-old male patient presenting with a focal 4.5 cm lesion in the right liver lobe. The patient was censored after 8.75 years. Intermediate-risk group: A 54-year-old male patient presenting with a focal 3.1 cm lesion in the right liver lobe. The patient died 15 months after diagnosis. High-risk group: A 54-year-old male patient presenting with a focal 5.7 cm lesion in the right liver lobe. The patient died 6.7 months after diagnosis

Survival analysis

A comprehensive table with mOS times and 95% confidence intervals for various strata can be found in Table 3. The mOS time of the entire study cohort was 32.47 (95% CI: 28.40, 37.83) months. In the development- and validation cohort, the mOS were 32.57 (95% CI: 29.13, 38.50) months and 24.57 (95% CI: 14.73, NA) months, respectively. Survival rates (± standard error) for the proposed risk groups in the development- and validation cohort are summarized in Table 4. Survival times between the development- and the validation cohort showed no statistical difference (p = .29). In the development cohort, mOS in the low-risk group was 90.40 (95% CI: 62.97, NA) months, 25.8 (95% CI: 23.50, 31.90) months in the intermediate-risk, and 6.40 (95% CI: 5.03, 7.73) months in the high-risk group. In the validation cohort, mOS in the low-risk group was NA (95% CI:48.03, NA) months, 19.60 (95% CI: 14.23, NA) months in the intermediate-risk, and 5.40 (95% CI: 3.80, 12.63) months in the high-risk group. Kaplan-Meier curves for the developed risk groups can be found in Fig. 6. The developed low-, intermediate-, and high-risk groups demonstrated significantly different survival times in both cohorts (development cohort, p <.0001; validation cohort, p <.0001). Notably, no statistical difference was found when comparing the survival times of each risk group between the development- and the validation cohort (low-risk group, p = 1.0; intermediate-risk group, p = 1.0; high-risk group, p = 1.0), thus indicating the generalizability of the risk groups’ survival times to new data. Complete results of the logrank test for pairwise OS comparisons between the proposed risk groups can be found in Supplemental Table 4. Kaplan-Meier curves for the conventional staging scores can be found in Supplemental Figure 1.

Table 3 Median overall survival times across staging systemsTable 4 Survival rates (± standard error) for the proposed risk groupsFig. 6figure 6

Kaplan-Meier curves of the proposed risk groups in the development- and validation cohort

留言 (0)

沒有登入
gif