Multiparametric MRI based deep learning model for prediction of early recurrence of hepatocellular carcinoma after SR following TACE

Baseline characteristics

Finally, a total of 511 patients (age, 54.5 ± 9.7 years; 364 men) were enrolled in this study. According to 4:1 ratio, the DC consisted of 413 patients who underwent SR, and the VC consisted of 98 patients. Table 1 shows a comparison of baseline characteristics between the DC and VC. All baseline characteristics of clinical variables demonstrated superior balance and consistency (all, P > 0.05). In the DC, the mean number of TACE sessions per patient was 3.5 ± 0.8. The median interval between the first session of TACE and SR was 4.2 months (IQR, 1.8–7.2 months). After TACE, the mean maximum diameter was reduced from 6.8 cm ± 1.5 to 3.2 cm ± 1.1. In the VC, the mean number of TACE sessions per patient was 3.0 ± 0.5. The median interval between the first session of TACE and SR was 3.5 months (IQR, 2.0–7.2 months). After TACE, the mean maximum diameter was reduced from 6.8 cm ± 1.2 to 3.0 cm ± 0.4. The median follow-up period was 22.8 months (IQR, 11.7–44.2 months) in this study.

Table 1 Baseline characteristics of patients with HCC undergoing SR after TACEIndependent prognostic factors

We collected 16 clinical variables and microvascular invasion as a histopathologic variable to analyze the independent prognostic factors of ER after TACE plus SR. The risk factors of ER were assessed by univariate and multivariate analysis (Table S2). In the univariate analyses, tumor size (HR: 0.860; 95% CI:0.462,0.934; P = 0.039), tumor number (HR: 1.795; 95% CI: 1.147, 3.962; P = 0.002), AFP (HR:1.871; 95% CI: 1.229,2.848; P = 0.003), and MVI (HR:0.435; 95% CI: 0.284, 0.666; P < 0.001) were significant factors for ER. Multivariate analyses showed the factors that significantly affected the ER were tumor number (HR:1.690; 95% CI: 1.112, 2.564; P = 0.004), AFP (HR: 1.527; 95% CI: 1.041, 2.248; P = 0.030), and MVI (HR:0.461; 95% CI: 0.310, 0.628; P < 0.001).

After DL score were input, as shown in Table 2, the multivariate regression analysis indicated that these variables including tumor number (HR: 3.42; 95% CI: 2.75, 4.31; P = 0.003), MVI (HR: 9.21; 95% CI: 6.24, 32.14; P < 0.001), and DL score (HR:17.46; 95% CI: 12.94, 23.57; P < 0.001) were independent predictors for ER after TACE plus SR. Furthermore, a nomogram was built based on these variables (Fig. 3A), described by the formula: Y = − 8.992 + 3.298 × tumor number (0: Single; 1: Multiple) + 4.282 × MVI (0: Presence; 1: Absence) + 8.291 × DL score, where Y indicated the ER probability to initial TACE plus SR treatment in uHCC. As shown in Fig. 3B, C, the calibration curves of post-ablation recurrence prediction analysis were performed on DC and VC. Agreement between the two pathologists for the MVI in nomogram was excellent (k = 0.918 for MVI). In all cohorts, both models showed good agreement between the predictive ER and the real ER. Further, DCA graphically indicated that DL + Clinical model can provide larger benefit across the range of reasonable threshold probabilities than DL and clinical model in both cohorts (Fig. 3D-E).

Table 2 Multivariable regression analysis of predictors of early recurrenceFig. 3figure 3

Nomogram for predicting ER status in HCC patients. A Independent variables of model included number of tumors, MVI and DL signature. B, C Calibration curve for predicting ER status of nomogram model in DC and VC. D, E Net benefit for predicting ER status of nomogram in DC and VC. HCC hepatocellular carcinoma; ER early recurrence; DC derivation cohort; VC validation cohort

Predictive model comparison

The AUC comparison among the clinical mode, DL models based on T1WI + C, T2WI, DWI, all MRI sequences, and nomogram are shown in Table 3. T1WI + C-based DL model provided a better predictive performance than that of T2WI- and DWI-based DL, respectively, which indicating the T1WI + C may contain more important information for the prediction of ER after SR following TACE. The predictive performance of the DL mode based on the three MRI sequence was improved compared to the single MRI sequence-based DL model. Table 3 indicates the predictive ability was improved after adding the clinical variables to the DL-based model. AUCs were increased from 0.868 to 0.872 for DV, and 0.855 to 0.862 for VC. Additionally, DL based models all performed better than the clinical model in all cohorts (all P < 0.001).

Table 3 The performance comparison of different modelsSurvival risk stratification

To facilitate the clinical practice of the nomogam model, we divided the HCC patients into two risk groups according to the risk scores of the nomogram models: high-risk group, and low-risk group. We identified the cutoff values (66.23) in the DC and verified them in the VC using X-title. This pragmatic visualization of the risk level could help in deciding the strategy of TACE-based comprehensive treatment schemes for uHCC patients. According to the risk scores from the nomogram model, the cumulative 1-, 3- and 5-year OS of patients in the low-risk group were 87.4%, 53.6%, and 53.6%, respectively, showing significant statistical differences compared with those (53.1%, 11.0%, and 11.0%, respectively) in the high-risk group in the DC (P < 0.001) (Fig. 4A). Similarly, the cumulative 1-, 3- and 5-year OS of patients in the low-risk group were 47.8%, 42.0%, and 42.0%, respectively, showing significant statistical differences compared with those (63.1%, 10.4%, and 10.4%, respectively) in the high-risk group in the VC (P < 0.001) (Fig. 4B).

Fig. 4figure 4

The overall survival of comparison between high-risk and low-risk ER groups. A primary cohort; B validation cohort

Interpretation for DL Model

To better explore the hidden patterns the network learned, heatmaps were divided into high- and low-risk ER groups. Overall, the intensity of feature map in high-risk group was higher than that in the low-risk group, so the DL features in the high-risk group generated a larger value of DL signature, resulting in a higher recurrence risk. Additionally, higher-intensity regions are mainly distributed in the tumor area in the low-risk group, while in the high-risk group, high-intensity regions are not only distributed in the tumor area, but also in the area surrounding the tumor. Six representative patients in the DC were selected, and the feature maps learned by DL was shown and compared in Fig. 5.

Fig. 5figure 5

The visualization of learned feature maps from the deep neural network. Overall, the intensity of feature map in ER group was higher than in non-ER group. In the feature map, the high intensity voxels were mainly distributed in the area of tumor center.

留言 (0)

沒有登入
gif