Development and clinical utility analysis of a prostate zonal segmentation model on T2-weighted imaging: a multicenter study

Zonal segmentation is important in the management of prostatic diseases. Many studies have demonstrated the feasibility of training CNN models for zonal segmentation. However, they lack validation in non-public datasets and consideration of the patients’ characteristics. The application performance in patient cohorts with different clinicopathological characteristics remains unknown. Moreover, factors influencing the segmentation performance have rarely been investigated. In this study, we trained a 3D U-Net model for prostate zonal segmentation and applied two external testing datasets to assess its clinical utility in different patient cohorts. The model yielded good performance in all testing groups and outperformed the junior radiologist for PZ segmentation with higher DSC and ICC for volume estimation. The model performance was demonstrated to be susceptible to the prostate morphology and MR scanner parameters.

Our trained U-Net model showed good performance in zonal segmentation in both ETDpub and ETDpri. Previously reported mean DSCs for CG and PZ segmentation in public external datasets were 0.80–0.90, and 0.64–0.81 [8, 19, 20] respectively. Our model also showed a good result in a public dataset with mean DSC of 0.889 and 0.755 for CG and PZ, respectively. In the private external testing dataset, which consisted of advanced prostate cancer, the U-Net model also showed promising results. Regardless of tumor extension, the U-Net model can recognize the natural border of prostate anatomy zones with high consistency with radiologist (Fig. 4), which could serve as the foundation for orientation of prostate tumor and identification of extraprostatic cancer. Compared with previous studies testing CNN model’s performance in private external testing datasets [21, 22], our study applied the model to patients with different clinic scenarios and considered the patients’ clinicopathological characteristics. Furthermore, even without the fine-tuning process [21], our trained model still showed good performance in external testing. Our study also showed that segmentation in the extreme parts of prostate is challenging. Specifically, among different testing groups, the mean DSCs in the apex, base, and midgland of the prostate were 0.811–0.856, 0.847–0.901, and 0.916–0.941 for CG, and were 0.625–0.788, 0.739–0.832, and 0.818–0.896 for PZ, respectively. Other studies have also reported a significantly lower DSC in the apex and base of the prostate even for radiologists’ manual delineation, with DSC of 0.85 in the apex, 0.87 in the basal part, and 0.89 in the midgland [23].

The U-Net model outperformed the junior radiologist in PZ segmentation with a significantly higher DSC and better agreement of volume estimation, and was comparable to the junior radiologist in CG segmentation. In our study, the volume estimation ICCs of the U-Net model and the expert radiologist’s manual segmentation were 0.836 for PZ and 0.953 for CG, which were close to the literature-reported values of a radiologist’s manual segmentation between two MR scans in one patient cohort (0.888 for PZ and 0.988 for non-PZ) [24]. The volume calculation variability was higher in PZ than in CG, due to the irregular morphology of PZ. The ICC for CG volume estimation of the junior radiologist in our study was excellent, while the ICC for PZ volume estimation showed moderate agreement. The junior radiologist lacked a good grasp of the prostate anatomy and included some periprostatic fat as PZ, which led to the overcalculation of PZ volume. The prostate volume estimation is an important biomarker for multiple clinical applications [25, 26]. Lee et al. suggested that volume measurement by automated network provided reliable volume estimates of the prostate compared with those obtained with the ellipsoid formula [10]. Our study demonstrated that our automated network was able to provide faster and more accurate prostate zonal volume calculation than the junior radiologist, especially in PZ, which could serve as a useful tool for accurate prostate-specific antigen density calculation and obstructive symptoms analysis of patients.

Prostate morphology affected the segmentation performance of the U-Net model. In our study, the DSC for both CG and PZ was higher for larger CGv, while the DSC for PZ was lower for larger CGv/WGv. Prostate hyperplasia is common in men and is age-related, which contributes to the increase in CGv and the compression of PZ. Therefore, the recognition of CG was easier, but with extreme compression of PZ, the segmentation of PZ became more difficult. Prostate morphology also has an influence on manual segmentation variance. Montagne et al. [7] reported the variability of manual prostate zonal segmentation by seven radiologists on T2WI, with DSC value of 0.88–0.91 for TZ and analyzed factors that may influence it. The results showed that the DSC was lower for smaller prostate (Spearman correlation ρ > 0.8). Nai et al.[12] found that CNN auto-segmentation was difficult for special cases, which was the most difficult for cases with transurethral resection of the prostate. However, since the number of special cases was small in their study (only four subjects), no statistical analysis was provided. Rouvière et al.’s [17] study found a discordant result, where the mean DSC value for CG segmentation decreased significantly when the CG volume increased. The decreased performance of their model for larger prostate might be due to the different training process, combining model-based and deep learning–based approaches.

MR imaging parameters also significantly influenced the model’s auto-segmentation performance. The DSC value for PZ and CG was significantly higher in images acquired from the same vendor with the training group. Furthermore, the DSC value for CG was significantly higher for images from 3.0T MR scanners. In a previous study, Rouvière et al. [17] found that the scanner used for imaging significantly influenced the mean DSC for CG segmentation, with an odds ratio of 0.69 (1.5T vs. 3.0T). Since the MR scanners affect the model’s auto-segmentation, further training of the model with heterogeneous datasets might be necessary. In our study, the patients’ clinicopathological information was less likely to affect the segmentation performance. The reasons may be the relatively homogeneous clinicopathological data in the ETDpri, since all patients were diagnosed with advanced prostate cancer. Further studies using a larger cohort with heterogeneous patient data might be necessary. Additionally, a previous study has reported the change in prostate morphology by using the endorectal coil [27]. However, none of the patients included in our study used an endorectal coil. Whether our model is applicable to these patients should be analyzed in future studies.

Our study has some limitations. First, the private external testing dataset was small, so further external testing using larger datasets is needed. Furthermore, the other structures of the prostate, such as the anterior fibromuscular stroma and seminal vesicles, were not segmented since their outlining is difficult. However, for more accurate prostate cancer staging, the segmentation of seminal vesicles should be considered in future studies. Finally, the manual segmentation to generate the ground truth is time-consuming, which also limited the cases used for analysis; thus, utilizing the model to generate the ground truth in future studies is worth trying.

In conclusion, we validated the model’s utility for prostate zonal segmentation on T2WI in different external testing datasets. The model yielded good performance regardless of the variations in the patients’ clinicopathological characteristics. The model showed better performance than the junior radiologist in PZ segmentation. Prostate morphology and MR scanner parameters, especially CGv and vendor, impact zonal segmentation performance.

留言 (0)

沒有登入
gif