Deep learning-based 3D cerebrovascular segmentation workflow on bright and black blood sequences magnetic resonance angiography

The institutional review board approved this study (approval number 2021110623007), which was conducted according to the Helsinki Declaration. The researchers clearly informed the subjects about the experimental procedures. Their consent to voluntarily participate in the project was obtained, and signed informed consent forms were collected.

DatasetCerebrovascular Segmentation Dataset (CSD)

Chen et al. [26] curated a cerebrovascular segmentation dataset, which is accessible at xzbai.buaa.edu.cn/datasets.html. The dataset comprises 45 volumes of TOF-MRA data obtained through 1.5 T GE MRI from the IXI dataset [27]. For accurate ground truth annotation, multiple radiologists, each with more than 3 years of clinical experience, meticulously labeled each volume with voxels. All the volumes were 1024 × 1024 × 92 in size, with a spatial resolution of 0.264 mm × 0.264 mm × 0.8 mm.

MCA M1 segment with LSA dataset (MLD)

We collected data from 107 patients, including outpatients and inpatients, at our hospital between 2014 and 2018. All 3D T1 VISTA images were 480 × 480 × 140 in size, with a spatial resolution of 0.4 mm × 0.4 mm × 0.4 mm. Voxel-level labeling of the image data was performed by a radiologist with more than three years of experience using ITK-SNAP. Additionally, the labeled MCA and LSA were thoroughly reviewed by another radiologist with more than ten years of experience. Physicians that processed the data were not aware of the clinical status of the patients to avoid bias.

Table 1 shows the details of the different datasets. N/A means not provided; note that 16 individuals did not provide sex information in the CSD.

Table 1 Details of different datasets. N/A means not provided; note that 16 individuals did not provide gender in the CSDDeep learning-based segmentation workflow

The workflow encompasses five sequential steps: dataset analysis, data preprocessing, model training, model validation, and postprocessing with analysis. The overall flowchart is illustrated in Fig. 1.

Fig. 1figure 1

3D deep learning-based cerebrovascular segmentation workflow

Dataset analysis

First, we thoroughly analyzed the dataset, focusing on assessing its overall size and the distributions of key variables. This analysis includes examining demographic distributions (such as age and sex), clinical characteristics (such as disease types and stages, scanning devices or protocols), and image feature diversity (such as size, shape, contrast, and intensity distribution). This analysis is conducted to gauge the dataset’s representativeness, detect biases or imbalances, and confirm the robustness and relevance of the findings to the target population or condition. We then evaluate factors such as image clarity, noise level, and the presence of artifacts. This involves assessing images against predefined criteria: for strong image clarity, images must exhibit sharp, well-defined edges and structures; for low noise levels, images must exhibit minimal random variations or “graininess”; finally, any distortions or anomalies that could interfere with accurate interpretation are defined as artifacts. Images failing to meet these standards, such as those with excessive noise or significant artifacts such as motion blur or ghosting, are excluded from the analysis to maintain the integrity and reliability of our dataset. Additionally, we verify the accuracy and consistency of the labels to ensure the cerebrovascular structures of interest are correctly labeled. Evaluating the dataset from these perspectives provides comprehensive insights into the characteristics, quality, and feasibility of the dataset. This process is designed to yield detailed statistical information about the dataset, guiding experimenters in critical aspects of cerebrovascular segmentation. Specifically, this information can be used to analyze the variabilities of thein vessel size, shape, and branching patterns, which are crucial for accurate segmentation. Experimenters should consider the distributions of any pathologies within the dataset, such as areas of stenosis or aneurysms, as these features may require specialized segmentation approaches. The statistical data also assist in assessing the heterogeneity of the patient population, ensuring that the segmentation method is robust across diverse patients. This thorough analysis is vital for refining and validating the segmentation method, yielding more reliable and clinically applicable results.

Data preprocessing and augmentation

The preprocessing stage involves several essential steps: denoising, smoothing, resampling, and contrast enhancement. Denoising techniques, such as median filtering, are employed to reduce the image noise, minimize interference from low-quality images during segmentation, and improve the algorithm’s performance. Smoothing operations, such as Gaussian smoothing, eliminate discontinuous edges in the image and yield more continuous and recognizable blood vessel structures. Resampling ensures a standardized spatial resolution across images, reducing generalization errors caused by sampling device disparities and providing a consistent size and resolution for model learning and inference. Contrast enhancement techniques, such as histogram equalization or adaptive histogram equalization, improve the visibility of vascular structures in images. Specifically, we used mean filtering (kernel size: 3 × 3), Gaussian smoothing (standard deviation: 1, kernel size: 3 × 3), intensity normalization (Z-score), and adaptive histogram equalization (block sizes: 8 × 8, clip limit: 0.1) for preprocessing. In addition, since images in the same dataset had the same resolution (MLD: 0.264 mm × 0.264 mm × 0.8 mm, CSD: 0.4 mm × 0.4 mm × 0.4 mm), no resampling operation is performed.

Another critical step in the workflow is data augmentation. The dataset is expanded by applying various geometric transformations. These techniques, including image rotation, flipping, and intensity changes, increase the diversity and robustness of the data. In our experiments, we augment the data by flipping and rotating the images and adjusting their intensity to 0.9–1.1 times that of the original images. These transformations were introduced with a probability of 0.1 for each image during the training process.

Model training

We implemented fourfold cross-validation at this stage. Specifically, we partitioned the data into training and test sets at an 8:2 ratio. The training set was divided into four equal subsets, with three subsets used as training data and one as validation data, for model training and validation. This process was repeated four times, employing different subsets as validation data in each iteration to encompass the entire training set. By using cross-validation, we can effectively utilize the limited dataset for training and validation, mitigating the risk of overfitting the model to a specific data distribution. Specifically, CSD encompassed 45 volumes; 36 were used for cross-validation (27 for the training set and 9 for the validation set), and 9 were used for testing. MLD encompassed 107 volumes; 85 were used for cross-validation (64 for the training set and 21 for the validation set), and 22 were used for testing. In this phase, we selected four network models for comparison training: U-Net [28], V-Net [29], UNETR [30], and SwinUNETR [31]. U-Net is a classic convolutional neural network that enhances segmentation accuracy through its encoder-decoder structure and skip connections. V-Net utilizes a residual network and multiscale residual module to capture fine details and contextual information. UNETR is a transformer-based model that leverages self-attention mechanisms to model pixel relationships effectively. SwinUNETR combines the Swin Transformer [32] and UNETR, where the Swin Transformer is a variant of the Transformer mechanism based on a local perceptual window. All the models used were three-dimensional segmentation models. During training, in the preprocessing process, a 192 × 192 × 64 sized image patch was cropped from the entire volume of 3D data for use as input to the models.

Due to the small proportion of cerebral vessels in the image, the foreground and background pixels are significantly imbalanced. We use a weighted combined variant of dice loss and focal loss [33] to address this imbalance and enhance cerebrovascular segmentation. The dice loss effectively handles the foreground–background pixel imbalance, while the focal loss focuses on hard-to-classify pixels by adjusting sample weights. By modifying the weights of the dice loss and focal loss and combining them using a weighted summation to form the final loss function, we balance their contributions to the segmentation results, leading to improved accuracy and robustness in cerebrovascular segmentation.

We utilized a workstation with 6 RTX 3090 GPUs, 2 Intel(R) Xeon(R) Silver 4310 CPUs, and 256 GB of RAM for model training and testing. The training process employed the AdamW optimizer with an initial learning rate of 0.0001 for 400 training epochs, utilizing a batch size of 1. To expedite model convergence and reduce training time, we also implemented a warmup strategy.

Model validation

We evaluated the model using the validation set after every five rounds of each cross-validation fold. The entire image was predicted using a sliding window of size 192 × 192 × 64, and the dice similarity coefficient (DSC) was calculated as an internal performance metric. We then adjusted the model’s hyperparameters, such as the learning rate, network structure, or regularization parameters, based on the performance of the model on the validation set. We tested different combinations of hyperparameters and selected the model with the best performance on the validation set, i.e., the highest DSC.

We comprehensively compared the U-Net, VNet, UNETR, and SwinUNETR models using the CSD and MLD datasets and evaluated their performances based on several metrics: DSC, average surface distance (ASD), precision (PRE), sensitivity (SEN), and specificity (SPE).

Postprocessing and analysis

Postprocessing was conducted to enhance the accuracy and quality of the image segmentation results. We employed operations such as edge smoothing, region merging, splitting, and filtering on every image in the validation set and test set. These operations fill voids and connect disjointed edges, merge small neighboring regions into more comprehensive areas, improve segmentation consistency and connectivity, and eliminate artifacts, isolated points, or mislabeling in the segmentation results. Specifically, we first performed mean filtering on the entire image and then removed regions smaller than 50 pixels and spliced neighboring regions that were no more than 10 pixels away from each other. Finally, we quantitatively and qualitatively evaluated the segmentation results by calculating various evaluation metrics on the test set and comparing them with the physician’s criteria for manual segmentation. The segmentation outcomes were visualized in three dimensions, enabling the feasibility and accuracy of the segmentation results to be observed.

Statistical analysis

First, we calculated the performance metrics for each model and performed a Shapiro–Wilk test on the DSC, ASD, PRE, SEN, and SPE of the four models on the two test sets. If the p value exceeded 0.05, the data were normally distributed; if the p value was less than or equal to 0.05, we examined the quantile–quantile plot; and if the data were distributed around a straight line, we assumed that the data were normally distributed. Otherwise, the data were assumed to not follow a normal distribution. We then used one-way ANOVA to compare the performances of the different models. We calculated the F-statistic and the corresponding p value. A p value less than 0.05 was considered to indicate a statistically significant difference between the performances of the models. All analyses were performed using the IBM SPSS statistical software (version 27).

留言 (0)

沒有登入
gif