AI-based MRI auto-segmentation of brain tumor in rodents, a multicenter study

The study was executed as shown in Fig. 1, including image acquisition, data preparation, model training and model validation.

Fig. 1figure 1

Flow chart and 3D U-Net architecture for current study. Collection and allocation of both data for model training, validation and test (A). All these data were manually segmented and pre-processed (B), followed by data augmentation and model training (C). The trained models were challenged by images with Gaussian noise added, measured quantitatively (D). AI-assisted segmentation was demonstrated based on ground truth by two radiologists (D). The same 3D U-Net architecture for training the two models was shown (E). Abbreviations: AI: artificial intelligence; RV: relative ratio; HD: Hausdorff distance; MSD: mean surface distance; DSC: Dice similarity coefficient

Collections of datasets

For the Leuven dataset, the animal model of metastatic brain tumor was constructed with proper laboratory animal care, after ethical committee approval of KU Leuven (P046/2019) (Fig. 1A). Rat rhabdomyosarcoma cell line, kindly provided by Lab of Nanohealth and Optical Imaging group in KU Leuven, was cultured with 10% FBS and 1% penicillin/streptomycin at 37 °C in a 5% CO2 atmosphere in DMEM (Gibco, USA). The contamination of mycoplasma was excluded by e-Myco PCR kit (Boca Scientific, USA). The cell line was chosen based on the following considerations. Firstly, it is natively compatible with immune competent WAG/Rij rats, where reproduction of cancer-immunity interaction is possible. Secondly, its derived animal model exhibits similar MRI manifestation with clinical patients [22]. Thirdly, we aimed at developing models for brain or brain tumor segmentation, instead of elaborating on the biological disparity between different types of brain metastasis. The brain metastasis model was induced by surgical implantation, as published before [22].

MRI scans were performed using a 3.0T magnet (MAGNETOM Prisma; Siemens, Erlangen, Germany), with a 16-channel phase array wrist coil under gas anesthesia of a mixture of 3% isoflurane, 80% air and 17% oxygen, with the MRI sequences optimized from clinically used ones (Table 1). To ensure generalizability and translation potential, the commonly used sequences were adopted, including T1 weighted imaging (T1WI), T2 weighted imaging (T2WI) and contrast-enhanced T1 weighted imaging (CE-T1WI). These sequences provide high-resolution and not-distorted anatomical information in the brain. To increase the generalizability of the model, cases with various sizes of tumors, cases with/without ventriculomegaly, cases with intra-tumoral necrosis were included. Cases with missing/unsatisfactory MRI images were excluded.

Table 1 Summary of MRI scanning parameters

The Cancer Imaging Archive (TCIA) dataset consists of MRI images (Philips 3.0T magnet, the Netherlands) from genetically engineered mouse models of high-grade astrocytoma, including glioblastoma multiforme and surgically implanted orthotopic model based on U87 cell lines. In the genetically engineered mouse models, the most dysregulated networks in glioblastoma multiforme including RB, KRAS and PI3K signaling are perturbed. These genetic aberrations induce development of mouse high-grade astrocytoma like that in humans. Thus, the TCIA dataset is more diverse in terms of tumor induction methods, pathological and genetic profiles. Two out of 48 cases were excluded from the TCIA dataset due to incompleteness of sequences (Fig. 1). Cases with ambiguous tumor lesion were excluded from model training.

A total of 46 cases from TCIA and 57 cases from Leuven dataset respectively were included. For model 1 that is responsible for segmentation of tumor bearing brain, a total of 57 cases were collected in KU Leuven, with 46 for training and 11 for validation. A total of 46 cases were collected in TCIA, with 28 for training and 18 for validation. For model 2 that is responsible for segmentation of brain lesions, a total of 48 cases were collected in KU Leuven, with 40 for training and 8 for validation. 42 cases were collected in TCIA, with 30 for training and 12 for validation.

Manual segmentation

Generation of ground truth for both Leuven and TCIA datasets, facilitated by intensity-based thresholding and region-growing algorithms, was finished in ITK-SNAP (http://www.itksnap.org) by two co-authors, Yuanbo Feng, and Yicheng Ni, with more than 10 years of experience in experimental and clinical radiology (Fig. 1B) [26]. The segmentations of the brain and tumor were finished separately. Segmentation of brain was mainly based on T2WI and propagated to other sequences. Tumor segmentation was mainly based on CE-T1WI, with reference information from other sequences. For each segmentation task (either brain or tumor), Yicheng Ni and Yuanbo Feng performed segmentation independently, and consensus was achieved after discussion whenever there was a disagreement.

AI model architecture

We adopted a stepwise solution for the segmentation tasks: firstly, developing a model for segmentation of tumor-bearing brain from the images of head and neck region; and secondly, developing a model for segmentation of tumor from the brain images for both datasets (Fig. 1C). These models are named as model 1 (segmentation of tumor-bearing brain) and model 2 (segmentation of brain tumor). The adoption of step-wise solution is based on consideration of future applications. Segmentation of only the brain tissue, namely skull stripping, highlights brain morphology. In quantitative imaging analyses, intra-individual comparison between brain tumor and contralateral brain tissue is widely adopted. So, the contralateral brain tissue can be easily segmented if both segmentations of brain and tumor have been achieved.

The models were optimized from the basic 3D U-Net architecture (Fig. 1E) [3]. The network weights were initially set with the Adam optimizer and a learning rate of 10−4. A loss function based on dice loss and focal loss was adopted to solve the issue of imbalanced class due to minor volume of ROI. The loss function put weights of 0.75 and 0.25 respectively on ROI and non-ROI voxels.

Model training and validation

To train our 3D U-Net models [3], we first established a training dataset by random selection, with the remaining data as test dataset. Before the training, data preprocessing was performed, including intensity normalization and isotropic resampling by B-spline interpolation to isotropic 0.5 mm. Data were augmented by rotations for certain times of 90 degrees, vertical and horizontal flips. Since previous studies have illustrated that a bigger patch size is generally associated with superior model performance [6, 9], here, a patch size of 64 × 64 × 64 is selected due to the trade-off between run time, resource constraint and information loss. To confirm the applicability in different MRI settings, training and validation were performed in two MRI datasets: Leuven and TCIA datasets, with noise-added images for extra validation (Fig. 1D) [4, 12]. For the noise addition, Gaussian white noise was added with different levels of sigma values from 1 to 15, with a step of 1 after normalization of images into the range 0–255. Contralateral normal brain tissue and background areas on T2WI images were selected for the calculation of signal–noise-ratio (SNR) (Additional file 1: Fig. S1). The SNR for both datasets was decreased to 1, when the sigma value is around 15 (Additional file 1: Fig. S2).

Quantitative evaluation of AI performance

Dice similarity coefficient (DSC) was adopted to quantify the volume-based similarity between ground truth and AI-derived segmentation [29]. The overlapping area between them is proportional to the DSC value, which is always between 0 and 1. Volume ratio (RV) computes the ratio of the ROI volumes from two segmentations, defined as RV (seg1, seg2) = V1/V2, where V1 and V2 are the volumes of two segmentations. Mean surface distance (MSD) and Hausdorff distance (HD) are designed to measure the surface-based difference between two segmentations [7]. MSD computes the average distance between the two segmentation surfaces, whereas HD computes the largest distance between them.

$$ \begin & Dice\,Similarity\,Coefficient = \frac \right|}} \\ & Volume\,ratio = \frac }} }} \\ & Mean\,surface\,distance = \frac + n_}} }}\left( ^ }} d\left( } \right) + \mathop \sum \limits_ = 1}}^}} }} d(p^,S)} \right) \\ \end $$

where p: pixel; S, S′: surface of model segmentation and ground truth, d(p, S′): minimum Euclidean distance between p and all pixels p′ on surface S′

$$ Hausdorff\,distance = max\left\c} \\ \\ \end d\left( \right),\beginc} \\ \\ \end d\left( \right)} \right\} $$

Practicability of AI-assisted segmentation

To illustrate whether AI-assisted segmentation can reduce the inter-observer disparity, inter-observer disparity on fully human segmentations was compared with the disparity of AI-assisted segmentations. The inter-observer disparity was calculated by comparing the difference of native masks from two radiologists (Yicheng Ni and Yuanbo Feng). While the disparity of AI-assisted segmentations was calculated by comparing masks that were first generated by AI models and then modified by the two radiologists. Additionally, the time between de novo manual segmentation and AI-assisted segmentation was also compared.

留言 (0)

沒有登入
gif