One-shot neuroanatomy segmentation through online data augmentation and confidence aware pseudo label

Brain structure segmentation from magnetic resonance (MR) scans is a fundamental step in neuroimage analysis. Especially, fine-grained brain structure segmentation can provide a wealth of information for brain disorders diagnosis (Iglesias et al., 2016, García-Lorenzo et al., 2013), treatment planning (Burgos et al., 2017, Bauer et al., 2013), and clinical assessment (van Erp et al., 2016). However, manual segmentation from 3D MRI is highly laborious, and the results may suffer from poor reproducibility because of operator variability. Although classical methods such as Freesufer (Fischl, 2012) and SAMSEG (Puonti et al., 2016) can also perform whole brain segmentation, the process is usually computationally expensive and time-consuming.

Recent deep learning-based brain segmentation methods (Huo et al., 2019, Kamnitsas et al., 2017, Li et al., 2017, Moeskops et al., 2016, Zhang et al., 2015, Roy et al., 2019, Li et al., 2021b, Li et al., 2021a, Wachinger et al., 2018, Chen et al., 2018) have been widely studied. These methods usually build segmentation models based on 3D convolutional neural networks (CNNs) with image patches as input. Besides, various methods (Li et al., 2021b, Roy et al., 2019) have also been presented for 2D slices. Although these fully supervised methods can achieve fast and accurate whole-brain segmentation, many high-quality labeled scans are required in the training phase. Therefore, training deep models with one or a few labeled samples is highly expected in application deployment. Data augmentation is a key strategy to generate new labeled images. Traditional data augmentation methods, such as elastic transformation, can only generate simple samples. Recent studies (Sandfort et al., 2019, Mahapatra et al., 2018, Bailo et al., 2019, Chaitanya et al., 2019) leveraged the generative adversarial network (GAN) to synthesize more realistic labeled samples, but it may suffer from the model collapse problem.

Classical atlas-based segmentation methods (Heckemann et al., 2006, Collins et al., 1995, Artaechevarria et al., 2009, Jia et al., 2012) have alleviated the amount of labeled data required. However, they are still time-consuming and unsuitable for applications requiring fast processing speed. Recent researchers have enhanced the atlas-based segmentation with CNNs. Specifically, Wang et al., 2020, Dinsdale et al., 2019 exploited the deep learning model to learn the correspondence from the atlas to the target image and then employ the predicted correspondence to transfer the label. Zhao et al. (2019) developed a one-shot segmentation method, which first learned the spatial deformation and appearance transformations from the reference atlas to unlabeled images and then utilized the transformations to synthesize extra labeled images for segmentation. To increase the variety of the transformations, the studies (Ding et al., 2021, Tomar et al., 2022) considered learning the spatial and appearance transformation distributions. Then, they could generate more diverse labeled images by random sampling from the prior. However, these methods require additional networks to learn the transformations, and the training process is cumbersome. Moreover, the unlabeled scans are discarded during the training of the segmentation network.

To address the above challenges, we propose a unified deep-learning framework for one-shot neuroanatomy segmentation. As illustrated in Fig. 1, our method differs from the previous approaches (Zhao et al., 2019, Ding et al., 2021, Tomar et al., 2022). We joint the deformation modeling and segmentation tasks and train the network end-to-end. The shared encoder can capture appearance variations between the atlas and unlabeled images. Specifically, the atlas and unlabeled images are input to the shared encoder and deformation modeling head to estimate the atlas-to-image spatial deformation field. The predicted deformation field is subsequently used to generate the augmented image pair. The augmented image pair is further employed to train the segmentation branch. To fully use the unlabeled images, we also predict segmentation maps for them and introduce confidence aware pseudo label for supervision. We conduct experiments on three benchmark datasets. The results demonstrate that our method consistently outperforms other methods, revealing its superiority.

In summary, this paper presents a new deep-learning model for brain structure segmentation in a one-shot setting. The main contributions can be summarized as follows:

We present an end-to-end unified framework for one-shot neuroanatomy segmentation, which joins the deformation modeling and segmentation tasks.

We develop multi-scale deformation modeling to capture the correspondence from the atlas to unlabeled images accurately and utilize online data augmentation to generate augmented image pairs.

We introduce confidence aware pseudo label to filter out the most reliable voxels of the unlabeled images to further improve the segmentation performance.

We conduct experiments on three public datasets: CANDI, multi-center ABIDE, and OASIS. The results show that our method significantly outperforms other state-of-the-art methods, demonstrating its superiority and robustness.

留言 (0)

沒有登入
gif