AnNoBrainer, An Automated Annotation of Mouse Brain Images using Deep Learning

Data Collection

Data from a study that characterized the aSyn pre-formed fibrils surgical model of synucleinopathy in A30P transgenic mice were used for model validation (Schultz et al. 2018). Image selection was largely dependent on stain quality and tissue integrity. Significant tears and deformation of the tissue that can dramatically affect detection of morphological boundaries were excluded. Likewise, images were excluded where stain quality was not sufficient for detection of morphological boundaries, or it was substantially different from the staining intensity of neighboring sections. The initial dataset consisted of 313 brains samples on 19 slides. AnNoBrainer was tested eventually on only 229 brain samples, since some brain samples did not pass the initial quality control e.g., due to processing artifacts (e.g., missing half of a brain). Selected brain samples on each slide were annotated by expert neuroscientists and were limited to three different brain regions as follows: CP (caudioputamen), SNr (substantia nigra) and PAG+AQ (periaqueductal gray + cerebral aqueduct). The final dataset for quantitative analysis consisted of 89 brains samples from 17 different layers for CP, 105 brains samples from 18 different layers for SNr and 35 brains from 7 different layers for PAG+AQ. The distribution of brain section in each layer within each region is shown on Supplementary Fig. S2.

The primary objective of our study was to conduct a qualitative and quantitative analysis of AnNoBrainer, which necessitates both the registration results and corresponding ground truth annotations. Manual annotation of brain regions can be challenging for individuals without specialized training in neuroscience. Thus, we sought the assistance of expert neuroscientists to provide ground truth annotations. Our ground truth dataset only encompasses annotations for the three regions that were specifically investigated in our experiments. While this might seem at first as a limitation, we demonstrate the ability of AnNoBrainer to annotate every region of interest in the brain sample (Fig. S1).

We also conducted a qualitative analysis of 7 additional brain regions of interest (ROIs) commonly used in neuroscience, including ACB (Nucleus Accumbens), aco (Anterior Commisure), cc (Corpus Calosum), HY (Hypothalamus), CTX (Cerebral cortex), LSr (Lateral septal nucleus – rostral part), MS (Medial septal nucleus) and HIP (Hippocampus region). These regions cover a significant portion of the brain.

Manual Annotations

Manual annotations of the caudoputamen (CP) had been made based on morphological boundaries. The lateral edge of the CP was delineated by the corpus callosum and the medial edge by the lateral ventricle. The dorsal CP was separated from the ventral CP by drawing a horizontal line from the most ventral aspect of the lateral ventricle, where it often meets the anterior commissure, across to the lateral edge of the CP. The full striatal area was not captured manually for two reasons. First, the site of injection into the CP only targeted the dorsal portion of the CP. Secondly, accurately, and quickly discerning the ventral CP from the nucleus accumbens using a hematoxylin stain is more challenging than identifying the lateral and medial boundaries of the CP. Using only well-defined morphological markers for manual annotations enabled consistency and minimized biases in drawing the region of interest. However, this also illustrates why manual annotations often sacrifice capturing the full anatomical region of interest in favor of accommodating technical ease.

Qualitative Analysis

In addition to manual annotations, which can be time-consuming and impractical for every brain region, we conducted a qualitative analysis of additional brain regions (see Data Collection section). During the evaluation, each annotation was rated independently by two expert neuroscientists on a scale from 1 to 5 representing the need for manual adjustment of the automated annotation. In category 1 annotations required no manual adjustments, in category 2 required only minor adjustments, in category 3 medium adjustments were needed, for category 4 annotations required significant manual adjustments, and in category 5 the ROI was completely mis-annotated. This qualitative analysis proved to be too granular as it reflected disagreement between the expert annotators, highlighting the challenges in annotating brain images even by experts. We therefore grouped these data into two categories “useful” and “not useful”. Annotations falling into category 1 (no manual adjustment) or category 2 (minor adjustment) were considered "useful", as these instances required minimal effort and indicated a fairly accurate annotation. Annotation rated 3 and above were considered "not useful" as they required significant adjustments, raising concerns about their accuracy and reliability.

Allen Mouse Brain Atlas

The Allen mouse brain atlas (Lein et al., 2007) consists of 132 coronal sections evenly spaced at 100 µm intervals and annotated to detail numerous brain regions. It was designed to be easily integrated into digital applications in the field of automated histological segmentation, and it is considered reliable source of mouse brain anatomy delineating regions of interest, which are mapped and labeled.

Brain Detection

Detection of brains and hand-written notes on a slide is considered an object detection problem, which consists of two subproblems: (i) detecting the objects of interest (brains and handwritten notes) and (ii) segmenting the image with bounding box detection to label the category. For this purpose, a transfer learning strategy (Zhuang et al., 2021; Talo, 2019) was followed, and a pre-trained mask R-CNN model (He et al., 2017) was modified by replacing its last fully connected layer with a set of new fully-connected layers retrained on 20 manually labeled slides aiming to detect two distinct categories – 1) mouse brains and 2) handwritten notes. For the training process, ADAM (Adaptive Moment Estimation) optimizer was used with 60 epochs and batch size of 2. Following brain detection, bounding boxes were generated for each individual mouse brain on a slide and centroids were calculated to form a brain centroid grid.

Linking of Detected Brain Images

To differentiate between experimental and control group brains, an experimental table was manually created for each slide. To link the detected brains from the slides to their corresponding descriptions in the experimental table, the following approach was employed. First, a brain centroid grid from the brain detection step was obtained. A similar grid was then constructed from the experimental table using its row and column coordinate structure. Both grids were then standardized to ensure that they were in the same unit of measurement. To connect the corresponding points on each grid, Hungarian algorithm (Kuhn, 2005) was applied. The algorithm assigns each pair from both tables in such a way that the total sum of distances is minimized.

Matching with Reference Atlas Layer

Matching of brains to their respective reference (z-slice) layers is a manual, non-trivial and subjective task that requires expert knowledge. Multiple convolution neural networks architectures for classification were trained and tested, using a standard test/train split approach and image augmentations, including rotation, random brightness, channel flip, median blur, and elastic transforms (Buslaev et al., 2020). The latter is an important component of the training since individual brains are cut by hand and a certain level of asymmetry is generally present. The ResNet34 (He et al., 2015), and Efficient Net architectures were used (Tan & Le, 2020). Since this task can be also interpreted as a regression problem, a Label Smoothing Cross-Entropy Loss function (Müller et al., 2020) were used for the training. The training used pre-trained models on the ImageNet dataset and only replaced the last fully connected layers that are responsible for the classification itself. These were fully trained from scratch using a batch size of 12 and ADAM optimizer for 12 epochs.

Image Registration Process

Image registration is an iterative optimization process that searches for best suitable geometrical transformation to spatially align two images, a reference image, and a target image. The image registration process implemented in our pipeline consists of subsequent steps of geometric alignment between reference Allen mouse brain atlas layer and mouse brain identified on a slide.

Airlab Library (Sandkühler et al., 2020) and the detection of sparse cross-domain correspondences (Aberman et al., 2018) were the main tools utilized for AnNoBrainer’s custom registration pipeline. AirLab provides flexibility with various image registration methods, loss functions, and regularization techniques all using Pytorch, and can be run on a GPU cluster. Affine Registration was the first step in our procedure followed by Elastic Registration to correct local nonlinear disparities. The objective function for image registration was extended by considering a sparse correspondence detected in target image and in the template image via (Aberman et al., 2018). Thereafter, the elastic registration regularization term was extended to minimize the distance between the corresponding landmarks while ensuring smooth geometric transformation.

Affine Registration

Affine registration is a technique used to align two images by applying 2D linear transformations and translations. These transformations involve operations like scaling, reflection, and rotation. The objective is to gradually transform the registered image to closely resemble the template image, minimizing the loss function. To preserve the original shape of the image, similarity transformation is utilized, involving simple operations like rotation and reflection. The affine registration process employs the Normalized Cross-Correlation loss objective function. Airlab's backend, which enables iterative GPU-based optimization, is used for the registration. The optimization software gradually adjusts the registered image to maximize its similarity to the template image. The ADAM optimizer is utilized with a learning rate of 0.01 and 1000 iterations.

Elastic Registration

Following the affine registration alignment, the elastic registration process takes place. Elastic registration estimates a smooth, continuous function to map points from the registered image to the template image. There are three commonly used transformation models: linear/dense, non-linear/interpolating, and dense models. AnNoBrainer employs the non-linear/interpolating approach. In this approach, to transform a point x in the image, the displacement f(x) is interpolated from neighboring control points using a basis function. A B-spline kernel is used as the basis function. The diffeomorphic approach is employed to maintain topology preservation. Additionally, a regularization term is introduced to the affine registration to limit the number of sampling points. The displacement field is subjected to a Regularizer (DiffusionRegularizer) that penalizes changes in the transformation. Similar to the affine registration, the normalized cross-correlation loss function is used for optimization. The Adam optimizer is employed with a learning rate of 0.01 and 1000 iterations for optimizing the transformation field.

Landmark Regularization Term

During the registration process, two common edge cases arise: excessive or insufficient distortion. These cases are typically a result of balancing the loss function and regularization term. To achieve realistic and anatomically reasonable distortions, a smaller distortions approach with higher regularization is commonly used. However, a common issue arises when the tissue cut is not perpendicular to the z-axis of the brain, leading to anatomical differences between the brain hemispheres and imperfect alignment in some areas, especially when higher regularization is used. To address this, a higher level of flexibility is necessary. Consequently, an additional step was implemented to adjust the regularization term for specific parts of the image, allowing for higher distortion locally. Image landmarks play a crucial role in this process. In image processing, landmarks are characteristic features or structures in an image. In the context of this work, landmarks are points at the same semantic locations across two images, such as the left edge of an organ. When the landmarks are aligned, the regularization term behaves as it would without landmarks. However, when there are significant differences between the landmarks of the template image and the registered image, it locally reduces the effect of the regularization term. This approach leads to potentially improved overall image alignment. To identify these landmark points, a pre-trained neural network from Neural Best-Buddies: Sparse Cross-Domain Correspondence (Aberman et al., 2018) was utilized. This neural network helps find semantically matching areas across two cross-domain image pairs (across two brain samples in our case). Then, a regularization term derived from the landmarks was constructed. The objective was to minimize the squared Euclidean distance between corresponding landmarks. The squared Euclidean distance was preferred over the classic square root version as it preserves the smoothness of the function at all points, which is crucial for gradient optimization.

A Landmark regularization term (\(LRT\)):

$$^\left(p, q\right) = \sum\limits_^\sum\limits_^_-_\right)}^$$

where \(p\) denotes warped landmark point from moving image, \(q\) denotes landmark point from fixed image, \(N\) represents number of landmark point and \(D\) represents a dimension corresponding to x and y position in image.

An elastic Regularization loss uses Normalized Cross Correlation (\(NCC\)) extended with a Diffusion Regulariser term (\(DF\)). Thus, the loss function extended with a Landmark regularization term is defined as follows:

where terms \(_\) and \(_\) are both selected as 0.5.

留言 (0)

沒有登入
gif