A deep learning model for generating [18F]FDG PET Images from early-phase [18F]Florbetapir and [18F]Flutemetamol PET images

Demographic information

The research cohort comprised patients referred to Geneva University Hospitals, spanning from cognitively unimpaired (CU) individuals to mild cognitive impairment (MCI) and dementia. Approval was obtained from the local ethics committee, ensuring adherence to the ethical principles outlined in the Declaration of Helsinki and the good clinical practice standards established by the International Conference on Harmonization. All patients signed informed consent in accordance with specific guidelines.

A total of 166 patients were included in our study and were categorized into: CU (N = 72), MCI (N = 73), and AD (N = 21) following standardized criteria for clinical staging. The inclusion criteria encompassed having at least one 3-dimensional (3D) T1-weighted MRI, undergoing dual-phase amyloid PET scans using either Fluorine-18 Florbetapir ([18F]FBP) (210 ± 18.77 MBq) or Fluorine-18 Flutemetamol ([18F]FMM) (166 ± 16.73 MBq), undergoing an [18F]FDG PET (203.89 ± 15.62 MBq) scan, and having an interval of less than 1 year between imaging procedures.

Table 1 presents the demographic and clinical information of our cohort. The mean time intervals between amyloid PET and [18F]FDG PET, between MRI and [18F]FDG PET, and between MRI and amyloid PET were 2.15 months (standard deviation, SD = 3.06), 1.89 months (SD = 4.15), and 2.76 months (SD = 3.40), respectively.

Table 1 Patient demographics of the dataset used in this study

As a group of comparison for the single-subject analyses, we included 112 healthy controls (HCs) who underwent [18F]FDG -PET and had a normal visual and semiquantitative [18F]FDG -PET assessment, already validated and included in previous studies [16]. We performed separate evaluations for early phase [18F]FBP (eFBP) and early phase [18F]FMM (eFMM), and the results were reported separately.

MRI acquisition

High-resolution anatomical 3D T1 was conducted at Geneva University Hospitals’ Division of Radiology using two 3 Tesla MRI scanners (Magnetom Skyra, Siemens Healthineers, Erlangen, Germany and GE Healthcare, Milwaukee, Wisconsin) with a matrix size = 256 × 256, and 254 × 254, slice thickness = 0.9 mm and 1 mm, and repetition time = 1930 ms and 7.2 ms.

PET acquisition

The [18F]FDG PET and amyloid brain PET scans were conducted at the Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospitals, utilizing clinical PET scanners, including Biograph 128 mCT, Biograph 128 Vision 600 Edge, Biograph 40 mCT, or Biograph 64 TruePoint (Siemens Medical Solutions). It's important to note that all these scanners were comparable in terms of performance. The [18F]FDG PET scans followed the guidelines outlined by the European Association of Nuclear Medicine [17]. For amyloid PET imaging, we utilized either [18F]FBP (94 cases) or [18F]FMM (72 cases). The determination of amyloid status (Aβ + /Aβ −) for each late image was carried out by an experienced in nuclear medicine physician, following the standard operating procedures approved by the European Medicines Agency.

For the early phase amyloid PET scan (eFBP and eFMM), image acquisition commenced promptly after the injection of the tracer to obtain a static image over 5 min for eFBP and 10 min for eFMM [18, 19]. The details of the PET acquisition protocol are depicted in Supplementary Fig. 1.

MRI and PET normalization processing

The MRI 3D T1 sequences were registered to the Montreal Neurologic Institute (MNI) space using 12 degrees of freedom using Statistical Parametric Mapping (SPM 12), which was executed within MATLAB R2018b, version 9.5 (MathWorks Inc.). The [18F]FDG and eFBP/eFMM images were aligned with each subject's T1 MRI and standardized to the MNI space using the transformation matrix from MRI registration. PET images underwent spatial smoothing with a 3D 8 mm Gaussian kernel. The procedures conducted were in accordance with established protocols [20].

For the quantification of Standardized Uptake Value Ratio (SUVR), we employed Automated Anatomic Labeling atlas 3 (AAL3) [21] with 166 Regions of Interest (ROIs). SUVR values were computed by standardizing the uptake within these regions against the combined mean values of the pons and cerebellar vermis, serving as the reference region. The resulting intensity-normalized PET images were saved for subsequent analyses.

SwinUNETR model implementation

Our study introduces a novel convolution-free transformer architecture, drawing inspiration from prior works [22,23,24]. Our architecture features an encoder, bottleneck, decoder, and skip connections, predominantly centered around the Swin-transformer (Shifted windows) module [23].

The image processing starts by dividing input images (SUVR eFMM/eFBP images) into non-overlapping 4 × 4 blocks, linearly projecting them to create sequences for network input. The encoder utilizes patch-merging blocks for down-sampling and Swin-transformer blocks for representation learning, forming a hierarchical structure akin to the U-Net's architecture. The symmetric decoder employs Swin-transformer layers and patch-expanding units.

To facilitate signal transmission, skip connections are established between the encoder and decoder. At the encoder's core, a bottleneck comprises two consecutive Swin-transformer blocks, serving as an additional connection between the encoder and decoder without involving up- or down-sampling operations.

The Swin-transformer block, inspired by shifted-windows [23], employs patch division at one level and a shifted version at the next, enabling connections between different window shapes via self-attention mechanisms. Comprising layer normalization (LN), multi-head self-attention (MSA), multi-layer perceptrons (MLP), and multiple skip-connections, this block ensures efficient information flow. The architecture of SwinUNETR is illustrated in Fig. 1.

Fig. 1figure 1

Overview of the Swin UNETR architecture. The input to our model is a single early phase eFBB/eFMM images (different models for eFBP/eFMM were trained separately) and the output is the synthetic [18F]FDG. The Swin UNETR creates non-overlapping patches of the input data and uses a patch partition layer to create windows with a desired size for computing the self-attention. The encoded feature representations in the Swin transformer are fed to a CNN-decoder via skip connections at multiple resolutions

The mean square error (MSE) loss function was employed as the guiding loss function for our model, which underwent training via a five-fold cross-validation methodology (60%, 20% and 20% allocation for the training, validation, and testing, respectively). The images were maintained within the SUVR range without undergoing any normalization procedure. Training the model extended over 300 epochs, concluding upon reaching a plateau in the graph representing the loss function.

Assessment of image quality

Initially, the predicted images underwent a visual inspection to detect potential artifacts and abnormalities, with a subsequent effort to identify the underlying causes and provide detailed reports for documentation. In our research, the evaluation of the DL model's performance was conducted by assessing various metrics, including the Structural Similarity Index (SSIM), Root Mean Squared Error (RMSE), and Peak Signal-to-Noise Ratio (PSNR). These metrics were computed for early phase amyloid image and synthetic [18F]FDG while considering the actual [18F]FDG scan as the ground truth. Subsequently, the metrics for each section were averaged to derive a comprehensive assessment. A pairwise t-test was computed individually across all groups, employing a predetermined significance level of 0.05. The distributions of SUVR for all regions and patients were visualized by plotting a Bland–Altman graph for both eFBP/eFMM and [18F]FDG, and synthetic [18F]FDG and [18F]FDG.

Clinical evaluation

To evaluate the performance of our model in clinical setting, two experienced nuclear medicine physicians (V.G with 18 years and G.M with 5 years’ experience in nuclear medicine and reading of brain PET scans) evaluated and compared the physiological aspects and biodistribution patterns of [18F]FDG images (as standard of reference) and synthetic [18F]FDG and eFBP/eFMM. We hypothesized that if the model enhances the similarity between [18F]FDG images and synthetic [18F]FDG compared to eFBP/eFMM images, our model can improve the accuracy of clinical diagnosis. To evaluate this hypothesis, 30 subjects were selected randomly and the synthetic [18F]FDG and eFBP/eFMM were anonymized while keeping the [18F]FDG known. We asked the physicians to look at the images head-to-head (actual [18F]FDG beside image unknown-1 and image unknown-2) and select a clinical similarity score (CSS) between 1 to 3 when comparing unknown images with actual [18F]FDG images. The scores were selected as follows:

1.

No clinical similarity: The unknown image compared to actual [18F]FDG does not represent similar clinical information.

2.

Slightly similar: The unknown image compared to actual]18F]FDG leads to partially similar diagnosis, some important information was missed.

3.

Similar: The unknown image compared to actual [18F]FDG leads to similar diagnosis, the necessary information was preserved.

An Intraclass Correlation Coefficient (ICC) was calculated between the two physicians to measure the agreement and consistency between assigned ranks.

Single-subject voxel-wise analyses

According to a validated SPM-based single-subject procedure [16], each PET and synthetic PET image was tested for relative hypometabolism/hypoperfusion by means of a 2-sample t-test in comparison with [18F]FDG PET images of 112 HC subjects. The statistical threshold for the resulting hypometabolic and hypoperfusion SPM maps was set at a P-value of 0.05, uncorrected for multiple comparisons, considering significant clusters containing more than 100 voxels. SPM maps were then binarized for further Dice coefficient analyses.

Statistical analysis

Dice coefficients were calculated using FSL software [25] to quantify the whole-brain spatial overlap between hypometabolic ([18F]FDG PET) and hypoperfused (eFBP/eFMM) binary maps at the single-subject level as well as between [18F]FDG PET and synthetic [18F]FDG PET hypometabolic binary maps. Dice coefficient for binary maps A and B is defined as: Dice = 2 ∗ (A ∩ B) /(A + B). It takes the value of 1 if A and B assume the same logical value in every pixel (high concordance), and a value of 0 if they always disagree (null concordance). It is interpreted as follows: < 0.2, poor; 0.2–0.4, fair; 0.4–0.6, moderate; 0.6–0.8, good; and > 0.8, excellent agreement.

General linear models were performed to assess the correlation between eFBP/eFMM SUVR in the AAL ROIs and their respective [18F]FDG SUVR as well as between [18F]FDG SUVR and their respective synthetic [18F]FDG SUVR in the whole sample. To evaluate the level of statistical significance between two groups, namely eFBP/eFMM vs. reference [18F]FDG and synthetic [18F]FDG vs. reference [18F]FDG, we performed a paired samples t-test. A P-value less than 0.05 was used as threshold for statistical significance.

留言 (0)

沒有登入
gif