Combatting the effect of image reconstruction settings on lymphoma [18F]FDG PET metabolic tumor volume assessment using various segmentation methods

Study population

In this study, we used baseline [18F]FDG PET/CT scans from 19 patients from two different datasets. The first dataset consists of 14 patients scanned at Amsterdam UMC which were retrospectively obtained from ongoing studies with a waiver for informed consent from the Medical Ethics Review Committee of Amsterdam UMC, location VUmc (IRB2018.029). From these 14 patients, 9 patients were diagnosed with DLBCL, 3 were diagnosed with Hodgkin lymphoma, 2 were diagnosed with T cell lymphoma and 1was diagnosed with post-transplant lymphoproliferative disorder (PTLD). The second dataset consists of 5 DLBCL patients which were recruited at the outpatient clinics of the department of Hematology of the Amsterdam UMC, location VUmc, and the outpatient clinics of the department of Hematology of the Amstelland Hospital in Amstelveen (IRB2019.278). These trials enrolled patients aged 18 years or older diagnosed with DLBCL with at least one tumor with a diameter equal to or more than 3 cm. Patients who had undergone chemotherapy in the past 4 weeks showed multiple malignancies, metal implants or pregnant/lactating patients were excluded from the study.

Quality control of scans

The quality control (QC) check of the scans followed the EANM guidelines: The liver SUVmean should be between 1.3 and 3.0, and the plasma glucose should be lower than 11 mmol/L [12]. Furthermore, scans were excluded during the QC if the scans were incomplete and/or the total image activity (MBq) was not between 50 and 80% of the total injected FDG activity and/or any DICOM data were missing as in [2].

Image processing

In order to analyze the impact of reconstruction methods, we used [18F]FDG PET baseline scans derived from three different reconstruction methods: one reconstruction which followed locally clinically preferred protocols (high resolution or HR reconstruction), another reconstruction following EARL1 standards (EARL1 reconstruction) and a third reconstruction following EARL2 standards (EARL2 reconstruction). EARL2 standards were established with the implementation of PSF into the original EARL image reconstruction capabilities [13]. PSF is a resolution modeling algorithm which improves image resolution and contrast [14]. In comparison with EARL standards, the most substantial configuration to the HR reconstruction is a pixel spacing parameter of 2 mm instead of 4 mm and a higher spatial resolution. Table 1 contains a summary of the parameters related to the reconstruction methods  used in this study.

Table 1 Summary of parameters characterizing each reconstruction method

The MTV of lesions was calculated and analyzed using ACCURATE software [15]. ACCURATE enables the calculation of MTV of lesions on PET scans automatically and allows the users to apply multiple segmentation methods or volumes of interest (VOI) [15]. Nineteen lymphoma patients were included in the analysis. For each PET baseline study, 3 different reconstructions were investigated (EARL1, EARL2 and HR). We delineated on average 3 lesions per PET scan, which resulted in a total of 56 lesions across all of the included patients. Nine different semiautomatic segmentation methods were applied to delineate each of these lesions. Since each PET scan consisted of 3 reconstructions, a total of 1512 delineations and MTV measurements were included for the analysis.

For each reconstructed scan, the following segmentation methods were applied: segmentation based on fixed thresholds using standardized uptake value of 4.0 (SUV4.0), and SUV of 2.5 (SUV2.5), 41% of SUVmax (41M), segmentation based on adaptive thresholding using 50% of peak voxel value adapted for local background (A50P), majority vote approaches for segmenting voxels detected by at least 2 (MV2) and 3 (MV3) out of these 4 methods [16], lesional-based methods that identify the optimal method based on SUVmax (L2A, L2B) [17] and a contrast oriented method, Nestle segmentation [18]. For the L2A method, a SUV4.0 contour is used for SUVmax > 10 and MV3 for SUVmax < 10. For L2B, MV2 instead of SUV4.0 in case of SUVmax > 10 was used. The majority vote approaches are based upon the agreements between SUV4.0, SUV2.5, A50P and 41M. A detailed description of the methods can be found in [19].

ComBat harmonization

ComBat harmonization was applied to align the MTV measurements from the three different reconstructions used in this study. As aforementioned, ComBat was first described in the field of genomics to remove batch effects [10, 20]. The ComBat method assumes that the deviation introduced by the batch effect is removed once the means and the variances are standardized across the different batches. The value of the feature Y for a specific VOI j and scanner i is expressed as follows:

$$Y_ = \alpha + \gamma_ + \delta_ \varepsilon_ ,$$

(1)

where \(\alpha\) represents the mean value of the feature Y, \(\gamma\) represents the additive effect of the scanner, \(\delta\) is the multiplicative effect of the scanner, and \(\varepsilon\) is the error. In this case, the feature Y would be the MTV and the VOI j the delineated lesion. This harmonization method uses the empirical Bayes framework to estimate the batch/scanner effect terms, \(\gamma_\) and \(\delta_\). Subsequently, the corrected Y value \(Y_^}}}\) is calculated in Eq. (2) where \(\hat\), \(\widehat }}\) and \(\widehat }}\) are estimations of parameters \(\alpha\), \(\gamma_\) and \(\delta_,\) respectively.

$$Y_^}}} = \frac - \hat - \widehat }}}} }}}} + \hat$$

(2)

To understand how the implementation of ComBat is affecting our MTV values, we implemented multiple versions and compared them to the original data. Initially, we applied the regular implementation of ComBat which derives the transformation by aligning the mean and standard deviation of the data groups pertaining to different reconstructions (‘Regular ComBat’). This implementation of ComBat assumes a normal distribution of the data. Since medical data are rarely normally distributed, we also implemented the version of ComBat which applies the logarithmic transformation to attain normal distributions (‘Log-transformed ComBat’). When applying such transformation, the returned values have already been exponentially transformed to be comparable with the rest. Details of these two ComBat versions can be found in Table 2. Another approach to address the non-normal data distribution is to standardize the median and interquartile range instead of the mean and the standard deviation. Furthermore, we investigated whether excluding outliers affects the harmonization of the data. ComBat was applied using R version 4.0.5 based on the code provided by Fortin et al. [21].

Table 2 Description of characteristics of ComBat implementationsStatistical analysis

We first compared the MTV values across the 9 different segmentation methods. For each one of the lesions, we compared the MTVs obtained from EARL2 or HR reconstructions to those from EARL1 using MTV volume ratios. Since EARL1 is used as the reference reconstruction method, in these ratios, EARL1 results are given in the denominator as shown in the following equations:

$$}\;}\;}2 = \frac}\;}}}}1 \;}}}$$

(3)

$$}\;}\;} = \frac}\;}}}}\;}}}$$

(4)

Equations (3) and (4) were calculated across all of the 9 segmentations which resulted in a MTV ratio value per lesion for each segmentation method for both EARL2 and HR reconstructions. MTV ratios were used to compare the effect of different reconstructions across multiple segmentations before applying ComBat and after applying ComBat.

留言 (0)

沒有登入
gif