Imaging quality of an artificial intelligence denoising algorithm: validation in 68Ga PSMA-11 PET for patients with biochemical recurrence of prostate cancer

Patients

We retrospectively included 30 consecutive patients aged over 18 years old presenting with biochemical recurrence of prostate cancer at the Jean Perrin Cancer Center (Clermont-Ferrand, France) for 68 Ga-PSMA PET from July 17 to October 20, 2020. As per French regulations, every patient had already undergone 18F-choline PET/CT that were negative or inconclusive [15]. The data collected was age, weight, Gleason score, ISUP score, prostate cancer treatment, initial prostate-specific antigen (PSA) level, and last known PSA level before the PET scan.

Every patient received an information letter validated by the Jean Perrin Cancer Center data protection department and was free to refuse use of their data throughout the study.

Image acquisition

The tracer used is 68Ga PSMA-11, also called HBED-CC, Glu-urea-Lys(Ahx)-HBED-CC, or PSMA-HBED-CC [8]. Image acquisition usually begins 60 min after IV injection of 68 Ga-PSMA-11 with an average activity of 2.0 MBq/kg [16] (118–453 MBq) with 4 min per scan step from the vertex to the upper third of the femur. Excluding one patient, the range of activity is thinner (118–224 MBq). A low-dose CT scan was performed for attenuation correction and localization.

Fifteen scans were performed on a PMT-based Discovery 710 Optima 660® scanner, and 15 were performed on a SiPM-based Discovery MIDR® scanner (GE Health, Healthcare, Milwaukee, WI). No cross-validation was made across the two scanners. Both machines have a 700-mm field of view, 256 × 256 matrix, and 2.7 × 2.7 × 3.27 mm3 voxel volumes. The CT parameters of the SiPM-based PET were 124 kV of mean tube voltage (range 120–140 kV), 76.2 mA of mean tube current (range 58–159 mA), a pitch of 0,98, a noise index of 28,2, and a percentage of iterative reconstruction of 40%. The CT parameters of the PMT-based PET were 128 kV of mean tube voltage (range 120–140 kV), 76.2 mA of mean tube current (range 55–123 mA), a pitch of 0.98, a noise index of 27.3, and a percentage of iterative reconstruction of 40%. No cross-validation was done across the two scanners. For each scan, two standard images were reconstructed, the first with VPFX (ordered subset expectation maximization (OSEM) + time of flight (TOF)) and the second with Q.Clear (OSEM + TOF + point spread function (PSF)) [17]. We named these standard reconstructions VP4 and QC4, respectively. The VPFX series were performed with two iterations for 24 subsets.

The algorithm uses a residual learning approach optimized for quantitative (L1 norm) as well as structural similarity. It learns to separate and suppress the noise components while preserving and enhancing the structural components. The networks were trained with paired low- and high-count PET series coming from a wide range of clinical indications and patient BMIs and from a large variety of PET/CT and PET/MR devices (10 General Electric, 5 Siemens, and 2 Philips models). The training data included millions of paired image patches derived from hundreds of patient scans with multi-slice PET data and data augmentation. The list mode allows retrospective reconstruction of images by artificially reducing the count statistic by taking only the data acquired in a given time-interval. For each patient, we used on 1.3.0 of the SubtlePET® post-processing denoising algorithm to create 4 image series equivalent to one-minute, two-minute, three-minute and four-minute steps. We named these series SVP1, SVP2, SVP3, and SVP4 for the VPFX-derived series and SQC1, SQC2, SQC3, and SQC4 for the Q.Clear-derived series, respectively. These series were anonymized by a medical physicist blind to the image interpreting team.

Image analysis

Three nuclear physicians blindly interpreted every series using the PET VCAR module bundled with the General Electrics’ ADW® software. These 3 readers had different degrees of experience: One was a PSMA referent (16-year experience), one was a senior physician (4-year experience), and one was a resident nuclear physician (1-year experience). All three readers had experience reading PSMA PET scans in their daily practice. The VP4 and QC4 series analyzed by the most experienced reader served as the gold standard benchmark for assessing the other series.

Data for lesions suspected of malignancy were collected by classifying the lesions according to anatomical location, i.e., prostate bed, pelvic lymph nodes, secondary bone lesions, or secondary extra-osseous. In the event of lymph node or bone involvement, the precise locations were recorded to facilitate comparison between series. Lesions were classified into 5 levels [18]: negative, equivocal negative, equivocal, equivocal positive, or positive. Lesions labeled as equivocal positive or positive were considered significant.

Image quality

We evaluated the quality of each series based on 3 criteria: overall image quality, interpretability, and visualization or not of suspected lesions. For overall image quality, we used a 5-point Likert scale [5, 19] for noise level, contrast, and signal-to-noise ratio estimated visually (1 = uninterpretable, 2 = bad, 3 = correct, 4 = good, 5 = excellent). We retained levels 3, 4, and 5 as usable in daily practice, thus classifying the series as interpretable.

We also evaluated image quality using two binary indexes: series interpretability and lesion detectability.

Quantitative analysis

Quantitative analyses focused on SUVmean and SUVmax measurements, where SUV is ‘standardized uptake value.’ SUVmean measures the mean activity in a volume, whereas SUVmax retains only the value of the hottest pixel. To evaluate the influence of the algorithm on SUV measurements, we recorded the SUVmax of every lesion found on each examination and background measurement performed.

SUVmax was measured by plotting regions of interest on each lesion by semi-automatic method. For homogenization purposes, the analyses for each specific anatomical location only considered the most intense lesions of each patient. SubtlePET®-induced bias in lesion SUVmax measurements was evaluated using the following formula: (study series SUVmax—reference series SUVmax)/reference series SUVmax [1].

Background measurements were performed by the junior reader using a 2-cm-diameter sphere in the right liver and right gluteal muscle and a 1-cm-diameter circle in the aortic arch [20, 21]. We chose to take the gluteal region as background reference as most lesions in this specific application of prostate pathology are located in the pelvic area. The SUVmean and SUVmax of the background uptake in these regions of interest were measured. Signal-to-background ratio (SBR) was defined by the following formula: lesion SUVmax/background uptake SUVmean [22]. We also performed subgroup analysis based on PSA level, weight, and camera type.

Diagnostic performance of PET series using the SubtlePET® denoising algorithm

To evaluate the diagnostic performance of the SubtlePET® series, we performed a series-by-series analysis per reader. Our gold standard benchmarks were our most experienced reader and the routine-process reconstruction series, i.e., not reprocessed by the denoising algorithm.

True positives were defined as lesions classified as positive by the most experienced reader in the usual reconstruction series (QC4 and VP4) as well as by all readers in all other series. True negatives were defined as lesions classified as negative by the most experienced reader in the usual series and negative by all readers in all other series. False positives were defined as lesions classified as negative or not found by the most experienced reader in the usual series but classified as positive in the other series. False negatives were defined as lesions classified as positive by the most experienced reader in the usual reconstruction series but classified as negative in the other series.

We were thus able to calculate the sensitivity, specificity, and accuracy of each of the series processed by SubtlePET® for each reader.

Statistical analysis

To lend clarity to the statistical analysis, we pooled lesions classified as equivocal positive into the positive group and lesions classified as equivocal and equivocal negative into the negative group.

Subgroup analyses were stratified by patient weight, PSA values, and the two different cameras.

Analyses involving image quality (categorical variables) were compared using Cochran’s Q test for differences between all readers and McNemar’s test for pairwise comparison.

Analyses involving quantitative parameters (continuous variables) were compared by the Student’s paired-samples t test and the Wilcoxon signed rank test.

A p value adjustment was performed́ to account for the multiplicity of tests. Statistical analyses were performed with R software version 4.1.0 (R-Project, GNU GPL, http://cran.r-project.org/).

留言 (0)

沒有登入
gif