Dual time point imaging in locally advanced head and neck cancer to assess residual nodal disease after chemoradiotherapy

In ECLYPS, 123 patients received DTPI, of which 12 patients were excluded due to protocol violations such as SUV values in the liver exceeding normal limits (n = 4), exceeding the time limit between scans (n = 5), motion artifacts (n = 2), or difference in time/bed position between scans (n = 1). Additionally, nine patients were excluded as nodal status could not be assessed: Five patients had recurrence at the primary tumor site or a distant relapse without confirmation of neck status at 12 weeks or beyond, and four patients were lost to follow-up or withdrew their informed consent, leaving 102 patients evaluable for this analysis. Patient and tumor characteristics are summarized in Table 1. The mean interval from therapy to scanning and end of follow-up was, respectively, 12.6 weeks (95% CI: 12.3–12.9) and 22.1 months (95% CI: 20.3–24.0). In this cohort, 16 patients (15.7%) had confirmed residual or recurrent nodal disease, of which 15 had residual or recurrent nodal disease within 12 months after therapy. The mean interval from scanning to detection of nodal recurrence was 113.4 days (95% CI: 54.8–172.0) Overall, the mean uptake time of the first and second dedicated head and neck acquisition was, respectively, 64.0 min (95% CI: 63.1–64.9) and 123.1 min (95% CI: 121.6–124.6), resulting in a mean additional uptake time of 59.1 min (95% CI: 57.8–60.5). The mean administered activity of FDG was 277.0 MBq (95% CI: 266.6–287.4).

Table 1 Patient and tumor characteristicsEarly (SUV1) versus delayed (SUV2) SUV measurements

The optimal SUV cutoff to differentiate benign from malignant nodes was 2.2 for both early and late images and was independent of the chemotherapy schedule used. Both SUV1 and SUV2 were significantly higher in malignant lymph nodes compared to benign nodes (P = 0.01), although there was a clear overlap (Table 2, Fig. 1A). In malignant nodes, SUV2 was significantly higher compared to SUV1 (median SUV2 = 2.7; IQR 1.9–4.7 vs. SUV1 = 2.6; IQR 1.6–5.9; P = 0.04). In contrast, FDG uptake did not differ significantly between the delayed and early acquisition in benign nodes (median SUV2 = 1.7; IQR 1.5–2.2 vs. SUV1 = 1.8; IQR 1.4–2.2; P = 0.28). An SUV1 cutoff at ≥ 2.2 resulted in a sensitivity of 66.7% (95% CI: 38.4–88.2%), specificity of 80.5% (95% CI: 70.6–88.2%), PPV of 37.0% (95% CI: 25.2–50.7%), and NPV of 93.3% (95% CI: 87.2–96.7%), AUROC 0.74 (Table 3). The SUV2 threshold (optimal cutoff ≥ 2.2) resulted in similar accuracy with one fewer false-negative (FN) case at the expense of 1 additional false-positive (FP) case.

Table 2 Median (IQR) SUV70 measurementsFig. 1figure 1

Boxplot of SUV70 measurements and RI of the whole study population and of visually equivocal patients. Top panel: boxplot of SUV70 measurements A and RI B of the whole study population (n = 102). Bottom panel: boxplot of SUV70 measurements C and RI D of visually equivocal patients (n = 24). The boxes represent the interquartile range, and the horizontal line represents the median. The whiskers represent the minimal (Q1 − 1.5*IQR) and maximal (Q3 + 1.5*IQR) values. The dots and asterisk indicate outliers and extreme outliers (beyond Q1 − 3*IQR or Q3 + 3*IQR), respectively. In panels B, C, and D, an extreme outlier was excluded to improve scaling

Table 3 Diagnostic performance of SUV70 measurements regarding nodal recurrence within 12 months after CCRTThe retention index

In benign lymph nodes, median RI was negative although highly variable (median RI = − 2.6; IQR − 16.7–4.5; 21.2), while in malignant nodes median RI was positive (median RI = 12.3; IQR − 11.6–25.6; 37.2) and significantly higher (P = 0.018) compared to benign nodes, although there was a clear overlap (Table 2, Fig. 1B). Exploration of potential RI cutoffs, when used in combination with the SUV1 threshold (≥ 2.2), yielded optimal results at RI ≥ 3%. This combined threshold (SUV1 + RI) significantly reduced FP cases by 53% (n = 9) at the expense of increasing FN cases by 20% (n = 1) compared to the SUV1 threshold alone (McNemar exact P = 0.02). This combination consequently leads to a marked increase in specificity (90.8% vs. 80.5%, + 10.3%) and PPV (52.9% vs. 37.0%, + 15.9%), while NPV remained comparably high (92.9% vs. 93.3%, − 0.4%) (Table 3). However, the difference in AUROC, as overall measure of benefit in diagnostic accuracy, was not significant (P = 0.62).

The “visually equivocal” cohort

Visual assessment of the most intense nodal lesion on the early scan assigned a score of 1 to 72 LNs (70.6%), score 2 to 11 LNs (10.8%), score 3 to 10 LNs (9.8%), score 4 to 3 LNs (2.9%), and score 5 to 6 LNs (5.9%). Excluding patients with either a score of 1 or 5 (clear negative and positive cases, respectively) resulted in a cohort of 24 equivocal cases, of which 6 (25%) patients had residual or recurrent lymph node disease within 12 months after the end of chemoradiation. In this subgroup, 10 patients (41.7%) had HPV-associated OPSCC and 4 patients (16.7%) had HPV-negative OPSCC. On both the early and the delayed acquisition, SUV was significantly higher in malignant nodes compared to benign nodes (Table 2). However, neither malignant nor benign nodes showed significant changes in FDG uptake over time (Table 2, Fig. 1C). Consequently, RI was not significantly higher in malignant nodes compared to benign nodes (P = 0.2) (Table 2, Fig. 1D). Applying the optimal SUV threshold on the early acquisition (SUV1 ≥ 2.2) resulted in a sensitivity of 83.3% (95% CI: 35.9–99.6%), specificity of 50.0% (95% CI: 26.0–74.0%), PPV of 22.3% (95% CI: 13.8–34.0%), and NPV of 94.6% (95% CI: 73.3–99.1%) with an AUROC of 0.67 (95% CI: 0.39–0.89). The same threshold on the delayed acquisition reduced FN and FP cases by n = 1 (AUROC = 0.78), leading to an improved sensitivity (100% vs. 83.3%) and specificity (55.6% vs. 50.0%) as compared to the early time point. However, the difference in AUROC was not statistically significant (P = 0.29). Combining the SUV1 threshold with an RI cutoff at 3% yielded an increase in specificity (77.8% vs. 50.0%, + 27.8%) by reducing FP cases (n = 5) at the cost of 1 additional FN case (sensitivity of 66.7% vs. 83.3%; − 16.6%) (Table 3).

Impact of human papillomavirus

Out of the 102 patients in our study, 54 patients had oropharyngeal squamous cell cancer (OPSCC), of which 32 patients had HPV-associated OPSCC, 21 patients were HPV-negative, and in one patient HPV status was not assessed (Table 1). In lymph nodes of patients with HPV-negative OPSCC, SUV1 and SUV2 were significantly higher in malignant nodes compared to benign nodes, whereas in nodes of HPV-associated OPSCC, SUV was not significantly different (Table 4, Fig. 2). Moreover, in nodes of HPV-negative OPSCC, delayed imaging revealed a significant decrease in SUV in benign nodes (P = 0.02) and a borderline significant increase in SUV in malignant nodes (P = 0.07). In contrast, nodes of HPV-associated OPSCC patients had no significant change in SUV over time in neither benign nor malignant nodes (Table 4, Fig. 2), although the small amount of malignant nodes after treatment (n = 3) in HPV-associated disease precludes any firm conclusions.

Table 4 Median (IQR) SUV70 measurements in OPSCC stratified by HPV statusFig. 2figure 2

Boxplot of SUV70 measurements and RI in HPV-negative and HPV-associated OPSCC. Top panel: Boxplot of SUV70 measurements A and RI B in HPV-negative OPSCC. Bottom panel: Boxplots of SUV70 measurements C and RI D of HPV-associated OPSCC. The boxes represent the interquartile range, and the horizontal line represents the median. The whiskers represent the minimal (Q1 − 1.5*IQR) and maximal (Q3 + 1.5*IQR) values. The dots represent outliers. In panel D, two extreme outliers were excluded to improve scaling

In HPV-negative OPSCC, the optimal threshold of the SUV1 parameter (≥ 2.2) resulted in a sensitivity of 83.3% (95% CI: 35.9–99.6%), specificity of 80.0% (95% CI: 51.9–95.7%), PPV of 41.8% (95% CI: 19.7–67.8%), and a NPV of 96.5% (95% CI: 82.1–99.4%), with an AUROC of 0.82. On the delayed acquisition, specificity increased to 93.3% (+ 13.3%), while sensitivity was preserved. This corresponded to a marked increase (+ 26.5%) in PPV (68.3%, 95% CI: 23.9–93.7%), while NPV remained comparably high (97.0%; 95% CI: 84.4–99.5) (Table 3). Analogous to our previous analyses, an optimal RI cutoff was explored in combination with the SUV1 parameter revealing an optimal cutoff at 3%. However, the diagnostic accuracy of the combined threshold was identical to the SUV2 (≥ 2.2) parameter on the delayed acquisition. There was no statistically significant difference in AUROC values. In patients with HPV-associated OPSCC, the diagnostic performance of the SUV1 parameter suffered from low sensitivity (33.3%; 95% CI: 0.8–90.6%). The SUV2 threshold improved the sensitivity, while specificity remained comparable. A combination of the SUV1 threshold with an RI cutoff yielded no benefit in diagnostic performance as it improved specificity at an unacceptable reduction in sensitivity (Table 3).

留言 (0)

沒有登入
gif