Improving detection performance of hepatocellular carcinoma and interobserver agreement for liver imaging reporting and data system on CT using deep learning reconstruction

This retrospective study was approved by our Institutional Review Board, and the requirement for obtaining written informed consent was waived.

Patients

We searched the picture archiving and communication system for all consecutive patients who underwent CT scan for the evaluation of HCC. Figure 1 summarizes the patient inclusion process.

Fig. 1figure 1

Flowchart of patient inclusion process and image analysis. HCC hepatocellular carcinoma

For the HCC group, patients who underwent abdominal dynamic contrast-enhanced CT between October 2021 and March 2022 in which one or more HCCs were identified were included in the study. Patients with four or more HCCs were excluded; according to the guideline [18], those patients will have a potential to be treated with systemic treatment and identifying all lesions have little clinical benefit for its burden on radiologists. A total of 26 patients and 42 HCCs were identified. There were also 5 hemangiomas and 2 focal nodular hyperplasias. However, since the main purpose of this study was to evaluate the detection performance of HCCs, these lesions were not evaluated in the following analyses. Two radiologists (A and B with imaging experience of 5 and 12 years, respectively) established the standard for the diagnosis of HCC with reference to the following modalities: histopathological report (8 lesions), CT image findings with chronological change in size (≥ 50% size increase in ≤ 6 months) (9 lesions) (Fig. 2), combinations of CT and MRI image findings (13 lesions), and CT image findings (12 lesions). All lesions were treated with surgery (7 lesions), liver transplantation (1 lesion), or radiofrequency ablation (34 lesions).

Fig. 2figure 2

A size change for HCC which showed ≥ 50% size increase in ≤ 6 months. The figure shows tumor growth rate and interval time

The inclusion criterion for the non-HCC group was the absence of HCC on abdominal dynamic contrast-enhanced CT in February and March 2022 (the study period was different between the HCC group and the non-HCC group in order to balance the number of patients for these two groups). The absence of HCC was confirmed based on the following modalities: histopathologically 1 patient who underwent liver transplantation, no chronological change with examinations including MRI over 4 months (3 patients), no chronological change with CT examinations over 4 months (18 patients), and image findings at a single CT examination (1 patient). In consequence, 23 patients met the criterion.

A total of 49 patients (26 patients in the HCC group and 23 patients in the non-HCC group) were included in the final analyses (qualitative image analyses part 1 and 2, and quantitative image analyses, as described later). Table 1 shows age, sex, body mass index, hepatitis B viral status, hepatitis C viral status, the presence of histopathologically proven cirrhosis, and CT dose index volume in the HCC and non-HCC groups.

Table 1 Demographic and Clinical Characteristics in the HCC and Non-HCC GroupsCT imaging

All patients underwent CT with a multi-detector row CT (Aquilion ONE; Canon Medical Systems, Otawara, Japan). CT scanning parameters were as follows: tube voltage, 120 kVp; tube current, automatic tube current modulation with SD set at 13.0 Hounsfield units; helical pitch, 0.8125:1; and gantry rotation time, 0.5 s. The concentration and volume of the contrast material were determined based on the body weight: 300 mgI/mL and body weight × 2 mL, respectively, for those weighing < 50 kg; 350 mgI/mL and 100 mL, respectively, for those weighing between 50 and 60 kg; and 370 mgI/mL and 100 mL, respectively, for those weighing > 60 kg. Contrast material was injected via the peripheral vein within 30 s. The arterial, portal, and delayed phase images were scanned with the following delays: arterial phase, using a bolus tracking system (threshold attenuation of 200 Hounsfield units in the descending aorta at the level of the diaphragm; portal phase, 40 s after arterial phase; and delayed phase, 180 s after the beginning of contrast agent injection). From the source data, images were reconstructed with the following algorithms: DLR (AiCE body sharp standard, Canon Medical Systems) and Hybrid IR (AIDR 3D enhanced standard with kernel of FC03, Canon Medical Systems). The following image reconstruction parameters were the same across all the image sets: field of view, 35–40 cm (adjusted to body size), and slice thickness/interval, 3/3 mm.

CT images were anonymized and exported from the picture archiving and communication system in Digital Imaging and Communications in Medicine format.

Qualitative image analyses

In qualitative image analyses, two other radiologists (readers 1 and 2, with 4 and 7 years of imaging experience, respectively, and reader 2 specialized in abdominal radiology) were involved. Qualitative image analyses comprised two parts: HCC detection test and LI-RADS scoring (part 1) and image quality evaluation (part 2). In both parts, single image set, consisting of arterial, portal and delayed phase, was evaluated at a time. The two readers evaluated the images using Image J (https://imagej.nih.gov/ij/).

HCC detection test and LI-RADS scoring (part 1)

In this part, the two readers independently identified HCCs by scoring diagnostic confidence (5, definitely present; 4, probably present; 3, possibly present but uncertain; 2, probably not present; 1, definitely not present) and LI-RADS categories (LR 1 to 5) based on version 2018 [6] and recording the location (CT slice number and liver segment). The two readers were also asked to measure the size of the HCC. LI-RADS was categorized with the evaluations of major features consisting of non-rim arterial phase hyper-enhancement (APHE), nonperipheral washout appearance, enhancing capsule appearance, and size of HCC. However, because previous CT images were not necessarily available in all the patients, the threshold growth was not considered in this study. Ancillary features and tiebreaking rules were applied to upgrade or downgrade category [6], and the final LI-RADS category was used in the analyses.

To avoid overestimating the detection performance in DLR, the two readers were asked to evaluate the DLR image sets in session 1 followed by Hybrid IR image sets in session 2 with 2 weeks wash-out period between the two sessions. The single session was performed within a single day. They were blinded to the image reconstruction algorithm. Furthermore, they were not informed of the purpose of the study. The order of the image sets within the DLR and the Hybrid IR was randomized by the radiologist A before the readers’ evaluation. The time required to evaluate one image set was also measured.

After completion of the two sessions, the readers were asked to score LI-RADS categories for the missed HCCs (the diagnostic confidence was not scored in this process).

Image quality (part 2)

After part 1, the two readers independently evaluated the image sets, in terms of the following:

Depiction of major features of HCC on LI-RADS (APHE, nonperipheral washout, and enhancing capsule) (5, clear depiction; 4, clearer than standard; 3, standard; 2, blurred than standard; and 1, very blurred).

Subjective image noise for the arterial, portal, and delayed phases separately on a 5-point scale (5, almost no noise; 4, less than standard noise; 3, standard noise; 2, more than standard noise; and 1, severe noise).

Image quality on a 5-point scale (5, excellent; 4, better than standard; 3, standard; 2, worse than standard; 1, poor).

In this part, all image sets (including both the DLR and Hybrid IR) were randomized by radiologist A before the evaluation by the two readers. The two readers evaluated images of one reconstruction algorithm at a time (i.e., not in a side-by-side way). The two readers were also blinded to the image reconstruction algorithm.

Quantitative image analyses

The radiologist A placed regions of interest with the size of approximately 100 mm2 on the abdominal aorta at the level of celiac artery origin. The SD of the CT attenuation, which is an indicator of image noise, was recorded in the arterial, portal, and delayed phases. These evaluations were also performed with Image J (https://imagej.nih.gov/ij/).

Statistical analysis

Statistical analyses were performed with EZR version 4.0.0 (https://www.jichi.ac.jp/saitama-sct/SaitamaHP.files/statmed.html) [19], which is a graphical user interface of R version 4.2.0 (https://www.r-project.org/) (R Foundation for Statistical Computing, Vienna, Austria).

Fisher’s exact test and the Mann–Whitney U test were used to compare the demographic and clinical characteristics in the HCC and non-HCC groups.

The results for continuous variables and ordinal scales were compared between DLR and Hybrid IR with the paired t test and Wilcoxon signed-rank test, respectively, except for the comparison for the depiction of enhancing capsule, for which the Mann–Whitney U test was performed. To assess the diagnostic performance in detecting HCCs with the diagnostic confidence score, jackknife alternative free-response receiver operating characteristic analysis was performed with R package of “RJafroc,” and the figures of merit (FOM), which corresponds to the area under the curve in the conventional receiver operating characteristic analysis, were calculated. In calculating the sensitivity for the detection test, diagnostic confidence scores of 3 or more indicated positivity for the presence of lesions. The sensitivities were compared between DLR and Hybrid IR with McNemar’s test. For these comparisons, a P value < 0.05 considered statistically significant.

For the LI-RADS categories, interobserver agreement between the two readers was evaluated with the Cohen’s weighted kappa analysis (quadratic weight was used). Based on the Cohen’s article [20], 95% CI for the kappa value was also calculated. Kappa values of 0.00–0.20, 0.21–0.40, 0.41–0.60, 0.61–0.80, and 0.81–1.00 indicated poor, fair, moderate, good, and excellent agreement, respectively.

留言 (0)

沒有登入
gif