Impact of an AI software on the diagnostic performance and reading time for the detection of cerebral aneurysms on time of flight MR-angiography

Institutional review board approval was obtained for this retrospective study, and the need for written informed consent was waived.

No statistical power analysis was performed. To ensure that a sufficient number of aneurysms would be present in the dataset, a full-text research of our radiology information system (RIS) for known aneurysm cases was performed. The cases that were found in the full-text research were reviewed by a neuroradiologist and were then added to a set of consecutive MRI studies that were acquired at our institution and that included a TOF-MRA, resulting in a dataset of 186 MRI studies in total. The fact that most patients with cerebral aneurysms are referred to our institution with external imaging and that external images are rejected by our version of mdbrain, as well as the fact that most emergency imaging is done with CT at our institution, prevented us from including a larger number of aneurysms in our study.

The imaging technique and parameters were the same as previously described [4]. In brief, the imaging studies were acquired using two clinical 3 T scanners (Achieva, Philips Healthcare; Discovery, GE Healthcare) and one clinical 1.5 T scanner (Achieva, Philips Healthcare), using routine protocols of our institutions. For the acquisition of 3D TOF images, TR ranged from 19.33 to 20.12 ms and TE ranged from 3.68 to 3.80 ms. The slice thickness was 1 mm, and the increment was 0.5 mm. The field of view and matrix size were chosen according to the patient’s characteristics by the radiology technician. The imaging studies were reviewed by two neuroradiologists with 8 years of experience and 15 years of experience, respectively, in reading MR imaging studies of the brain, for the presence of cerebral aneurysms under full consideration of the patient’s medical records, previous and follow up imaging including digital subtraction angiography (DSA). In cases with differing results between the two readers, a consensus was reached to establish a reference standard.

Details of the training of the AI model have been published before and can be found in supplemental material S1. The imaging studies were analyzed by the artificial intelligence-based software mdbrain, version 4, and written reports as well as annotated series were automatically created by the software and imported to the institute’s Picture archiving and communication system (PACS). For each patient, two sets of hanging protocols were created in the institute clinical PACS viewer (Deep Unity, Dedalus Healthcare Group AG, Bonn, Germany), one of which included axial TOF images, maximum intensity projection (MIP) images of the TOF-MRA, and the findings of the software, and the second with only the above mentioned TOF images but without the findings of the software. Also, an Excel sheet (Excel, Microsoft Corporation, Redmond, WA, USA) was created which contained a schematic image of the circle of Willis with checkboxes at typical locations for the presence of cerebral aneurysms. The findings reported in the Excel sheet were automatically exported to an Excel table, and the reading time was measured by calculating the time when the reading was started and the time when it was finished.

Six readers (three medical students with no experience in image interpretation, one neuroradiology resident with 2 years of experience in diagnostic neuroradiology, one radiologist with 6 years of experience in diagnostic radiology and neuroradiology, and one neuroradiologist with 12 years of experience in diagnostic neuroradiology) were asked to review the imaging studies using the hanging protocols and to report their findings in the above mentioned Excel sheet. The readers were only allowed to review the images included in the respective hanging protocol, but they were allowed to create multiplanar reconstructions when needed. The readers had no knowledge of the original reports, the patient’s medical histories, or prior follow up imaging. First, the readers reviewed the hanging protocols including the software’s reports. After a washout period of at least 6 weeks, they read the imaging studies again but now without the assistance of the software.

Statistical analyses were performed with R statistical and computing software, Version 4.0.3 (http://www.r-project.org/) and R Studio, Version 1.2.5033 (http://rstudio.org/download/desktop). On the patient level, sensitivity and specificity were calculated for each reader with and without the use of the software. Only cases where all aneurysms were detected by the reader without any false positive findings were counted as true positives. Cases where both the reader and the reference standard reported no aneurysm were counted as true negatives. When at least one false positive finding was reported, the case was counted as false positive. When at least one aneurysm was missed by the reader, the case was counted as false negative. We also measured the sensitivity on lesion level, the rate of false positives per case, and the reading times. To avoid misinterpretation of aneurysms that were correctly detected by the readers, but with a false localization as false findings, we summarized aneurysms originating from the internal carotid artery (ICA), the carotid terminus, and the posterior communicating artery (PcoA) as ICA aneurysms; and aneurysms originating from the basilar artery, the basilar tip, the distal vertebral arteries, the posterior inferior cerebellar arteries (PICA), anterior inferior cerebellar arteries (AICA), superior cerebellar artery (SCA) or posterior cerebral artery (PCA) as posterior circulation aneurysms. The readers’ findings with and without AI support were compared with McNemar’s test, and the reading times of each reader with and without the use of the AI software were compared by using the Mann–Whitney U test. Normal distribution was evaluated by using the Shapiro–Wilk test. The level of statistical significance was set at p = 0.05. The readers’ diagnostic performances compared to the diagnostic reference standard were evaluated using confusion matrices. The study design is summarized in Fig. 1.

Fig. 1figure 1

Flow chart of the study design. NRad neuroradiologist, Rad radiologist

留言 (0)

沒有登入
gif