Artificial intelligence-assisted assessment for Forrest classification of peptic ulcer bleeding: hype or reality?

  SFX Search  Buy Article Permissions and Reprints

Forrest et al. first described the evolution of peptic ulcer bleeding (PUB) in 1974, and their description was then widely adopted as the gold standard for risk stratification of PUB [1]. Current guidelines recommend the application of endoscopic therapy for patients with Forrest Ia (active spurting), Ib (active oozing), and IIa (nonbleeding visible vessel) [2]; however, the interobserver and intraobserver variation of this classification system may limit its role in practice [3] [4]. It is sometimes difficult to differentiate IIa from IIb lesions (with adherent clot) without proper irrigation and clot removal. Moreover, it can be controversial whether something is a less prominent vessel (IIa) or just a pigmented spot (IIc), with the latter not requiring any endoscopic intervention.

With the rapid advance of artificial intelligence (AI) for image and pattern recognition, including endoscopic images, it is conducive to test whether AI can be applied to the interpretation of the Forrest classification of PUB, in a manner similar to that which has been achieved for colorectal polyp detection and characterization [5]. In this issue of Endoscopy, He et al. report their work on the development of a real-time deep convolutional neural network (DCNN) system for assessment of the Forrest classification of PUB [6]. Using a training dataset of 3868 still endoscopic images and an internal validation set of another 834 still images, they trained the DCNN system to classify the Forrest type. The system was further validated by an external set of 521 endoscopic images, as well as 46 short endoscopic video clips of PUB. The overall accuracies in the still image and video datasets were 91.2% and 92.0%, respectively. The accuracy achieved was also numerically higher than that of both junior and senior endoscopists.

“the actual benefits of this system, particularly in patients who would need better delineation of their Forrest classification, are still uncertain in the absence of a proper prospective evaluation in real patients.”

Similar findings have been reported by Yen et al., who showed that the deep learning model performed better than inexperienced endoscopists in the interpretation of Forrest classification [7]. Despite the promising performance of the reported AI systems, there are some limitations to the study of He et al. [6]. As the system was trained on selected endoscopic images collected retrospectively, validation should ideally have been performed on unselected prospective patients with PUB, rather than from the selected endoscopic images and video clips. These images and videos were usually captured by experienced endoscopists for illustrative purposes or clinical records, and typically all poor quality images were excluded. Because endoscopy for upper gastrointestinal bleeding (UGIB) is a dynamic procedure, typically with poor endoscopic view owing to the presence of blood in the stomach and the patient’s unstable condition, validation based on these highly selected videos would potentially lead to a falsely high accuracy in a very controlled and artificial setting. Moreover, not all patients with UGIB would have PUB. Therefore real-time prospective validation in an unselected cohort of UGIB patients is mandatory to prove the actual accuracy of this AI model in a real-world setting.

In daily clinical practice, there is little controversy in accurately classifying actively bleeding ulcers (Forrest Ia or Ib). The inconsistencies or variabilities usually fall around Forrest II lesions, particularly in terms of whether endoscopic therapy is indicated. Among the various lesion types, the lowest accuracy in this study was found for IIa lesions, with the sensitivities for IIa and IIb lesions both being <86% [6]. The positive predictive values (PPVs) for all type II lesions were <90%, with the lowest being 81.3% for type IIc lesions. It is also highly likely that the classification would change during the actual endoscopic procedure, with adequate irrigation and suction of blood clots, meaning a IIb lesion might change to a IIa or IIc lesion. Therefore the actual benefits of this system, particularly in patients who would need better delineation of their Forrest classification, are still uncertain in the absence of a proper prospective evaluation in real patients.

Given that even experts sometimes cannot agree on the classification of bleeding stigmata in the Forrest classification [3] [4], the ultimate outcome of the ideal clinical trial in future should be a solid clinical outcome (e.g. the need for endoscopic therapy, rebleeding risk, etc.), rather than the accuracy of the system, which lacks translational impact. A properly designed randomized controlled trial using a clinical outcome is therefore urgently needed.

With the advance of endoscopic imaging systems, it would also be interesting to note whether the use of new image-enhancement technology, such as red dichromatic imaging (RDI), designed for better visualization of a bleeding point [8], could help to boost the accuracy of the AI system for Forrest classification.

While AI development in endoscopy is exciting, its role and impact should be evaluated by actual clinical outcomes, rather than purely measuring the accuracy of labelling lesions. Future studies should also evaluate the application and integration of AI assistance into our daily endoscopy practice, particularly its impact on training and quality assurance.

Publication History

Article published online:
13 March 2024

© 2024. Thieme. All rights reserved.

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany

留言 (0)

沒有登入
gif