The color variation of hematoxylin and eosin (H&E)-stained tissues has presented a challenge for applications of artificial intelligence (AI) in digital pathology. Many color normalization algorithms have been developed in recent years in order to reduce the color variation between H&E images. However, previous efforts in benchmarking these algorithms have produced conflicting results and none have sufficiently assessed the efficacy of the various color normalization methods for improving diagnostic performance of AI systems. In this study, we systematically investigated eight color normalization algorithms for AI-based classification of H&E-stained histopathology slides, in the context of both using images from one center and from multiple centers. Our results show that color normalization does not consistently improve classification performance when both training and testing data are from a single center. However, using four multi-center datasets of two cancer types (ovarian and pleural) and objective functions, we show that color normalization can significantly improve the classification accuracy of images from external datasets (ovarian cancer: 0.25 AUC increase, p = 1.6 e-05, pleural cancer: 0.21 AUC increase, p = 1.4 e-10). Furthermore, we introduce a novel augmentation strategy by mixing color-normalized images using three easily accessible algorithms that consistently improves the diagnosis of test images from external centers, even when the individual normalization methods had varied results. We anticipate our study to be a starting point for reliable use of color normalization to improve AI-based, digital pathology-empowered diagnosis of cancers sourced from multiple centers.
This article is protected by copyright. All rights reserved.
留言 (0)