Machine learning identification of thresholds to discriminate osteoarthritis and rheumatoid arthritis synovial inflammation

Study design and cohort

We compared knee synovial histologic features from two different cohorts of patients undergoing TKR for OA or RA at a high-volume, tertiary care hospital. This was a secondary analysis of OA and RA patients that were identified via electronic medical records or physician referral and enrolled during their preoperative screening visit.

The OA patients were enrolled in the OA subtypes cohort from November 2018 through October 2019. Patients over the age of 45 that met ACR Clinical/Radiographic Criteria, ACR Clinical/Laboratory Criteria [9], or Kellgren-Lawrence (KL) Radiographic Criteria (grades 2–4) for knee OA [9, 10] were included in the study. Patients who had a fracture in the operative knee, a diagnosis of a systemic rheumatic disease such as RA, or any disease other than OA as an indication for TKR were excluded from the study. In addition, three patients were excluded from the study sample after TKR because the pathologist assessment of the arthroplasty explant revealed a rheumatic disease diagnosis masked as OA.

As previously described, RA patients were enrolled in the RA Perioperative FLARE Study from October 2013 to October 2021 [7, 11, 12]. Inclusion criteria for this cohort were patients above the age of 18 who met the American College of Rheumatology (ACR)/European League Against Rheumatism 2010 classification criteria for RA [13] and/or the ACR 1987 criteria for RA [14]. Patients who had any other systemic rheumatic disease or crystalline arthropathy were excluded.

Written informed consent was obtained for all participants. Patients meeting the inclusion/exclusion criteria were enrolled in the respective OA and RA cohorts. Demographic characteristics such as patient age, race, sex, and body mass index (BMI) were collected. Erythrocyte sedimentation rate (ESR), C-reactive protein (CRP), rheumatoid factor (RF), and cyclic citrullinated peptide (CCP) were measured on all OA and RA patients. RF and CCP were measured as part of the standard of care in RA patients, or if unavailable, were performed by serum ELISA as in OA patients.

As per institutional policy, ethical approval for this study was provided by the Institutional Review Board at the Hospital for Special Surgery (IRB #2018-0895 and #2014-233), and the research was performed in accordance with the relevant guidelines and regulations. The study methods and results are described in accordance with the Strengthening of Reporting in Observational studies in Epidemiology (STROBE) guidelines for cohort studies [15].

Tissue processing and histologic scoring

Synovial samples were obtained intra-operatively from 147 OA patients and 60 RA patients. As per the study protocol, orthopedic surgeons were requested to preferentially obtain a research sample from grossly abnormal-looking synovium. Tissue for histological examination was chosen by a pathologist on the basis of gross features including the smoothness and granularity of the synovial surface, red or brown discoloration, and the clarity, dullness, or opacity of the synovial layer, preferentially avoiding regions of electro-cautery effect.

Synovial samples were preferentially obtained from the most grossly inflamed (dull and opaque) area of the synovium. If there was no obviously inflamed synovium, samples were obtained from standard locations: the femoral aspects of the medial and lateral gutters and the central supratrochlear region of the suprapatellar pouch. OA synovial tissue samples were formalin-fixed and paraffin-embedded, and the RA tissues were fresh-frozen in optimal cutting temperature compound. Each tissue biopsy was sectioned at 5-μm thickness and stained with Harris-modified hematoxylin solution and eosin Y (H&E) manufactured by Epredia in Kalamazoo, MI. An expert musculoskeletal pathologist (ED) scored fourteen synovial histologic features in a single section for each patient: lymphocytic inflammation, mucoid change, fibrosis, fibrin, germinal centers, lining hyperplasia, neutrophils, detritus, plasma cells, binucleated plasma cells, Russell bodies, sub-lining giant cells, synovial lining giant cells, and mast cells. Detailed methods for scoring these features are included in the Appendix, some of which are described in prior studies [8] and available at www.hss.edu/pathology-synovitis.

Computer vision analysis of cell density

Pathology slides were digitized using an Aperio AT Turbo Scanner manufactured by Leica Biosystems in Deer Park, IL, USA, with a 20× resolution of the whole slide image. As previously described [8], we applied computer vision techniques on the whole slide images to count the cell nuclei and quantify the amount of tissue present. The whole slide images were deconstructed into smaller image tiles, each covering an area of approximately 0.25 mm2. These tiles were transformed into grayscale, analyzed for different intensity levels, and assigned a metric based on the proportion of the tile determined to contain tissue. Using a combination of techniques—including Otsu’s method [16], the watershed algorithm, and local adaptive thresholding—the cell nuclei were isolated from the tissue within the image. Final nuclei counts were refined using shape filtering and nuclei density was calculated by normalizing the total count of individual nuclei by the tissue area. This method yields a continuous value of mean cell count per mm2 of tissue. Pre-processing the whole slide image into tiles takes an average of 40 min, which enables the computation of nuclei density in under a minute. The open-access code can be downloaded here: https://github.com/sgmitre/ai-histology. See Fig. 1 for representative histological images of varying nuclei densities.

Fig. 1figure 1

Representative images of varying nuclei densities

Data analysis

Demographic characteristics of the OA and RA patients are reported as frequencies, means, standard deviations (SD), medians, and interquartile ranges (IQR). Chi-square tests were used to compare fourteen pathologist-graded histology scores between OA and RA patients. Logistic regression models were performed to distinguish OA vs RA as the outcome and adjusting for fibrosis and mast cell scores with lymphocytic infiltrates.

Supervised machine learning analysis

A supervised machine learning model was built to classify OA vs RA samples using Random forests (Fig. 2). The model inputs were either all fourteen pathologist scores, the computer vision score alone, or both sets of scores combined. The model is selected according to the area under the receiver operating curve (AUC). The hyperparameters of the random forest model we tuned include the number of trees and the depth of each tree, which were optimized with a nested 5-fold cross-validation process (5-fold for the outer loop and 5-fold for the inner loop) [17] from candidate values [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200] and [5, 6, 8, 10, 12, 14, 16, 18, 20], respectively. The outer loop separates the data into 5 equal folds with stratified partition. For each iteration, one specific fold will be used as a testing and the rest 4 folds as training. Then another 5-fold cross-validation procedure will be performed on the training set to estimate the optimal model hyperparameters. The final results were reported using macro-AUC and micro-AUC on the testing data. For micro-AUC, we computed the AUC of each fold and reported the average AUC and standard deviation (SD). For macro-AUC, we concatenated the AUC from all folds of the testing data [18]. Such a nested cross-validation process can help obtain a robust estimation on the model’s generalization performance [17].

Fig. 2figure 2

Overview of the analysis pipeline. OA osteoarthritis, RA rheumatoid arthritis, AUC area under receiver operating characteristic curves. Created with BioRender.com

Additionally, to determine the discriminative power of each individual pathology feature in distinguishing OA vs RA, we treated the feature values themselves as prediction scores for generating the receiver operating characteristic (ROC) curve, based on which the AUC value was calculated. Then, to determine the optimal threshold for a given feature to distinguish OA vs RA, Youden’s J statistic was calculated to obtain the optimal point on the ROC curve, the optimal threshold, sensitivity, and specificity [19]. Finally, feature importance was calculated for the model combining all fourteen pathologist scores and computer vision-generated cell density.

A p-value less than 0.05 was considered statistically significant. Python 3.6 Scikit-Learn 0.24.2 was used for the machine learning analysis, Python Scikit-image 0.17.2 to was used for the computer vision analysis, and Stata version 14.0 was used for descriptive statistics and logistic regression models [20].

留言 (0)

沒有登入
gif