A clinical–radiomics model based on noncontrast computed tomography to predict hemorrhagic transformation after stroke by machine learning: a multicenter study

Patient selection

This multicenter retrospective analysis was approved by the institutional review boards of our hospital, and the necessity for patient informed consent was waived.

Clinical data and NCCT images were collected from seven hospitals from June 2012 to December 2021. A total of 822 consecutive patients with AIS were chosen for inclusion. This study included patients with AIS who met the following criteria: (1) were undergoing IVT in accordance with the management guidelines for AIS, (2) had completed NCCT examination before IVT therapy, and (3) underwent a follow-up MRI or NCCT within 36 h after receiving IVT. Patients with head trauma injuries, primary cerebral hemorrhage or brain tumors, hemorrhagic infarction upon admission, insufficient data, and severe artifacts on NCCT images were excluded.

Finally, a total of 517 patients (282 patients without HT and 235 patients with HT) were enrolled. The dataset from six hospitals (the First Affiliated Hospital of Chongqing Medical University, Chongqing General Hospital, Haikou Affiliated Hospital of Central South University Xiangya School of Medicine, the Second People’s Hospital of Hunan Province/Brain Hospital of Hunan Province, the First Affiliated Hospital of Hainan Medical University, Changsha Central Hospital (the Affiliated Changsha Central Hospital, Hengyang Medical School, University of South China)) was randomly divided into training cohort (n = 355) and internal validation cohort (n = 90). Data from the seventh hospital (People's Hospital of Yubei District of Chongqing City), which included 33 patients with HT and 39 patients without HT, were kept as an independent external validation cohort. The flowchart of patients’ preparation is depicted in Fig. 1.

Fig. 1figure 1

Flowchart of patients’ selection (IVT, intravenous thrombolysis; HT, hemorrhagic transformation)

Obtaining clinical data

Clinical data (demographic data and laboratory tests) were obtained. Laboratory tests on admission (including blood pressure, blood glucose levels, and blood lipid levels), initial National Institute of Health Stroke Scale (NIHSS) score, onset-to-CT time, medical history (including smoking (smoking index), drinking (drinking index), previous stroke, diabetes mellitus, and atrial fibrillation) and Trial of ORG 10172 in acute stroke treatment (TOAST) typing of acute stroke etiology were examined separately from the electronic medical record system.

Imaging acquisition

Additional file 1: Table S1 presents the models of CT scanners and scanning parameters used in seven institutions (Additional file).

Reference standard

HT was determined based on the European Co-operative Acute Stroke Study-II trial [26]. CT images show high-density lesions, including hemorrhagic infarction (HI) and parenchymal hemorrhage (PH). In this study, two neuroradiology staff members independently evaluated HT on follow-up NCCT or MRI within 36 h following IVT therapy for all the training and testing datasets without knowledge of the patient outcome (X.H. and L.B.Y., directors with 10 years of experience in neuroradiology). Any discrepancy was resolved by consensus. By comparing prior CT or MRI images, HT and contrast agent extravasation could be differentiated, and the conclusion was supported by examination performed 2–7 days following treatment.

Data preprocessing

Clinical data were processed using Z-score normalization after missing values were filled in using K-nearest neighbor (KNN). Furthermore, the steps of NCCT image normalization were as follows: (a) every NCCT image slice was resampled to a unified pixel dimension size of 1.0 × 1.0 × 1.0 mm3; (b) image intensity of every NCCT image was normalized by the gray-level discretization method with a fixed number of bins (256 bins); The purpose of the two steps was to minimize any potential effects brought on by scanners, scanning parameters. In addition, NCCT images were set in a fixed head window (window level = 50 Hounsfield unit (Hu); window width = 110 Hu). The purpose was to ensure there was less difference while manually drawing lesions.

Radiomics analysis

The region of interest (ROI) of cerebral infarction was manually defined on the axial slices of NCCT images using the 3D-Slicer software, slice by slice, around its perimeter. If the lesion’s border was not clearly visible on the NCCT image, diffusion-weighted images taken within 6 h were used to draw the border.

To ensure the reproducibility of radiomics features, the intra- and interobserver correlation coefficients were computed using the ROIs randomly selected from 20 patients. By comparing the ROIs’ features of radiologists 1, the intraclass correlation coefficient (ICC) was determined (twice, one month apart). Comparing the ROIs’ features of radiologists 1 and 2 allowed for the calculation of the inter-ICC. The features (with both ICCs threshold ≥ 0.95) having good reliability were added to the subsequent analysis (Additional file 1: Figure S1).

By applying the mask of ROIs, radiomics features were extracted based on the 3D-Slicer package (Version no. 4.13.0) (https://www.slicer.org/). Eight categories of radiomics features were obtained as follows: first order; shape; shape 2D; gray-level co-occurrence matrix (GLCM); gray-level run length matrix (GLRLM); gray-level size zone matrix (GLSZM); neighboring gray-tone difference matrix and gray-level dependence matrix.

Finding the best method to select features

Firstly, for clinical data, T-test was used to test the characteristics of significant difference between HT group and non-HT group in the training cohort. Then we compared the five common dimensionality reduction methods (including Least Absolute Shrinkage and Selection Operator (LASSO), Select from Model, Recursive Feature Elimination Cross Validation (RFECV), Recursive Feature Elimination (RFE), and Logistic Regression (LR)) by ten-fold cross validation in the training cohort to choose the best one. And the best method was used to select the most important clinical features (the blue part of Fig. 2).

Fig. 2figure 2

Flowchart of the most important features’ selection (The numbers in parentheses are characteristic numbers; ICC, Intercorrelation Coefficient; LASSO, Least Absolute Shrinkage and Selection Operator; RFECV, Recursive Feature Elimination Cross Validation; RFE, Recursive Feature Elimination; LR, Logistic Regression; Linear SVC, Linear Support Vector Classification; SGD, Stochastic Gradient Descent; SVM, Support Vector Machine; RF, Random Forest; XGB, eXtreme Gradient Boosting)

Secondly, for radiomics features, the unrepeatable features below the ICC threshold were eliminated from the 1037 radiomics features obtained after sketching, leaving 778 features. The best of the five popular methods (LASSO, Linear Support Vector Classification, RFECV, RFE, Tree-based Model) was chosen by ten-fold cross validation in the training cohort to select the most significant radiomics features (the red part of Fig. 2).

Finding the best ML algorithm to build models

Before modeling, the effects of five ML algorithms (eXtreme Gradient Boosting (XGB), Support Vector Machine, LR, Stochastic Gradient Descent, and random forest) were compared to identify the best algorithm by ten-fold cross validation in the training cohort (the yellow part of Table 4) to develop the prediction model of HT.

Building models

Independent clinical prediction factors (p < 0.05) for HT were obtained by the best dimensionality reduction method mentioned above (Fig. 2). Then, they were used to develop a clinical model by the best ML algorithm. And the model was used in the internal and external validation cohorts to test the efficiency.

Using the same process as developing the clinical model, a radiomics model was constructed in the training cohort utilizing the most significant radiomics features which were ultimately selected. In addition, the internal and external validation cohorts also need to verify the radiomics model.

Finally, the clinical–radiomics model, which combined distinct clinical risk variables and important radiomics features, was developed in the training cohort and then validated independently in the internal and external validation cohorts.

Model evaluation

To assess each model’s performance, the receiver operating characteristic curve was created, and the AUC was calculated.

The calibration curve was presented to assess the model’s capacity for calibration, which compares the consistency between real results and the clinical–radiomics model. To assess the combined model’s clinical utility, a decision curve analysis (DCA) was used. The workflow of the radiomics analysis of the ROI and the building model is described in Fig. 3.

Fig. 3figure 3

Workflow of the clinical–radiomics model of predicting HT after IVT (IVT, intravenous thrombolysis; HT, hemorrhagic transformation; DCA, decision curve analysis)

Statistical analysis

All statistical data were analyzed using R software (version 4.1.3) (https://www.r-project.org/). Normal data were presented as means ± standard deviation, and qualitative data were shown as numbers and percentages. A chi-squared test, a two-sample t-test, or the Mann–Whitney U test was used to evaluate the clinical characteristics. To compare the AUCs of various models, the DeLong test was utilized. A two-sided p value < 0.05 was deemed significant for all statistical analyses.

留言 (0)

沒有登入
gif