A deep learning‐based model for prediction of hemorrhagic transformation after stroke

1 INTRODUCTION

Acute ischemic stroke (AIS), which has high mortality and disability rates, is seriously harmful to human life [1]. Endovascular thrombectomy (EVT) has been proven to benefit AIS patients with large vessel occlusion (LVO) [2-4]. However, the serious complication of hemorrhagic transformation (HT) often occurs after EVT [5]. Recent studies have shown that the incidence of HT after EVT can be as high as 31.9% [6] and increased morbidity and mortality [7]. Identifying this serious complication could help to adjust periprocedural management, especially by predicting needs for intensive care among patients at high risk of HT.

Neuroimaging, especially magnetic resonance imaging (MRI), has shown to be effective in identifying tissue at risk of infarction [8]. Additionally, diffusion-weighted imaging (DWI) and perfusion-weighted imaging (PWI) can detect tissue ischemia earlier than other conventional neuroimaging modalities in experimental and clinical settings, which may inform on the risk of HT [9]. Previous studies have reported that the large initial lesion volume on DWI, a large area of perfusion loss, and regions with very low cerebral blood volume (CBV) were important for predicting HT [6, 10-12]. However, as a single measure, none of these MRI-based indices has been shown to be able to reliably identify tissue at risk of HT prior to EVT. In addition, automated MRI PWI–DWI mismatch estimation may be significantly different in individual patients when using different software packages [13]. Thus, because of the various risk factors, postprocessing software, settings, and chosen parameters, it is still difficult to predict the HT after EVT early in AIS patients in the clinical workflow.

Deep learning (DL) is a form of representation learning—in which a machine is fed with raw data and develops its own representations needed for pattern recognition—that is composed of multiple layers of representations [14]. Although DL algorithms require a large amount of data to function, they have exceeded the capabilities of classical statistical machine-learning (ML) techniques on specific imaging tasks like multiclass classification [14, 15]. Convolutional neural networks (CNNs), a type of DL algorithm, have grown to be central in this field. CNN methods take image data as input and iteratively warp it through a series of convolutional and nonlinear operations until the original raw data matrix is transformed into a probability distribution over potential image classes [16]. CNNs have shown strong performances in the prediction of tissue outcome [17] and detection of penumbral tissue in AIS [18]. It also has the advantage of including simultaneously both multiple-input biomarkers and spatial information and being capable of modeling complex interplays between the input images. Overall, the predictive results from CNNs yield an infarction probability, providing a much-needed certainty level, and CNN may be a suitable candidate for predicting HT in AIS patients receiving EVT, which is not yet reported.

In this study, we developed and validated DL models to automatically predict HT in AIS patients receiving EVT by using multiparameter on DWI and PWI images. We hypothesized that the CNN model can be used to provide predictive information before therapy for assisting the periprocedural management in AIS patients with EVT.

2 MATERIALS AND METHODS 2.1 Patient selection and clinical data

From January 2016 to October 2019, data from Nanjing First Hospital and the Affiliated Jiangning Hospital of Nanjing Medical University were collected. Patients with AIS were included in this study if (1) they are first-ever AIS within 24 h from the onset, (2) DWI and PWI examinations were performed before EVT therapy, (3) receiving EVT therapy or bridging therapy (both intravenous thrombolysis [IVT] and EVT) according to the guidelines for managing AIS, and (4) follow-up MRI or noncontrast CT within 24 h after EVT therapy. Patients with previous intracranial hemorrhage, brain surgery, large territorial lesion, or subarachnoid hemorrhage after EVT therapy were excluded. All patients in this study provided written informed consent before examination and treatment. The study was approved by the local ethics committee of the Nanjing Medical University. Finally, a total of 338 patients from Nanjing First Hospital (data set 1) and 54 patients from the Affiliated Jiangning Hospital of Nanjing Medical University (data set 2) were included. Data set 1 was used to train and test the models and data set 2 was preserved as an independent external validation set. The flowchart of this study is shown in Figures 1 and 2.

The demographic data of the cohorts

The flowchart of the study. (1) Data acquisition: all patients suspected of acute stroke underwent head CT examination to exclude hemorrhage. Clinical data were collected and normalized by min–max normalization method. Patients who met the criteria for intravenous thrombolysis (IVT) therapy received IVT after CT scanning, and MRI examination was followed scanned immediately. Then patients who met the criteria for endovascular thrombectomy (EVT) therapy performed EVT immediately. All patients underwent CT after EVT within 24 h. According to the follow-up CT, the patients were divided into hemorrhagic transformation (HT) group and the no HT group. (2) Model training: data set 1 was split into training (75%) and testing (25%) subsets by stratified random sampling, and data set 2 was used for validating. The labels of two forms (VOI data sets and slice data sets) were trained and validated. Then the final result was evaluated

HT was independently reviewed on follow-up noncontrast CT or cranial MRI within 24 h after EVT therapy by two neuroradiology staff (Y-CC, attending doctor with 4 years of experience in neuroradiology, and XY, director with 10 years of experience in neuroradiology). In case of discrepant assessment results between the two readers, a consensus was established. HT was categorized according to the Heidelberg Bleeding classification (HBC) [19] and European Collaborative Acute Stroke Study (ECASS) classification [20] (see Table 1). HT and contrast agent extravasation were distinguished by comparing the previous CT or MRI scans, and the result was confirmed on CT or MRI scans at 2–7 days after the therapy.

TABLE 1. Overview of bleeding events, categorized with HBC and the ECASS classification, according to anatomical, descriptive, and clinical features HBC ECASS Patients with HT (data set 1; n = 88) Patients with HT (data set 2; n = 15) Description 1a HI1 4 (4.55%) 0 Scattered small petechia, no mass effect 1b HI2 7 (7.95%) 1 (6.67%) Confluent petechia, no mass effect 1c PH1 13 (14.77%) 2 (13.33%) Hematoma within infarcted tissue, occupying <30%, no substansive mass effect 2 PH2 34 (38.64%) 6 (40.00%) Hematoma occupying >30% or more of the infarcted tissue, with obvious mass effect 3a – 2 (2.27%) 0 Parenchymal hematoma remote from infarcted brain tissue 3b – 0 0 Intraventricular hemorrhage 3c – 27 (30.68%) 6 (40.00%) Subarachnoid hemorrhage 3d – 1 (1.14%) 0 Subdural hemorrhage Abbreviations: ECASS, European Collaborative Acute Stroke Study; HBC, Heidelberg Bleeding classification; HI, hemorrhagic infarction; HT, hemorrhagic transformation; PH, parenchymatous hematoma. 2.2 MRI protocols and analysis

MRI on admission and follow-up MRI were performed on 3.0 Tesla MRI scanner (Ingenia, Philips Medical Systems) with an eight-channel receiver array head coil. The DWI images were acquired using a spin-echo (SE) sequence with the following parameters: repetition time (TR), 2501 ms; echo time (TE), 98 ms; acquisition matrix, 152 × 122; three directions; field of view (FOV), 230 × 230 mm; flip angle (FA), 90°; slices, 18; slice thickness, 6 mm; intersection gap, 1.3 mm; and b values, 0 and 1000 s/mm2. DSC-PWI images were acquired using a T2*-weighted gradient recalled echo (T2*GRE) sequence with the following parameters: TR, 2000 ms; TE, 30 ms; acquisition matrix, 96 × 93; FOV, 224 × 224 mm; FA, 90°; slice thickness, 4 mm; and scan time, 88 s. Fifty phases and 20 images were obtained from each phase. During dynamic acquisition, a dose of 0.1 mmol/kg of contrast agent (Magnevist, Bayer Schering Pharma, Germany) was injected at a rate of 4 mL/s.

The PWI data were analyzed by using a Philips advanced workstation. The arterial input function (AIF) was selected by manually identifying the M2 segment of the MCA ipsilateral to the acute infarction. The cerebral blood flow (CBF), CBV, time to peak (TTP), and mean transit time (MTT) maps were generated from circular singular value decomposition of the concentration–time curve.

2.3 Preprocessing

As DICOMs with different modalities might have different pixel intensity ranges, we converted the pixel intensity of DWI into [−2200, 2800] and PWI into [0, 4095] to keep the intensity range consistent. In order to achieve the same contrast and brightness and easy to display lesions, we found that the lesions display of all patients was the best when the window width and level were 0–255. Therefore, we linearly compressed the pixel intensity range into [0, 255] of all DICOMs. Then we used the OpenCV library to histogram equalize the image to enhance contrast. Images were then saved as PNG files.

Furthermore, the clinical data (age, gender, NIHSS score on admission, time from MRI to onset, time from MRI to EVT therapy, and history of hypertension, diabetes mellitus, hyperlipidemia, homocysteine levels, and atrial fibrillation) were collected (Table 2). The patient's clinical data were entered into the CNN model in text form. Age was encoded by means of normalization. Sex, smoking, alcohol drinking, diabetes mellitus, hypertension, atrial fibrillation, hyperlipidemia, and homocysteine were encoded using a one-hot method. The NIHSS was encoded into a one-dimensional vector. Finally, a total of the 10-dimensional vector was extracted.

TABLE 2. Comparison of no HT and HT group in acute stroke patients after EVT Data set 1 Data set 2 No HT (n = 250) HT (n = 88) p value No HT (n = 39) HT (n = 15) p value Gender (male), n (%) 165 (66.00%) 48 (55.55%) 0.072 23 (58.97%) 8 (53.33%) 0.765 Age (years), mean ± SD 65.78 ± 9.68 70.93 ± 11.29 <0.001 64.22 ± 8.57 71.38 ± 10.19 <0.001 Time from MRI to onset (min), mean ± SD 202.72 ± 77.36 218.17 ± 114.19 0.159 217.19 ± 89.39 220.03 ± 102.41 0.211 Time from EVT to onset (min), mean ± SD 285.25 ± 137.27 314.47 ± 104.56 0.070 301.36 ± 124.71 319.07 ± 128.11 0.127 Smoking, n (%) 57 (22.80%) 16 (18.18%) 0.452 7 (17.95%) 2 (13.33%) 1.000 Alcohol drinking, n (%) 25 (10.00%) 15 (17.05%) 0.086 4 (10.26%) 2 (13.33%) 1.000 Diabetes mellitus, n (%) 60 (24.00%) 25 (28.41%) 0.475 8 (20.51%) 4 (26.67%) 0.719 Hypertension, n (%) 197 (78.80%) 72 (81.82%) 0.645 31 (79.49%) 12 (80.00%) 1.000 Atrial fibrillation, n (%) 102 (40.80%) 46 (52.27%) 0.080 14 (35.90%) 7 (46.67%) 0.541 Hyperlipidemia, n (%) 22 (8.80%) 9 (10.23%) 0.672 3 (7.69%) 2 (13.33%) 0.610 Homocysteine, n (%) 18 (7.20%) 7(7.95%) 0.815 3 (7.69%) 2 (13.33%) 0.610 NIHSS on admission, n (%) 10.91 ± 3.87 13.90 ± 2.46 <0.001 11.26 ± 4.41 14.79 ± 6.65 <0.001 Reperfusion therapy, n (%) 0.362 1.000 EVT 195 (78.00%) 73 (82.95%) 30 (76.92%) 12 (80.00%) IVT + EVT 55 (22.00%) 15 (17.05%) 9 (23.08%) 3 (20.00%) Abbreviations: EVT, endovascular thrombectomy; HT, hemorrhagic transformation; IVT, intravenous thrombolysis; NIHSS, National Institutes of Health Stroke Scale.

All DWI and PWI images of AIS patients were derived from DICOM format and then converted all DICOM format into NII format by using MRIcron software (https://www.nitrc.org/projects/mricron). The high-intensity signal infarction area on DWI and abnormal perfusion area on CBF, CBV, MTT, and TTP were drawn as volumes of interest (VOIs) manually using ITK-SNAP software (http://www.itksnap.org/pmwiki/pmwiki.php). All VOIs were performed in consensus by two board-certified neuroradiologists (LJ, 9 years of experience in neuroradiology, H-YC, 8 years of experience in neuroradiology) who were blinded to the clinical data. We developed and evaluated the HT prediction model in two forms. First, the VOIs of DWI, CBF, CBV, MTT, and TTP were analyzed as labels individually or together (VOI data set). Then, the axial images of VOIs (DWI, CBF, CBV, MTT, and TTP) were analyzed as labels individually or together (slice data set).

2.4 Deep neural network architecture and experiments 2.4.1 Single parameter model

For detecting the clinical features in predicting HT in AIS patients after EVT, the clinical model was built, which involved three fully connected layers and one sigmoid layer. The network used in our study of radiomics model was the Inception V3 [21], which offers compact end-to-end CNN structures that maintain high-resolution multiscale features. The schematic diagram of the basic CNN model was shown in Figure 3. The feature vectors (1024 × 1 × N) extracted from all the images (N means the number of images of one patient) of a given single parameter were merged into a one-dimensional vector (1024 × 1) to form the input tensor of the patient-level model with a Max pooling layer. This layer was used to uniform the different image sequence lengths. At the end of the model, two FC layers (the first layer had 1024 nodes, and the second layer had two nodes for HT and no HT) were added, and the prediction of the HT risk used the Softmax function.

The schematic diagram of the basic CNN model

2.4.2 Multiparameters model

Based on the single parameter models (DWI, CBF, CBV, MTT, and TTP), the multiparameter model was further established by considering the patient's multiparameter image features synthetically (Figure 4A). The multiparameter models were an ensemble of single parameter models and included four kinds of multiparameters: multiparameter PWI model (combining two PWI parameters of the best prediction efficiency); clinical + PWI parameters model (combining clinical data and the above two PWI parameters); DWI + PWI parameters model (combining DWI parameter and the above two PWI parameters); and clinical + MRI parameters model (combining clinical data, DWI parameter, and the above two PWI parameters). The methods are as follows: first, these separated single-sequence models were trained and optimized, respectively. All models were CNN trained with the combined set of training images. We used Inception V3 as the convolutional neural network architecture that was pretrained on ImageNet. Then the features extracted from each pretrained single parameter model were concatenated to a tensor. Later, two layers of fully connected layer and Softmax layer after were added following the concatenation layer to classify HT or no HT. Furthermore, the compounded model was developed and validated by incorporating the multiparameter radiomic and clinical features (Figure 4B). The clinical feature tensor was got through three fully connected layers. The weights of each pretrained single parameter model and clinical model were frozen as the weight of the corresponding channel, and only the following fully connected layers were refined during the training process. Of note, “MT” represents the multiparameter model of MTT and TTP; “MTC” represents the multiparameter model of MTT and TTP and clinical; “DMT” represents the multiparameter model of DWI and MTT and TTP; and “DMTC” represents the multiparameter model of DWI and MTT and TTP and clinical.

An overview of the patient-level multiparameter model. (A) Patient-level single parameter module: first, learning intraimage features (1024 × 1) from one patient's N image based on image-level model (Inception V3) and concatenating features into two-dimensional vectors (1024 × N). And then merge the two-dimensional vectors into a one-dimensional vector by the max pooling layer. Finally, the interimage features are learned through two FC layers. Based on the patient-level single parameter module, the patient-level multiparameter module was developed and validated by considering the patient's multiparameter radiomic features. (B) Furthermore, using the concatenation layer to concatenate the multiparameter radiomic features and clinical characteristics, then two layers of fully connected layer and Softmax layer were added

2.4.3 Training details

As a result of the disequilibrium of our data set, during the training process, the HT data set was dynamically oversampled on the patient level to get a balance between HT and no HT data set. The training data set was augmented by adding rotation (rotation range = [−20°, 20°]), two shifts (width shift range = [0, 0.2]), height shift range = [0, 0.2]), and zoom (zoom ratio = 0.2) variants for each image. Volumes were randomly extracted from preprocessed input and label images for training. With sufficient data augmentation, the network could be prevented from overfitting.

The stochastic gradient descent (SGD) optimizer was used as the optimization method for model training. The learning rate was 0.01, the momentum was 0.9, and the decay rate was 1.0 × 10−6. The mini-batch size was set to 16. The models were trained with the iteration stopping criteria, when the validation loss drops <0.03% within 10 epochs or the iteration reaches 200 epochs, the training is stopped. All models were implemented using the KERAS framework and all experiments were performed on a workstation equipped with an Intel(R) Xeon(R) E5-2650 v4 CPUs @ 2.20 GHz (two CPUs, 24 cores, two threads/core, 128 GB of memory) and an NVIDIA Tesla M4.

2.4.4 Performance assessment

Data set 1 were split into training (75%) (HT: 187 patients; no HT: 66 patients) and testing (25%) (HT: 63 patients; no HT: 22 patients) subsets by stratified random sampling, and we ensured that images from the same patient remained in the same split to avoid training and testing on the same patient. Then parameters were trained and tested with two forms of labels (VOI and slice image), respectively. Model parameter exploration was performed by five-fold cross-validation on the training data set. During the validation phase, all the metrics were calculated based on the average five-fold cross-validation results. Then, a better model structure was chosen, the model was retrained with all the training data, and the model parameters were saved. To eliminate contingencies in the test results and evaluate the performance of the HT classification model, the results were compared with neuroradiologists’ conclusions and evaluated by several metrics, including accuracy (ACC), SEN, SPC, negative predictive value (NPV), positive predictive value (PPV), receiver-operating characteristic (ROC) curves, and AUC. After the aforementioned process, a final model with the best performance was chosen and validated in the external validation set (data set 2).

2.5 Statistical analysis

All statistical analyses for clinical data were conducted using commercially available software (SPSS for Windows, version 19.0). Continuous data are shown as the mean ± SD and were analyzed by using an independent samples t-test or Fisher's exact test, whereas categorical variables are presented as absolute and relative frequencies and were analyzed by using the χ2 test. p < 0.05 was considered to be statistically significant. Kappa values were used to determine interrater agreement. The ROC differences of the testing set and validating set were evaluated according to Delong et al. [22].

3 RESULTS 3.1 Clinical and demographic information

Of 338 patients enrolled from data set 1, 88 patients (26.04%) had HT after EVT therapy. Of 54 patients enrolled from data set 2, 15 patients (27.78%) had HT after EVT therapy. The interobserver agreement for HT was k = 0.96 (95% CI, 0.93–0.99). Bleeding classification specifics are shown in Table 1. The clinic and demographic information are shown in Table 2. In both data set 1 and data set 2, the age in HT group was older (data set 1: 70.93 ± 11.29 vs. 65.78 ± 9.68, p < 0.001; data set 2: 71.38 ± 10.19 vs. 64.22 ± 8.57, p < 0.001) than that of no HT group, and the NIHSS on admission in HT group was higher (data set 1: 13.90 ± 2.46 vs. 10.91 ± 3.87, p < 0.001; data set 2: 14.79 ± 6.65 vs. 11.26 ± 4.41, p < 0.001) than that of no HT group. There were no significant differences in gender, smoking, alcohol drinking, diabetes mellitus, hypertension, atrial fibrillation, hyperlipidemia, and homocysteine between the two groups of both data set 1 and data set 2 (p > 0.05).

3.2 Evaluation of single parameter model

The results are listed in Tables 3 and 4. For the single parameter model, the result of the single clinical model (AUC = 0.680) to predict HT after EVT was the lowest, with the ACC was 0.659. The results of the model based on image features were significantly better than those based on clinical features. The VOI data set test results of AUCs from MTT (AUC = 0.933) and TTP (AUC = 0.916) were larger than other parameters (Figure 5A), with the ACC was 0.843 and 0.873, respectively. The slice data set test results of AUCs from MTT (AUC = 0.945) and TTP (AUC = 0.889) to predict HT after EVT in AIS patients were also larger than other parameters (Figure 5B), with the ACC was 0.833 and 0.818, respectively. Other metrics such as SEN, SPC, PPV, and NPV in Tables 3 and 4 presented similar trends as the AUCs. For the single parameter model, the results of the slice data set were worse than the VOI data set.

TABLE 3. Performance comparison of the VOI CNN methods by using different MRI parameters Model Parameters ACC SEN SPC PPV NPV AUC The single parameter model Clinical 0.659 0.778 0.613 0.420 0.885 0.680 DWI 0.815 0.700 0.847 0.560 0.910 0.830 CBF 0.773 0.9375 0.720 0.517 0.973 0.835 CBV 0.823 0.815 0.827 0.629 0.925 0.878 MTT 0.843 0.926 0.813 0.641 0.968 0.933 TTP 0.873 0.889 0.867 0.706 0.956 0.916 The multiparameter model MT 0.863 0.926 0.840 0.676 0.969 0.924 MTC 0.882 0.895 0.880 0.630 0.973 0.924 DMT 0.873 0.852 0.867 0.700 0.941 0.933 DMTC 0.892 0.864 0.900 0.704 0.960 0.948 DMTC* 0.884 0.859 0.890 0.703 0.955 0.939 Abbreviations: ACC, accuracy; AUC, area under the receiver-operating characteristic curve; CBF, cerebral blood flow; CBV, cerebral blood volume; CNN, convolutional neural network; DMT, DWI & MTT &TTP; DMTC*, DWI & MTT & TTP & clinical of external validation set; DMTC, DWI & MTT & TTP & Clinical; DWI, diffusion-weighted imaging; MT, MTT&TTP; MTC, MTT&TTP& Clinical; MTT, mean transit time; NPV, negative predictive value; PPV, positive predictive value; SEN, sensitivity; SPC, specificity; TTP, time to peak; VOI, volume of interest. TABLE 4. Performance comparison of the Slice CNN methods by using different MRI parameters Model Parameters ACC SEN SPC PPV NPV AUC The single parameter model Clinical 0.659 0.778 0.613 0.420 0.885 0.680 DWI 0.667 0.579 0.690 0.333 0.860 0.609 CBF 0.742 0.625 0.78 0.476 0.867 0.689 CBV 0.667 0.929 0.58 0.417 0.967 0.702 MTT 0.833 1.000 0.78 0.592 1.000 0.945 TTP 0.818 0.875 0.80 0.583 0.952 0.889 The multi parameter model MT 0.853 0.925 0.827 0.658 0.969 0.896 MTC 0.863 0.882 0.859 0.556 0.973 0.913 DMT 0.871 0.818 0.919 0.786 0.932

View original article

BRAIN PATHOLOGY

Like

分享书签

0 0 0 0 0 0 0

More from this channel

A deep learning‐based model for prediction of hemorrhagic transformation after stroke

留言 (0)