Machine learning approach for predicting post-intubation hemodynamic instability (PIHI) index values: towards enhanced perioperative anesthesia quality and safety

Study sample and data description

This single center retrospective study was approved by the institutional ethics committee of Chinese Academy of Medical Sciences & Peking Union Medical College (I-23PJ746), with waivers for consent. All data underwent de-identification procedures for the purpose of preserving privacy and ensuring data confidentiality. The manuscript adheres to the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines for observational studies.

The original dataset comprises data from a cohort of 56,083 adult (age ≥ 18 yr) surgical patients who underwent general anesthesia with intravenous induction and endotracheal intubation at Peking Union Medical College Hospital, collected over a span of 9 years from 2012 to 2020. Given our primary objective of establishing a general framework to evaluate post-intubation hemodynamic instability, independent of specific surgical categories, we selectively reserved the patients with American Society of Anesthesiologists (ASA) scores of 1 and 2. This inclusion criterion was employed to mitigate the disproportionate influence of severe complications of the patients with ASA score 3 or 4 on the model complexity and study outcomes. Furthermore, by excluding the outliers and the data with missing values for key variables, the dataset was refined to encompass a cohort of 23,305 patients, thereby ensuring data integrity and enhance the robustness of subsequent analysis. By leveraging the Electronic Health Record System (EHRS) and Anesthesia Information Management System (AIMS) of the hospital, we retrospectively obtained comprehensive data of the study samples including preoperative patient information, initial drug infusion details, and perioperative hemodynamic profiles.

Preoperative patient information

The differences in patient characteristics greatly influence the process of the operation and notably affects the patient anesthetic response. A fundamental tenet of feature selection involves prioritizing attributes derived from readily obtainable basic information during each surgical procedure. This approach ensures the applicability of the prediction framework across a wider spectrum of clinical scenarios. The study utilized basic patient information from the EHRS, including age (median = 47 years, range = 18–91 years), sex (0 = female, 1 = male, female proportion 67.39%), height (median = 164 cm, 25th percentile = 160 cm, 75th percentile = 170 cm), weight (median = 64 kg, 25th percentile = 56 kg, 75th percentile = 72 kg), body mass index (BMI) (median = 23.71 kg/m2, 25th percentile = 21.45 kg/m2, 75th percentile = 26.16 kg/m2) and hypertension (0 = no hypertension, 1 = hypertension, hypertension proportion 14.49%). These parameters were considered in the modeling process according to the empirical experience of anesthetist and data availability. Furthermore, the preoperative physiological characteristics of patients, including systolic pressure (SP), diastolic pressure (DP) and heart rate (HR), were retrospectively obtained from the first stable measurements recorded in the AIMS. The median (25th percentile, 75th percentile) values of these preoperative physiological characteristics were SP = 130 (117, 144) mmHg, DP = 78 (70, 86) mmHg, and HR = 78 (69, 88) BPM (beats per minute), respectively.

Initiatory drug infusion

The initiatory drug infusion, which is the primary pharmacological intervention undertaken by the anesthetist during surgery, constitutes a fundamental step towards achieving favorable anesthesia safety. The collection of initiatory drug infusion data included the dosages of four frequently utilized drugs in general anesthesia, specifically fentanyl, lidocaine, propofol, and rocuronium. The determination of drug dosages is based on the anesthetist’s evaluation of the patient’s clinical status, and is conventionally expressed as the amount of drug administered (in microgram or milligrams) per kilogram (kg) of the patient’s body weight. Table 1 displays the ranges of both absolute and converted drug dosages in the whole dataset, along with their corresponding means and standard deviations. In this study, the converted drug dosages of fentanyl, lidocaine, propofol, and rocuronium are considered as the input features of the model.

Table 1 Drug characteristics. The range and converted range of the drug doses are represented as (minimum, maximum). The median (25th percentile, 75th percentile) of the converted doses for different drugs are also providedIntegrated coefficient of variation

Hemodynamic profiles serve as the primary foundation for constructing the PIHI index. The systolic pressure, diastolic pressure and heart rate were measured six times in 1 minute during each operation and stored in the AIMS, comprising sequences of hemodynamic characteristics. As shown in Eq. 1, by describing the extent of variability concerning the mean value, the coefficient of variation (CV) value were applied to indicate the instability of a sequence. The inherent property of ratio values of the CV serves to eliminate the adverse effects of varying scales and dimensions in different types of sequences. To comprehensively integrate the CV value of different hemodynamic characteristics, we estimate their weights according to Eq. 2. The weight of each hemodynamic parameter is calculated based on the corresponding information entropy [26], which generates a smaller weight for a more homogeneous sequence set. In this study, the CV values captured from the three hemodynamic response sequences were merged to obtain the integrated coefficient of variation (ICV) value as the PIHI index, according to Eq. 3.

$$_=\frac\sum \limits_^K_^s-\frac\sum \limits_^K_^s\right)}^2}}\sum \limits_^K_^s}$$

(1)

$$_s=1+\frac\sum \limits_^N\left[\left(_/\sum \limits_^N_\right)\times \log \left(_/\sum \limits_^N_\right)\right]$$

(2)

$$_i=\sum \limits_^S\left(_s_/\sum \limits_^S_s\right)$$

(3)

where yis is the CV value of the s-th sequence of patient i; \(_^s\) is the k-th measured value in the s-th sequence of patient i; ws is the overall weight of the s-th sequence sets; zi is the ICV value of patient i; K represents the sequence length; N is the number of patients; and S is the number of hemodynamic characteristics, which in this paper is 3.

In general, a larger ICV value could indicate that the patient may be at higher risk of an unexpected PIHI. The original minimum and maximum values were determined to be 0.0055 and 0.6746, respectively. For convenience, the ICV values are normalized to values between 0 and 1, and proposed to be the anesthetic risk index of PIHI in this study. To depict the whole picture of the sequence around the intubation, 5 minutes of data (around 30 data points) before and after the intubation time are collected and utilized to calculate the ICV index. Figure 1 shows the sequences for different patients and the achieved ICV values. After intubation, patient 5 experienced significant unexpected changes in systolic pressure (Fig. 1A), diastolic pressure (Fig. 1B) and heart rate (Fig. 1C), while comparatively, patient 1 showed more stable anesthetic responses. Figure 1D illustrates a discernible contrast in the ICV value between patient 1, exhibiting a lower ICV value of 0.10, and patient 5, demonstrating a larger ICV value of 0.99 due to the unstable or sudden change of hemodynamic characteristics.

Fig. 1figure 1

Sequences of hemodynamic characteristics around intubation and the calculated ICV values. Plots were generated by using the anesthesia monitoring data of 5 specific patients. In A, B and C, the sequences of the systolic pressure, diastolic pressure and heart rate of the 5 patients are presented with a time step of 10 seconds. The time of intubation is plotted as a dotted line to distinguish the pre- and postintubation periods. The calculated ICV values of the patients, as the anesthetic risk index of PIHI, are provided in D. BPM beats per minute, ICV integrated coefficient of variance

Statistical analysis

A correlation analysis was conducted on the various features of the original data, and their Pearson correlation coefficients are depicted in Fig. 2. It is evident that there exists a certain degree of correlation among the variables, with weight exhibiting the strongest negative correlation with rocuronium, yielding a correlation coefficient of − 0.582 (p < 0.01). Additionally, weight demonstrates the highest positive correlation with BMI, with a correlation coefficient of 0.862 (p < 0.01). The proposed ICV index demonstrates a moderate correlation with other variables. Specifically, it displays a relatively higher correlation with age, indicated by a significant Pearson correlation coefficient of 0.204 (p < 0.01).

Fig. 2figure 2

Pearson correlation coefficients among the variables. Different colors are assigned to symbolize the diverse coefficient values. Fentanyl, Lidocaine, Propofol, and Rocuronium represent the converted doses of drugs at initial infusion. PreSP preoperative systolic pressure, PreDP preoperative diastolic pressure, PreHR preoperative heart rate, BMI body mass index

Model architecture

To deeply understand the nonlinear impact of different variables on post-intubation hemodynamic instability, a machine learning framework was constructed for predicting ICV. The model architecture is given in Fig. 3. The input features of the prediction models consist of preoperative patient information (age, sex, height, weight, BMI, hypertension, preoperative systolic pressure, preoperative diastolic pressure, and preoperative heart rate) and the converted doses for different drugs in initiatory infusion (fentanyl, lidocaine, propofol, and rocuronium), which are extracted from the EHRS and AIMS. The PIHI index, that is the ICV value of the patient, is taken as the output feature in the models. We choose the possible variables that may affect the post-intubation hemodynamic stability according to the collective experience of anesthetist and data availability. For example, the preoperative volemia assessment results are not always available in our dataset and can hardly be applied in the machine-learning training process for building up a more general model. Meanwhile, other comorbidities except for hypertension are not taken into account. Because chronic high blood pressure can lead to increased collagen production in the arterial wall [27], resulting in arterial rigidity and decreased vascular elasticity, which may influence the hemodynamic instability. Even though the hypertension may overlap with preoperative systolic pressure to some extent, the machine learning methods with more complex architecture were recommended to better mitigate multicollinearity [28].

Fig. 3figure 3

Model architecture with preoperative patient information, perioperative initiatory drug infusion and ICV. EHRS electronic health record system, BMI body mass index, AIMS anesthesia information management system, MLR multiple linear regression, SVR support vector regression, ETR extra tree regression, MLP multilayer perceptron, XGBoost extreme gradient boosting, SMOTETomek synthetic minority over-sampling technique with Tomek links identification, ICV integrated coefficient of variance

Furthermore, it should be noted that the original dataset is insufficient to facilitate comprehensive model training, owing to the data imbalance pertaining to the predicted ICV value. In light of this, balancing the data distribution could be helpful, e.g., by using oversampling and undersampling methods. Given the certified and widely used techniques for addressing class imbalance, the continuous ICV values are first converted to a typical class imbalance problem. The ICV values are classified into 5 classes, with ranges of [0, 0.2), [0.2, 0.4), [0.4, 0.6), [0.6, 0.8) and [0.8, 1.0]; this establishes a class imbalance in the dataset, with class 1 (10.08%), class 3 (30.29%), class 4 (3.32%) and class 5 (0.16%) as minority classes and class 2 (56.15%) as the majority class. The SMOTETomek (synthetic minority over-sampling technique with Tomek links identification) technique [29] is then utilized to solve this class imbalance problem; SMOTE creates new minority class data by interpolating the adjacent original data in the minority class, and Tomek is used to identify and remove noisy or borderline samples caused by the creation of these new data.

Following the resolution of the imbalanced data issue, the extended dataset was randomly divided into training and testing sets in a 9:1 ratio. The statistics of the input features in the training and testing sets are given in Table 2. We found that the training and testing sets exhibited a comparable distribution, with no significant differences. Ten-fold cross-validation was performed with the training dataset to ensure the robustness of the prediction models by assessing the model performance with averaged evaluation indices over different fold partitions.

Table 2 The statistics of the input features in the training and testing setsMachine learning models

Five typical machine learning models were created to predict the ICV value using multiple linear regression (MLR), support vector regression (SVR), extra tree regression (ETR), multilayer perceptron neural network (MLP), and eXtreme Gradient Boosting (XGBoost) regression methods. MLR was used as a baseline method, as it establishes a linear relationship between the input features and output features. The SVR method produces a classical model that provides a nonlinear solution by mapping input features into a higher-dimensional feature space [30]. As the most commonly used mapping kernel, the radial basis function (RBF) was used in this work to establish the SVR model. ETR [31] is an extension of random forest regression and has been shown to be more reliable than random forest regression in terms of resisting overfitting [32]. MLP is a typical artificial neural network with multiple hidden layers and neuron units and is mainly trained by the backpropagation algorithm [33]. The network parameters were adjusted and updated through iterative computation of their gradients, that is, partial derivative calculations. Finally, XGBoost is a method that operates under the gradient boosting framework [34]. Its decision tree ensembles are composed of sequentially additive trees that learn the residual errors of predictions. All these models were implemented using the mature packages in Python 3.7 [35, 36].

Model hyperparameters

Grid search and cross-validation were adopted to determine the model hyperparameters. The SVR model was trained with RBF kernel coefficient of 100 and a penalty parameter of 1.5 (L2 penalty) to ensure robust model regularization. The optimal ETR model was trained with 890 estimators and without assigning a maximum tree depth. Two hidden layers constructed by 66 and 65 neuron units were set for the MLP model with the Relu activation function. The learning rate and momentum coefficient for the MLP model were 0.001 and 0.7, respectively. The optimal hyperparameters of the XGBoost model were identified with 648 estimators, a maximum tree depth of 16 and a minimum child weight of 9.

Performance metrics

The mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE) and R-squared index (R2) were used to evaluate the prediction performance of the different models based on Eqs. 47.

$$MAE=\frac\sum \limits_^n\left|}_i-_i\right|$$

(4)

$$RMSE=\sqrt\sum \limits_^n}_i-_i\right)}^2}$$

(5)

$$MAPE=\frac\sum \limits_^n\left|\frac}_i-_i}\right|$$

(6)

$$^2=1-\frac^n}_i-_i\right)}^2}^n-_i\right)}^2}$$

(7)

in which \(}_i\) and yi are the predicted and observed ICV values of the anesthetic responses, respectively, of patient i; n is the number of patients; and \(\overline\) is the average ICV value of the anesthetic responses.

留言 (0)

沒有登入
gif