Twenty-eight centers belonging to the Italian Society of Colorectal Surgery (SICCR) joined the study. The database consisted of 1731 total patients who underwent elective surgery between January 2016 and February 2020, of which 1681 met the defined criteria. The dataset contained 171 (10%) missing data overall for the variables, with 1510 complete cases (90%) that were included in the analysis. The mean age of all patients (59.7% males and 40.3% females) was 53 years (SD: 13.0).
Based on the Clavien–Dindo classification, we considered grades 0 and I as “no complications” and grades II, III, IV, and V as “presence of complications”. According to this definition, ten per cent (10%) of patients reported complications (148 patients out of 1510). Characteristics of the sample stratified by complications are reported in Table 1.
Table 1 Demographic and clinical characteristics of the sample stratified by complications yes/noWe can see from Table 1 that all the collected variables were statistically significant among groups “complications” and “no-complications” at the univariate analysis, so they could be all included in the regression analyses. However, based on the structure of the administered questionnaire, we reported the occurrence of collinearity among the variable “preoperative score” and the variables “preoperative prolapse”, “preoperative bleeding”, “preoperative manual reduction”, and “preoperative pain/discomfort”. Backward stepwise selection showed that the choice of variable “preoperative score” reduced both parameters AIC and BIC, so we decided for a model that included this variable and excluded the other four.
Then the dataset was randomly split into “train” and “test” groups, and the characteristics of the sample stratified by group are reported in Table 2.
Table 2 Demographic and clinical characteristics of the sample stratified by group (train or test)In this case, differences among variables were not statistically significant between the two groups, which resulted, therefore, to be comparable.
Logistic regression analysis performed on the “train” group is showed in Table 3.
Table 3 Multivariate logistic regression analysis of predictors for complicationsThe AUC was 0.83 (Fig. 1) and the optimal cut-off value of p defined by the Youden index was 0.24. This model was then performed on the “test” database (302 patients, of which 30 reported complications), with the performance metrics reported in Table 4.
Fig. 1ROC curve of logistic regression model
Table 4 Comparison among model performancesDecision Tree (DT), Support Vector Machine (SVM), and XGBoost (XGB) algorithms were then performed. Confusion matrices of each model is reported in Fig. 2, and discrimination performance of different models was finally compared (Table 4).
Fig. 2Confusion matrices of the included models
The AUC of the multivariate logistic regression model was 0.83, and the balanced test accuracy was 88%, with sensitivity and specificity of 0.83 and 0.92, respectively.
The AUC of the DT model was 0.84, and the balanced test accuracy was 84%, with sensitivity and specificity of 0.80 and 0.89, respectively.
The AUC of the SVM model was 0.84, and the balanced test accuracy was 84%, with sensitivity and specificity of 0.87 and 0.82, respectively.
The AUC of the XGBoost model was 0.88, and the balanced test accuracy was 88%, with sensitivity and specificity of 0.80 and 0.93, respectively.
The models also reported the input features importance to establish the importance of each feature in the competing risk assessment, except for the SVM algorithm which could not provide this information. For LR analysis, feature importance was reported in terms of odds ratio (OR) and p value for significance; for ML models, it was reported in terms of relative importance (the higher the score for a feature, the larger effect it has on the model to predict a certain variable).
According to LR model, the three most important and significant variables that led to increased surgical risk were:
Anesthesia (OR 10.1, p < 0.001)
Mucosal prolapse (OR 3.8, p = 0.002)
Preoperative score (OR 1.1, p < 0.001)
According to DT, the three most important variables that led to increased surgical risk were:
Anesthesia
Pre-operative score
Treatment
According to XGB, the three most important variables that led to increased surgical risk were:
Pre-operative score
Anesthesia
Treatment
SHAPLEY values of ML models are reported in Fig. 3. Notice that this allows to have a feature ranking also for the SVM model, for which no feature importance is structurally provided. In such plots, for each feature, all samples of the dataset are horizontally distributed according to their SHAPLEY values, so that the larger is the range of the horizontal distribution, the stronger is the impact of that feature on the final prediction (occurrence of complications). Features are ranked from the top (most important) to the bottom (less important). Moreover, the color of each sample marker allows to detect if a feature value tends to influence positively (i.e., toward “complication”) or negatively (toward “no complications”) the associated prediction.
Fig. 3SHAPLEY values analysis of ML models
留言 (0)