This study developed and validated a machine learning model to predict functional outcomes at discharge for patients with aSAH, using data available from the time of admission to the initiation of physical therapy. Unlike models such as SAFIRE and SAHIT, which focus on long-term outcomes (e.g., 2–3-month functional recovery), the SEASAH study targets the transition phase from acute care to rehabilitation, thereby providing actionable insights for early intervention. The model demonstrated excellent predictive accuracy by extracting information from initial findings and highlighting features with statistical significance (AUC, 0.88) and all extracted features (AUC, 0.88). The model’s optimal performance was achieved using eight key features: (1) age, (2) WFNS grade, (3) mFS score, (4) SI, (5) presence of intracerebral hemorrhage, (6) presence of symptomatic cerebral vasospasm, (7) presence of higher brain dysfunction (including attention deficits and aphasia), and (8) presence of complications (e.g., pneumonia and perioperative issues), yielding an AUC of 0.89. By incorporating novel features such as higher brain dysfunction and SI, which are often overlooked in predictive models, SEASAH offers enhanced clinical applicability. Furthermore, the use of SHAP values enhances interpretability, which allows clinicians to understand the contribution of each variable to patient outcomes. This transparency supports better informed decision-making, distinguishing SEASAH as a valuable complement to existing models such as SAHIT and SAFIRE.
Comparison with previously reported predictive factorsAt the cohort level, age was the most significant factor influencing aSAH outcomes. Previous studies have shown that patients aged 75 years are significantly more likely to experience poor outcomes [21]. Park et al. classified patients aged ≥ 75 years as high risk [22]. In the SAFIRE classification, individuals aged < 50 years received 0 points, whereas those in their 50 s received 1 point, those in their 60 s received 2 points, and those aged ≥ 70 years received 5 points, indicating a greater weighting for older age groups [11]. Other than age, whether a patient is elderly is an important factor. Age is the most important predictive factor in various models. However, according to the SHAP score distribution at the individual patient level, some elderly patients achieve good outcomes, underscoring the need for detailed analyses focused on elderly patients with aSAH in future.
The prognosis of aSAH has long been determined based on the severity at onset. Previous studies using the Hunt and Hess grade have shown that the survival rates at discharge or within 30 days is < 95% for grades I and II. Nevertheless, they decrease to approximately 90% for grade III, 68%–76% for grade IV, and 30%–49% or lower for grade V [23, 24]. In some studies using the WFNS grade, > 70% of patients with grades I and II aSAH were able to return home. Meanwhile, approximately half of those with grades III and IV aSAH could not, and almost none with grade V were able to do so [25, 26]. In our study, the WFNS grade was adopted as an indicator of severity, based on the SAFIRE classification. Our study confirmed that severity at onset is a determinant of prognosis, with WFNS grades 1 and 5 showing a particularly strong impact on outcome determination.
In contrast, by incorporating factors related to executive dysfunction and higher brain dysfunction, which have not been the focus of previous studies on the intensive care period but are significant in the mid- to long-term [27], a robust model was constructed. The presence of attention deficits is a significant predictor at the cohort level according to the SHAP score. However, individual patient distribution varies. In contrast, the presence of aphasia in our study data clearly classified the good and poor outcome groups. Additionally, prior research suggests that aphasia is considered to cause greater functional impairment than hemiplegia [28]. Furthermore, aphasia is a predictor of poor outcomes even in patients with mild ischemic stroke [29]. Therefore, cautious evaluation is recommended to detect higher brain dysfunction, including attention deficits and aphasia, at the start of physical therapy, considering of the presence of consciousness disorders.
In addition to higher brain dysfunction, SI was included as a predictive variable in this study, reflecting acute physiological stress response in patients with aSAH. SI, calculated as the ratio of blood glucose level to serum potassium level, serves as an integrated measure of metabolic and catecholaminergic activities, which are essential in the acute phase of aSAH. Elevated blood glucose levels and electrolyte imbalances, particularly hypokalemia, have been associated with poor outcomes in critical care settings, as shown in previous studies on hyperglycemia and potassium levels in patients admitted to the ICU [30, 31]. The inclusion of SI as a composite variable emphasizes its potential utility in capturing these interrelated physiological stress responses. While SI is widely used as a prognostic indicator in Japan, its application in international studies remains limited. However, its simplicity and cost-effectiveness make it a promising tool for evaluating disease severity and guiding clinical decision-making during transition from ICU care to rehabilitation. Recent studies, such as those by Yang et al. [32], further support the prognostic significance of hyperglycemia in critically ill patients, suggesting that SI can complement existing metrics to provide a more nuanced understanding of patient trajectories.
LimitationsFirst, the model was trained and evaluated using datasets extracted from the same population, without independent validation using an external dataset. This lack of external validation restricted our ability to completely assess the generalizability and robustness of the model. Hence, it is considered a significant limitation. External validation is essential to validate the model’s broader applicability, reduce the risk of overfitting, and benchmark its performance against existing tools, such as SAFIRE, in diverse clinical settings. To address this limitation, additional internal validation was performed using k-fold cross-validation (k = 5). This approach allowed us to evaluate the model’s performance across multiple subsets of the dataset and mitigated the risk of overfitting. However, it is still challenging to obtain suitable external datasets with comparable variables and outcomes. The key variables in our model, such as the stress index and specific rehabilitation-related factors, are not widely used in other studies, thereby limiting the availability of compatible datasets. To overcome these challenges, we are planning to conduct a nationwide registry study in Japan. This initiative aims to provide a larger and more diverse dataset while addressing regional characteristics and hospital-specific evaluation standards, which facilitate the future external validation and broader applicability of the model. Nevertheless, it is important to recognize the unique contributions of the SEASAH model, which represents a significant step forward in the outcome prediction for patients with aSAH. By focusing on discharge outcomes, SEASAH addresses a critical transition phase in patient care that is not the primary focus of other models, such as SAHIT and SAFIRE. This enables clinicians to make informed decisions early in the recovery process, particularly in individualizing rehabilitation strategies and optimizing resource allocation.
Second, this study primarily focused on discharge outcomes rather than long-term outcomes, such as 90-day or 2–3-month mRS scores. Discharge outcomes are clinically relevant for guiding early rehabilitation strategies and resource allocation. However, they only represent the initial stage of recovery. Long-term outcomes are essential for providing a more comprehensive understanding of recovery trajectories and assessing the sustained impact of early interventions. Thus, future studies should aim to incorporate long-term outcomes to complement the findings of this study and further validate the predictive value of the model across different recovery phases.
Third, the SEASAH study was a multicenter research conducted at five facilities in Japan. Although the analysis was performed using relatively large dataset, a larger cohort size could be beneficial. To further validate the robustness of the model, a dataset that considers regional characteristics while establishing standards for each hospital (fixed evaluation date) is important. In the future, these tools should be prospectively verified, and this approach must be further optimized with a large multicenter dataset. While the current model is tailored specifically to Japanese patients with aSAH, its applicability to other countries may be limited due to differences in medical systems and acute care practices. However, this study was made possible because it was conducted in Japan, a country with one of the highest aging populations worldwide. The insights gained from this study provide a unique perspective on managing elderly patients with aSAH and their rehabilitation needs. Over the coming years, these findings may serve as a valuable resource for other countries as they face similar demographic transitions and seek to optimize care for aging populations.
Fourth, this study excluded specific patient groups to maintain focus on a homogeneous cohort for predictive modeling. These exclusions included patients who died before starting physical therapy, those with recurrent aSAH, those who had received aneurysm treatment > 72 h after onset, those aged < 20 years, and those with an mRS score of 2–5 before aSAH onset. These criteria were essential to refine the study’s focus and improve the model’s generalizability to typical aSAH cases. However, they may have limited its applicability to more diverse clinical scenarios. In addition, follow-up data on excluded patients were incomplete, which prevented us from conducting a systematic comparison between the included and excluded groups. This limitation emphasized the need for cautious interpretation when generalizing these findings to broader patient populations. Future prospective studies should aim to collect comprehensive data on the excluded groups to better evaluate the impact of these exclusions on predictive modeling and outcomes.
Fifth, this study did not account for the WLST, which could influence outcomes and potentially introduce biases, including self-fulfilling prophecy biases. The lack of systematic data collection on WLST is a limitation of retrospective studies. Therefore, future studies should aim to systematically capture and analyze WLST decisions to better understand their impact on outcomes and improve the robustness of predictive modeling in SAH research.
Finally, in this study, a model using features selected based on descriptive statistics and the SAFIRE classification was constructed. Nevertheless, there may be other important features. In particular, novel features that should be measured and collected that are not included in the dataset, such as troponin T levels upon admission [33] and hemodynamic response during endotracheal suctioning [34], may play an important role in the prediction accuracy and interpretation of the model. Therefore, further investigation should be performed.
留言 (0)