Sepsis has complex, time-sensitive pathophysiology and important phenotypic subgroups. The objective of this study was to use machine learning analyses of blood and urine biomarker profiles to elucidate the pathophysiologic signatures of subgroups of surgical sepsis patients.
MethodsThis prospective cohort study included 243 surgical sepsis patients admitted to a quaternary care center between January 2015 and June 2017. We applied hierarchical clustering to clinical variables and 42 blood and urine biomarkers to identify phenotypic subgroups in a development cohort. Clinical characteristics and short-term and long-term outcomes were compared between clusters. A naïve Bayes classifier predicted cluster labels in a validation cohort.
ResultsThe development cohort contained one cluster characterized by early organ dysfunction (cluster I, n = 18) and one cluster characterized by recovery (cluster II, n = 139). Cluster I was associated with higher Acute Physiologic Assessment and Chronic Health Evaluation II (30 versus 16, P < 0.001) and SOFA scores (13 versus 5, P < 0.001), greater prevalence of chronic cardiovascular and renal disease (P < 0.001) and septic shock (78% versus 17%, P < 0.001). Cluster I had higher mortality within 14 d of sepsis onset (11% versus 1.5%, P = 0.001) and within 1 y (44% versus 20%, P = 0.032), and higher incidence of chronic critical illness (61% versus 30%, P = 0.001). The Bayes classifier achieved 95% accuracy and identified two clusters that were similar to development cohort clusters.
ConclusionsMachine learning analyses of clinical and biomarker variables identified an early organ dysfunction sepsis phenotype characterized by inflammation, renal dysfunction, endotheliopathy, and immunosuppression, as well as poor short-term and long-term clinical outcomes.
IntroductionSepsis, a dysregulated host response to infection leading to life-threatening organ dysfunction, is responsible for more than $20 billion in annual US healthcare expenditures and is associated with 18%-28% mortality.1Singer M. Deutschman C.S. Seymour C.W. et al.The third international consensus definitions for sepsis and septic shock (Sepsis-3)., 2Angus D.C. Linde-Zwirble W.T. Lidicker J. Clermont G. Carcillo J. Pinsky M.R. Epidemiology of severe sepsis in the United States: analysis of incidence, outcome, and associated costs of care., 3National inpatient hospital costs: the most expensive conditions by payer, 2011: Statistical Brief #160. Optimal treatment involves early antibiotic administration, resuscitation, and source control of infection.4Hotchkiss R.S. Moldawer L.L. Opal S.M. Reinhart K. Turnbull I.R. Vincent J.L. ,5Kumar A. Roberts D. Wood K.E. et al.Duration of hypotension before initiation of effective antimicrobial therapy is the critical determinant of survival in human septic shock. With widespread adoption and implementation of this approach, sepsis mortality has decreased over time but remains unacceptably high. Sepsis is a broad syndrome defined and classified by clinical criteria not necessarily reflective of underlying pathological processes, and most sepsis drug trials have failed. Given the heterogeneity of sepsis, identifying subgroups of sepsis patients with unique pathophysiological signatures and treatment responses may be necessary to develop successful targeted therapies.6Seymour C.W. Kennedy J.N. Wang S. et al.Derivation, validation, and potential treatment implications of novel clinical phenotypes for sepsis.Grouping routinely collected clinical variables and biomarkers from sepsis patients with clustering, an unsupervised machine learning technique, will allow the identification of entirely novel phenotypes.6Seymour C.W. Kennedy J.N. Wang S. et al.Derivation, validation, and potential treatment implications of novel clinical phenotypes for sepsis., 7Knox D.B. Lanspa M.J. Kuttler K.G. Brewer S.C. Brown S.M. Phenotypic clusters within sepsis-associated multiple organ dysfunction syndrome., 8Sweeney T.E. Azad T.D. Donato M. et al.Unsupervised analysis of transcriptomics in bacterial sepsis across multiple datasets reveals three robust clusters. Unsupervised machine learning has been used previously to identify subgroups in other complex syndromes, such as acute respiratory distress syndrome (ARDS).9Calfee C.S. Delucchi K.L. Sinha P. et al.Acute respiratory distress syndrome subphenotypes and differential response to simvastatin: secondary analysis of a randomised controlled trial. Given the complex pathophysiology of sepsis and heterogeneity among sepsis patients, a battery of physiologic measurements of organ dysfunction obtained during routine clinical care and blood and urine metabolic and immunologic biomarkers are available to allow early identification of patients at risk for poor short-term and long-term outcomes.6Seymour C.W. Kennedy J.N. Wang S. et al.Derivation, validation, and potential treatment implications of novel clinical phenotypes for sepsis. Illness severity scoring systems such as the Sequential Organ Failure Assessment (SOFA) and Acute Physiologic Assessment and Chronic Health Evaluation II (APACHE II) scores and inflammatory and immunosuppressive biomarkers can forecast mortality within 24 h of sepsis onset and thus may help predict the underlying pathophysiology of various sepsis phenotypes.7Knox D.B. Lanspa M.J. Kuttler K.G. Brewer S.C. Brown S.M. Phenotypic clusters within sepsis-associated multiple organ dysfunction syndrome.,10Cioara A. Valeanu M. Todor N. Cristea V. Lupse M. Early sepsis biomarkers and their relation to mortality., 11Simon L. Gauvin F. Amre D.K. Saint-Louis P. Lacroix J. Serum procalcitonin and C-reactive protein levels as markers of bacterial infection: a systematic review and meta-analysis., 12Masson S. Caironi P. Spanuth E. et al.Presepsin (soluble CD14 subtype) and procalcitonin levels for mortality prediction in sepsis: data from the Albumin Italian Outcome Sepsis trial., 13Liu B. Chen Y.X. Yin Q. Zhao Y.Z. Li C.S. Diagnostic value and prognostic evaluation of Presepsin for sepsis in an emergency department., 14Liu Y. Hou J.H. Li Q. Chen K.J. Wang S.N. Wang J.M. Biomarkers for diagnosis of sepsis in patients with systemic inflammatory response syndrome: a systematic review and meta-analysis. Validating the utility and accuracy of an unsupervised learning model on routinely available clinical variables may provide a basis for future studies differentiating sepsis management efficacy based on phenotype.We used machine learning analyses of clinical variables, as well as blood and urine biomarker profiles to identify phenotypic subgroups in a prospective, longitudinal cohort of surgical sepsis patients. Our objective was to elucidate the pathophysiologic signatures of phenotypic subgroups, with the rationale that a deeper understanding of sepsis phenotypes can inform the development of targeted therapies.
DiscussionUsing 42 blood and urine biomarkers and routinely collected clinical data, we identified two major clusters of patients with surgical sepsis. Inflammatory, renal, and endothelial biomarkers that differentiated cluster I from cluster II included interleukin 8, tumor necrosis factor-alpha, serum creatinine, cystatin C, blood urea nitrogen, anion gap, fluid overload, lactate, angiopoietin 2, and fms-like tyrosine kinase. These biomarkers contributed significantly to differences in composite biomarker mosaics in both clusters, suggesting that systemic inflammation, renal dysfunction, and endotheliopathy were primary drivers of cluster differentiation. Leave-one-out analysis suggested that all biomarkers contributed significantly to primary cluster assignment because excluding any of these biomarkers would result in different cluster assignments. Although agglomerative hierarchical clustering is dependent on data dimensionality (i.e., number of clustering biomarkers), making it more likely that leave-one-out analysis produces different results as a statistical artifact, we found that a naïve Bayes classifier, which is more robust to dimensional changes, was able to successfully reproduce similar clusters on a validation cohort, suggesting that biomarker profiles reflect underlying septic pathology.
Consistent with these biomarker profiles, we observed a greater prevalence of chronic renal and cardiovascular disease in cluster I. Furthermore, cluster I had severe early multiorgan failure with a disproportionately high incidence of cardiovascular and renal disease. These results are consistent with a previous study by Garcia-Obregon et al. (2018),25Garcia-Obregon S. Azkargorta M. Seijas I. et al.Identification of a panel of serum protein markers in early stage of sepsis and its validation in a cohort of patients. in which a similar panel of ten proteins in a prospective cohort of 85 patients predicted sepsis with cardiovascular dysfunction. In addition, cluster I patients had an immunosuppressive phenotype manifest as increased interferon gamma-inducible protein 10 and soluble programmed death-ligand 1. Prior work has not consistently demonstrated concomitant inflammation and immunosuppression, as was observed in our study. In an analysis of peripheral blood leukocyte gene expression among patients with sepsis due to pneumonia, Davenport et al. (2016)26Davenport E.E. Burnham K.L. Radhakrishnan J. et al.Genomic landscape of the individual host response and outcomes in sepsis: a prospective cohort study. found that inflammation and immunosuppression occur in separate, distinct sepsis response signatures. However, a meta-analysis of 949 sepsis patients using hierarchical clustering demonstrated significant inflammation and immunosuppression in early sepsis, similar to results from our study.27Schaack D. Siegler B.H. Tamulyte S. Weigand M.A. Uhle F. The immunosuppressive face of sepsis early on intensive care unit-A large-scale microarray meta-analysis.The unique biologic signatures of clusters I and II corresponded to different illness severity and clinical outcomes. The acute physiology component of the APACHE score differentiated between clusters, similar to results obtained by Knox et al. (2015)7Knox D.B. Lanspa M.J. Kuttler K.G. Brewer S.C. Brown S.M. Phenotypic clusters within sepsis-associated multiple organ dysfunction syndrome. Cluster I had higher Charlson comorbidity indices, suggesting that they had a greater chronic disease burden prior to the onset of sepsis, but the difference in the acute physiological score between clusters I and II was of greater magnitude. Cluster I had a higher incidence of septic shock, which could be explained by the cardiovascular physiological derangement. Cluster I had worse clinical outcomes with higher early and 1-year mortality rates. The 5% in-hospital mortality and 20% 1-year mortality rates observed in Cluster II are lower than the mortality rates observed in other studies of contemporary populations with sepsis; this may be attributable to the early preserved hemostasis biomarker profile and phenotype observed in Cluster II.High heterogeneity in clinical and biomarker characteristics across phenotypes among sepsis patients may provide insight regarding failed sepsis drug trials. Seymour et al. (2019)6Seymour C.W. Kennedy J.N. Wang S. et al.Derivation, validation, and potential treatment implications of novel clinical phenotypes for sepsis. demonstrated that there are subgroups of sepsis patients with unique responses to treatments, offering compelling evidence that broadly applied monotherapies for sepsis and septic shock are likely to continue to fail. Our study does not address the hypothesis that different sepsis phenotypes have different treatment responses but supports the hypothesis that clustering analysis can identify hidden patterns and structures within sepsis patient data, identifying phenotypes with distinct short-term and long-term outcomes. Also, clustering analysis may help improve the performance of prediction and risk-stratification for these outcomes by building separate prediction models for each cluster. These observations were made in a prospective study of a relatively small group of patients, suggesting that it is feasible to perform these clustering techniques in clinical settings.We demonstrate that the application of machine learning analytic methods to a battery of routine clinical, physiologic measurements of organ dysfunction in concert with blood and urine biomarkers of renal function, tissue perfusion, inflammation, and immunosuppression can identify surgical sepsis phenotypes. Although these are done in a small cohort as a proof-of-concept, these findings demonstrate the potential benefit of machine learning-derived clusters of sepsis in treatment. For example, as we utilize routine biomarkers, all procured within hours of ICU admission, machine learning may rapidly identify septic patients at high risk of AKI requiring aggressive resuscitation or renal replacement therapy. Furthermore, as Cluster I is defined by an inflammatory phenotype, these patients may benefit from anti-inflammatory therapy, which can be administered quickly and potentially improve surgical outcomes. Similar approaches have demonstrated efficacy for phenotyping other critical illnesses, suggesting broader implications for understanding the host response to critical illness.28Heterogeneity and phenotypic stratification in acute respiratory distress syndrome. Calfee et al. (2018)9Calfee C.S. Delucchi K.L. Sinha P. et al.Acute respiratory distress syndrome subphenotypes and differential response to simvastatin: secondary analysis of a randomised controlled trial. performed a secondary, latent class analysis to identify acute respiratory distress syndrome subphenotypes in a multicenter, randomized controlled trial database. This analysis identified distinct hyperinflammatory and hypoinflammatory phenotypes with different biological features and clinical outcomes. Perhaps more importantly, the administration of simvastatin conferred a survival advantage that was specific to the hyperinflammatory group, suggesting that the identification of phenotypes can guide patient-specific treatments. Similarly, Antcliffe et al. (2019)29Antcliffe D.B. Burnham K.L. Al-Beidh F. et al.Transcriptomic signatures in sepsis and a differential response to steroids from the VANISH randomized trial. performed a secondary analysis of a randomized clinical trial database of sepsis patients to determine whether phenotypes of sepsis patients have unique responses to corticosteroid administration. Patients with an immunocompetent phenotype had increased mortality after corticosteroid administration compared with placebo. These findings suggest that phenotyping techniques not only elucidate underlying pathophysiology but are also associated with unique treatment responses. Machine learning techniques may be ideal for representing complex disease syndromes like sepsis and acute respiratory distress syndrome because their underlying pathophysiology is beyond the reach of additive and linear statistical approaches.30Loftus T.J. Upchurch G.R. Bihorac A. Use of artificial intelligence to represent emergent systems and augment surgical decision-making.Study limitationsOur study has several limitations. First, we used a small data sample of surgical patients from a single institution, limiting the power and generalizability of these findings. In addition, perhaps the greatest value of phenotyping is the ability to assess responses to targeted therapies. Accomplishing this objective would require the application of biomarker signatures to data from randomized controlled trials, which is feasible but beyond the scope of this study. Second, the method we used to evaluate the importance of each biomarker to the clusters (leave-one-out analysis) was susceptible to false positives. The merge step of agglomerative clustering cannot be reversed and is dependent on the distance, and subsequently the dimensionality of the data. By reducing the dimensions of the dataset in leave-one-out analysis, reperforming agglomerative clustering is more likely to find different results. Despite the reliance of leave-one-out on statistical methods, we still report these findings (1) to show the magnitude of dependence for each variable and (2) to account for the possibility that some biomarkers may be unimportant even given the clustering method’s limitations. Likewise, we used a naïve Bayes classifier to predict clusters in our validation cohort, which makes an assumption of data independence. Although some collected variables, such as BUN and Cr, do not satisfy the independence assumption, naïve Bayes was chosen for its simplicity and outperformance of other alternatives even in some cases where the independence assumption is not met.31Thottakkara P. Ozrazgat-Baslanti T. Hupf B.B. et al.Application of machine learning techniques to high-dimensional clinical data to forecast postoperative complications.,32On the optimality of the simple Bayesian classifier under zero-one loss. Third, we use a single time point, rather than longitudinal data. Although we focus on capturing patient profiles shortly after sepsis diagnosis and exclude patients with advanced liver or heart disease, our approach did not account for the evolution of sepsis prior to and after ICU admission. Thus, it is possible that our clusters represent differences in the evolution of sepsis, with Cluster II representing a resolved state. However, given that our study focuses on exploring the ability of machine learning to derive sepsis phenotypes, longitudinal analyses were deemed out of scope. Finally, we did not perform an external validation of our derived clusters using a large dataset, and this should be considered in the interpretation of our findings.ConclusionsMachine learning analyses of clinical and biomarker variables identified an early organ dysfunction sepsis phenotype characterized by inflammation, renal dysfunction, endotheliopathy, and immunosuppression, as well as poor short-term and long-term clinical outcomes. These efforts to elucidate the pathophysiologic signatures of phenotypic subgroups may provide a deeper understanding of sepsis phenotypes that can inform the development of targeted therapies.
Author ContributionsAB conceived the original idea for the study and obtained funding. TOB and AB had full access to all the data in the study and took responsibility for the integrity of the data and the accuracy of the data analysis. Analyses: RWMAM, YR, QW, LA. Interpretation of data: All authors. The article was written by RWMAM, VP, TL, YR, TOB, HL, PR, and AB with input from all coauthors. All authors participated in critically revising the manuscript for important intellectual content and gave final approval of the version to be published. RWMAM, VP, TL, YR, and HJL contributed equally. AB, PR, and TOB served as senior authors. Azra Bihorac is the guarantor for this Article.
FundingA.B., T.O.B., Q.W., P.E., and M.S. were supported by Sepsis and Critical Illness Research Center Award P50 GM-111152 from the National Institute of General Medical Sciences. A.B. was supported by R01 GM110240 from the National Institute of General Medical Sciences (NIH/NIGMS), 1R01EB029699 and 1R21EB027344 from the National Institute of Biomedical Imaging and Bioengineering (NIH/NIBIB), 1R01NS120924 from the National Institute of Neurological Disorders and Stroke (NIH/NINDS), and R01 DK121730 from the National Institute of Diabetes and Digestive and Kidney Diseases (NIH/NIDDK). T.O.B. was supported by K01 DK120784 , R01 DK123078 , and R01 DK121730 from the National Institute of Diabetes and Digestive and Kidney Diseases (NIH/NIDDK), R01 GM110240 from the National Institute of General Medical Sciences (NIH/NIGMS), R01 EB029699 from the National Institute of Biomedical Imaging and Bioengineering (NIH/NIBIB), and R01 NS120924 from the National Institute of Neurological Disorders and Stroke (NIH/NINDS). PR was supported by the National Science Foundation CAREER award 1750192, 1R01EB029699 and 1R21EB027344 from the National Institute of Biomedical Imaging and Bioengineering (NIH/NIBIB), R01GM-110240 from the National Institute of General Medical Science (NIH/NIGMS), 1R01NS120924 from the National Institute of Neurological Disorders and Stroke (NIH/NINDS), and by R01 DK121730 from the National Institute of Diabetes and Digestive and Kidney Diseases (NIH/NIDDK). T.J.L. was supported by the National Institute of General Medical Sciences of the National Institutes of Health under Award Number K23 GM140268 .
留言 (0)