Stratification of Alzheimer's Disease Patients Using Knowledge-Guided Unsupervised Latent Factor Clustering with Electronic Health Record Data

Abstract

Background: People with Alzheimer's disease (AD) exhibit varying clinical trajectories. There is a need to predict future AD-related outcomes such as morbidity and mortality using clinical profile at the point of care. Objective: To stratify AD patients based on baseline clinical profiles (up to two years prior to AD diagnosis) and update the model after AD diagnosis to prognosticate future AD-related outcomes. Methods: Using the electronic health record (EHR) data of a large healthcare system (2011-2022), we first identified patients with ≥1 diagnosis code for AD or related dementia and applied a validated unsupervised phenotyping algorithm to assign AD diagnosis status. Next, we applied an unsupervised latent factor clustering approach, guided by knowledge graph embeddings of relevant EHR features up to the baseline, to cluster patients into two groups at AD diagnosis. We then prognosticated the risk of two readily ascertainable and clinically relevant AD-related outcomes (i.e., nursing home admission indicating greater need for assistance and mortality), adjusting for baseline confounders (e.g., age, gender, race, ethnicity, healthcare utilization, and comorbidities). For patients remaining at risk one year post-diagnosis, we updated their group membership and repeated the prognostication. Results: We stratified 16,411 algorithm-identified AD patients into two groups based on their baseline clinical profiles (41% Group 1, 59% Group 2). Patients in Group 1 were marginally older at AD diagnosis (age Mean [SD]: 81.4 [9.3] vs 81.0 [8.7], p=.007), exhibited greater comorbidity burden (Elixhauser comorbidity index Mean [SD]: 11.3 [10.3] vs 7.5 [8.6], p<.0001), and more frequently received AD-related medications (47.7% vs 40.9%, p<.0001) than those in Group 2. Compared to Group 1, Group 2 had a lower risk of nursing home admission (HR [95% CI]=0.804 [0.765, 0.844], p<.001), while the two groups had similar mortality risk (HR [95% CI]=1.008 [0.963, 1.056], p=.733). One year after AD diagnosis, 12,606 patients remained at risk (45.7% Group 1, 54.3% Group 2). Consistent with baseline findings, Group 2 had a lower risk of nursing home admission than (HR [95% CI]=0.815 [0.766, 0.868], p<.001) and similar mortality risk as (HR [95% CI]=0.977 [0.922, 1.035], p=0.430) Group 1 in the updated model. Conclusions: It is feasible to stratify patients based on readily available clinical profiles before AD diagnosis and crucially to update the model one year after diagnosis to effectively prognosticate future AD-related outcomes.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This study was supported by the National Institutes of Health under award numbers R01 NS098023 and R01 NS124882 from the National Institute of General Medical Sciences. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

IRB of University of Pittsburgh Medical Center gave ethical approval for this work.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Data Availability

All data produced in the present study are available upon reasonable request to the authors

留言 (0)

沒有登入
gif