MACE prediction using high-dimensional machine learning and mechanistic interpretation: A longitudinal cohort study in US veterans

Abstract

High dimensional predictive models of Major Adverse Cardiac Events (MACE), which includes heart attack (AMI), stroke, and death caused by cardiovascular disease (CVD), were built using four longitudinal cohorts of Veterans Administration (VA) patients created from VA medical records. We considered 247 variables / risk factors measured across 7.5 years for millions of patients in order to compare predictions for the first reported MACE event using six distinct modelling methodologies. The best-performing methodology varied across the four cohorts. Model coefficients related to disease pathophysiology and treatment were relatively constant across cohorts, while coefficients dependent upon the confounding variables of age and healthcare utilization varied considerably across cohorts. In particular, models trained on a retrospective case-control (Rcc) cohort (where controls are matched to cases by date of birth cohort and overall level of healthcare utilization) emphasize variables describing pathophysiology and treatment, while predictions based on the cohort of all active patients at the start of 2017 (C-17) rely much more on age and variables reflecting healthcare utilization. In consequence, directly using an Rcc-trained model to evaluate the C-17 cohort resulted in poor performance (C-statistic = 0.65). However, a simple reoptimization of model dependence on age, demographics, and five other variables improved the C-statistic to 0.74, nearly matching the 0.76 obtained on C-17 by a C-17-trained model. Dependence of MACE risk on biomarkers for hypertension, cholesterol, diabetes, body mass index, and renal function in our models was consistent with the literature. At the same time, including medications and procedures provided important indications of both disease severity and the level of treatment. More detailed study designs will be required to disentangle these effects.

Competing Interest Statement

I have read the journal's policy and the authors of this manuscript have the following competing interests: CJO is an employee of Novartis Institute for Biomedical Research.

Funding Statement

BHM and SD were awarded funding from the MVP Champion initiative, which was a Congressional allocation to VA & DOE, described at: https://www.energy.gov/articles/doe-and-va-team-improve-healthcare-veterans and administered through the Million Veteran Program at the VA ORD. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

This research was conducted with the written approval of the VA Central IRB with project number MVP014.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

All calculations underlying figures will be made available in SI. Underlying EMR contain PHI and cannot be made available.

留言 (0)

沒有登入
gif