Systolic blood pressure, chronic obstructive pulmonary disease and cardiovascular risk

Introduction

Systolic blood pressure (SBP) is a well-known risk factor for cardiovascular diseases.1–3 However, in subgroups with complex underlying health conditions, the association of SBP with cardiovascular outcomes is less well understood. Often, in these patient groups, a so-called J-shaped association is reported, where the association between SBP and risk of cardiovascular events has an optimum, above and below which the risk increases.4 5

In patients with chronic obstructive pulmonary disease (COPD), the relationship remains unclear. Independently, SBP and COPD have both been associated with a higher risk of cardiovascular disease (CVD).2 3 6 7 However, there is a dearth of evidence when it comes to conclusively understanding the relationship between SBP and risk of cardiovascular end points in patients with COPD. A J-shaped association between SBP and risk of cardiovascular events was found in a previous observational analysis using traditional statistical modelling in patients with COPD who were at risk of developing CVD.4 However, observational studies using conventional statistical modelling might be limited in investigating this question. The adjusted variables need to be manually chosen and their relationship assumed by researchers, naturally exposing models to issues of residual confounding. Additionally, in subgroups of patients with multiple comorbidities at baseline and a large number of complicated factors of risk and prevention, confounding factors are lesser understood; as a result, conventional statistical models with insufficient adjustment can result in confounded or spurious J-shaped associations.2 8–10

With the availability of comprehensive electronic health records (EHR) and the advancement of deep learning (DL) causal modelling, the opportunity for more accurate modelling of associations among subgroups with poorer health has arisen.10–12 While traditional modelling requires manual confounder selection, DL approaches such as Targeted Bidirectional EHR Transformer (T-BEHRT) automatically extract latent features that are confounding the association and more accurately estimate risk ratio (RR) in observational settings.10 12

In this study, we applied the T-BEHRT model to evaluate the association between SBP and risk of cardiovascular outcomes in a cohort of 39 602 patients with COPD.

MethodsStudy setting and participants

We used retrospective anonymised EHR data from Clinical Practice Research Datalink (CPRD), an EHR database representative of the UK population that has been validated for epidemiological research.13 14 We used EHR from two data sources within CPRD to identify a cohort of 39 602 individuals with COPD: primary care and secondary care (Hospital Episode Statistics (HES)). Those between 55 and 90 years of age with at least one blood pressure measurement taken between the years 1990 and 2009 were included in this study with index date (baseline) being defined as the date of the first SBP measurement in this time period (online supplemental figure S1). COPD was identified at baseline using phenotyping methods validated for use on CPRD data.15

Figure 1Figure 1Figure 1

Cohort selection flow chart. Process for selecting cohort used in the study of the association between systolic blood pressure (BP) and risk of cardiovascular events in patients with chronic obstructive pulmonary disease (COPD) using observational data from the Clinical Practice Research Datalink (CPRD) database.

This cohort study followed the Strengthening the Reporting of Observational Studies in Epidemiology reporting guidelines.

Exposures

The exposure variable in this study was SBP and was derived from the CPRD measurements dataset. Blood pressure measurement data are recorded by staff at the general practice (GP) during a visit/consultation.14 In our study, we extracted SBP values and excluded measurements <50 and >300 mm Hg as recommended by previously published methods to clean measurements data.16 Next, the exposure status for a patient was calculated as mean of the SBP measurements in the first 12 months after baseline (ie, exposure period). Patients were categorised into six exposure categories of this averaged measure of SBP over the course of the exposure period: <120 mm Hg, 120–129 mm Hg (reference), 130–139 mm Hg, 140–149 mm Hg, 150–159 mm Hg and ≥160 mm Hg.

Outcomes

The primary outcome was fatal/non-fatal CVDs defined as a composite of ischaemic heart disease (IHD), heart failure, stroke and cardiovascular-related death. Secondary outcomes investigated in this study were individual components of the defined primary outcome: (1) IHD, (2) heart failure and (3) stroke. We identified cardiovascular events using three data sources in CPRD: (1) primary care, (2) secondary care (HES) and (3) the Office of National Statistics (cause-specific mortality) using previously published phenotyping algorithms.15 Read codes were used to identify the conditions in the primary care setting while International Classification of Diseases 10th Revision codes were used to identify cases in the secondary care and mortality setting. Follow-up period started 1 year from baseline (ie, following the exposure period). Events within 5 years of the follow-up period (ie, between 1 year and 6 years after baseline) were captured for analysis; this feature of study design was incorporated to avoid conducting association estimation in the time period overlapping with the exposure period (ie, the first 12 months following baseline). Those who had events or left the study within the first 12 months following baseline were removed from the analysis as consistent with similar past studies.10

Statistical and deep learning analyses

For analyses of the primary and secondary outcomes, the DL model, T-BEHRT was implemented.12 The T-BEHRT model is a DL approach that uses minimally processed EHR to estimate RR more accurately than other statistical and DL benchmark models.12 The model incorporates EHR records, specifically diagnoses and medications—longitudinal in nature—along with few static attributes of the patient (ie, sex, smoking status) and adjusts for confounding features in the medical history of the patient (online supplemental figure S2).12 In addition to adjusting for confounders and estimating risk of outcome, the T-BEHRT model estimates probability of being assigned to a particular exposure status (propensity score).12 17 By conducting both outcome and propensity score prediction, the DL framework offers the opportunity to conduct doubly robust estimation using propensity score modelling in order to limit issues of selection bias (further information in online supplemental methods).17

Figure 2Figure 2Figure 2

Forest plot of risk ratio estimates of the Targeted Bidirectional EHR Transformer model with 95% CIs for the association of systolic blood pressure and the primary outcome. From the left, the six exposure groups are shown in first column. Number of events and total number of patients in each exposure group is shown in second column. The forest plot and corresponding risk ratio estimates are shown in the right-most column relative to the reference class, 120–129 mm Hg. The effect size is plotted on a logarithmic scale. For the reference category, there is no CI.

In order to compare our DL approach against established statistical modelling, we implemented logistic regression (LR) modelling to investigate the association between SBP and risk of cardiovascular outcomes in those with COPD. The SBP exposure group was included as a categorical variable. Since we motivated our work with findings from the research conducted by Byrd et al, we adjusted for the same variables as those chosen in their research: sex, age, body mass index (BMI), smoking status (current, former, never a smoker), beta-blocker use, long-acting beta-agonist (LABA) use and inhaled corticosteroid use.4 In a second LR model with an expanded set of predictors including known cardiovascular risk factors, we additionally adjusted for triglycerides (TG), low-density lipoprotein (LDL), total cholesterol (TC), atrial fibrillation, rheumatoid arthritis, severe mental illness (psychosis, schizophrenia or bipolar disorder), chronic kidney disease and diabetes. Diagnoses and medication use were identified using validated phenotyping algorithms.15 18 19 For BMI, TC, TG and LDL, average of the measurements recorded in the 36 months before baseline were computed to minimise issues of random measurement error.2 20 We conducted imputations on missing variables to ensure fairer comparison with the DL approach. Multiple imputations using chained equations were implemented (15 imputations) to impute the continuous and categorical missing variables: BMI, TC, TG, LDL and smoking status. Estimation of RR was conducted using the direct standardisation method (further elaboration in online supplemental methods).21

Five sensitivity analyses were pursued in our studies using the T-BEHRT model. First, we investigated the association of SBP and cardiovascular risk in patients who had not taken antihypertensives during the follow-up period. Antihypertensives are established medications for lowering high blood pressure, thereby potentially attenuating cardiovascular risk; hence, we conducted this sensitivity analysis in order to investigate the undiluted association between SBP and risk of cardiovascular outcomes in patients with COPD.22 Second, to investigate the effects of time period, we limited the investigation to only include those with baseline after 1 January 2001. Third and fourth, to mitigate issues of reverse causality, we investigated the primary outcome excluding individuals who had cardiovascular events in the first 12 and 24 months of the follow-up period, respectively. Fifth, in order to investigate the association in smokers, we limited the analysis to only include current and former smokers in the cohort.

Patient and public involvement

Patients were not involved in this research for the development of the research question, exposure definition or the outcome definition. They were not involved in any form for any possible recruitment, design or implementation of this study. There are no current plans to involve patients in the dissemination stage of this study.

Discussion

Using a DL approach for longitudinal EHR, we found that SBP was monotonically associated with cardiovascular risk in 39 602 patients with COPD. Individuals with SBP <120 mm Hg were found to have the lowest risk of both the primary and secondary outcomes with little material deviation in the trends found in the sensitivity analyses.

SBP is established to be log-linearly associated with cardiovascular risk in the general population and in fact, naturally below average blood pressure values in industrialised communities.3 23 24 However, in groups with prior CVDs and associated risk factors, the relationship remains insufficiently described. In this context of high-risk patients—such as those with diabetes, IHD and other risk factors at study entry—many observational studies reject the monotonic relationship between SBP and cardiovascular risk, concluding a J-shaped pattern.4 5 25 However, these observational studies are criticised for improperly dealing with manifestations of reverse causality and confounding. With cardiometabolic multimorbidity at baseline more prevalent in those with lower SBP than higher, additional variables capturing this poor baseline health and associated cardiovascular illnesses must be included for adjustment. Given an insufficient understanding of risk and protection in multimorbid patients currently, solely relying on expert selection of known confounders (eg, gender, age, BMI, known risk factors of CVD) exposes the modelling to issues of residual confounding.26 As a result, unadjusted confounding due to multimorbidity in lower SBP groups can result in the J-shaped pattern: an optimum exists such that SBP below and above is associated with higher cardiovascular risk.4 27

In our own implementation of conventional regression modelling, adjusting for predictors as previously defined in Byrd et al, the results captured this described J-shaped pattern and rejected the established log-linear relationship between SBP and risk of cardiovascular outcomes.3 4 Even the fully adjusted LR model with the expanded set of predictors resulted in a non-monotonic trend across analyses of both primary and secondary outcomes.

Implementing the DL approach for assessing the studied association directly confronted these modelling issues. By using minimally processed diagnoses and medications data in routine clinical EHR, our DL approach accounts for a breadth of risk factors potentially confounding the exposure-outcome relationship. In our cohort with COPD and cardiometabolic multimorbidity at baseline, in which traditional approaches failed to sufficiently capture confounding factors in observational data, our approach was appropriately implemented to model the association between SBP and risk of cardiovascular events.

The monotonic association concluded in this work raises important clinical questions for cardiovascular care. What is the optimal SBP in patients with COPD? Does this threshold differ from the recommendations for the general population (<120 mm Hg)? While guidelines for hypertension indeed endorse blood pressure lowering in patients with concomitant COPD and high blood pressure, the recommendations suggest a treatment target of <130 mm Hg (<140 mm Hg in the elderly).28 Our results demonstrated an infimum of risk at SBP of <120 mm Hg—consistent with the established log-linear understanding of the association between SBP and cardiovascular risk.

Naturally, our investigation does not answer questions relating to antihypertensive treatment effects. Hence, while our study in isolation is insufficient for recommending revisions of hypertension guidelines, our investigation sheds light on the aetiological nature of SBP and CVD in those with COPD—imperative, especially since randomised evidence of blood pressure-lowering therapies in patients with COPD is unavailable and likely to remain unavailable in the near future. While (1) external validation of the studied association would be prudent and (2) in-depth investigations of the association between antihypertensive and CVD (in at least the observational capacity) are needed to comprehensively capture all facets of the relationships between blood pressure, antihypertensives and CVD risk in patients with COPD, our investigation serves as one such source of well-adjusted evidence.

Strengths and limitations

First, in terms of data, the comprehensive information provided by CPRD is a strength of our research. The linkage capabilities of CPRD allow the capture of rich health encounters (eg, diagnoses, medications, measurements, static attributes) from various sources including primary care, secondary care and mortality-based datasets. With access to rich EHR, our DL approach could better extract confounders, both known and latent in routine clinical data as shown in past investigations of SBP and CVD risk in high-risk patients.10 12 Second, with access to repeated SBP measurements specifically, we were able to derive a summary value (mean value of multiple SBP measurements) limiting issues of measurement error.20 Third, we were able to capture many more patients than prior studies investigating this association, and also, unlike previous studies of SBP and cardiovascular risk, we included older aged patients and those with cardiovascular multimorbidity at baseline.4 Exclusion from our study was limited, thereby allowing understanding of the association of SBP and cardiovascular outcomes in high-risk subgroups with COPD. Fourth, rich longitudinal data in CPRD afforded us the opportunity to follow patients for a median of 3.9 years as opposed to the prior exploration of this association in patients with COPD, which reported median follow-up of 1.9 years.4 With a longer follow-up period, potential biases in RR estimation due to issues of reverse causation are mitigated. Fifth, we explored various sensitivity analyses in order to understand the role of unforeseen biases (eg, reverse causality) and supplement the narrative of the main results. In terms of modelling, a strength of our work is the DL approach capable of extracting and adjusting for confounding factors in rich annotated EHR.10 12 Additionally, we implemented two varieties of the conventional statistical approach with validated predictor sets allowing direct comparison with the DL approach.4 By using superior confounding adjustment methods, we demonstrated the utility of DL modelling ultimately rejecting the evidence of a J-shaped relationship.

In terms of limitations, while EHR data in CPRD have some degree of diagnostic recording error, past studies have validated the primary care, secondary care and mortality-based sources within the CPRD database for observational research.11 14 15 Also, SBP variability is a concern; we have attempted to ameliorate issues of random measurement error by taking an average of repeat measurements over the course of 12 months following baseline as recommended by previous works.20 Furthermore, more accurate consideration of the outcome and censoring with time-to-event modelling is needed. Given the nascent stage of deep survival modelling for EHR, further methodological innovation is required to fuse DL-based causal models and survival framework modelling.29 Also, methods that can interpret confounding capture conducted by T-BEHRT would be useful for fully characterising DL estimation processes. While importance of adjusted variables can be readily assessed in the conventional approach, auxiliary methods to extract and decompose the confounders captured by T-BEHRT into explicit medical history variables would lend insight into shared risk factors of blood pressure and CVD. In terms of adjustment, while overadjustment (collider variable adjustment and M-structure bias) is a theoretical concern, empirical research has shown that conditioning on all pre-exposure variables in similar types of EHR studies does not lead to biased estimates.30 Additionally, we have attempted to further mitigate this potential issue by defining a clear baseline with adjustment specifically up to baseline. Lastly, as is true with all observational studies, residual confounding cannot be completely ruled out even with more complex confounding adjustment approaches (eg, T-BEHRT).

留言 (0)

沒有登入
gif