Relative contributions of six lifestyle- and health-related exposures to epigenetic aging: the Coronary Artery Risk Development in Young Adults (CARDIA) Study

Study samples

The study participants were from the Coronary Artery Risk Development in Young Adults (CARDIA) study. The CARDIA study is a multicenter prospective cohort study to examine the development and determinants of cardiovascular disease in young adults. At study baseline (Year 0; 1985–1986) 5115 Black and White adults (based on participants’ self-reported race) aged 18–30 years were enrolled across four field centers in Birmingham, AL, Chicago, IL, Minneapolis, MN, and Kaiser Permanente Health Plan in Oakland, CA. Participants have been followed up after baseline nine times, in study years (Y) Y0, Y2, Y5, Y7, Y10, Y15, Y20, Y25, and Y30. More details about study design and recruitment for the CARDIA study are described elsewhere [44]. In the current study, we included participants with complete information on DNA methylation and six lifestyle- and health-related exposures measured at Y20: alcohol consumption, education, diet, physical activity, sleep, and smoking. Of 3549 participants in CARDIA Y20, 1200 with available whole blood samples were randomly selected for DNA methylation measurement. After excluding samples with low DNA amount or poor quality, DNA methylation levels were measured in 1092 participants, and the final methylation dataset included 957 participants after quality control (QC) procedures. Out of 957 participants, we included participants with complete information across six component variables, which resulted in the final analytic data set of 744 participants.

Lifestyle- and health-related components in CARDIA study

In the current study, we investigated six lifestyle- and health-related exposures: smoking, alcohol consumption, education, sleep, diet, and physical activity (as of Y20). Smoking was defined as cumulative packs of cigarettes by Y20, which was calculated as the total number of cigarette packs over the study years. For example, smoking from Y15 to Y20 was calculated by summing the number of self-reported daily amount of cigarettes at Y15 and Y20 multiplied by 5, and then divided by 20 (number of cigarettes in a pack). We repeated and summed this calculation for each exam interval. We defined alcohol consumption by Y20 as the summed average alcohol consumption for each pair of consecutive examinations multiplied by the time interval (in years) between them. For example, alcohol consumption from Y15 to Y20 was calculated by taking the mean of self-reported alcohol intakes (in ml/day) at Y15 and Y20 multiplied by 5. We repeated this calculation for each interval (Y0-Y2, Y2-Y5, Y5-Y7, Y7-Y10, Y10-Y15, Y15-Y20) and summed them together to produce a measure of alcohol consumption from Y0 to Y20. For education we used self-reported education (maximum education years) in years at Y20. Sleep information was obtained from the Sleep Habits questionnaire at Y15 and Y20, asking study participants to report the hours of actual sleep during the past month of the exam. We used the mean value of sleep hours between Y15 and Y20 for the analysis. The Healthy Eating Index (HEI) scores [45] were measured at Y0, Y7, and Y20 as a measure of participants' diet quality, asking participants’ self-reported dietary history. To calculate diet quality over the study years, we used mean values of HEI scores according to participants’ number of exam years [46]. Finally, for participants' physical activity levels, total intensity scores obtained from the self-reported Physical Activity questionnaire at Y0, Y2, Y5, Y7, Y10, Y15, and Y20 were employed in the current study. To calculate cumulative total intensity scores, we summed average total intensity scores for each pair of consecutive examinations multiplied by the time interval, as we adopted the same approach for alcohol consumption.

Quality control of DNA methylation profiling and calculation of epigenetic age acceleration (EAA)

We conducted QC procedures adopting the R packages minfi [47] and ENmix [48] with DNA methylation profiles among 1200 participants. We excluded CpGs with detection rate less than 95%. We also excluded samples if the sample demonstrated either > 5% of low quality of methylation measurements or < 3 standard deviation from the mean intensity of bisulfite conversion probes. We further adopted Tukey's method to detect and exclude outlier samples [49]. The R function, preprocessIllumina, in the minfi package was used for preprocessing procedures after QC [47].

We used two EAA measurements, GrimAA and PhenoAA, to assess the association between collective lifestyle- and health-related components and epigenetic aging in the current study. We focused on GrimAA and PhenoAA because the two EAA measurements, which are recently developed, have shown better performance in association with health outcomes compared to the first-generation EAA measurements, as they were designed to predict healthspan [18, 19, 24]. The calculations of GrimAge and PhenoAge in participants were based on the published algorithms [18, 19]. We used Horvath's online DNA Methylation Age Calculator (https://dnamage.genetics.ucla.edu) to calculate GrimAge and PhenoAge with participants' DNA methylation data from the CARDIA study. GrimAA and PhenoAA were then derived from the regression residuals of GrimAge against chronological age, which captures the difference between epigenetic age and chronological age.

Statistical analysis

We performed descriptive analyses to explore the distribution of participants' chronological age, GrimAge, and GrimAA as well as their lifestyle- and health-related exposures and potential confounders (body mass index [BMI, kg/m2], sex, race, and field center). The six components were log-transformed and centered and each rescaled to have a mean of 0 and a standard deviation (SD) of 1. We also assessed Spearman correlation coefficients among our six individual components of interest. To assess the relative contributions and collective associations of the six exposures, we used quantile-based g-computation (QGC) and Bayesian kernel machine regression (BKMR). For comparison purposes we conducted additional, conventional analyses with linear regression models by including all six components as independent variables and each EAA as an outcome. To be consistent with QGC and BKMR analyses, log-transformed and standardized (mean = 0, SD = 1) components variables were used in the linear regression models adjusting for the same covariates as the other two approaches. The slope coefficients from the linear models was defined as mean change of EAA per increase of 1 SD. Statistical significance was defined with a threshold of p value < 0.05.

We used SAS version 9.4 (SAS Institute Inc., Cary, NC) for descriptive analyses and traditional regression models, and R version 4.0.3 (R Core Team, 2020) for QGC and BKMR analyses. As we observed greater associations with GrimAA, we present GrimAA in our main results, and PhenoAA in our supplementary tables for all analyses.

Quantile-based g-computation (QGC)

In QGC, each continuous component is transformed into the quantized version, \(}_^\) (coded as 0, 1, 2, or 3), and fitted to a linear model as follows:

$$Y_}}} = \beta_ + \psi \mathop \sum \limits_^ w_ }_^ + \epsilon_ = \beta_ + \mathop \sum \limits_^ \beta_ }_^ + \epsilon_$$

where βj is the effect size for jth component, and \(\epsilon_\) is the error term. The estimate of collective association, \(\psi = \sum\nolimits_^ }\), is interpreted as the change in EAA per one quartile change of all six components, controlling for covariates (age, sex, race, BMI, and field center). The weight for the kth component is defined as \(w_ = \beta_ /\sum\nolimits_^ }\), when the directions are same across all components. If the directions are different, the weights are defined for each direction, thus sum to 1.0 for positive and to − 1.0 for negative. The R package qgcomp [27] was used to obtain point estimates and 95% confidence intervals (95% CI) for QGC analyses.

Bayesian kernel machine regression (BKMR)

We also used BKMR with a Gaussian kernel to investigate the association between the collective effect of six components and EAA. The equation of the BKMR model for this study can be represented as follows:

$$Y_}}} = h\left( }_ } \right) + }_^}} \beta + \epsilon_$$

where \(}_ = \left( }_ , \ldots ,}_ } \right)\) is a vector of the six component variables for the ith participant, \(}_^}}\) is the matrix of covariates (age, sex, race, BMI, and field center), β is the vector of corresponding coefficients for covariates, and \(\epsilon_\) is the error term. In the context of this study, h() represents the unknown exposure–response relationship among the components in the combination, which may incorporate nonlinearity and nonadditivity. In this study, single exposure–response relationship was defined between each component and EAA, fixing all other components to their median and controlling other covariates. In BKMR, a kernel machine is used to specify the unknown exposure-relationship h(), and the component-wise variable selection using Gaussian kernel within a Bayesian paradigm allows to calculate posterior inclusion probability (PIP) [28]. The estimates for collective association were defined as the change in the mean EAA when all of the six components are fixed at their 75th percentile compared to when the six components are at their 25th percentile, controlling for covariates [50]. For all BKMR models, we ran 50,000 iterations to fit the Markov Chain Monte Carlo (MCMC) sampler. R package bkmr [50] was used to obtain point estimates and 95% credible intervals (95% CrI) for BKMR analyses and plots.

Definition of relative contributions to EAA

In this study, we defined the term ‘relative contribution’ to reflect the magnitude of importance in the collective association of all six components with EAA. In QGC, the relative contribution is captured by weight, which reflects the proportion of the effect among the individual components with the same direction. In BKMR, the relative contribution is captured by PIP, which reflects the ranked importance of each component in association with EAA. The higher weight and PIP represent the higher importance in the collective association.

Sensitivity analyses

To take into account that GrimAge incorporates DNA methylation-based surrogate markers for smoking pack-years [19], we investigated the relative contributions of lifestyle- and health-related components to GrimAA by participant smoking status at Y20. We grouped the participants into ever smokers and never smokers. Additionally, we performed separate analyses by further grouping the participants into current-, former-, and never smokers. We also utilized different measures of physical activity for a sensitivity analysis, using moderate and vigorous activity levels for participants’ physical activity information. We additionally performed stratified analyses by sex and race.

留言 (0)

沒有登入
gif