An EWAS of dementia biomarkers and their associations with age, African ancestry, and PTSD

Participants and procedures

Analyses were based on existing clinical, genetic, and biomarker data collected under research protocols led by investigators at the Behavioral Sciences Division of the VA National Center for PTSD [18]. Data for this report were from 849 individuals, including 580 US military veterans and a subset of 269 of their intimate partners, collectively ranging from 19 to 75 years of age. Clinical and demographic characteristics of the sample are listed in Table 1. Each of the original studies, and the research presented in this report, was reviewed and approved by the appropriate institutional review boards. In each study, participants had blood drawn for future genetic and biomarker assays. Blood was collected in EDTA tubes, centrifuged to separate plasma, serum, and buffy coat, then aliquoted and stored at − 80 °C until thawed for analysis. Participants also underwent psychiatric assessments using the Clinician-Administered PTSD Scale for DSM-IV or DSM-5 (CAPS [19, 20]) and Structured Clinical Interview for DSM-IV or DSM-5 (SCID [21, 22]), depending on the version of DSM in use at the time of study enrollment. For the CAPS, in addition to determining current diagnosis, frequency and intensity ratings of each symptom were summed to create a dimensional score reflecting current PTSD severity. We harmonized CAPS symptom severity across the two DSM versions by calculating each participant’s symptom severity as a percentage of the maximum possible severity score for the relevant version of the measure (yielding scores ranging from 0–1). All diagnostic interviews were video recorded. Inter-rater reliability was assessed for approximately 25% of participants; kappa for each diagnosis reported here was greater than 0.78. History of psychological trauma was assessed using the Traumatic Life Events Questionnaire (TLEQ) [23], an inventory of 23 different types of traumatic experiences that were coded as positive if the participant endorsed (a) exposure to the event, and (b) experiencing “intense fear, helplessness, or horror” when it happened. Additional details regarding the samples, clinical assessments, and inter-rater reliability are available in previous reports [2425]. Neurocognitive function was not formally assessed; however, participants were terminated from the procedure if, in the clinician’s judgment, cognitive impairment interfered with the participant’s ability to complete the procedures. Finally, current psychiatric medication use was assessed using a self-report checklist and then classified into four categories: (a) SSRI/SNRIs, (b) other antidepressants, (c) typical/atypical antipsychotic, (d) sedatives, hypnotics, anxiolytics.

Table 1 Sample descriptive statistics and bivariate correlations with the Simoa factor scoresGenotype and DNA methylation data

Genetic data were generated using methods described in prior publications [18]. Briefly, DNA was isolated on a Qiagen AutoPure instrument with Qiagen reagents and samples normalized using PicoGreen assays (Invitrogen, Grand Island, NY, USA). Each DNA sample was run on an Illumina OMNI 2.5 microarray and scanned using an Illumina HiScan System (San Diego, CA, USA) according to the manufacturer’s protocol. Imputation was based on the Thousand Genomes Phase 3 reference panel [26]. Ancestry was determined using a pipeline [27, 28] that identified ancestral principal components using 100,000 randomly selected common single nucleotide polymorphisms (SNPs). APOE ε4 carrier status was called using the isoform-defining SNPs (rs7412 and rs429358) which were well-imputed (r2 = 0.96 and 0.99, respectively). We used “best guess” imputed genotypes with a 90% confidence threshold for these SNPs to derive the APOE ε4 genotypes.

DNA methylation (DNAm) studies involve measurement of a methyl group on the DNA strand at a cytosine-phosphate-guanine (CpG) site. DNAm data were generated using methods described in prior publications [29]. In brief, DNAm was measured using the Illumina Infinium Methylation EPIC BeadChips. Zymo EZ-96 DNA Methylation Kits (D5004) were used to bisulfite-convert batched samples. DNA conversion was accomplished via PCR using DAPK1 primers (Zymo) followed by gel electrophoresis of PCR products. Bisulfite-modified DNA was then whole-genome amplified, hybridized to the BeadChips, single-base extended, and stained using the Automated Protocol for the Illumina Infinium HD Methylation Assay. Assignment of individuals to chip and chip positions were balanced based on PTSD diagnosis and sex. We applied a quality control (QC) pipeline developed by the Psychiatric Genomic Consortium-PTSD Workgroup [30] prior to analysis (and recently updated as described at https://github.com/PGC-PTSD-EWAS/EPIC_QC). Proportional white blood cell (WBC) estimates (CD8-T and CD4-T cells, natural killer cells, b-cells, monocytes) were calculated from the methylation data for use as covariates.

Simoa markers

Simoa assays were performed at the Quanterix Accelerator Lab (Quanterix Corporation, Billerica, MA) using plasma samples. Samples were thawed and diluted per manufacturer’s specifications, centrifuged to remove particulates and debris, then pipetted into 96 well plates, diluted 4x, and run in duplicate. All markers were tested using the HD-1 Analyzer. Aβ40, Aβ42, GFAP, and NfL were assayed using the N4PE advantage kit (Quanterix Item #103,670). pTau181 was assessed using the pTau181 advantage v2 kit (Quanterix Item #103,714). We initially intended to include plasma total Tau, but preliminary analyses showed weak bivariate associations between this analyte and the other Simoa markers (rs < 0.2). Furthermore, unlike the other five markers which are primarily brain-derived, the Tau protein is also expressed in peripheral tissues [31], and a recent study estimated that only 20% of plasma total Tau originates in the brain [32]. Calibration was conducted with reference samples and QC procedures included evaluation of average enzyme per bead and coefficient of variation (CV). Samples that did not pass QC procedures were re-run, when possible, with the goal of minimizing freeze/thaw cycles. Samples were excluded (0–4.5%, varying by marker) if (a) the CV was > 25%, or (b) a duplicate was not available. Remaining concentrations were then multiplied by the dilution factor (× 4) prior to analysis. Results below the functional lower limit of quantification (fLLOQ) were set to the fLLOQ, and results above the functional upper limit of quantification (fULOQ) were set to the fULOQ. In total, data from 713 participants passed the DNAm and Simoa QC procedures, and of those, 704 also had genotype data.

Data analysesExploratory and confirmatory factor analyses

Exploratory (EFA) and confirmatory factor analyses (CFA) of raw values for the five Simoa markers (Aβ40, Aβ42, GFAP, NfL, pTau181) were performed using Mplus v8.5 [33]. EFA is a method for identifying the structure and dimensionality underlying the covariation of set of variables when that structure is not known a priori. In EFA, models with different numbers of latent variables (factors) are compared to determine which model best accounts for the covarion among the variables. It is similar to principal components analysis except that EFA can distinguish between true score variability and error and separates these two sources of variance. CFA, in contrast, enables examination of the degree to which data fit a predefined, or a priori hypothesized, structure for the number of factors underlying the covariation of variables and loading of individual variables on each factor. In both approaches, the fit of the model to the data is evaluated using several commonly used fit indices including the root mean square error of approximation (RMSEA), standardized root mean square residual (SRMR), and the comparative fit and Tucker-Lewis fit indices (CFI and TLI, respectively). The Bayesian information criterion (BIC) can be used for evaluating the fit of competing models.

We began by conducting an EFA of the five markers in a random half of the sample and evaluated 1 and 2 factor solutions using geomin rotation. (With 5 markers, we could only evaluate 1- and 2-factor solutions due to the fact that a model with more factors would have been statistically under-identified.) The robust maximum likelihood estimator (MLR) was used in all analyses. The results of the best fitting (two-factor) EFA were then used to inform the structure of a CFA that we tested in the other random half of the sample. After identifying a good fitting two-factor CFA, we executed the same model in the full sample (N = 849), again evaluated fit, then saved those factor scores for use in the EWAS. The factor scores reflect each individual’s score on the latent variables (e.g., a higher score on the latent variable would account for higher values on the individual Simoa markers that load on that factor).

Epigenome-wide association analyses (EWAS)

We performed EWASs of scores from the two factors using linear models in the Bioconductor limma (Linear Models for Microarray Data) package [34], with the base 2 logit-transformed methylated proportion (known as an M values) as the response and the factor score as the predictor. Each EWAS included the following covariates: the top three ancestry principal components, age, sex, estimates of WBC proportions, a categorically coded batch variable representing the methylation project that each sample was assayed under, and a DNAm-based smoking score. The latter was based on effect-size estimates for the top-39 probes from a smoking EWAS [35] that we have previously shown to be an important covariate to include in DNAm association analyses [29]. We computed false discovery rate (FDR) corrected p values [36], also known as Q values, to control for multiple testing (denoted “padj”). Finally, we examined the genes corresponding to the top 500 sites from each EWAS for enrichment of specific gene ontology (GO) term categories using the gometh function from the R missMethyl package [37]. This function is an extension of the GOseq method [38] which explicitly models the relationship between the number of CpG sites assessed within a gene and the probability of that gene appearing within the target list.

Structural equation model

Structural equation modeling (SEM) is a multivariate analytic method for simultaneously estimating the strength of associations between latent variables (e.g., in this case, the Simoa factors) and other observed variables in a causal structure containing direct (regressive paths) and/or indirect (mediated) paths in a single analysis. It is like path analysis, but involves paths between latent variables, as opposed to between observed variables. As with the factor analyses, this was performed in Mplus. This model examined (a) the strength of associations between the independent variables (i.e., the demographic, psychiatric and APOE ε4 genotypes) and the dependent variables (i.e., the Simoa factors), (b) the influence of the independent variables on the M values from the CpG sites identified by the EWAS, and (c) the effects of covariates relevant to each variable in the model. In addition, given the role of DNAm in mediating effects of environmental factors on many forms of gene and protein expression, we also modeled the EWAS-significant CpG sites as mediators of the associations between the independent variables and the Simoa factors. Additional file 1: Fig. S1 illustrates full structural equation model including the latent variables, regressive paths, factor loadings, factor correlations, and covariates. Given the large number of parameters to be estimated and evidence that Type 1 error can become inflated in SEM analyses [39], we utilized a conservative p value threshold of 0.001 for all direct regressive paths in the model. Finally, for significant CpG associations, we also evaluated the significance of the indirect (mediated) effect of the IV on the DV via the CpG loci using the “model indirect” command in Mplus which computed the products of (a) the effect of each IV on each CpG, and (b) the effect of each CpG on the Simoa factor, along with the p values for each indirect path.

留言 (0)

沒有登入
gif