Distinct sex-specific DNA methylation differences in Alzheimer’s disease

Description of study datasets

Our sex-specific meta-analysis included DNA methylation (DNAm) data measured by the Illumina EPIC arrays and generated from blood samples of 889 independent subjects (447 females and 442 males) older than 65 years of age (Table 1). The samples were collected at baseline, at 18-months follow-up in the AIBL study, and at multiple follow-up visits ranging from 6 months to 60 months in the ADNI study [38]. A total of 632 female samples (188 cases, 444 controls) and 652 male samples (239 cases, 413 controls) were included in this study. For females, the mean ages were 77 and 73 years in the ADNI and AIBL studies, respectively. Similarly, for males, the mean ages were 79 and 73 years in these two studies.

Table 1 Demographic information of the study datasetsSex-stratified and methylation-by-sex interaction analyses identified complementary sex-specific DNA methylation differences in AD

In sex-stratified analysis, after adjusting covariate variables age, batch, and immune cell-type proportions and correcting inflation in each dataset (Methods), inverse-variance fixed-effects meta-analysis identified 2 CpGs, mapped to the PRRC2A and RPS8 genes at 5% false discovery rate (FDR) (Table 2, Fig. 1) in the analysis of female samples. No CpGs reached 5% FDR in males. At the predefined suggestive threshold of P < 1×10−5, an additional 4 and 21 CpGs were identified in males and females, respectively (Fig. 2, Supplementary Table 2, Supplementary Figures 1 and 2).

Table 2 Results of sex-specific meta-analyses of the blood samples in ADNI and AIBL datasets. Inverse-variance weighted fixed-effects meta-analysis models were used to combine dataset-specific results from logistic regression models that included methylation beta values and covariate variables age, batch (i.e., methylation plate), and estimated immune cell-type proportions. In females, two CpGs were significant in the Alzheimer’s disease (AD) vs. cognitive normal groups comparison at 5% false discovery rate (FDR). No CpG reached 5% FDR in males. Annotations include the location of the CpG based on hg19/GRCh37 genomic annotation (Chr, position), nearby genes based on GREAT and Illumina gene annotations, and overlap with enhancer regions identified in Nasser et al. [53] study (enhancer). Odds ratios and their 95% confidence intervals (OR, 95% CI) describe changes in odds of AD (on the multiplicative scale) associated with a one percent increase in methylation beta values (i.e., increase in methylation beta values by 0.01) after adjusting for covariate variables. Direction indicates hypermethylation (+) or hypomethylation (−) in AD samples in the ADNI and AIBL datasetsFig. 1figure 1

Sex-specific meta-analysis of female samples identified 2 CpGs significantly associated with AD diagnosis at 5% false discovery rate (FDR). a The CpG cg18020072, located on the PRRC2A gene, is significantly associated with AD diagnosis in females (P-value = 3.02 × 10−8, FDR = 0.023). b The CpG cg24276069, located on the RPS8 gene, is also significantly associated with AD diagnosis in females (P-value = 9.62 × 10−8, FDR = 0.036). FDR: false discovery rate

Fig. 2figure 2

Sex-specific DNA methylation differences associated with AD diagnosis in males and females. The X-axis are chromosome numbers. The Y-axis shows -log10 (P-value) of CpGs associated with AD diagnosis in males (above X-axis) or in females (below X-axis). The genes corresponding to the CpGs that reached P-value < 1×10−5 (indicated by the red lines) are highlighted

For these 27 AD-associated CpGs, the odds ratios (ORs) for hypermethylated CpGs in AD ranged from 1.059 to 1.328 in females and 1.181 to 1.199 in males, and the ORs for hypomethylated CpGs ranged from 0.839 to 0.935 in females and was 0.677 for the only hypomethylated CpG in males (Supplementary Table 2). Overall, the majority of these CpGs were hypermethylated in AD subjects (22 CpGs), located outside CpG islands or shores (24 CpGs), or in distal regions located greater than 2k bp from the TSS (23 CpGs). Only 4 of these 27 CpGs were located in gene promoters (at SLC5A8, DAZAP1, C16orf89 and MYOZ1 genes). A total of 10 CpGs (all of them in females) overlapped with enhancer regions [53], which are regulatory DNA sequences that transcription factors bind to activate gene expressions [53, 63].

Using the sex-specific meta-analysis P-values for individual CpGs as input, the comb-p software [48] identified 41 differentially methylated regions (DMRs) in females and 24 different DMRs in males at 5% Sidak multiple comparisons corrected P-value (Supplementary Table 36). The median numbers of CpGs in these DMRs are 5 CpGs in females and 4 CpGs in males. A total of 13 DMRs (6 in females, 7 in males) overlap with enhancer regions. These DMRs are mostly distinct from the AD-associated CpGs; there is no overlap between the DMRs and significant individual CpGs in either females or males (Supplementary Fig. 3). Among the significant DMRs, about half of them (21/41 in females, 13/24 in males) are hypermethylated in AD (Table 3, Supplementary Table 36).

Table 3 In sex-specific meta-analysis of the blood samples in ADNI and AIBL datasets, the top 10 most significant DMRs associated with Alzheimer’s disease diagnosis identified by comb-p software at 5% Sidak adjusted P-values. CpG direction indicates hypermethylation (+) or hypomethylation (−) in AD subjects for each CpG located within the DMR, based on effect estimate in meta-analysis. Annotations include nearby genes based on GREAT and Illumina gene annotations. Highlighted in red are promoter regions of the genes mapped by the DMRs

Interestingly, AD-associated DNA methylation differences are largely distinct between the sexes. There is no overlap between the significant CpGs (or DMRs) identified in females and males (Supplementary Fig. 4). Among the 27 sex-significant CpGs, there was only modest correlations between the effect estimates (i.e., odds ratios) obtained from meta-analyses of female and male samples (Spearman correlation R = 0.100) (Supplementary Fig. 4). About a third (9 out of 27) of the CpGs are in the same direction of change in both females and males (i.e., hypermethylated across all datasets or hypomethylated across all datasets) (Supplementary Table 2).

In methylation-by-sex interaction analysis, we identified significant interactions at 5 CpGs with P < 1×10−5 (Table 4). These CpGs mapped to the MYO19, ESRRB, APLNR genes, and intergenic regions. There was no overlap between significant CpGs identified in methylation-by-sex interaction and sex-stratified analyses. To understand this discrepancy, note that the interaction analysis identifies CpGs with large differences in sex-specific effect estimates that are in different directions, but these effects might not have reached the P < 1×10−5 significance threshold in sex-stratified analysis. Therefore, the results from sex-stratified analysis and methylation-by-sex interaction analysis complemented each other.

Table 4 Results from meta-analysis of methylation-by-sex interaction effect in the analysis of blood samples in ADNI and AIBL datasets. Inverse-variance weighted fixed-effects meta-analysis models were used to combine dataset-specific results from logistic regression models that included methylation beta values, sex, methylation beta values*sex and covariate variables age, batch (i.e., methylation plate), and estimated immune cell-type proportions. For each CpG, annotations include the location of the CpG based on hg19/GRCh37 genomic annotation (chr, position), Illumina gene annotations, overlap with enhancer regions identified in Nasser et al. [53] study (enhancer). Odds ratios and their 95% confidence intervals (OR, 95% CI) describe changes in odds of AD (on the multiplicative scale) associated with a one percent increase in methylation beta values (i.e., increase in methylation beta values by 0.01) after adjusting for covariate variables. Direction indicates hypermethylation (+) or hypomethylation (−) in AD samples in the ADNI and AIBL datasetsCross-tissue meta-analysis prioritized sex-specific DNA methylation differences associated with both AD neuropathology and AD diagnosis

As changes in the brain are more relevant for cognitive disorders such as AD, we next prioritized sex-specific DNA methylation differences with changes in both blood and the brain, by performing cross-tissue analysis using two complementary approaches: (1) cross-tissue meta-analysis and (2) significant overlap.

In the first approach (i.e., cross-tissue meta-analysis), we performed a meta-analysis of the two blood sample datasets described above (i.e., AIBL and ADNI) with four additional prefrontal cortex datasets measured on brain samples, previously described by the ROSMAP [19], Mt. Sinai [23], London [20], and Gasparoni EWAS studies [64]. Supplementary Table 7 includes additional information on Braak stage, CERAD scores, clinical diagnosis, and postmortem interval for these brain samples. We previously meta-analyzed these four brain sample datasets and identified a number of CpGs and DMRs, many involved in immune processes, that are significantly associated with AD neuropathology [21, 36].

In the cross-tissue meta-analysis, no CpGs reached the 5% FDR significance threshold. At P-value < 1 × 10−5, we identified 28 CpGs and 12 CpGs in females and males, respectively (Fig. 3). We then prioritized 13 CpGs in females and 6 CpGs in males by additionally requiring these CpGs to also be nominally significant (i.e., P-value < 0.05) in the separate sex-specific meta-analyses of brain and blood samples (Tables 5a and 6). Among them, 8 CpGs were located in enhancer regions [53]. In females, 5 CpGs are located in promoter regions of the genes AGAP2, SLC44A2, LST1, VPS13D, and BLCAP. In males, 2 CpGs are mapped to promoters of the OAT and ADORA3 genes.

Fig. 3figure 3

Workflow for identifying sex-specific DNA methylation differences that are associated with both AD pathology (in prefrontal cortex brain samples) and AD diagnosis (in blood samples) using cross-tissue meta-analysis approach. Results for brain sample meta-analysis were obtained from Zhang et al. [36]

Table 5 Cross-tissue analysis of female samples prioritized a total of 25 significant CpGs. (a) A total of 13 CpGs reached a P-value < 10−5 in cross-tissue meta-analyses that included both brain and blood samples, and nominal significance (i.e., P-value < 0.05) in sex-specific meta-analyses of each tissue. The brain sample meta-analysis results were obtained from Zhang et al. [36]; (b) A total of 4 CpGs achieved P-value < 10-5 in blood sample meta-analysis and nominal significance in brain sample meta-analysis; (c) A total of 13 CpGs achieved P-value < 10-5 in brain sample meta-analysis and nominal significance in blood sample meta-analysis. Direction indicates hypermethylation (+) or hypomethylation (-) in AD samples in individual brain or blood sample datasets. Annotations include nearby genes based on GREAT annotation and overlap with enhancer regions identified in the Nasser et al. [53] study. All but 5 significant CpG showed the same direction of change in brain and blood samples (highlighted in gray). Highlighted in red are gene promoter regions overlapped with the significant CpGsTable 6 Cross-tissue analysis of male samples prioritized a total of 6 significant CpGs. These 6 CpGs reached a P-value < 10−5 in cross-tissue meta-analyses that included both brain and blood samples, and nominal significance (i.e., P-value < 0.05) in sex-specific meta-analyses of each tissue. The brain sample meta-analysis results were obtained from Zhang et al. [36]. Among the 6 CpGs, 2 CpGs also achieved P-value < 10−5 in brain sample meta-analysis and nominal significance in blood sample meta-analysis. Direction indicates hypermethylation (+) or hypomethylation (−) in individual brain or blood sample datasets. Annotations include nearby genes based on GREAT annotation and overlap with enhancer regions identified in the Nasser et al. [53] study. Highlighted in red are gene promoter regions overlapped with the significant CpGs

In the second approach (i.e., significant overlap), we identified CpGs that achieved P-value < 1×10−5 in the blood sample meta-analysis and nominal significance (i.e., P-value < 0.05) in the brain sample meta-analysis, and vice versa. In females, for the 23 significant sex-specific CpGs we discovered in blood sample meta-analysis, 4 CpGs, mapped to the promoter region of DAZAP1 and intergenic regions, also achieved nominal significance in brain meta-analysis (Table 5b). On the other hand, for the 116 CpGs with P-value < 1×10−5 in brain meta-analysis, 13 CpGs, mapped to the promoter regions of SLC44A2, AGAP2, RHOB, TRPV4, MTA3 genes, and intergenic regions, achieved nominal significance in blood sample meta-analysis (Table 5c). Among these 17 CpGs prioritized by the significant overlap approach, 5 CpGs were also identified by the cross-tissue meta-analysis approach.

In male samples, we did not identify any additional CpG using the significant overlap approach (Table 6). Among the 6 CpGs prioritized by cross-tissue meta-analysis, two CpGs, mapped to the OAT and ADORA3 genes, also achieved P-value < 1×10−5 in brain sample meta-analysis and nominal significance in blood sample meta-analysis.

Intriguingly, among the 25 CpGs in females and 6 CpGs in males prioritized by these two complementary analyses, the majority of them (20 in females, 6 in males) showed the opposite directions of change in the brain and the blood, in which 11 CpGs in females and 2 CpGs in males were hypermethylated in the brain and hypomethylated in the blood of AD samples, and the rest were hypomethylated in the brain and hypermethylated in the blood of AD samples.

Correlation of sex-specific DNA methylation differences in AD with expression levels of nearby genes

To better understand the functional roles of the significant DNAm differences, we examined the correlation between CpG methylation (both significant individual CpGs and CpGs within significant DMRs) and the expression levels of nearby genes. To this end, we performed integrative analysis using matched methylation and expression data measured on blood samples from 265 independent subjects (120 females and 145 males) in the ADNI study. We first removed effects in batch, age, and immune cell-type proportions in methylation and gene expression data separately. Next, for CpGs in the promoter regions (i.e., within ± 2k bp from TSS), we tested the association between the CpG with their target gene expressions. Similarly, for CpGs in the distal regions (i.e., > 2k bp from TSS), we tested the association between the CpG with ten genes upstream and ten genes downstream and within 1M bp from the CpG location.

At 5% FDR, among the significant sex-specific AD-associated CpGs and CpGs located in AD-associated DMRs, in females, DNAm at 23 CpGs (mapped to 5 DMRs) in gene promoter regions were significantly associated with the expression of their target genes, including LGALS3BP, VAMP5, ALOX12, TAGLN3, and GABRG1 (Supplementary Table 8). Among CpGs located in distal regions, only 1 CpG (cg00271210) was significantly associated with the expression of its target gene RNASET2 at 5% FDR.

In males, DNAm at 12 CpGs (mapped to 2 DMRs) in gene promoter regions were significantly associated with the expression of their target genes PM20D1 and KCTD11 (Supplementary Table 9). Among CpGs in distal regions, 13 CpGs (mapped to 5 DMRs) were significantly associated with expressions of their target genes, including STK32C, TACSTD2, FANCA, OVGP1, and PGPEP1.

To further prioritize the target genes nominated by our sex-specific methylation-gene expression association analyses above, we also tested the association of the target genes with AD. In ADNI blood samples analysis, we found only 1 target gene, PM20D1, to be significantly upregulated in blood samples of male AD subjects (P-value = 2.60 ×10−3) (Fig. 4). In prefrontal cortex brain samples, we found several of these target genes, including LGALS3BP, RNASET2, TAGLN3, VAMP5, ALOX12 in females, and PGPEP1, KCTD11, STK32C, FANCA in males, are differentially expressed in AD (Supplementary Table 8c, 9c). The greater number of differentially expressed genes in brain samples compared to blood samples could be due to the larger sample size of brain samples available (502 brain samples in the meta-analysis of GSE33000 and GSE44772 vs. 265 ADNI blood samples).

Fig. 4figure 4

Differential DNA methylation and gene expression at the PM20D1 gene in blood samples of male AD and cognitively normal subjects. We first removed effects of age, estimated proportions of immune cell types, and batch effects in both DNA methylation and gene expression data separately, by fitting linear regression models and extracting residuals. The results showed that A DNA methylation at chr1:205819088-205819609 in the promoter region of PM20D1 is hypomethylated in AD subjects, B PM20D1 gene expression levels are significantly up-regulated in AD subjects, and C there is a strong negative association between DNA methylation and gene expression at this locus. Abbreviations: dnam, DNA methylation; CN, cognitively normal; rlm, robust linear model

Correlation and overlap with genetic susceptibility loci

To identify methylation quantitative trait loci (mQTLs) for the significant DMRs and CpGs, we next performed look-up analyses using the GoDMC database [58]. In females, among the 266 CpGs mapped to the AD-associated CpGs or located in AD-associated DMRs (Supplementary Tables 2, 3), 145 CpGs had cis mQTLs and 24 CpGs had both cis and trans mQTLs. In males, among the 126 CpGs mapped to the AD-associated CpGs or located in AD-associated DMRs (Supplementary Tables 2, 5), 67 CpGs had cis mQTLs, and 3 CpGs had both cis [58] and trans mQTLs. Among the 5 significant CpGs from interaction analysis, 2 CpGs had cis mQTLs. These results are consistent with the previous observation that a substantial proportion (about 45%) of the DNA methylation sites targeted by the Illumina 450k array are influenced by genetic variants in the blood [58].

Similarly, we also analyzed CpGs nominated by the cross-tissue analysis. In females, among the 25 significant CpGs prioritized in our cross-tissue analysis (Table 5), 19 CpGs had mQTLs in the blood, 7 of the 19 CpGs also had mQTLs in the brain. In the males, among the 6 significant CpGs in cross-tissue analysis (Table 6), 5 CpGs had mQTLS in the blood, and 2 of the 5 CpGs also had mQTLs in the brain. A total of 64 and 19 CpG–mQTL pairs in females and males were significant both in the analyses of brain and blood samples (Supplementary Tables 1011).

To evaluate if these mQTLs overlapped with genetic risk loci implicated in AD, we compared them with the 24 LD blocks of genetic variants reaching genome-wide significance in a recent meta-analysis of AD GWAS [60]. We found that in females, 155 mQTLs (associated with the CpG cg14324675) overlapped with the LD block at 6:32395036-32636434, which included genetic variants mapped to the HLA-DRB1, HLA-DRA, HLA-DRB5, HLA-DQA1, and HLA-DQB1 genes (Supplementary Table 12). In males, 864 mQTLs (associated with the CpG cg06363485) overlapped with the LD block at chromosome 6:40706366-41365821, which included genetic variants mapped to the UNC5CL, TSPO2, APOBEC2, OARD1, NFYA, TREML1, TREM2, TREML2, TREML3P, TREML4, TREML5P, TREM1, and NCR2 genes [60] (Supplementary Table 13).

We also evaluated if the significant methylation differences overlapped with the genetic risk loci implicated in AD [60]. We found that in females, there was no overlap between AD-associated CpGs or DMRs with the genetic risk loci; in males, there was only 1 DMR that overlapped with the LD block at chromosome 6:40706366-41365821, where the TREM2 gene is located (Supplementary Table 14). The limited commonality between genetic and epigenetic loci in AD could be due to the low power in EWAS and/or GWAS but could also reflect the relatively independent roles of genetic variants and DNA methylation in influencing AD susceptibility [65, 66].

Out-of-sample validation of sex-specific DNA methylation differences in an independent external dataset

To evaluate the feasibility of the significant methylation differences for predicting AD diagnosis, we performed an out-of-sample validation using an independent external DNA methylation dataset measured by Illumina 450k arrays and generated by the AddNeuroMed study, which included 64 males (30 cases, 34 controls) and 107 females (53 cases and 54 controls) with ages greater than 65 years [28] (Table 1). We performed methylation risk score (MRS) analysis [67] for samples of each sex separately. More specifically, MRS was computed by summing methylation beta values of the significant sex-specific AD-associated CpGs weighted by their estimated effect sizes in the meta-analyses. Several logistic regression models were then estimated using the AIBL dataset (training dataset) and then tested on samples in the AddNeuroMed dataset (testing dataset). We considered logistic regression models with three sources of variations that might affect the prediction for AD diagnosis: age, estimated cell-type proportions for each sample, and MRS.

In females, the most predictive model include MRS, age, and estimated immune cell-type proportions (AUC = 0.74, 95% CI: 0.65–0.83), significantly more predictive than a random classifier (P-value = 8.42×10−6). In contrast, the model without MRS (i.e., only age and estimated immune cell-type proportions) has an AUC of 0.68 (Fig. 5). Because samples in the testing dataset (i.e., AddNeuroMed) are measured by Illumina 450k arrays while samples in the training datasets (i.e., ADNI and AIBL) are measured by EPIC arrays, the MRS in the best-performing model included 9 of the 23 significant CpGs with P-value < 10−5 in meta-analysis of female samples (Supplementary Table 2) that are available in both training and testing datasets.

Fig. 5figure 5

Receiver Operating Characteristic curves (ROCs) for out-of-sample validation of logistic regression models predicting AD diagnosis in males and females. The training and testing samples included sex-specific samples from AIBL and AddNeuroMed datasets, respectively. In males, the best-performing logistic regression model included age and methylation risk score (MRS) (AUC = 0.70), compared to the model with age alone (AUC = 0.64), or the model with age and estimated immune cell-type proportions (AUC = 0.57). In females, the best-performing model included age, MRS, and estimated immune cell-type proportions (AUC = 0.74), compared to the model with age and estimated immune cell-type proportions (AUC = 0.68). MRS was computed as the sum of methylation beta values for significant CpGs weighted by their estimated effect sizes from sex-specific meta-analysis of AIBL and ADNI datasets. In males, significant CpGs for the MRS included 2 CpGs with P-value < 10−5 identified in the interaction analysis that are also available in the AddNeuroMed dataset; in females, significant CpGs for MRS included 9 CpGs with P-value < 10−5 identified in AD vs. CN comparison that are also available in AddNeuroMed dataset. Abbreviations: AUC = Area Under ROC curve, AD = Alzheimer's disease, CN = cognitive normal

In males, the most predictive model include MRS and age (AUC = 0.70, 95% CI: 0.56–0.82), significantly more predictive than a random classifier (P-value = 5.62×10−3). In contrast, the model without MRS (i.e., only age) has an AUC of 0.64 (Fig. 5). In the best-performing model, the MRS included 2 of the 5 significant CpGs with P-value < 1×10−5 in the meta-analysis of methylation-by-sex interaction effect (Table 4) that are available in both training and testing datasets.

Interestingly, while the best-performing prediction model for females included age, immune cell type proportions, and MRS, the best-performing prediction model for males included only age and MRS. When considered alone, immune cell type proportions achieved slightly higher prediction accuracy in females than in males (AUCfemale = 0.59, AUCmale = 0.55) (Supplementary Fig. 5), which might be due to a greater change in AD-associated B cell type proportions in females (Supplementary Fig. 6). To confirm this result, we also fitted a logistic regression model to data from all three datasets (ADNI, AIBL, AddNeuroMed). This model included AD status as the outcome, main effects B cell type proportion, sex, and B cell type proportion × sex, as well as covariate variables datasets and age. The results showed a significant B cell type proportion × sex interaction (P-value = 0.017), indicating the associations between B-cell type proportions and AD were significantly different between males and females. While previous studies observed a decrease in B cells in the blood samples of AD patients [68,69,70], our findings revealed that the diminishing B cells in AD is more pronounced in females, which is also consistent with the results of another recent sex-specific analysis of gene expression data in AD [71].

We also evaluated the robustness of the best-performing sex-specific logistic regression models with additional analyses. The results indicated the prediction performance of these models in males and females remained very similar when the ADNI dataset was additionally included as a training dataset in the development of the logistic regression models, or when CpGs from AD-associated DMRs and/or significant CpGs in cross-tissue analyses are also included in the computation of MRS, where MRS weights are based on effect sizes estimated in meta-analysis of ADNI and AIBL.

Additional sensitivity analyses

In additional to age, sex, and estimated cell-type proportions that we modeled, additional risk factors such as smoking, and education could also influence AD risk [15, 72, 73], thus may confound the methylation to AD association. To evaluate the impact of smoking on our analyses results, we repeated our meta-analysis by additionally adjusting smoking in our sex-specific logistic regression models. Because we did not have access to smoking information in the AIBL and AddNeuroMed datasets, we computed smoking scores using the SSc method, an objective measure shown to discriminate subjects with different smoking status in three independent datasets [43]. The results of our expanded logistic regression model that additionally included smoking score showed all 27 sex-specific CpGs (Supplementary Table 2) remained highly significant, with meta-analysis P-values ranging from 2.59 × 10−5 to 5.83 × 10−8 (Supplementary Table 15), indicating these CpGs are associated with AD independent of smoking.

Similarly, we also evaluated the impact of education by additionally including a covariate variable for years of education in the logistic regression model. Among the three public datasets (ADNI, AIBL, AddNeuroMed), we only had access to information on education in the ADNI dataset. Therefore, we compared results for the ADNI dataset using expanded model that additionally include years of education with those from our primary analysis that did not adjust for education. We found the estimated odds ratios (ORs) and P-values for all 27 sex-specific CpGs (Supplementary Table 2) based on the original model and expanded model to be very similar (Supplementary Table 16). Also, in the ADNI dataset, years of education did not differ significantly between CN and AD subjects in females or males (Supplementary Fig. 7), therefore is unlikely to be a confounder for AD.

留言 (0)

沒有登入
gif