Identification of a specific APOE transcript and functional elements associated with Alzheimer’s disease

To elucidate the mechanism of AD risk variants and its connections with transcriptomic, genetic, and epigenetic features within the context of AD, we harnessed the power of available multi-omics datasets sourced from diverse brain regions and two ancestries. Table 1 contains comprehensive demographic information pertaining to the participants in our analysis. It is noteworthy that while certain facets of this dataset have previously been analyzed in studies exploring brain phenotypes [21, 32], these earlier investigations predominantly emphasized genome-wide patterns. In contrast, our current study is distinct in its focus to unravel the intricate regulatory mechanisms operating within the APOE locus. As Fig. 1A and Supplementary Fig. S3 illustrate, we link AD genome-wide significant risk alleles (‘AD alleles’ hereafter) at the APOE locus to APOE gene and transcripts expression. Then, we link AD alleles to DNA methylation levels. Finally, we use ChIP-seq to prioritize functional SNPs. As a novel contribution, we present, for the first time, compelling associations between AD-associated risk SNPs and important functional elements at the APOE locus (Fig. 1B).

Fig. 1figure 1

Overview of APOE study in human postmortem brain (A) and novel AD risk factors (genetic, transcriptomic, and epigenetic elements) and their relative position at the APOE locus (B). Brain collection: ROSMAP, The Religious Orders Study and the Memory and Aging Project; LIBD, Lieber Institute for Brain Development. Ancestry: EA, European Ancestry; AA, African American. Brain region: DLPFC, dorsolateral prefrontal cortex; PCC, posterior cingulate cortex; AC, anterior cingulate cortex. Green boxes represent exons; blue boxes represent untranslated regions (UTRs); Me represents methylation site; peaks represent active chromatin regions; 3 solid blue lines represent genomic DNA

Our investigative journey commenced with a comprehensive exploration of the APOE locus, extracting transcriptomic, methylation, and histone modification features from the ROSMAP dorsolateral prefrontal cortex (DLPFC) dataset (see data availability). Serving as our cornerstone, this brain region formed the basis for probing APOE gene expression, encompassing bulk tissue RNA-seq (n = 573), histone modification through H3K9ac ChIP-seq (n = 615), and DNA methylation utilizing the 450 K Illumina array (n = 667). Expanding our exploration, we delved into APOE locus-associated attributes within two additional brain regions: the posterior cingulate cortex (PCC), comprising a sample size of n = 499, and the anterior cingulate cortex (AC), comprising n = 433 samples, with the intent of capturing the expression profiles in different brain regions. The overlapped samples across three brain regions can be found in Supplementary Fig. S4. Applying a congruent methodology, the LIBD dataset (see Methods) became another vital resource for investigation. With the DLPFC brain region at its core, this dataset facilitated the accumulation of additional bulk RNA-seq data from European ancestry individuals (n = 376) and African Americans (n = 216).

Because the vast majority of genes are regulated within an enhancer’s chromosomal position (cis-regulation), we limited our transcriptional mechanism studies to the 2 Mb region [33] containing the APOE gene. To select potential functional variants in the selected region, we extracted the genotypes of 6,428 high-quality SNPs from ROSMAP whole-genome sequencing data, 6,483 SNPs from LIBD European, and 10,838 SNPs from LIBD African for downstream analysis.

APOE jxn1.2.2 transcript is uniquely linked to specific AD risk-associated alleles in the APOE region

To pinpoint APOE's mRNA transcripts within specific gene regions, we employed an expression feature known as exon-exon junctions. This approach effectively tags specific transcripts, enhancing our ability to quantify them with a heightened degree of precision and specificity, as demonstrated by our recent postmortem brain studies [34,35,36]. Following the reads alignment and quality controls, our efforts yielded three distinct splicing junctions connecting exon 1 and exon 2, alongside a common junction spanning exon 2 and exon 3, as well as another common junction bridging exon 3 to exon 4 (Fig. 1B). Consequently, our focus homed in on the junction linking alternative exons 1 and 2, a pivotal choice given its capacity to delineate diverse APOE transcripts. Then, we combined the APOE gene expression information with genomic variants previously selected with the aim to identify the SNPs associated with the levels of the APOE transcripts identified. Specifically, we examined the association of selected variants with the global abundance of APOE expression (combining reads of all transcripts identified) as well as the abundance of each different spliced isoform. To this end, we conducted a linear regression model implemented in TensorQTL [22]. We used five principal components (PCs) derived from genotype data to correct population stratification, and K PCs derived from expression data to correct potential batch effects (detailed in Methods and Supplementary Table S1 & S2). Across the five RNA-seq datasets (Table 1), we identified an average of 57 k SNP-gene pairs and 5 M SNP-junction pairs at the APOE locus, about 6 k and 12 k cis-eQTLs at gene and junction levels with a false discovery rate (FDR) < 0.05.

To link the APOE transcripts-associated variants (eQTLs) to AD risk alleles, we co-localized observed eQTLs with AD GWAS [5] SNPs. The integration yields an average of 472 SNP-gene pairs and 885 SNP-junction pairs with genome-wide significance for AD risk (p < 5e-8) and FDR-significant for eQTL analysis (FDR < 0.05). Importantly, we uncover that a particular junction between alternative exon 1 and exon 2 (named jxn1.2.2 and tagging the APOE transcript NM_001302688) is the top hit junction at the APOE locus co-localizing with variants associated with AD-risk (p < 1e-7) (Figs. 1B and 2A, Supplementary Fig. S5 and Table S5). We didn’t observe statistical significance between AD risk variants (GWAS p < 5e-8) and other APOE transcripts (jxn1.2.1 and jxn1.2.3) or APOE gene-wide expression levels (Fig. 2B, Supplementary Fig. S6A, Table S6 & S7). When we analyzed the other two brain regions, PCC and AC, we found that AD alleles do not influence the jxn1.2.2 transcript expression (Supplementary Table S7). In contrast, the association between the AD alleles and jxn1.2.2 expression was replicated in the LIBD European ancestry brain DLPFC collection (Fig. 2C, Supplementary Fig. S7A & Table S5).

Fig. 2figure 2

APOE jxn1.2.2 transcript is associated with Alzheimer’s disease (AD). A jxn1.2.2 expression (red) is the top hit compared to other transcripts at the APOE locus (blue) in the ROSMAP brain DLPFC region. The association of AD risk SNP, rs157580, with APOE gene level and its 3 transcripts (jxn1.2.1, jxn1.2.2, and jxn1.2.3) in ROSMAP considering global ancestry (B). Association of jxn1.2.2 and AD risk SNP in LIBD European ancestry (C) and African American (D). E Local ancestry analysis at the APOE locus

To assess the potential influence of ancestry on the relationship between APOE transcripts and AD alleles, we also conducted an analysis of RNA-seq data from the LIBD African ancestry brain DLPFC collections, and this association persists (Fig. 2D, Supplementary Fig. S8A), suggesting a significant link between APOE jxn1.2.2 transcripts and AD alleles in samples from two different ancestries. Because we analyzed each European or African population separately to avoid heterogeneity among ancestries, the above analysis was based on global ancestry analysis using Principal Component Analysis (PCA) by integrating genotype data of ROSMAP and LIBD separately with HapMap3 populations (see Methods). Our global ancestry analysis clearly indicated the homogeneous nature of our populations: ROSMAP European, LIBD European and African populations (Supplementary Fig. S9). To further investigate if the results were influenced by population admixture, we performed local ancestry analysis at the APOE locus. As expected, the local ancestry results are consistent with our global ancestry analysis (Fig. 2E, Supplementary Figs. S7C & S8C).

The gene structure of APOE consists of four exons, with the two SNPs (rs429358 and rs7412 located in exon 4) determining the three common protein isoforms of the APOE gene (Fig. 1B). To determine if the association of AD alleles with jxn1.2.2 transcript is independent of the APOE2,3,4 alleles, we performed the conditional analysis by adding two variables, APOE4 (4 carriers and non-4 carriers) and APOE2 (2 carriers and non-2 carriers), in our regression Model-2 (Supplementary Table S2), and found the significant associations were not influenced compared to original model without APOE4 and APOE2 in 3 independent datasets: ROSMAP, LIBD European and African populations (Supplementary Figs. S6B & S7B & S8B & S10, Table S6 & S7). Our finding, the association between AD alleles and the jxn1.2.2 transcript is independent of APOE2,3,4 alleles, was replicated in local ancestry analysis (Supplementary Figs. S7D & S8D). To further define the independent effects of our candidate AD alleles on APOE jxn1.2.2 expression from APOE4 and APOE2, we performed epistasis (statistical interaction analysis), and we did not observe significant interactions between our candidate AD alleles and the APOE4/2 risk allele (Supplementary Fig. S6C & S6D), indicating the association between jxn1.2.2 expression and our candidate AD-risk alleles is not influenced by APOE4/2. The independent expression of jxn1.2.2 transcript was further supported by the lack of association between APOE2,3,4 determining SNPs (rs429358 and rs7412) and jxn1.2.2 expression (Supplementary Table S8).

APOE jxn1.2.2 transcript expression levels are associated with AD pathology, cognitive impairment, and APOE4 allele in DLPFC

To explore the role of APOE transcripts abundance in AD, we compared its expression level between AD and controls: (1) CERAD criterion to evaluate neuritic plaques [37]. (2) Braak criterion to evaluate the density and distribution of neurofibrillary tangles (NFT) [38, 39]. (3) and in cognitive health [40]. We evaluated mild cognitive impairment (MCI or dcfdx_lv) [41, 42] and cognitive status at the time of death [43] (cogdx). (4) APOE4 genetic factor [4, 44] by comparing APOE gene expression between APOE4 carriers and APOE4 non-carriers.

At the gene level by combining all transcripts, the APOE expression was marginally significantly associated with cognitive impairment (dcfdx_lv p = 0.0166; cogdx, p = 0.0432) in DLPFC. However, the APOE gene is not differentially expressed in CERAD, braak, and APOE4 criteria across DLPFC, AC, and PCC brain regions. In addition to neurodegenerative phenotypes, we also compared APOE gene expression between neuropsychiatric diseases (schizophrenia, bipolar disorders [BP], major depression disorders [MDD]), and controls in LIBD European and African individuals. However, we didn’t find significant differences (Supplementary Table S9).

At the single transcripts level, by analyzing the three transcripts separately, we found that jxn1.2.2 transcript was differentially expressed between AD and controls compared to other APOE transcripts in DLPFC (Fig. 3A). APOE jxn1.2.2 expression was uniquely associated with amyloid burden as characterized by CERAD pathology (p = 0.0472) and NFT characterized by braak pathology (p = 0.0215). We did not detect differences for the other APOE transcripts (jxn1.2.1 and jxn1.2.3) in DLPFC (Fig. 3B,C). Furthermore, differential jxn1.2.2 expression was observed between APOE4 carriers compared and non-carriers in European populations from ROSMAP (p = 0.0001) (Fig. 3D) and LIBD (p = 0.0012, Supplementary Fig. S11A), and the same trend in African population (p = 0.0591, Supplementary Fig. S11B). The three transcripts are all significantly associated with cognitive impairment (dcfdx_lv and cogdx p < 0.05) (Fig. 3A). In contrast, none of the three transcripts were associated with AD status using the abundance data of PCC and AC brain regions. Additionally, they were not associated with schizophrenia, BP, and MDD in LIBD European and African populations (Supplementary Table S10).

Fig. 3figure 3

Differential expression of APOE at gene level and transcripts level. A Differential expression of APOE exon-exon junctions across different diagnosis criteria among diverse ethnic groups (European ancestry [EA] and African ancestry [AA]). BP, bipolar disorders; SZ, schizophrenia; MDD, major depression disorders. The dashed line indicates the threshold of p-value = 0.05. Bigger red dots are jxn1.2.2. B Differential analysis of APOE at gene and junction level between AD and controls in BRAAK diagnosis. Differential analysis of jxn1.2.2 transcript in CERAD diagnosis (C), APOE4 carriers vs. non-carriers (D). E APOE gene and transcripts expression during brain development in brain DLPFC region in European ancestry. F APOE gene and transcripts expression during brain development in brain DLPFC region in African American ancestry

To delineate the expression trajectory of APOE transcripts during brain development, we analyzed 227 brain samples across 16 brain regions from 42 human postmortem brains (Supplementary Fig. S2). We plotted the expression patterns of the 3 APOE transcripts across 16 brain regions defined by Kang et al. [45] (Supplementary Figs. S12 & S13). We also visualized the expression trajectory by combining all APOE transcripts from the 16 brain regions using PC1, which can explain majority of variance (> 67%) (Supplementary Fig. S14). We observed low expression of all the APOE transcripts during prenatal stages. They are upregulated during childhood (0 < age < 13). Then, the expression is slightly downregulated during adulthood (13 + years). We replicated the findings of APOE expression trajectory in LIBD European postmortem brain DLPFC region (Fig. 3E, Supplementary Table S11). We also found the APOE expression trajectory is consistent between European and African ancestries (Fig. 3F). The AD-linked jxn1.2.2 transcript has a medium expression compared to the most abundant transcript jxn1.2.1 and the low-expressed transcript jxn1.2.3 across developmental stages (fetus, child, and adult) across the 16 brain regions (Supplementary Fig. S12 & S13 and Table S4 & S11).

To investigate the differences between the APOE transcripts, we aligned the coding sequences of the three transcripts and found distinct 5’ untranslated regions, promoting varied starting points for diverse transcripts. Due to disparate start codon usage, the jxn1.2.2 transcript contained an additional 26 amino acids compared to the other transcripts (Supplementary Fig. S15). To further understand the APOE transcripts, we predicted their signal peptides using SignalP 6.0 [46]. While the jxn1.2.1 and jxn1.2.3 isoforms likely possess signal peptides around the 13th amino acid, the jxn1.2.2 retains the same signal peptide following the 26 extra amino acids (Supplementary Fig. S16). Then, we used subprograms, GvH and ALOM, in PSORT2 and predicted the cleavage of the signal peptide in jxn1.2.1 and jxn1.2.3 isoforms at the 19 amino acids, and jxn1.2.2 isoform at the 44 amino acids (Supplementary Table S12). To gain further insight into the APOE coding sequences, we performed positive selection analysis, revealing evidence of natural selection upon APOE during evolution (see methods in Supplementary file, Figs. S17 & S18, Table S13).

To understand the cell-type-specific regulation of APOE levels in the human brain, we analyzed single nucleus RNA-seq data from 46 human postmortem brain DLPFC (European ancestry) focusing on six major cell types (see methods in Supplementary file and Fig. S19), we found that APOE was significantly upregulated in microglia of AD patients compared to healthy persons in the evaluation of neurofibrillary tangle using braak criterion, amyloid plaque using CERAD criterion, cognitive impairment by MCI and cogdx (Supplementary Fig. S20A, B, C, Table S14). Our results are in line with recent evidence that increased APOE expression in microglia has been associated with AD phenotypes [47, 48]. We also observed its differential expression in excitatory neurons when stratified by the APOE4 allele (Supplementary Fig. S20D), indicating the complex genetic-cellular interactions.

Next, to examine if the jxn1.2.2 transcript encodes a stable protein, we generated a full-length jxn1.2.2-Flag construct that overexpresses the full-length jxn.1.2.2 transcript, with the same transcription initiation site and 5’ UTR found in the endogenous jxn1.2.2 transcript. To assist the detection of protein expression from the jxn1.2.2 transcript, the ORF that potentially encodes a ~ 38 kDa protein was Flag-tagged. Western blot using anti-Flag tag antibodies indicates that the jxn1.2.2 construct, when overexpressed in SK-N-MC cells, is translated into a ~ 38 kDa protein, compared to a positive control encoding a Bb1-Flag protein (Supplementary Fig. S21).

Identifying functional SNPs using epigenetic data from brain tissues

To identify potential regulatory SNPs in the APOE region, we carried out a rigorous statistical effort to identify CpGs spanning the APOE region. We obtained 788 CpG sites and performed association analysis between 7,937 SNPs and methylation levels in selected epigenetic features (mQTL). After filtering with mQTL FDR < 0.05, we obtained 4,640 SNPs and 221 CpG sites. Subsequently, to link the DNA methylation with AD, we integrated selected CpG sites with AD variants and eQTL results. We identified 17 CpG significantly associated with 31 SNPs that reached GWAS significance (p < 5e-8) and are associated with jxn1.2.2 abundance (FDR < 0.05) (Supplementary Table S5). We observed significant impacts of AD alleles on CpG methylation (FDR < 0.05) (Fig. 4A). To determine whether the effect of DNA methylation can be modified by the APOE4 and APOE2 alleles, we performed conditional analysis by including the APOE4 and APOE2 as co-variants, and found the results were not influenced (Fig. 4B). We also checked for statistical interaction between methylation levels and AD alleles. As expected, we did not observe significant interactions between our candidate AD alleles and APOE4 and APOE2 on the DNA methylation levels (Supplementary Fig. S22). Consistent with the independent relationship, we found that APOE2,3,4 determining SNPs are not associated with our prioritized CpG methylation levels (Supplementary Table S8).

Fig. 4figure 4

Genotypic impact of candidate SNPs on DNA methylation levels in ROSMAP DLPFC. Association of the candidate AD risk SNPs with CpG sites. The association of the AD-linked CpGs is not affected by the APOE 2&4 allele by conditional analysis

ChIP-seq experiments can determine which chromatin regions are actively involved in gene transcription. From the above analysis, we have identified 31 SNPs associated with jxn1.2.2 transcript expression and DNA methylation (meQTL). Here we carried out several steps to prioritize SNPs within active chromatin at the APOE locus: First, we identified 7 SNPs located within active chromatin regions by co-localizing the H3K9ac ChIP-seq peaks with the 41 SNPs. Second, most enhancers exert their regulatory function through the binding of TFs. Thus, we performed an in-silico search of the DNA sequence of the 7 SNPs for putative TF binding sites using Motif Scan and Enrichment Analysis (MoSEA) and removed 1 SNP with no motif binding. Third, we reviewed the literature and found motifs affected by 3 SNPs (rs1871046, rs157580, and rs439401) that were reported to be involved in neuronal function (Fig. 5A, B, C, Supplementary Table S15). We predicted that SOX4 and SMAD TF family members would bind to rs1871046. SOX2 would bind to rs439401. rs157580 was predicted to be located within binding sites of EGR4 and vitamin D receptor (VDR).

Fig. 5figure 5

Candidate SNPs at the APOE locus located within active chromatin affect transcriptional factors (TFs)’ binding affinity. Left panel: (A) rs439401, (B) rs1871046, and (C) rs157580 are co-localized with H3K9ac ChIP-seq peak from human postmortem brains. Right panel: Recognition sites of TFs involved in Alzheimer’s disease are influenced by the 3 SNPs. The red dash box indicates the binding site of each SNP. D Linkage disequilibrium of candidate SNPs with other SNPs spanning APOE, including the two APOE2,3,4-determining SNPs

The 3 candidate SNPs were not significantly associated with global APOE levels in European and African populations across our 5 datasets (Supplementary Table S6). However, they were associated with the jxn1.2.2 transcript (FDR < 0.05) (Supplementary Table S7). Among the 3 SNPs associated with jxn1.2.2 expression levels in European cohorts, two SNPs (rs157580 and rs439401) were also significantly associated with jxn1.2.2 expression levels in African, indicating the shared regulatory mechanisms for both ancestries. To check the relationship between the 3 SNPs, we performed linkage disequilibrium and found they are relatively independent (weak correlation) (Fig. 5D). For example, r2 of the meQTLs with rs439401 in European is less than 0.4 (Supplementary Table S5). Importantly, the 3 SNPs may represent partially independent meQTLs associated with AD risk, according to the weak linkage disequilibrium with the common AD-risk polymorphisms (rs7412 and rs429358 defining the APOE2,3,4 alleles, Fig. 5D). CSF Amyloid-beta 42 (Aβ42) and phosphorylated tau (pTau) are two major proteins implicated in the AD pathological process that can be assayed. We studied the genetic effects on CSF Aβ42 and pTau levels in a total of 13,116 individuals using GWAS data [49]. We found that rs157580 and rs439401 SNPs are associated with both biomarkers in CSF (p = 4.37e-74 and 1.97e-58 separately), while rs1871046 is weakly associated (p = 1.64e-3) (Supplementary Fig. S23). Our epistasis analysis confirmed that APOE2&4 have no significant effects on the correlation between the two SNPs (rs157580 and rs439401) and DNA methylation (p > 0.05) (Supplementary Fig. S22). Summary-based Mendelian Randomization (SMR) can evaluate the mediation effect of gene expression on association between SNP and phenotype [31]. To further demonstrate the causal effects of the alleles on the expression of APOE transcripts, we performed SMR and the results are consistent (p < 1e-7) (Supplementary Table S16). To expand our observation to other neurological diseases, we investigate the 3 SNPs we prioritized and the two APOE2,3,4 determining SNPs across GWAS of neurodegenerative (e.g., Parkinson’s disease) and neuropsychiatric disorders (e.g., schizophrenia). Interestingly, we found those SNPs are specifically associated with AD (Supplementary Fig. S24 and Table S17).

留言 (0)

沒有登入
gif