African-specific alleles modify risk for asthma at the 17q12-q21 locus in African Americans

17q12-q21 haplotype associations with asthma in European American and African American individuals

We first examined associations between the common haplotypes (frequency ≥0.05) and asthma in the same parent-reported NHW and NHB subjects in the Children’s Respiratory and Environment Workgroup (CREW) cohorts included in our previous study [17]. Haplotypes were assigned based on five SNPs that tagged the core region of the 17q12-q21 locus. The SNPs included the missense SNP, rs2305480, in GSDMB, previously reported to be the lead SNP in these subjects [17] and in near perfect LD with a GSDMB splice variant, rs11078928, in both European-ancestry and African-ancestry populations [18]. The two common 5-SNP haplotypes in these samples, including the non-risk (haplotype 1) and risk (haplotype 2) haplotypes, accounted for 92% of haplotypes in NHW individuals and 57% of the haplotypes in the NHB individuals, with the same directions of effect (Fig. 1). Two other haplotypes were common in the NHB (frequencies 0.17 and 0.14 for haplotypes 3 and 4, respectively) but absent in the NHW (Fig. 1). An additional 13 haplotypes in the NHW and 11 in the NHB were present at frequencies less than 0.05 (Additional file 1: Table S2).

Consistent with our earlier study of individual SNPs [17], the haplotype carrying the rs2305480-G allele (haplotype 2) was associated with asthma in the NHW individuals (OR = 1.35 [CI 95% 1.11, 1.65]; p = 0.0025). In the NHB individuals, a different haplotype also carrying the rs2305480-G allele (haplotype 4) was most strongly associated with asthma (OR = 1.67 [CI 95% 1.10, 2.55]; p = 0.017), whereas two other haplotypes carrying the rs2305480-G allele (haplotypes 2 and 3) had estimated ORs greater than 1.0 but were not statistically significant in this sample. These results suggested that haplotype 4 carries additional variation that increases risk for asthma in NHB individuals.

To identify variants on haplotype 4 that may be contributing to asthma risk, we first characterized the variation at the 17q12-q21 locus using whole-genome sequences from four datasets including either asthma cases only (APIC and EVE) or both asthma cases and controls (URECA and CAAPA) (Table 1). To maximize phasing accuracy, we first focused on sequences from asthma cases who were homozygous for the 5 common haplotype-defining SNPs (44 European Americans and 177 African Americans). Only haplotypes 1 and 2 were observed in the homozygous sequences of the European American asthma cases (Fig. 2A). However, in the African American asthma cases who were homozygous for the 5 haplotype-defining SNPs, the four common haplotypes (haplotypes 1, 2, 3, 4) and two others (haplotypes 5 and 6) were observed. Haplotypes 5 and 6 both carried the asthma-associated rs2305480-G allele and differed from the African American high-risk haplotype 4 by one or two of the other haplotype-defining SNPs, respectively (Fig. 2A).

To gain an initial overview of the sequence structure across the extended 17q12-q21 locus, we visualized haplotypes extending beyond the core region to include the proximal and distal regions [11] in each population using ChromoPainter [35] (see “Methods”). Nearly all haplotype “switching,” which represents historical recombination events, in the 88 European American sequences were at the boundaries of the core region or within the proximal and distal regions (Fig. 2B), consistent with LD patterns in a European-ancestry reference population (CEU). These sequences revealed greater diversity and more historical recombination in African Americans, including numerous switches within the core region, also consistent with LD patterns in an African American reference population (ASW) (Fig. 2C, Additional file 1: Fig. S3). Haplotype frequencies in each whole-genome sequence dataset are shown in Additional file 1: Table S3.

Defining a critical region and African-specific risk variants on haplotype 4

Because variants in the proximal and distal regions were not associated with asthma in African Americans in our previous study [17] and because of the observed haplotype structures and LD patterns in African Americans (Fig. 2C), we focused on the sequences in asthma cases homozygous for the high-risk haplotype 4 (defined by 5 SNPs) to first identify the chromosomal region(s) shared by all haplotype 4 homozygotes (2N=36). Examining the ChromoPainter displays revealed a 23.9-kb region that was shared by haplotype 4 sequences and bounded by at least two recombination events on either side, providing more confidence in the boundaries (Additional file 1: Fig. S4). We then extended the region 1.2 kb (5%) on either side to capture any additional variants that may be excluded based on the small number of observed recombination events. Ultimately, we examined a 26.3-kb region that extended from intron 6 in GSDMB to 6.1 kb upstream of the ORMDL3 transcription start site (TSS). We refer to this 26.3-kb segment as the “critical region.”

To identify variants that were present in the critical region of haplotype 4 but not on any of the other homozygous haplotypes, we defined consensus sequences of this region for the European American haplotypes 1 and 2 and the African American haplotypes 1 through 4 (see “Methods”). We then conducted pairwise comparisons between sequence variants in the high-risk haplotype 4 critical region and each of the five other haplotypes. Among the 58 variant sites in the critical region, nine were specific to the haplotype 4 consensus sequence, occurring at frequencies between 0.75 and 0.97 in African Americans who were homozygous for haplotype 4 (Table 2). We then expanded the sample to include all sequences from cases and controls and not just those homozygous for the 5-SNP haplotypes. The nine variants were absent in 187 European American sequences, were highly enriched in 386 African American haplotype 4 sequences (frequencies between 0.495 and 0.756), and present in lower frequency on the other 1511 African American haplotypes (frequencies between 0.019 and 0.267) (Table 2). These frequency distributions are similar to those observed in worldwide populations (Additional file 1: Table S4). These data suggested that one or more of these nine variants contribute to asthma risk in African Americans.

Table 2 Location and frequencies of the nine novel African-specific SNPs in the full sampleFunctionally characterizing the African American–specific variants on haplotype 4

The nine African-specific variants that were enriched on haplotype 4 spanned from intron 6 of GSDMB to an intergenic region between ORMDL3 and LRRC3C. We hypothesized that these variants modulate asthma risk by impacting the expression of cis genes. To test this hypothesis, we extracted processed RNA-seq data [39] for the 27 genes whose TSS were within 500 kb of each of the nine variants and detected as expressed in upper airway (nasal) epithelial cells from 189 African American URECA children. Ancestry principal components (PCs) 1 and 2 for URECA children are shown in Additional file 1: Fig. S5. We then performed eQTL analyses of these genes and the nine African-specific variants. At a nominal (uncorrected) p-value <0.05, all nine novel variants were cis-eQTLs for only one of the 27 genes, gasdermin A (GSDMA). At a false discovery rate (FDR) of ≤0.10, seven of the nine variants remained significant (p ≤ 2.5×10−3; Table 3). The results for all analyses are shown in Additional file 2. The alleles on high-risk haplotype 4 were associated with increased expression of GSDMA (e.g., rs113282230-T p = 1.02×10−3, b = 0.086; Fig. 3A). This eQTL effect on GSDMA expression was replicated in nasal epithelial cell transcripts from 534 individuals of African ancestry in the CAAPA2 cohort (rs28623237-G p = 8.65×10−5, b = 0.116; Additional file 1: Fig. S6).

Table 3 cis-eQTL mapping results for the nine novel variantsFig. 3figure 3

Functional characteristics of the African-specific novel variants on the high-risk asthma haplotype. A rs113282230, as a representative of the novel variants, is an eQTL for GSDMA but no other genes in upper airway epithelial cells (see Table 3 for results with all nine variants and Additional file 2 for results with all genes). BUpper panel: Chromosomal region from the 26.3-kb critical region (thick black bar) to the GSDMA gene on chromosome 17q12-q21. Vertical lines at the top show the locations of all variants in the critical region. The location of the four genes in the region, showing pcHi-C interactions (red arc) from a region in intron 1 of ORMDL3 to GSDMA. H3K27ac peaks (read counts; light blue tracks) in primary normal human epidermal keratinocytes (NHEK) (ENCODE) are shown in a region overlapping with the pcHi-C capture. Lower panel: Close-up of the 26.3-kb critical region. The nine African-specific variants enriched on haplotype 4 and eQTLs for GSDMA are shown in red. The same HEK27ac peaks as in upper panel, in addition to tracks of DNase clusters across all ENCODE cell lines, are shown. The darker the tracks the denser the DNase cluster. Two of the nine variants, rs113282230 and rs113571956, overlap with the marks of an active enhancer (H3K27ac), open chromatin (DNAse), and a putative enhancer (pcHi-C). See Fig. 4 and Additional file 1: Table S7 for additional annotations in airway epithelial cells and Additional file 1: Fig. S13 and Table S8 for additional annotations in immune cells

Although SNPs within GSDMA at the distal end of the locus were more significant eQTLs for GSDMA in the airway epithelial cells (Additional file 1: Table S5), the LD between the six most significant African-specific SNPs in the 17q12-q21 core region and the eQTL SNPs in the GSDMA gene was small (LD r2 ≤ 0.28; Additional file 1: Fig. S7). Consistent with this observation, the eQTL effect of rs113282230-T was only modestly reduced when we included eQTL tag SNPs from each LD block in GSDMA as a covariate in the eQTL model for rs113282230 (p = 2.44×10−3, b = 0.744 conditioned on rs3859129 and p=4.69×10−3, b = 0.708 conditioned on rs4795406; see Additional file 1: Fig. S8). These conditional analyses indicate that the observed eQTL effects of the novel 17q12-q21 SNPs are independent of the eQTL SNPs in the GSDMA gene.

The novel variants with eQTL effects are enriched on haplotypes that also carry the main risk allele, rs2305480-G (Table 2), which is an eQTL for GSDMB [17]. To determine whether the effects of the novel variants on GSDMA expression were independent of the effects of rs2305480 on GSDMB expression and that each were specific eQTLs for different members of the gasdermin gene family, we performed eQTL studies with rs2305480 on GSDMA expression and with rs113282230 (as a surrogate for the nine novel variants) on GSDMB expression in the airway epithelial cells from URECA African American children (Additional file 1: Fig. S9). These results indicated that rs2305480 is an eQTL for GSDMB but not for GSDMA and rs113282230 is an eQTL for GSDMA but not GSDMB, consistent with the LD pattern between rs113282230 and other common variants in the 17q12-q21 core region (r2 < 0.11; Additional file 1: Fig. S10).

Because of the strong LD between the nine novel variants, it was not possible to statistically determine which variants impart functional effects on gene regulation at this locus. Therefore, we examined an active enhancer mark (H3K27ac) and areas of open chromatin assessed by DNAse in multiple cell lines from ENCODE [19] and by ATAC-seq in two airway epithelial cells lines (human bronchial epithelial cells, 16HBE, and small airway epithelial cells, SAEC). Two of the GSDMA eQTL variants, rs113282230 and rs113571956, overlapped with active enhancer marks, DNAse clusters in multiple cell types, and ATAC-seq peaks in airway epithelial cells (Figs. 3B and 4). DNase hypersensitivity sites of open chromatin in all ENCODE cells and in immune cells are shown in Additional file 1: Table S6 and Fig. S11, respectively. Next, we extracted published data on promoter capture Hi-C (pcHi-C) in lower airway (bronchial) epithelial cells [20] and examined interactions with the region containing the novel variants. Two interactions were observed between the promoter of GSDMA and the genomic region characterized by marks of active enhancers and open chromatin, which included rs113282230 and rs113571956 (Capture HiC Analysis of Genomic Organization [CHiCAGO] scores = 6.01 and 5.07 (Fig. 3B). Additional interactions and open chromatin marks are shown in Fig. 4 and Additional file 1: Table S7). These data suggests that rs113282230 and rs113571956 reside in an enhancer region that regulates the expression of GSDMA via chromatin looping and direct interaction with its promoter and provides a mechanistic explanation for how two novel variants in an intron of ORMDL3 regulate the expression of GSDMA, 33.5–54.5 kb away.

Fig. 4figure 4

pcHi-C loops and ATAC-seq peaks at 17q12-q21 locus from IKZF3 to GSDMA. The region harboring the 9 novel variants is shown in yellow and the location of the variants are show as vertical lines under the genes. The two candidate variants are indicated by an orange arrow. H3K27ac marks in NHEK (skin) cells from ENCODE are shown as blue tracks (also see Fig. 3B and Additional file 1: Fig. S11). ATAC-seq tracks of open chromatin for two airway epithelial cell lines (16HBE and SAEC) are shown in green. All pcHi-C interactions within this view in airway epithelial cells are shown. Two interactions between GSDMA with three of the nine variants (±1kb) were observed (shown as red loops). Two of those variants (orange arrow) were also eQTLs for GSDMA. All genes showing pcHi-C interactions with the 9 variants (±1kb) are shown in Additional file 1: Table S7

To evaluate functional evidence for the 9 variants in immune cells, we used published eQTL data in the eQTL browser (https://fivex.sph.umich.edu/variant/eqtl/17_39927157?group_by=symbol&n_labels=5&study%5B%5D=Schmiedel_2018&tss_distance=500000&y_field=log_pvalue) and pcHi-C data in the Open Target browser (https://genetics.opentargets.org/variant/17_39927157_A_T). The three most significant eQTLs in immune cells were with increased expression of long non-coding RNAs AC08112.1 and AC090884.2 in naïve Tregs (p=0.0038) and Th1-17 memory cells (p=0.0043), respectively, and with decreased expression of RARA in Th1-17 memory cells (p=0.010) (Additional file 1: Fig. S12). None of the promoters of these eGenes interacted with regions that overlapped with the novel variants in pcHi-C data in immune cells (Additional file 1: Fig. S13 and Table S8). Thus, these combined data do not support a role for the novel variants regulating the expression of genes in immune cells.

Associations between African-specific variants and clinical correlates of asthma

The results described above highlighted two novel African-specific variants, rs113282230 and rs113571956, that were enriched on the asthma high-risk haplotype, were eQTLs for GSDMA, and mapped within a putative enhancer element that physically interacted with the promoter of GSDMA. These SNPs were in perfect LD in our sample (r2 = 1; Additional file 1: Fig. S7), so we arbitrarily selected one (rs113282230) for further analyses with clinical measures. We first examined seven asthma-associated quantitative traits that were available for the African American children in both the APIC and URECA cohorts (n=607). Descriptions of these cohorts are shown in Additional file 1: Table S1; ancestry PCs 1–2 in each population are shown in Additional file 1: Fig. S5. These seven traits represented the lung function (pre-bronchodilator %predicted forced expiratory capacity at 1 s [FEV1], n=607; FEV1/forced vital capacity [FVC], n=601; bronchodilator response, n=588), airway inflammation (fractional exhaled nitric oxide [FeNO], n=423), allergic (total immunoglobulin E [IgE], n=604), and immune cell (blood eosinophil count and blood neutrophil count, n=606) components of asthma.

Three phenotypes were associated with rs113282230 at nominal significance (p<0.05): %predicted FEV1 (p = 9.06×10−3), blood neutrophil count (p = 0.016), and total IgE (p = 0.042) (Fig. 5A). The asthma risk alleles were associated with lower values of FEV1, total IgE, and neutrophil counts. None of the tests were significant after adjusting for seven tests using the conservative Bonferroni correction (p <0.007). However, using the correlation between z-scores of the seven traits, we calculated the probability of observing at least three tests with p<0.05 by chance and rejected the global null hypothesis that none of the traits are associated with rs113282230 (p = 0.0089) [44]. The results for all nine variants and all seven traits are shown in Additional file 1: Table S9.

Fig. 5figure 5

Clinical phenotype associations with the novel variants and haplotype 4. A Correlation plot of the seven asthma-associated quantitative phenotypes in African American children from the URECA and APIC cohorts and their association with rs113282230 genotypes. B Bar plot showing the frequency of the 5-SNP high-risk haplotype 4 by STEP classification categories (mild, moderate, and severe) [47] in African American adults from Chicago. Severity categories and sample sizes are shown on the x-axis and the frequency of haplotype 4 is shown on the y-axis. Haplotype 4 was used as a surrogate for the nine novel SNPs because neither whole-genome sequences nor imputed genotypes for these variants were available for these individuals. C Bar plot of the frequencies of asthma severity categories in African American adults with asthma who carry at least one rs2305480-G allele, stratified by the presence or absence of haplotype 4 (x-axis). None of these individuals were homozygous for haplotype 4

Taken together with the chromatin annotations and the results of the haplotype studies in the CREW cohorts, these clinical data suggested that the rs113282230-T allele increases the risk of asthma in carriers of the rs2305480-G allele. To examine this more directly, we tested the additive effects of the rs113282230-T allele in rs2305480-AA (low risk) and rs2305480-GG (high-risk) homozygotes (coded as 0 or 1). If the rs113282230-T allele had no effect on risk, then rs2305480-GG homozygotes should have similar risk regardless of the number of rs113282230-T alleles. In contrast to this null expectation, we observed an increasing prevalence of asthma with increasing numbers of rs113282230-T alleles among rs2305480-GG homozygotes, although this effect did not reach significance in this sample (OR = 1.34, 95% CI 0.95, 1.88; P = 0.096) (Additional file 1: Fig. S14).

Associations between African-specific variants and asthma severity

To further generalize these results to asthma severity and to adults, we examined available data on severity for 63 African American asthmatic adults who have participated in genetic studies in Chicago [45]. Because we did not have sequence data for these individuals, we used as a surrogate the 5-SNP haplotype and tested for an association between the high-risk haplotype 4 and asthma severity, defined as mild, moderate, and severe based on lung function and steroid use [47]. Consistent with the analysis of APIC and URECA children described above, the frequency of haplotype 4 increased with increasing asthma severity in African American adults (ordinal logistic regression β=1.58, 95% CI 0.38, 2.79; p = 0.012) (Fig. 5B). To directly test whether the presence of haplotype 4 adds to the risk conferred by the rs2305480-G allele, we further stratified the 59 adults who carried at least 1 copy of rs2305480-G into two groups based on whether they also carried 0 (n=41) or 1 (n=18) copies of haplotype 4 (none of the subjects carried two copies of haplotype 4). We compared the number of subjects who were classified as mild, moderate, or severe within the two groups. If haplotype 4 did not impact asthma severity beyond the effects of the rs2305480-G allele, the distributions by severity should be similar in the two groups. However, among adult asthmatics with at least one copy of the rs2305480-G allele, there was a greater proportion of severe asthma cases among those also carrying haplotype 4 compared to those not carrying haplotype 4 (ordinal logistic regression β = 1.68, 95% CI 0.42, 2.93; p = 0.011; Fig. 5C, Additional file 1: Table S10). These combined results support a role for the novel variants, which are enriched on haplotype 4, in asthma severity in both children and adults.

留言 (0)

沒有登入
gif