Weighted burden analysis of rare coding variants in 470,000 exome-sequenced UK Biobank participants characterises effects on hyperlipidaemia risk

Table 1 shows the results of the primary analysis. Three of the four protein-coding genes which reach exome-wide significance in the earlier study show convincing evidence of association with hyperlipidaemia, LDLR, PCSK9 and ANGPTL3, with SLPs of 87.28, −29.08 and −7.48 respectively. However the other gene, IFITM5, shows no evidence of association, with SLP of only 0.44, so it seems reasonable to conclude that the original results for this gene represented a type 1 error. Of the remaining genes which were originally significant at p < 0.001, most show no evidence of association in this new sample and can be dismissed as chance findings. However ABCG5, APOC3 and NPC1L1 all produce SLPs which are statistically significant after correcting for testing 47 genes, with values of 5.08, −11.01 and −3.42. These six genes were carried forward for secondary analyses along with ANGPTL4 which was considered to be of interest because of its similarity to ANGPTL3, even though it achieved SLPs of only −3.66 in the first sample and −1.95 in the second.

The original study considered 22,642 genes, meaning that for a gene-wise result to be considered exome-wide significant the magnitude of the SLP obtained should exceed -log10(0.05/22642) = 5.66. For the seven genes carried forward, the results of weighted burden analysis in the entire sample of 106,091 cases and 363,674 controls are also shown in Table 1 and it can be seen that all six of the genes which produced results which were statistically significant after multiple testing in the second sample also produce results which would be regarded as exome-wide significant in the full sample. However the SLP for ANGPTL4 in the combined sample is only −4.79.

In order to gain insights into the effects of different categories of variant within these seven genes of interest, counts for variants of each category in each subject were entered into multiple logistic regression analysis along with sex and 20 principal components as covariates. These results are shown in Tables 24 and are summarised briefly as follows.

Table 2 Results from logistic regression analysis showing the contribution different categories of variant within each gene make to risk of hyperlipidaemiaTable 3 Results of variant category analysis for PCSK9, NPC1L1 and APOC3Table 4 Results of variant category analysis for ANGPTL3 and ANGPTL4

Variants in LDLR (SLP = 156.81) and ABCG5 (SLP = 6.95) increase risk of hyperlipidaemia and results for each variant category are shown in Table 2. Table 2A shows the results for LDLR and it can be seen that LOF variants are associated with hyperlipidaemia risk with OR > 20. 113 participants carry a LOF variant and all but 16 of these are cases. Of note, there are also 19 subjects who carry an inframe indel and all but 2 of these are also cases, again yielding an OR over 20 though with a wide confidence interval. Detailed inspection of these results reveals that they are driven by two inframe deletions, 19:11105556ATGG > A (rs121908027) which is carried by 8 participants who are all cases and 19:11116925ACGG > A (rs1221971156) which is carried by 7 participants, 6 of whom are cases. The first of these, rs121908027, is reported to the be most common familial hyperlipidaemia (FH) mutation in Ashkenazi Jews and was found in 35% of FH families in Israel [15]. As well as the large effect of these LOF and indel variants there is statistically significant evidence for an overall small effect on risk of the much commoner variants in the “Protein altering” category (consisting mostly of nonsynonymous variants) with OR of 1.12 and a further modest increase in risk if these are annotated as deleterious by SIFT and/or possibly or probably damaging by PolyPhen, with ORs of 1.44, 1.27 and 1.61. While all these categories of variant are associated with increased risk of hyperlipidaemia, the category “Splice region” is actually associated with reduced risk, with OR of 0.82 and SLP of −19.81. This result is driven by 19:11120527 G > A (rs72658867), which has MAF 0.0126 in controls and 0.0096 in cases and which has previously been reported to lower HDL cholesterol and to be protective against coronary artery disease [16].

Table 2B shows the results for ABCG5 and it can be seen that although a few hundred participants carry LOF variants these do not appear to have any strong effect on hyperlipidaemia risk with an OR of 1.2 which is not statistically significant. Instead, the signal for this gene seems to be driven largely by the “Splice region category”, with OR of 1.44 and SLP of 6.20. Although there are 74 variants in this category, the result seems to be mainly driven by three variants which are somewhat commoner in cases, 2:43813316 A > C (rs114780578), 2:43822939 G > C (rs370895243) and 2:43825025 A > T (rs201469377). The category “Protein altering” yields an OR of 1.05 and an SLP of 2.07 but there is no suggestion that nonsynonymous variants recognised as more severe by SIFT or PolyPhen are associated with increased risk. This result may be largely driven by 2:43813208 T > C (rs140374206) which had frequency 0.006049 in controls and 0.006246 in cases and which has previously been reported to be associated with raised non-HDL cholesterol and increased risk of gallstones [17].

Variants in PCSK9 (SLP = −48.57), NPC1L1 (SLP = −7.60) and APOC3 (SLP = −13.19) are protective against hyperlipidaemia and their results detailed are shown in Table 3. As can be seen in Table 3A, LOF variants in PCSK9 reduce hyperlipidaemia risk, with OR of 0.39. On average, protein altering variants in general have a mild effect on lowering risk, with OR of 0.92, but those which are additionally annotated as deleterious by SIFT have a larger effect, with OR 0.69, whereas there is no additional effect associated with being characterised as possibly or probably damaging by PolyPhen.

The results in Table 3B show that the overall signal for NPC1L1 is mainly due to LOF variants, with OR 0.64 and SLP of −5.20, with possibly some additional contribution from variants annotated as deleterious by SIFT, which have OR 0.89 and SLP −1.76. A similar scenario is seen for APOC3 in Table 3C, with LOF variants have OR of 0.68 and SLP of −11.22 but with other categories not showing clear evidence of association.

Variants in ANGPTL3 (SLP = −12.68) and ANGPTL4 (SLP = −4.79) also appear to be protective against hyperlipidaemia, although the result for ANGPTL4 is not exome-wide significant. Nevertheless, as the products of both genes modulate the activity of lipoprotein lipase and as inactivating variants in both genes have previously been shown to be associated with hypolipidaemia, it seems appropriate to present the detailed results for both, as shown in Table 4 [18, 19]. For ANGPTL3 the signal is again mainly due to LOF variants, with OR of 0.59 and SLP of −8.36, but it can be seen that splice region variants also have OR of 0.69 and SLP of −4.70. This latter result is driven by 1:62598067 T > C (rs372257803) which has MAF 0.00100 in controls and 0.00069 in cases and which has been previously reported to be associated with lower non-HLD cholesterol and triglycerides [20]. The results for ANGPTL4 are not statistically significant after correction for multiple testing but it can be seen that they are consistent with the possibility that LOF variants lower hyperlipidaemia risk modestly, with OR 0.78 and SLP −1.84.

留言 (0)

沒有登入
gif