Mendelian randomization and genetic colocalization infer the effects of the multi-tissue proteome on 211 complex disease-related phenotypes

Inferring multi-tissue protein effects on disease-related phenotypes using cis study-wide significant pQTLs as IVsWashington University cohort-specific analyses

First, we performed analyses using only the pQTLs identified on the WashU cohort that included 835 CSF, 529 plasma, and 380 brain samples in which 713, 931, 1079 proteins were measured using the SOMAscan platform (1305 panel) and passed QC in each tissue, respectively.

We initially performed the MR analyses including the study-wide significant cis pQTLs (103, 47, 15 protein-locus pairs in the CSF, plasma, and brain, respectively). Horizontal pleiotropy can lead to false-positive results in MR analyses. Although it is known that cis pQTLs are less likely to be susceptible to horizontal pleiotropy than trans pQTLs [1, 7, 9], we removed pleiotropic cis pQTLs (defined as associated with five or more proteins) as IVs. We also performed colocalization analyses to examine the confounding effect of LD. Colocalization can provide complementary supporting evidence of inference by decreasing the likelihood of confounding by LD. Furthermore, we used the Steiger filtering to identify the correct directions of inference. We kept only protein-phenotype pairs that protein has effects on phenotype thereafter (Figs. 2 and 3, Table 1, Additional file 2: Tables S2-S4).

Fig. 2figure 2

Significant protein-phenotype associations identified using cis-only study-wide pQTLs as instrumental variables. Heatmaps were generated using the analyses on the WashU cohort only. A Thirty-three proteins against 37 diseases in CSF. B Thirteen proteins against 18 diseases in the plasma. C Five proteins against eight diseases in the brain. Colors were coded by 5 bins after cutting z-normalized beta MR estimate: below − 10 as dark blue, − 10 to − 5 as dodger blue, − 5 to 0 as cadet blue1, 0 to 5 as antique white1, and 5 to 10 as gold. Phenotype categories were listed on the left side as a bar plot (neurological diseases as blue, biological traits as red, blood traits as orange, cancers as purple, non-neurological diseases as green, and other risk factors as khaki)

Fig. 3figure 3

Miami plots for the cis-only study-wide pQTLs as IVs for all MR and colocalization analyses. Each dot represents the MR results for proteins on human phenotypes. A CSF. B Plasma. C Brain. Phenotype categories were color-coded: biological traits as red, blood traits as orange, cancers as purple, non-neurological diseases as green, neurological diseases as blue, and other risk factors as khaki; for protein-phenotype associations not significant or not colocalized, the color is dark/light gray

Table 1 Summary of replication on MR results using study-wide cis pQTLs with WashU cohort

We found that 33 CSF proteins were associated with 37 phenotypes, (Figs. 2A and 3A) with both significant MR results (FDR < 0.05) and strong colocalization evidence (PP > 80%), 13 plasma proteins were associated with 18 phenotypes (Figs. 2B and 3B), and five brain proteins were associated with eight phenotypes (Figs. 2C and 3C). In CSF (Fig. 2A), two proteins were associated with multiple phenotypes from the same category: (1) MSP was negatively associated with four general diseases (primary sclerosing cholangitis, Crohn’s disease, inflammatory bowel disease (IBD), ulcerative colitis (UC)) and three biological traits (years of schooling, forced vital capacity (FVC), forced expiratory volume in 1-second (FEV1)) and (2) TNFSF15 was negatively associated with Crohn’s disease, IBD, and UC.

In the plasma (Fig. 2B), six proteins were associated with multiple phenotypes from the same category: (1) ADAMTS-5 was positively associated with two biological traits (height and FEV1), (2) coagulation factor XI was positively associated with two diseases (deep venous thrombosis (DVT) and pulmonary embolism ± DVT), (3) CREL1 was negatively associated with two biological traits (height and weight), (4) KYMU was negatively associated with two biological traits (FVC and FEV1), (5) lysozyme was positively associated with two blood traits (hypertension and high cholesterol), and (d6) WFKN2 was negatively related to two biological traits (body mass index (BMI) and weight).

In the brain (Fig. 2C), protein CPNE1 was associated with multiple biological traits: it had a positive association with age at menopause and a negative association with hippocampus volume.

Among all these protein-phenotype pairs, we replicated previously reported findings in the plasma [7, 29] and brain [6] (Table 1, Additional file 2: Tables S2, S3). The replication rates in the CSF, plasma, and brain were 64%, 91%, and 86%, respectively, when compared with the plasma studies [7, 29]. On the other hand, our results did not replicate three previous findings in the brain after overlapping both proteins and phenotypes: CPNE1 on intelligence, cathepsin H on AD, and ALT on intelligence, where the study used a brain pQTL dataset from 144 samples [6].

Meanwhile, we uncovered 45%, 4%, and 12% novel protein-phenotype associations, in the CSF, plasma, and brain, respectively (Table 1, Additional file 2: Table S4).

pQTL meta-analyses uncovered additional protein-phenotype pairs

Furthermore, to increase the power of our analyses, we performed two-sample MR using summary statistics from meta-analyses for CSF and plasma independently (Fig. 1B). CSF meta-analysis included two cohorts including PPMI released in 2019 (N = 132) [20] and WashU [16]. We included 709 CSF proteins shared in both studies.

For the plasma, we leveraged two cohorts including INTERVAL (N = 3301) [21] and SCALLOP (N = 30,931) [22], that were meta-analyses with WashU [16]. For WashU and INTERVAL, we included 746 plasma proteins shared in both studies, and for WashU, INTERVAL and SCALLOP, we included 49 plasma proteins shared in three studies.

These meta-analyses yielded 31 additional CSF and 215 plasma pQTLs, which led to 10 additional CSF proteins associated with 13 phenotypes (Additional file 1: Fig. S3A, Additional file 2: Tables S5-S7) with significant MR results and strong colocalization evidence. Our analyses also identified 12 additional plasma proteins associated with 14 phenotypes (Additional file 1: Fig. S3B, Additional file 2: Tables S5-S7). In CSF (Additional file 1: Fig. S3A), protein IL1 receptor-type1 (IL-1 sRI) was associated with multiple phenotypes: it was negatively associated with three general diseases (IBD, Crohn’s disease, UC) while positively correlated with asthma. In the plasma (Additional file 1: Fig. S3B), the protein haptoglobin was negatively associated with two blood traits (LDL and total cholesterol) while positively related to height. No protein-phenotype pairs had an opposite effect size before and after meta-analysis (36 in CSF and four in plasma).

We successfully replicated the finding that CSF IL-1 sRI increased the risk of asthma, which was not found in our initial analyses (as meta-analyses increased the statistical power of IL-1 sRI pQTL p-value from 1.09 × 10−18 to 2.32 × 10−25). IL-1 receptor antagonist has been tested to attenuate asthmatic symptoms in animal models [35]. We also replicated the finding on plasma haptoglobin associated with reduced LDL and total cholesterol levels, as reported by Boettger and colleagues [36]. Moreover, we highlighted the risky effects of plasma B7-H2 (or ICOS ligand) on rheumatoid arthritis (RA), as it has been validated in a mouse model of RA that anti-ICOS ligand domains can help reduce the disease symptoms [37].

Inferring multi-tissue protein effects on disease-related phenotypes using both cis and trans genome-wide significant pQTLs as IVs

In the previous section, we only used cis pQTLs that passed the most stringent study-wide threshold. This threshold, however, may miss the real biological signals. Therefore, a more permissive threshold could reveal additional signals. To increase the power of MR analyses, we expanded our MR analyses by including potentially non-pleiotropic cis and trans pQTLs as instrumental variables that passed the genome-wide threshold (p < 5 × 10−8, F ≥ 10, and associated with fewer than 5 proteins).

Washington University cohort-specific analyses

With this new threshold, 169, 116, 50 cis and trans pQTLs in the CSF, plasma, and brain, respectively, were used for MR and colocalization analyses for the WashU cohort analyses (Fig. 4, Table 2, Additional file 2: Tables S8-S10). This led to the identifications of 58 CSF proteins associated with 58 phenotypes (Fig. 4A), 32 plasma proteins on 44 phenotypes (Fig. 4B), and nine brain proteins on 16 phenotypes (Fig. 4C) with significant MR results (FDR < 0.05) and strong colocalization evidence (PP > 80%).

Fig. 4figure 4

Significant protein-phenotype associations identified using cis and trans genome-wide pQTLs as instrumental variables. Heatmaps were generated using the analyses on the WashU cohort only. A Fifty-eight proteins against 58 diseases in CSF. B Thirty-two proteins against 44 diseases in the plasma. C Nine proteins against 16 diseases in the brain. Colors were coded by 6 bins after cutting z-normalized beta MR estimate: below − 10 as dark blue, − 10 to − 5 as dodger blue, − 5 to 0 as cadet blue1, 0 to 5 as antique white1, 5 to 10 as gold, and above 10 as orange. Phenotype categories were listed on the left side as a bar plot (biological traits as red, blood traits as orange, cancers as purple, non-neurological diseases as green, neurological diseases as blue, personality traits as pink, and other risk factors as khaki)

Table 2 Summary of replication on MR results using genome-wide cis and trans pQTLs with WashU cohort

Similar to the cis-only-pQTL analyses above, we replicated findings reported previously from a plasma [7, 29] and a brain study [6] (Table 2, Additional file 2: Table S8). In these new analyses, there were a total of 37, 37, and 10 CSF, plasma, and brain protein-phenotype pairs that were previously reported, respectively. Several protein-phenotype associations, however, did not replicate due to the weaker instrumental variables compared to the prior studies (Table 2, Additional file 2: Table S9). Additional novel protein-phenotype findings (48, 12, two in CSF, plasma, and brain, respectively) were also revealed after including both cis and trans genome-wide significant pQTLs (Table 2, Additional file 2: Table S10).

pQTL meta-analyses identified additional protein-phenotype pairs

We performed meta-analyses from two cohorts of CSF and three cohorts of plasma, leading to additional 313 CSF and 711 plasma pQTLs as IVs. This approach identified 21 additional CSF proteins associated with 17 phenotypes (Additional file 1: Fig. S4A, Additional file 2: Tables S11-S13), and 15 plasma proteins were associated with 15 phenotypes (Additional file 1: Fig. S4B, Additional file 2: Tables S11-S13). No protein-phenotype pairs had an opposite effect size before and after meta-analysis (70 in CSF and three in plasma).

To identify what was absent in our initial analyses that included cis-pQTLs, we compared two results from the study-wide versus the genome-wide p-value thresholds (Fig. 5, Additional file 2: Table S14). We identified additional associations for 45 CSF proteins with 42 phenotypes (Fig. 5A), 28 plasma proteins with 35 phenotypes (Fig. 5B), and five brain proteins with seven phenotypes (Fig. 5C).

Fig. 5figure 5

Additional significant protein-phenotype associations were identified after including cis and trans genome-wide pQTLs as instrumental variables. Heatmaps were generated using the analyses after meta-analyses. A Forty-five proteins against 42 diseases in CSF. B Twenty-eight proteins against 35 diseases in the plasma. C Five proteins against seven diseases in the brain. Colors were coded by 6 bins after cutting z-normalized beta MR estimate: below − 10 as dark blue, − 10 to − 5 as dodger blue, − 5 to 0 as cadet blue1, 0 to 5 as antique white1, 5 to 10 as gold, and above 10 as orange. Phenotype categories were listed on the left side as a bar plot (biological traits as red, blood traits as orange, cancers as purple, non-neurological diseases as green, neurological diseases as blue, and personality traits as pink)

In CSF (Fig. 5A), three proteins (DcR3, IL-1 sRII, Prekallikrein) were associated with more than two phenotypes within each category: (1) DcR3 was negatively associated with four biological traits (FVC, FEV1, diastolic blood pressure (DBP), systolic blood pressure (SBP)) and hypertension, while positively associated with BMI. (2) IL-1 sRII was positively associated with three diseases (Crohn’s disease, IBD, UC), while negatively associated with asthma. (3) Prekallikrein was positively associated with three diseases (Phlebitis and thrombophlebitis, DVT, and pulmonary embolism ± DVT).

In the plasma (Fig. 5B), two proteins (b-Endorphin, GRN) were associated with multi-phenotypes within each category: (1) b-Endorphin was negatively connected to four blood traits (HDL, total, high cholesterol, triglycerides) while positively associated with ER-negative breast cancer, and (2) GRN was positively related to four blood traits (serum creatinine, LDL, total, high cholesterol) and negatively related to HDL cholesterol. GRN was also found in positive associations with three cardiovascular diseases (coronary heart disease (CHD), myocardial infarction, and angina) and negative associations with two biological traits (heel bone mineral density (BMD) and height).

In the brain (Fig. 5C), two proteins (PSP and OAS1) were concordantly associated with more than two phenotypes: (1) PSP was negatively connected to two general diseases (angina and asthma) and one biological trait (DBP), and (2) OAS1 was negatively related to one biological trait (BMI) and one neurological disease (AD). Particularly, we showed a consistent finding that the brain OAS1 is protective against AD risk as recently published by Magusali and colleagues [38]. Magusali et al. [38] reported that OAS1 is required to limit the pro-inflammatory response of human induced pluripotent stem cell-derived microglia.

Cross-tissue comparisons on tissue consistency of the protein-phenotype effects

To investigate whether the directions of effects were consistent across tissues given the same protein-phenotype pairs, we compared the significant MR results (FDR < 0.05) using meta-analyzed genome-wide significant cis and trans pQTLs with strong colocalization evidence (PP > 80%) across three tissues. We identified 15 pairs in more than one tissue, in which 13 pairs had consistent MR estimates (Fig. 6). Among these 13 tissue-consistent pairs, 10 were concordant between CSF and plasma (Fig. 6A, B), two between plasma and brain (Fig. 6A, C), and one between CSF and brain (Fig. 6A, D). For example, WFKN2 levels from CSF and plasma were consistently associated with two phenotypes: BMI and weight (Fig. 6A).

Fig. 6figure 6

Cross-tissue MR estimate comparisons. A Heatmaps were generated on the MR estimates given the same protein-phenotype pairs with a PP > 80% when performing colocalization. Colors were coded by 4 bins after cutting z-normalized beta MR estimate: − 10 to − 5 as dodger blue, − 5 to 0 as cadet blue1, 0 to 5 as antique white1, and 5 to 10 as gold. B Scatter plot of CSF vs plasma MR estimates on the same protein-phenotype associations. C Scatter plot of plasma vs brain MR estimates on the same protein-phenotype associations. D Scatter plot of CSF vs brain MR estimates on the same protein-phenotype associations

Two pairs showed discordant MR effect sizes (Fig. 6A, B): (i) higher CSF ART (or AGRP, Agouti-related protein) was associated with higher levels of sodium in the urine, whereas higher plasma ART was associated with lower levels of the same trait; (ii) higher CSF TXD12 was associated with a higher risk of the ER-positive Breast cancer, whereas higher plasma TXD12 was associated with lower risk of the same phenotype. Overall, we found a small proportion of tissue-dependent protein effects on certain phenotypes.

To estimate the enrichment of phenotypes in different tissues, we compared phenotype-category proportions of MR analyses from each tissue (Fig. 7). Even plasma protein MR findings showed a higher proportion of blood traits, and brain protein MR results presented a higher proportion of neurological diseases, we found no statistically significant proportions on phenotype category across tissues (Fig. 7A, C). As our previous study [16] suggested that trans, but not cis, pQTLs may be tissue-specific, we further split the disease category of MR analyses from each tissue into cis-only and trans-additional findings to determine if there is any tissue-specific phenotypic enrichment (Fig. 7B, C). We found the pairwise tissue comparisons involved in the brains on proportions of disease category using IVs from cis-only analyses had a larger p-value than from trans-additional analyses from the proportion test. This observation may be underpowered but can be partially explained by our prior findings [16] that trans-pQTLs tend to be more tissue-specific than cis-pQTLs.

Fig. 7figure 7

Phenotype-category proportions of MR analyses from each tissue. Barplots were used to visualize the proportions of phenotype category per tissue and the percentage of each proportion was listed in the table in parallel. The MR results are from A combined analyses. B After splitting into cis-only and trans-additional findings by instrumental variables used. C Table summarizing the p-value of the proportion test (two-sided) for the overall phenotype-category proportions of MR analyses between each pair of three tissues. Phenotype categories were color coded as biological traits as red, blood traits as orange, cancers as purple, non-neurological diseases as green, neurological diseases as blue, personality traits as pink, and other risk factors as khaki

To determine whether the phenotype enrichment was affected by the statistical power across tissues because of the differences in sample size and protein, we first calculated the statistical power for pQTL identification in each tissue, and we next did sensitivity analyses by keeping only the same protein sets for the estimation of the disease enrichment. Our current study was well-powered (Additional file 1: Fig. S1) for cis-pQTL-based MR analyses given the same protein available across all tissues. However, for trans-pQTL-based MR analyses, the statistical power for CSF and brain were below 0.8 (Additional file 1: Fig. S1), indicating that we were underpowered to detect trans-pQTLs under the assumption that the protein should have at least one trans-pQTL in all tissues. Moreover, this assumption will be too stringent as we cannot ensure proteins from different tissues measured by the same platform share the same genetic structure. Thus, the phenotype enrichment analyses using trans-pQTL-based MR results may be underpowered to provide robust estimates.

Even if we used the same panel (SOMAscan 1305 panel) in all three tissues, different subsets of proteins passed QC in each tissue could bias the downstream disease enrichment analysis. To correct this bias, we performed the enrichment analysis using only the 411 proteins that passed QC in all three tissues (Additional file 1: Fig. S5). Even the proportion of neurological diseases was higher in the brain than in the plasma (brain: 10%; plasma: 3.3%) and the proportion of blood traits was higher in plasma than in the other two tissues (plasma: 23%; CSF: 7.8%; brain: 10%), these differences were not statistically different (CSF vs plasma p-value = 0.2; plasma vs brain p-value = 0.952; CSF vs brain p-value = 0.780). This is consistent with the current MR results including all proteins that passed QC in any tissues, indicating that the phenotype enrichments are not biased by different protein sets.

Overlap proteins with druggable genome

Moreover, we overlapped our proteins having strong MR and colocalization evidence with the druggable genome reported by Finan and colleagues [31]. To assess the overlap of the proteins identified in our MR analyses and based on the druggable genome tiers, we performed an enrichment analysis as described before [7] (Additional file 2: Table S15). Of the proteins associated with the studied phenotypes, 86.3% (69/80), 82.7% (43/52), and 66.7% (6/9) proteins in CSF (Additional file 1: Fig. S6A), plasma (Additional file 1: Fig. S6B), and brain (Additional file 1: Fig. S6C), respectively, intersected with the first three druggable genome tiers. These overlapping proteins were associated with seven, six, and four unique phenotypic categories (the “Methods” section).

Drug repurposing

Finally, to repurpose the known drug compounds for the phenotypes, we linked the inference results using meta-analyzed genome-wide significant cis and trans pQTLs with two drug databases. Using the DrugBank database [33] to first assign protein targets with a compound, which is curated by UniProt [34] and the ChEMBL database [32] to further keep the maximum clinical trial phase as “4” from the indication information and no side-effects, we identified two, three, one protein in CSF, plasma, and brain, respectively, connected with at least one compound for one disease-related phenotype (Fig. 8, Additional file 1: Fig. S7, S8, Additional file 2: Table S16). For CSF proteins as targets (Fig. 8A), two drugs can be used as an inhibitor given a positive estimate from MR analyses; for proteins from the plasma (Fig. 8B) and brain (Fig. 8C), two and two drugs, respectively, were predicted as activators, whereas two and one, respectively, were inferred as inhibitors. For example, plasma N-terminal pro-BNP can be targeted by carvedilol to lower the SBP. CSF TSG-6 can be targeted by acetylsalicylic acid in treating retinal detachment. Brain CPNE1 was found as a target of a small molecule drug, called theophylline, and potentially regulates the size of hippocampus volume and age at menopause.

Fig. 8figure 8

Phenotype-drug pairs after integration of protein-phenotype associations from MR and drug-protein interactions from DrugBank & ChEMBL databases. Heatmaps were used to visualize drug-name against phenotype for the drug target repurposing purpose. The drug-predicted effects were listed as follows: A in CSF, two drugs can be used as an inhibitor given a positive estimate from MR analyses; B two activators and two inhibitors in plasma; and C two activators and one inhibitor in brain. Colors were coded: activator as magenta and inhibitor as black

Discussion

Here, our study revealed that 80 CSF, 52 plasma, and nine brain proteins were associated with 64, 49, and 15 human disease-related phenotypes, respectively. Of these, we identified 45.8%, 30.2%, and 12.5% novel protein-phenotype pairs in CSF, plasma, and brain, respectively. After integrating the published druggable genome results, we found that 66.7 to 86.3% of proteins, depending on tissues, could be potential therapeutic targets for a complex trait/phenotype. These results systematically tested the potential effects of proteins, as potential drug targets, on human diseases or risk factors by both MR and colocalization in a tissue-specific manner.

Our study is the first analysis that systematically evaluated the cross-tissue protein effects on over 200 phenotypes using pQTLs from three tissues. Our result can be used as a complementary resource to the plasma proteome-by-phenome-wide MR studies [7, 14]. Our current study generated a multi-tissue MR atlas, and thus, we did not pre-select the priori “tissue-specific” phenotypes. This strategy would be extremely helpful in the downstream comparisons of cross-tissue MR effects given the same protein-phenotype associations.

From the MR results using genome-wide significant pQTLs, we found 48, 12, and two novel protein-phenotype pairs in CSF, plasma, and brain, respectively (Table 2), which were absent from previous studies [6, 7]. We found the largest set of novel protein-by-phenotype associations was from CSF proteins, and it could be explained as this study used the largest CSF pQTL dataset at the time of the analyses. Our MR results revealed that plasma proteins, as well as CSF and brain proteins, can be prioritized in the disease pathogenesis and further used as druggable targets. Our study expanded the scale of inferring CSF and brain protein effects on diseases to the phenome-wide scale compared to prior protein-disease MR studies in CSF [11] and brain [6]. Our results highlight proteins with potential opportunities for developing treatment with clinical trials; however, further functional experiments, in vitro and in vivo, would be essential to validate these findings. We think additional preclinical and clinical studies are relevant as 67% of the F

留言 (0)

沒有登入
gif