To determine a maximally discriminating concentration for survival of DGRP flies, we first measured the dose-response relationship for 4-methylimidazole-induced mortality in the standard laboratory CSB strain, separately for males and females. The LD50 concentrations were 15 mM for males and 30 mM for females, respectively, indicating sexual dimorphism in sensitivity to 4-methylimidazole exposure (Figure S1, Additional File 2 A, 2B). To assess whether the same concentrations would be optimally discriminating for an extensive toxicity screen of DGRP3 lines, we next measured 24-hour survival of 14 maximally divergent DGRP2 lines at 15 mM and 30 mM 4-methylimidazole for both sexes. We confirmed that 15 mM and 30 mM were indeed optimal concentrations for males and females, respectively, to reveal genetic variation for toxicant susceptibility across DGRP2 lines (Fig. 1, Additional File 3 A-C).
Fig. 1Optimally discriminating concentrations of 4-methylimidazole in the DGRP2. The proportion of flies surviving following 24-hour exposure to 15 mM and 30 mM 4-methylimidazole was quantified for 14 maximally diverse DGRP2 lines. Exposure to 30 mM for 24 h showed the greatest variation in female survival proportion (A, B) while exposure to 15 mM for 24 h showed the greatest variation for male survival proportion (C, D). Error bars represent standard errors of the mean
Based on these data, we proceeded to screen 204 DGRP3 lines for 24-hour survival following exposure to 15 mM (males) and 30 mM (females) 4-methylimidazole, with up to 80 individuals per line per sex. We found extensive genetic variation in survival in both females and males, ranging from 0 to 100% in each sex (Fig. 2, Additional File 4 A-C). However, the rank order of DGRP3 lines for mean survival proportion was different between the sexes, indicating variation in sexual dimorphism for survival (Figure S2; Additional File 4).
Fig. 2Variation in survival of females (A) and males (B) of 204 DGRP3 lines upon 24-hour exposure to 30 mM and 15 mM 4-methylimidazole, respectively. Datapoints are means of up to 80 individuals and error bars represent standard errors of the mean
We partitioned variation in survival following 24-hour exposure to 4-methylimidazole into the contribution of sex, DGRP3 line and the sex by DGRP3 line interaction, and DGRP3 line separately for males and females (Additional file 4 C). Broad sense heritabilities (\(\:^\)) were \(\:^\) = 0.80 for females and \(\:^\) = 0.83 for males. The sex by line interaction variance component was significant and ~ 70% that of the among line variance component from the pooled analysis across
sexes (Additional File 4 C), indicating substantial genotype by sex interaction. The cross-sex genetic correlation (\(\:_\)) was \(\:_\) = 0.588 ± 0.057 (SE). Since the cross-sex genetic correlation is significantly different from both 0 and 1, the genetic basis of variation in sensitivity to 4-methylimidazole is partially overlapping and partially distinct between males and females. The reaction norm plot of survival proportions of males and females of the same DGRP3 lines (Fig. 3) shows that for many lines, the responses to treatment with 4-methylimidazole are opposite in females and males.
Fig. 3Reaction norms of female and male survival proportions for each DGRP3 line. The x-axis indicates females and males, and the y-axis indicates survival proportion
Genome-wide association analyses for variation in susceptibility to 4-methylimidazoleGiven the significant genetic variation in susceptibility to 4-methylimidazole, we assessed associations of Wolbachia pipientis infection status, the karyotype of 31 polymorphic inversions and 2,514,186 segregating variants with allele frequencies greater than 0.015 with mean 24-hour survival following exposure to 4-methylimidazole in the 204 DGRP3 lines. We performed these analyses separately for females and males, for the average of the two sexes and for the difference (female – male) between the sexes.
Inversion 3L_P3173045_L13135796 (In(3 L)P) was significantly associated with 24-hour survival in females (P = 2.68 × 10− 7), males (P = 2.53 × 10− 3) and the average of females and males (P = 6.02 × 10− 6); and inversion 2R_P15391223_L4885174 (In(2R)NS) was associated with the difference between males and females in 24-hour survival (p = 7.45 × 10− 3) (Additional File 5). All covariates with P-values < 0.1 for association in each analysis were included in the GWA study for that respective analysis (Additional File 5).
We used a nominal P-value of P < 10− 5 to identify candidate variants and genes [13, 14] in the four GWA analyses. The female dataset had many observed P-values below this threshold (Fig. 4A and B), but the male dataset did not (Fig. 4C and D). The GWA analysis for the average of males and females also showed few observed P-values ≤ 10− 5 (Supplementary Figure S3); in contrast, the GWA analysis for the difference between males and females had substantially more associations at P-values ≤ 10− 5 (Fig. 4E and F). We identified 134 candidate molecular polymorphisms in or near (within 5 kb of the gene body) 89 unique genes for females, 10 molecular polymorphisms in or near 15 genes for males, 31 molecular polymorphisms in or near 32 genes for the average of females and males, and 85 molecular polymorphisms in or near 157 genes for the difference of females and males (Additional File 6 A). In total, we identified 241 molecular polymorphisms (211 single nucleotide polymorphisms (SNPs), and 30 insertion/deletion polymorphisms) in or near 273 genes (Additional File 6 A). Variants with lower minor allele frequencies tended to have larger effects than those with intermediate minor allele frequencies, as has been observed previously [13, 14].
Fig. 4Manhattan and QQ plots for genome-wide associations of molecular polymorphisms with variation in sensitivity to 4-methylimidazole for females (A, B), males (C, D), and the difference in the survival proportion of females and males (E, F). On the Manhattan plots (A, C, E), the x-axes show the Drosophila chromosome arms, and the horizontal dotted line designates the statistical threshold of P < (‒log10(P) = 5). The red diagonal line in the QQ plots (B, D, F) is the expected P-value under the null hypothesis of no significant associations; the black points are the observed P-values
One SNP (2R_21319198_T_C) was significant following a Bonferroni correction for multiple tests (P < 1.99 × 10− 8). This SNP could potentially affect three genes: it is located 842 bp downstream of CG30394; in the 3’ UTR of CG33655; and is a synonymous polymorphism in CG9485. Four other SNPs had relatively low P-values, although they did not reach Bonferroni level significance. 3L_12593961_A_G (P = 2.71 × 10− 8) is in an intron of ara; 2L_7230508_G_T (P = 4.92 × 10− 8) is a synonymous variant in Ndae; and 2L_4374432_G_A (P = 8.87 × 10− 8) is in an intron of Traf4. 2R_21329761_A_C (P = 4.10 × 10− 8) is localized to an intron of dom and an exon of snoRNA: Psi28S-3378, and is within 1 kb upstream of five other snoRNAs located in the same dom intron (Additional File 6). CG30394 is involved in transmembrane transport of amino acids; CG9485 is involved in the biosynthesis and catabolism of glycogen, and the function of CG33655 is not known. ara (araucan) is a transcriptional regulator involved in the morphogenesis, development and differentiation of neurons, the compound eye, muscles and wings. dom (domino) is also a transcriptional regulator that affects chromatin remodeling and alternative mRNA splicing as well as developmental processes. Ndae (Na+-driven anion exchanger 1) regulates cellular pH and the transmembrane transport of chloride, protons and sodium ions. Traf4 (TNF-receptor-associated factor 4) is a pleiotropic gene involved in multiple signaling pathways and developmental processes (Additional File 6B).
Most of the remaining candidate polymorphisms were also intronic, located up- or down-stream of candidate genes, or greater than 5 kb from the nearest gene (intergenic), and hence play putative regulatory roles associated with variation in viability following 24-hour exposure to 4-methylimidazole. These polymorphisms potentially affect multiple genes, highlighting the difficulty of determining causality in the densely packed Drosophila genome. The candidate genes included nine antisense RNAs, 27 long non-coding RNAs and 15 small nucleolar RNAs, which also regulate gene expression (Additional File 6 A). Indeed, 14 of the remaining 222 genes in or near significant molecular polymorphisms were annotated to affect transcriptional regulation [38] and 11 affect post-transcriptional processes (Additional File 6B). A total of 45 candidate genes affect nervous system development, morphogenesis and function, implicating natural genetic variation in susceptibility to neurotoxicity as a possible contributor to the variation in viability following exposure to 4-methylimidazole. Among these genes is bruchpilot (brp), which encodes a gene product that contributes to calcium channel clustering at presynaptic active zones [39, 40]. Perhaps surprisingly, 47 candidate genes have pleiotropic effects on development of many tissues and organ systems, implicating these genes in adult functions. In addition, two candidate genes affect DNA repair, consistent with 4-methylimidazole-induced chromosomal aberrations found in mice [8]; 14 affect reproduction; six affect immune/defense responses; 17 are involved in canonical signaling pathways; 13 affect metabolism and four are involved in detoxification (Additional File 6B). A total of 106 of the candidate genes had no previous Biological Process annotation. Of the 273 candidate genes, 129 had at least one human ortholog with a DIOPT [35, 36] score ≥ 3 (Additional File 6 C).
There was no overlap between candidate molecular polymorphisms detected in the female, male and sex difference GWA analyses (Additional File 6D). This is consistent with the ANOVA results that show a significant sex by line variance component and hence genotype by sex interaction for survival following exposure to 4-methylimidazole (Additional File 6D). The different significant molecular polymorphisms detected in separate GWA analyses of females and males have sex-specific effects. The significant molecular polymorphisms detected in the GWA analysis of the difference between females and males have sex-antagonistic effects. Although not significant in either the female or male GWA analyses, these molecular polymorphisms have opposite effects in the two sexes (Additional File 6D). There is some overlap between molecular polymorphisms detected in the single sex analyses and the average of the two sexes (Additional File 6D). These are technically sex-specific effects, since they are only significant in females or males, but the effect in the other sex was in the same direction as the significant sex, such that the average crossed the significance threshold (Additional File 6D).
Interaction networks associated with variation in susceptibility to 4-methylimidazoleTo explore functional contexts of the candidate genes from the GWA analyses, we constructed networks in which we recruited genes with known physical or genetic interactions for females (Fig. 5) and for the difference between females and males (Fig. 6). There were not sufficient candidate genes from the male and sex average GWA analyses to perform these analyses. The female network contains 88 genes, of which 32 are candidate genes from the GWA analysis (Fig. 5. Additional File 7 A). The network consists of two modules comprised of genes that distinguish the modules (Fig. 5A and B) and that overlap between them (Fig. 5C), and a set of unclustered genes. The sex-difference network consists of 96 genes, 31 of which were candidate genes from the GWA analysis for variation in survival (Fig. 6, Additional File 7 A). This network has a similar architecture to that derived from the female GWA, with two overlapping modules consisting of genes that are unique to each module (Fig. 6A and B), genes that are in common between the two modules (Fig. 6C), and unclustered genes (Fig. 6D). Although there is no overlap of molecular polymorphisms in the female and sex-average GWA analyses, a total of 17 genes overlap between the computationally recruited networks (Additional File 7 A).
Fig. 5An interaction network based on candidate genes from the female GWA analysis consisting of two modules (A, B), the overlapping genes between them (C), and unclustered genes (D). Candidate genes identified in the GWA analysis are indicated as purple boxes and computationally recruited genes are indicated as orange boxes. Bold font and black borders show genes with strong human orthologs (DIOPT score ≥ 7). Pink edges designate genetic interactions and blue edges indicate physical interactions
Fig. 6An interaction network based on candidate genes from the difference GWA analysis consisting of two modules (A, B), the overlapping genes between them (C), and several unclustered genes (D). Candidate genes identified through genome-wide association analysis are indicated as purple boxes; computationally recruited genes are indicated as orange boxes. Bold font and black borders show genes with human orthologs (DIOPT score ≥ 7). Pink edges designate genetic interactions; blue edges indicate physical interactions
We performed gene set enrichment analyses for the Biological Process Gene Ontology (GO) and PANTHER pathway categories [38] separately for the networks derived from the female (Additional File 7B) and sex difference (Additional File 7 C) GWA analyses. Although only 17 genes overlapped between these analyses, many GO and pathway terms were significantly over-represented in both analyses ((FDR < 0.05; Additional File 7D). These terms include the development, morphogenesis and differentiation of antennae, appendages (halteres, legs, wings), the digestive system, eyes, gametes, gonads, growth in general, heart, imaginal discs, muscles, Malpighian tubules, the nervous system and the open tracheal system. Over-represented terms common to both networks impinge on multiple cellular and organismal functions: apoptosis, asymmetric cell division, behaviors, biosynthesis, cell cycle, cell division, cellular responses to stimuli and stresses, circadian rhythm, cognition, DNA damage response, gene expression, hemopoiesis, immunity, learning, lifespan, locomotion, metabolism, memory, reproduction and sex differentiation (Additional File 7D). The genes encoding cadherin, Epidermal Growth Factor, Notch, p53 and Wingless were also over-represented in both networks. The network genes had a combined total of 808 human orthologs (DIOPT score ≥ 3) and 323 strong human orthologs (DIOPT score ≥ 7) (Additional File 7E).
We used a Monte Carlo permutation procedure to determine the statistical significance of the observed connectivity in the interaction networks constructed from our candidate genes. We compared the observed degree of connectivity (female survival proportion = 6.43; differences in survival proportions = 8.21) to a null distribution generated by repeated random sampling of subnetworks (10,000 permutations) from the FlyBase database network of known genetic and physical associations, with the number of genes in the subnetworks constrained to the number of genes in observed interaction networks. This procedure was performed separately for each interaction network. The observed connectivity indices for the interaction networks constructed from the candidate genes associated with female survival proportions (Supplementary Fig. 4A) and differences in survival proportions (Supplementary Fig. 4B) were found to be strongly statistically significant at P < 0.0001. This indicates that both networks are densely interconnected and unlikely to be random.
Comparison of protein-protein interactions between Drosophila and humansTo identify shared protein-protein interactions (PPI) between Drosophila and human data, we isolated the subnetwork of genes with strong human orthologs from Drosophila FlyBase interactions and used this gene set to filter the human PPI network from STRING database [41] using Cytoscape. To identify and organize the common edges between these two networks, we used the DyNet plugin [42] for Cytoscape. We identified 26 interactions shared with the human PPI network from STRING database, out of a total of 107 edges in the PPI FlyBase network constructed from the GWA analysis of differences in survival proportions (Supplementary Fig. 5). Similarly, we identified 9 interactions shared with the human PPI network, out of a total of 56 edges in the PPI FlyBase network constructed from the GWA analysis of female survival proportions (Supplementary Fig. 6). This results in an edge correctness (percentage of edges shared) of 24.3% and 16.1% for the two network comparisons.
留言 (0)