An integrative analysis of functional consequences of PKD2 missense variants on RNA and protein structures: a computational approach

PKD2, also known as polycystic kidney disease 2, is a gene associated with the development of autosomal dominant polycystic kidney disease (ADPKD), a genetic disorder described by the numerous cysts in the kidneys, in ~ 10–15% of the ADPKD cases [26, 27]. PKD2-related ADPKD is a milder form of the disease compared to PKD1-related disease. Given the complexity of this disease, in-depth understanding of the ADPKD pathogenesis in cellular and molecular level is crucial for determining precise mechanism behind the heterogeneous nature of the disease, elucidating genotype–phenotype correlations, and predicting disease severity in ADPKD patients and thereby developing patients centered treatments and managements [1, 28]. The study of genetic variants causing ADPKD and understanding their impact on RNA structures and protein structure dynamics is crucial in this direction. Analysis of the RNA structural changes and protein structure dynamics altered by these variants can provide valuable insights into potential disruptions in RNA–protein interactions, RNA processing and expression level, and overall RNA function as well as protein folding, stability, and interactions, ultimately impacting the function of the encoded protein [8, 9].

Considering the genetic and phenotypic diversity within ADPKD, we conducted an analysis focusing on eight missense variants of PKD2 identified in our previous studies to evaluate their impact on RNA structure using computational tools and employed MD simulation to explore the dynamics of protein structure associated with these variants. The comprehensive analysis conducted on these variants sheds light on the structural alterations occurring at both RNA and protein levels. Firstly, the assessment of RNA secondary structures using RNAStructure Web Server allowed the visualization of secondary structural changes induced by the variants (Fig. 1). This analysis revealed notable deviations from the wild-type RNA structure across various variants, suggesting disruptions in RNA folding. Further quantitative assessment through remuRNA demonstrated distinct patterns among the variants, with some exhibiting high relative entropy values (H(wt:mu)), while others showed minimal deviation from the wild-type RNA structure. Variants such as c.1109G > A (p.S370N) and c.1849C > A (p.L617I) with higher relative entropy values indicative of considerable structural alterations affecting RNA stability, followed by c.568G > A (p.A190T), c.646 T > C (p.Y216H) with moderate relative entropy values. In contrast, variants with lower entropy values, including c.1789C > A (p.L597M), c.2494A > G (p.S832G), c.1354A > G (p.I452V), and c.915C > A (p.N305K), suggest relatively minor deviations from the wild-type RNA structure. The structural complexity of RNA varies across different regions within a single molecule, resulting in varying impacts of single-nucleotide polymorphisms (SNPs). SNPs located within structured elements, such as stems, are likely to have a stronger effect on RNA structure compared to those within loop regions. Therefore, structure-altering disease-associated mutations can serve as indicators of critical structural elements whose perturbation contributes to disease pathogenesis [10]. In addition, the Circos plots and dot matrices provided a visual representation of base pair probabilities for both wild-type and mutant PKD2 RNA snippets (Fig. 2, figure S1: differential Circos plots). The differences in the base-pairing probabilities are visualized in differential dot plots in Fig. 4. Variants such as c.1109G > A (p.S370N), c.1789C > A (p.L597M), c.1849C > A (p.L617I), c.646 T > C (p.Y216H), and c.2494A > G (p.S832G) displayed prominent alterations in base pair probabilities, suggesting modifications in RNA structures. On the other hand, variants such as c.915C > A (p.N305K), c.1354A > G (p.I452V), and c.568G > A (p.A190T) exhibited relatively minor changes in base pair probabilities, indicating more subtle alterations in RNA structure. These alterations caused changes in the accessibility profile of the RNA structures in various nucleotide positions (Fig. 3). Variants such as c.1789C > A (p.L597M), c.1109G > A (p.S370N), c.1849C > A (p.L617I), and c.646 T > C (p.Y216H) caused major alterations of RNA accessibility at specific positions within the structure. These alterations in RNA accessibility could potentially impact the binding of regulatory proteins or RNA-modifying enzymes, thereby influencing expression and downstream cellular processes. On the contrary, variants such as c.915C > A (p.N305K), c.1354A > G (p.I452V), and c.568G > A (p.A190T) resulted in minor changes in RNA accessibility profile. However, even subtle changes in RNA structure caused by single-nucleotide change including synonymous change can have functional implications due to complex structural dynamics involved in RNA folding, processing, and the expression and may contribute to disease progression [9, 29, 30]. The observed variable impact of the single-nucleotide variants on RNA stability could impact the expression of that gene and downstream processes to varying degree.

Now moving to the impact of these variants on protein dynamics, a spectrum of effects was observed in protein structures as well. The variant c.1789C > A (p.L597M) resulted in a slight decrease in RMSD, indicating a small structural deviation from the wild type (Table 2 and Fig. 4). This variant led to an increase in flexibility (RMSF) and a larger solvent-accessible surface area (SASA). The number of hydrogen bonds increased with this variant. Conversely, the variant c.915C > A (p.N305K) caused an increase in RMSD, suggesting a considerable structural deviation from the wild type. This variant also led to increased flexibility (RMSF), a more extended shape (increased Rg), and a larger solvent-accessible surface area (SASA) and increased hydrogen bonds. The variant c.1354A > G (p.I452V) resulted in a slight decrease in RMSD, RMSF, SASA, and H-bonds indicating a minor structural deviation from the wild type. Similarly, the variant c.1849C > A (p.L617I) led to a decrease in RMSD, suggesting a structural deviation as compared to the wild type. While there was a larger solvent-accessible surface area (SASA), increased H-bonds, negligible change in flexibility (RMSF) and Rg, were observed. Lastly, the variant c.1109G > A (p.S370N) caused a large increase in RMSD, indicating a considerable structural deviation from the wild type. This variant also led to an increase in the solvent-accessible surface area (SASA) and a reduction in the number of hydrogen bonds. Comparing the wild-type PKD2 protein with the mutant variant carrying the c.2494A > G (p.S832G) variant also revealed differences in structural and functional properties. The mutant variant displayed a slightly higher RMSD (1.97 nm) compared to the wild type (1.94 nm), indicating a marginal increase in structural deviation. Similarly, RMSF values were slightly elevated in the mutant (0.72 nm) compared to the wild type (0.69 nm), suggesting a minor increase in flexibility. Other structural parameters such as Rg (2.44 nm for both wild type and mutant) and SASA (228.87 nm2 for mutant compared to 231.92 nm2 for wild type) also showed slight changes between the two variants. The number of hydrogen bonds showed minimal alteration. Comparing the wild-type PKD2 protein with each c.646 T > C (p.Y216H) and c.568G > A (p.A190T) variant revealed distinct differences in their structural and functional properties. The mutant variant with the c.646 T > C (p.Y216H) variant exhibited a decrease in RMSD (0.60 nm) compared to the wild type (0.72 nm), indicating a structural deviation. The RMSF increased slightly in the mutant (0.27 nm) compared to the wild type (0.23 nm), suggesting increased flexibility. The Rg also increased to 2.24 nm in the mutant, indicating a more extended shape compared to the wild type (2.11 nm). The SASA also increased slightly in the mutant (209.22 nm2) compared to the wild type (198.60 nm2). The mutant variant carrying the c.568G > A (p.A190T) variant showed no change in RMSD (0.72 nm) compared to the wild type. RMSF increased slightly in the mutant (0.27 nm) compared to wild type (0.23 nm). The Rg (2.17 nm), SASA (207.74 nm2), and H-bonds (324.08) values also showed alterations compared to the wild type.

Table 2 Average values of structural parameters from MD simulation of PKD2 protein

The variants such as c.1789C > A (p.L597M), c.1109G > A (p.S370N), c.1849C > A (p.L617I), and c.646 T > C (p.Y216H) were found to induce major alterations not only in RNA structure and accessibility profile but also in the dynamics of protein structure. While variants such as c.915C > A (p.N305K), c.1354A > G (p.I452V), and c.568G > A (p.A190T) resulted in minor changes in RNA structure and accessibility profile but have noticeable effects on certain parameters of protein structure dynamics. This emphasizes the need for a broader understanding of molecular mechanisms underlying disease progression.

Overall, considering the heterogeneity of PKD1 and PKD2 genetic variants and their variable impact on RNA and protein structure dynamics as well as the functional characteristics [31], these findings serve as essential clues toward understanding mechanisms driving the ADPKD pathology and its variability. This also brings out the importance of considering not only the direct effects of variants on protein structure but also their impact in RNA level. By constructing a stability profile of RNA using relative entropy, the base-pairing probability, accessibility profiles, as well as the protein structure dynamics, we can discern the structural impact of the mutations at each nucleotide position or the protein motifs. Such integrated analyses may aid in prioritizing the variants for further implications allowing to pinpoint the sensitive regions within RNA or protein structures, providing valuable notions about the molecular basis of disease and its heterogeneous nature and facilitating the development of targeted therapeutic strategies aiming at perhaps restoring their normal structure and function more effectively.

Although the study limitation is its confinement to a subset of variants and relies on computational tools for structural predictions and simulations which may not fully reflect the real-world complexity of RNA and protein dynamics yet it emphasizes the utility of these tools in integrated analyses, as these can help prioritize variants for further investigation.

Translational significance of the study

This computational study helps us understand how changes in genes can affect the disease. By looking closely at how these changes affect RNA and proteins structures, we learn more about why some people with different variants in same gene may have different presentations of the disease. This knowledge can lead to new treatments tailored to each person’s specific genetic changes, helping doctors better manage kidney disease and improve the health of patients. Also, the study can aid clinicians and researchers in identifying high-impact variants in terms of pathogenicity as well as for the prioritization for further functional implications and as their appealing drug targets. Ultimately, this research can guide the development of personalized medicine targeting at RNA or protein levels, where treatments are customized for each individual based on their unique genetic makeup for the improvement of precise therapeutic approaches that seek to cure the progression of ADPKD and maybe other genetic disorders.

留言 (0)

沒有登入
gif