To evaluate whether SBNA could be meaningfully applied to human proteins, we first applied the approach to four well-studied human proteins—BRCA1, HRAS, PTEN, and ERK2—which were selected for analysis due to the availability of high-quality structural data. In addition, these proteins had published in vitro saturation mutagenesis experiments, which allowed us to extract the functional consequence of all missense variants and quantify mutational tolerance36,37,38,39. We next generated network scores for all amino acid residues in the available protein databank (PDB) files (Fig. 1a) and evaluated the correlation with mutational tolerance (Fig. 1b). All four proteins showed a strong inverse correlation between mutational tolerance and network score, which was consistent with previous findings for viral proteins and other model proteins29. Of note, only 17% of BRCA1 has been crystallized (the N- and C-terminal ends), but SBNA scores still performed reasonably well (Spearman correlation coefficient −0.514, p = 3.45e-22) despite limited structural data.
Fig. 1: Residue network score correlates with mutation intolerance and distinguishes pathogenic variants from benign variants in well-studied human proteins.a Structural representations show network scores at each residue. Sphere radius and color corresponds to network score magnitude at a particular position. b Comparison between functional data from saturation mutagenesis experiments and network scores, with Spearman correlation coefficients and p-values displayed for each plot. Points are colored based on available clinical phenotype data. c Pooled comparison between network scores for variants with available clinical phenotype data for all four well-studied human proteins. d Individual comparisons between network scores for variants with available clinical phenotype data for all four well-studied human proteins.
We next compared network scores to pathogenicity categorizations derived from human data using the ClinVar and gnomAD genetic databases for these same four proteins (Fig. 1c). Missense variants in all subsequent analyses were categorized with respect to human clinical data in line with the American College of Medical Genetics and Genomics/Association of Molecular Pathology (ACMG/AMP)24 as benign (encompassing “benign” or “likely benign” within ClinVar), VUS or pathogenic (encompassing “pathogenic” or “likely pathogenic” within ClinVar). We restricted our analysis to ClinVar missense variants with at least two-star level evidence, and gnomAD was used to identify additional relatively benign missense variants (variants with at least 250 alternative allele counts across 100,000 individuals). Across all four proteins, network scores assigned to pathogenic variants were significantly greater than those assigned to benign variants (median network scores for benign missense variants −0.936 and pathogenic variants 0.866, p = 1.836e-5, Fig. 1c). Scores assigned to VUSs fell in between those assigned to benign and pathogenic variants. These trends were observed consistently within individual proteins, though smaller sample sizes limit statistical power (Fig. 1d). Overall, network scores correlate with available clinical phenotype data for the four well-studied human proteins (Spearman correlation coefficient 0.228, p = 2.116e-23), suggesting that SBNA can be meaningfully applied to human proteins. Of note, we found that SBNA seems to capture structural properties beyond simple solvent accessibility (i.e., proximity to the core) because relative solvent accessibility (RSA) scores show less correlation with mutational tolerance scores than did SBNA (Supplementary Fig. 1).
SBNA predicts variants in IRD genes associated with pathogenic clinical phenotypesHaving validated SBNA on four canonical, well-studied human proteins, we then applied SBNA to additional human proteins. We analyzed the relationship between network scores and pathogenicity designations from high-quality ClinVar variants and benign gnomAD variants for the 47 human genes associated with IRDs, for which high-quality structural data is available for the encoded protein. This dataset includes both membrane proteins (e.g., ABCA4 and RHO) as well as cytoplasmic proteins (e.g., RPE65 and RPGR) (Supplementary Table 1). We found that pathogenic variants were assigned significantly greater network scores compared to both benign variants and VUSs (median benign −0.841, median VUS −0.188, median pathogenic 0.947 p = 3.140e-29 for benign vs. pathogenic, p = 1.753e-60 for VUS vs. pathogenic; Fig. 2a and Supplementary Fig. 2), which is similar to the pattern observed for the canonical human proteins. Because the number of high-quality ClinVar entries for missense variants in each of the 47 IRD-associated proteins varies considerably (Fig. 2b), we wanted to evaluate whether the difference between median benign and pathogenic network scores remained statistically significant even for proteins with limited variant data available in ClinVar. We grouped proteins by the amount of available high-quality entries in ClinVar and calculated the difference between the median benign and pathogenic network scores across each group of proteins. The magnitude of these differences was robust in the setting of differing levels of available clinical data across genes and was detectable down to 40 high-quality entries per gene (Fig. 2c).
Fig. 2: Structure-based network analysis identifies pathogenic variants in inherited retinal disease proteins.a Pooled comparison between network scores for variants with available clinical phenotype data for all 47 inherited retinal disease proteins. b Distribution of available high-quality evidence in ClinVar across all 47 inherited retinal disease proteins. The bins reflect increasing numbers of high-quality entries in ClinVar, and the height of each bar reflects the number of proteins in each category. c Comparison between median benign and pathogenic network scores assigned to variants with available high-quality evidence in ClinVar, grouped by level of available high-quality evidence for each inherited retinal disease protein. Stars above columns indicate statistical significance (*p < 0.05, **p < 0.01, ***p < 0.001, ***p < 0.0001). The statistical significance of the difference in network scores between benign and pathogenic variants is lost between 20 and 40 high-quality ClinVar evidence entries.
Incorporating network scores and BLOSUM62 scores successfully predicts variant pathogenicityTo move from aggregate statistics to prediction of pathogenicity using network scores, we constructed a modified score that incorporates not only the SBNA network score but also the degree of chemical and side chain dissimilarity between the reference and mutant amino acid at that position (since missense variants vary in this regard). To capture the latter effect, we subtracted the BLOSUM62 matrix score from the SBNA score (which we will now refer to as the modified SBNA score) to allow for a distinction between non-conservative substitutions (e.g., ILE → TRP) and conservative ones (e.g., ILE → LEU)40. BLOSUM62 scores for non-synonymous substitutions range from −4 to 3, while 95% of the raw SBNA scores range from −2.88 to 4.97; thus, the two metrics are on similar scales and can be combined with simple subtraction. We then calculated receiver operating characteristic (ROC) curve statistics for high-quality ClinVar variants and benign gnomAD variants based on the modified SBNA score. Modified SBNA scores predicted variant pathogenicity (AUC 0.851) and outperformed network scores alone, BLOSUM62 scores alone, and RSA scores alone (Fig. 3a, b and Supplementary Fig. 3). We tested multivariable logistic regression modeling with 500 iterations of a 70/30% train-test split as well as a leave-one-out approach using the labels derived from high-quality ClinVar and gnomAD variants and found similar performance to the raw scores (AUC 0.835, Supplementary Figs. 3, 4). Given the advantage that using the raw scores has over a trained approach (which can be subject to poor phenotypic labeling and data circularity20,23), all downstream clinical applications described here use the raw modified SBNA score.
Fig. 3: Modeling using SBNA and BLOSUM62 is superior to SBNA network scores and BLOSUM62 scores alone.a ROC curves for network scores alone (red), BLOSUM62 scores alone (blue), and the difference between network scores and BLOSUM62 scores (purple). ROC curves were determined using all variants with available clinical phenotype data for all 47 inherited retinal disease proteins. AUC values are shown for each curve. b Pooled comparison between the difference between network scores and BLOSUM62 scores for variants with available clinical phenotype data for all 47 inherited retinal disease proteins. c ROC as shown in (a) with added comparison to EVE scores (orange) and the sum of EVE scores and the scaled difference between network scores and BLOSUM62 scores (light purple). ROC curves were determined using all variants with available clinical phenotype data for all 47 inherited retinal disease proteins. AUC values are shown for each curve. d Pooled comparison between the sum of EVE scores and the scaled difference between network scores and BLOSUM62 scores (“combined score”) for variants with available clinical phenotype data for all 47 inherited retinal disease proteins.
Comparison of SBNA to existing methods reveals similar performance without dependence on phenotype labels or evolutionary sequence dataWe set out to compare SBNA to three tools that are built from different underlying data: Polyphen2, AlphaMissense, and EVE. Polyphen2 is a widely used computational prediction tool for variant pathogenicity41. It is important to note however that PolyPhen2 is used by the ACMG/AMP to assess pathogenicity designations in databases such as ClinVar and HumDiv, so using ClinVar as ground truth might overestimate the performance of PolyPhen2. AlphaMissense32 incorporates structural data from AlphaFold242, protein language modeling, and evolutionary multiple sequence alignments into a machine-learning model using ClinVar labels to train. Similar to PolyPhen2, there is a risk of over-fitting and circularity with AlphaMissense. EVE is a variant pathogenicity approach which relies only on evolutionary sequence data rather than clinical labeling for model training and is thus not subject to circularity, similar to SBNA26. EVE performs well on two of the well-studied human proteins (Spearman correlation ranging from −0.478 to −0.513, benign versus pathogenic P < 0.05 for all; Supplementary Fig. 5); EVE predictions are not available for ERK2. To minimize bias, all algorithms were tested on an independent dataset of 2800 rare variants derived from patients with an IRD who were seen at MEE, though of note, ground truth is still determined using the ClinVar database in accordance with the clinical standard in the field. Thus, methods that train on ClinVar must be interpreted with caution. The modified SBNA scores were compared to results generated using PolyPhen2 trained on two different datasets, HumDiv and HumVar12,41, as well as EVE scores26. All methods showed a significant difference between benign and pathogenic variants, and the modified SBNA scores correlated with the scores from other methods (Supplementary Figs. 6A, B, 7A, B, Spearman correlation coefficient range 0.510–0.571, p < 5e-24 for all). With a threshold of 1.5 for modified SBNA scores, the sensitivity was 0.548, specificity 0.908, positive predictive value 0.963, and negative predictive value 0.312.
ROC curves were generated by testing each of the methods on the dataset of 2800 variants present in MEE patients (AUC range: 0.788 [modified SBNA], 0.829–0.833 [PolyPhen2], 0.819 [AlphaMissense], and 0.809 [EVE]), Supplementary Figs. 6C, 7C). The modified SBNA, PolyPhen2, AlphaMissense, and EVE performed similarly, though PolyPhen2 and AlphaMissense scores may be inflated due to training on ClinVar pathogenicity labels.
Combining SBNA and EVE outperforms all methods individuallyThe correlation and prediction results suggest that structural information and sequence conservation provide distinctly important insight into pathogenicity, and thus, incorporating orthogonal metrics into a single score may help to improve model correlation with phenotypic benchmarks. We thus sought to use the two unbiased methods (modified SBNA and EVE). The modified network scores were scaled and added to EVE scores to create a combined score with a range of 0 to 2 (as EVE scores fall between 0 and 1, and SBNA scores were scaled based on the maximum and minimum values across all proteins such that they fell between 0 and 1 before adding to the EVE scores). When applied to the 2800 rare variants from MEE patients as well as the 47 IRD genes, this combined score distinguished pathogenicity (Supplementary Fig. 7D, benign vs. pathogenic p = 3.056e-17; Fig. 3d, benign vs. pathogenic p = 8.839e-69) and outperformed all other models with an AUC of 0.859 (Supplementary Fig. 7E) for the 2800 variants and 0.927 (Fig. 3c) for the IRD genes. With a threshold of 1.0 for the combination score, the sensitivity was 0.765, specificity 0.899, positive predictive value of 0.976 and negative predictive value of 0.416. While we note that EVE alone performed quite well (AUC = 0.915), adding the modified SBNA improves performance and, importantly, unlike EVE alone, offers a direct structural explanation for the mechanism through which variation may affect phenotype.
Model incorporating the modified SBNA scores identifies putative disease-causing variants with an unclear genetic basis for clinical diseaseA significant percentage of patients with clinical presentations consistent with IRD lack an identified genetic basis for their phenotype, and this is also observed for patients who receive care from the Inherited Retinal Disease Service at MEE. We therefore evaluated genetic data from 3621 probands with a clinical diagnosis of an IRD based on visual acuity, visual field testing, clinical exam, fundus autofluorescence imaging, optical coherence tomography and electroretinogram in individuals who underwent targeted or whole-exome sequencing. Mitochondrial causes of IRD were excluded. Missense variants of interest were defined using variant ranking software43 and residence in one of the studied 47 IRD genes. There were 2948 unique variants identified and categorized as either “pathogenic”, “likely pathogenic”, “VUS”, or “benign” based on a known variant consequence in the literature using ACMG/AMP criteria44 and ClinVar designations20 (Supplementary Fig. 8). Variants were further categorized in the context of individual patients as “likely causal” if they were pathogenic or likely pathogenic, the zygosity was consistent with known modes of inheritance, and the clinical presentation was consistent with the known consequence of the affected gene. Of all the reviewed patients, 455 patients carried variants of interest in the 47 IRD genes. Before applying SBNA, 357 were found to have variants that were “likely causal”, while 63 patients harbored one or more VUSs that prohibited a molecular diagnosis. The remaining 35 patients had non-missense variants or variants within a region that lacked available structural data and were therefore excluded.
We generated pathogenicity predictions for the 2,948 IRD gene variants discovered in the probands (Fig. 4a). Variants receiving a modified SBNA score of >1.5 (calibrated using probability estimates from the regression modeling) were deemed pathogenic. Scores were considered in the context of the identified genetic variants in known IRD genes for each patient with a clinical presentation consistent with IRD. Variants were matched with any phenotypic data available in ClinVar to roughly benchmark the pathogenicity scores. Similar to the ClinVar analysis of IRD genes, there were observable differences between the modified SBNA scores assigned to known pathogenic variants as compared to benign variants and VUS/variants without any available clinical data (benign median −1.478, VUS median −0.656, pathogenic median 2.148; benign vs. pathogenic p = 1.713e-30).
Fig. 4: SBNA helps identify pathogenic variants in patients with inherited retinal disease.a Categorization of results from the application of modified structure-based network analysis (SBNA) to a dataset of possibly solving patient variants. Results were further subdivided into those from patients with known putative genetic causes of disease (b) and those from patients with only VUSs in known inherited retinal disease-associated genes (c). d Representation of network scores for a sample structure with putative solving genetic variants. Sphere radius corresponds to network score magnitude at a particular position. A patient with clinical evidence of ABCA4 disease (d) as evidenced by bilateral foveal pigmentary changes (arrowhead) on color fundus photo and bilateral RPE atrophy (star) and hyper autofluorescent flecks throughout the posterior pole on fundus autofluorescence imaging (arrows) but with no complete genetic explanation was fully solved using SBNA which highlighted two variants (Pro1380Leu and Arg1097Ser) that score highly in the ABCA4 protein structure (e). Arg1097Ser was a VUS and is indicated in red within the structure.
For the 357 patients who harbored known pathogenic variants sufficient to cause disease, the modified SBNA scores were concordant with these pathogenicity categorizations in 96.0% of cases (Fig. 4b). For the 63 patients with VUSs as categorized by ACMG/AMP standards44 and/or ClinVar, the modified SBNA scores offered support for a genetic cause of disease for 40 patients (23 unique variants, Fig. 4c). By contrast, EVE scores offered support for a genetic cause of disease in 25 patients (20 in common with SBNA), PolyPhen2 scores trained on HumVar offered support in 33 patients (27 in common with SBNA), and AlphaMissense offered support in 40 patients (34 in common with SBNA). Combined EVE and modified SBNA scores offered support in 23 patients (20 in common with SBNA), but this analysis was limited because EVE does not provide scores for 15 patients. In the 15 patients where modified SBNA scores suggested a putative causative variant but EVE scores did not, one patient had a variant for which no EVE score was provided in the database, and the remainder had at least one candidate variant with an EVE score below the predicted pathogenicity threshold. Modes of inheritance included autosomal recessive (in combination with a known pathogenic variant or a second VUS with a high estimated probability of pathogenicity; n = 15), autosomal dominant (n = 2), and X-linked recessive (n = 5). For example, a patient with autosomal recessive ABCA4-related disease was found to have variants p.(Pro1380Leu) (known pathogenic) and p.(Arg1097Ser) (VUS). The p.(Arg1097Ser) variant scored highly (SBNA score 3.672, BLOSUM62 score -1, score difference 4.672), suggesting it is likely pathogenic and thus completing the genetic solution for this patient. Similarly, the VUS p.(Cys302Tyr) in RPGR was found in a hemizygous patient with phenotypic findings consistent with X-linked IRD and also scored highly (SBNA score 2.154, BLOSUM62 score -2, score difference 4.154) (Supplementary Fig. 9). For 19 patients, the modified SBNA scores contributed towards identifying a possible but not completely solved genetic cause, such as only one heterozygous VUS receiving a strong score. Finally, for the four remaining unsolved patients, SBNA was not possible due to lack of crystal structure data for those portions of the IRD-associated proteins or due to the presence of non-missense variants.
AlphaFold2 can improve structural coverage for SBNADespite evidence of strong performance when applied to IRD-associated proteins, SBNA remains broadly limited by the availability of high-quality structural data for proteins of interest. This structural coverage must also overlap with the availability of high-quality phenotypic data from ClinVar, limiting the scope of analysis (Fig. 5a). Applying AlphaFold2 may provide a path toward overcoming this limitation. For example, NR2E3 was excluded from the set of 47 IRD genes because the only available structural data is from a chimera formed between one domain of NR2E3 and an unrelated bacterial protein (PDB: 4LOG). SBNA performs poorly on this non-physiologic structure with relatively poor coverage of NR2E3 (47%) (Fig. 5b). However, when SBNA is applied to the full AlphaFold2-generated human NR2E3 structure, performance improves considerably (Fig. 5c).
Fig. 5: AlphaFold2 improves coverage for SBNA.a The percentages of high confidence ( ≥ 2 stars) ClinVar variants for each of 47 IRD genes captured with SBNA are widely distributed (blue); the per-protein percentages of solved structure are widely distributed (orange). b NR2E3 only has structural data available for a single domain as part of a chimera. SBNA scores for NR2E3 correlate poorly with pathogenicity because the structure is partial and non-physiologic, but (c) the performance improves when using AlphaFold2 to model the full human protein. d Ten IRD-associated genes without available structural data were selected, and e SBNA scores were calculated for the full AlphaFold2 structures.
To further establish that AlphaFold2 can help to overcome the limited availability of high-quality structural data for SBNA, we selected ten IRD-associated genes without available structural data that had a considerable amount of data in ClinVar (Fig. 5d). We performed SBNA on the AlphaFold2-generated structures for these genes and found a significant difference between the SBNA scores assigned to benign and pathogenic variants (p = 1.201e-7; benign median −0.539, VUS median −0.491, pathogenic median 0.076) (Fig. 5e). However, the magnitude of difference between the median benign and pathogenic network scores was decreased compared to SBNA performed on the structural data from the 47 IRD-associated genes. Still, these results suggest that AlphaFold2 could potentially be useful in expanding the applicability of SBNA, although it is not clear that the quality of this analysis would be superior to that performed on experimentally generated structural data.
留言 (0)