Exploring the role of splicing in TP53 variant pathogenicity through predictions and minigene assays

Splicing assay results

A summary of the number of positive predictions and splicing assay results for the selected TP53 variants encoding synonymous and missense substitutions, as well as the positive and negative control variants, is shown in Table 1. Variant-specific results with all transcripts produced are detailed for variants based on predicted effect and location, as follows: splice site positive controls (Table 2), synonymous variants outside splice regions (Table 3), missense variants outside splice regions (Table 4), missense variants inside splice regions (Table 5). Electropherograms can be visualised in Supplementary Fig. 2.

Table 1 Summary of splicing assay results according to test variant groups and controls

Variants in all variant groups produced aberrant transcripts (i.e. not FL) for which PVS1 (RNA)_Variable weight was assigned (summarised in Table 1). The proportion of variants producing aberrant transcripts was highest for positive control variants located at splice site dinucleotide positions and expected to have high impact (8/8), and lowest for synonymous variants.

Negative and positive controls

The wild-type minigene construct mgTP53_2-9 produced the expected FL transcript of 1,202 nucleotides with a 100% expression. All eight splice site dinucleotide positive control variants produced aberrant transcripts, providing 0% of FL transcript and variable percentages of PTC and/or in-frame transcripts, with three variants additionally producing an extremely low level of uncharacterised transcripts (Tables 1 and 2).

Table 2 Splicing assay results with corresponding assigned PVS1 (RNA)_Variable weight for the splice site positive controls

All positive controls met PVS1 (RNA)_Variable weight based on TP53 specifications v2 with an expression higher than 92.6%, except for c.783-1G > A. For this variant, half of the abnormal expression (50.2%) corresponded to a very short in-frame transcript ▼(E8p3) (p.Ser261_Gly262insSer) with no obvious damaging impact and benign impact prediction using MutationTaster (no prediction available using BayesDel), for which we subsequently assigned PVS1_NA. The other two transcripts produced by this control variant, △(E8p24) [48.2%] and ▼(I7) [1.6%], were assigned PVS1_Strong and PVS1, respectively.

Apparent synonymous variants outside splice regions

Of the nine synonymous variants assessed, four had FL transcript expression at 100%, and an aberrant transcript meeting PVS1 at any weight was observed for the remaining five variants (Tables 1 and 3).

Table 3 Splicing assay results with corresponding assigned PVS1 (RNA)_Variable weight for the synonymous variants outside splice regions

Only the c.597A > T variant (absent in ClinVar) led to an aberrant transcript expression level higher than 50% (66.9%), based on three transcripts which all met PVS1. The remaining four variants each led to the expression aberrant transcripts, with total level ranging from 5.3% to 22.8%, and PVS1 (RNA) annotation either PVS1_Strong (c.879G > T only) or PVS1.

Apparent missense variants outside splice regions

Of the 24 missense variants outside splice regions, 18 led to an aberrant transcript meeting PVS1 at any weight, four missense produced only uncharacterised transcripts at extremely low level (< 3.3%), and the remaining two had FL transcript expression at 100% (Tables 1 and 4).

Table 4 Splicing assay results with corresponding assigned PVS1 (RNA)_Variable weight for the missense variants outside splice regions

Of the 18 variants for which PVS1 was assigned, 12 variants produced aberrant transcripts with a total expression higher than 50%, in which ten had PVS1 assigned at full strength. Of these 12, seven had an expression higher than 80%, of which two had a full expression of 100%.

Apparent missense variants inside splice regions

Of the 26 missense variants inside splice regions, an aberrant transcript was observed in 20 variants, with the remaining six having FL transcript expression at 100% (Tables 1 and 5).

Table 5 Splicing assay results with corresponding assigned PVS1 (RNA)_Variable weight for the missense variants inside splice regions

All variants producing aberrant transcripts met PVS1, except one producing a transcript (ΔE9p3 [100%]) for which PVS1_NA was assigned based on the BayesDel bioinformatic predictions for single amino acid deletions at positions spanning the p.Ala307Leu308delinsVal variant (as per current TP53 VCEP specifications). Of the 19 PVS1-assigned aberrant transcript variants, 13 produced transcripts had a total expression higher than 50%, eight had a total expression higher than 80%, and three had a full expression of 100%.

SpliceAI demonstrates highest predictive performance at currently used VCEP cutoffs

Performance for Splice and MES was evaluated against variants without splice impact (i.e., FL transcript expression at 100%) and variants with a high splice impact (PVS1-assigned aberrant transcript expression > 80%), as well as other variants with intermediate aberrant expression ranges (Table 6). Overall, of the variants not predicted to alter splicing, SpliceAI had the highest proportion of true negatives at the ≤ 0.1 cutoff (80%), in comparison to the < 0.5 cut-off (41.2%) and MES (18.2%). On the other hand, of the variants predicted to alter splicing, the proportion of true positives was not markedly different for the three approaches: 33–38% for SpliceAI prediction at either cutoff or MES when considering only variants resulting in a high splice impact of > 80% aberrant transcripts, and 55–57% when considering all variants producing an aberrant transcript with expression higher than 50%.

Table 6 Comparison of predictive performance of each splicing predictor categoryLack of strong correlation between intermediate SpliceAI scores and aberrant transcript expression

With regards to the correlation between SpliceAI maximum delta scores and level of PVS1-assigned aberrant transcript expression, there was a positive moderate correlation of 0.50 using Pearson’s correlation analysis (Fig. 1A). Upon analysing the data using box plots for different score bins, it was evident that there was a clear distinction in the transcript expression distribution for variants with SpliceAI scores below 0.2, as most of these variants exhibited 0% expression of PVS1-assigned aberrant transcripts (Fig. 1B). In contrast, no significant differences were observed among the other three score groups, although variants with SpliceAI scores over 0.8 generally exhibited somewhat higher aberrant transcript expression.

Fig. 1figure 1

Correlation between SpliceAI maximum scores and the expression level of the corresponding PVS1-assigned aberrant transcripts, using individual scores (A) and score ranges (B)

SpliceAI-10k calculator is effective at predicting specific aberrant transcripts

Specific types of splicing aberrations were predicted using the SAI-10k-calc algorithm, which interprets the combination of all SpliceAI delta scores and delta positions. Results showed that SAI-10k-calc can predict specific aberrant transcripts with at least 3% of overall expression (Supplementary Table 3). Splicing aberrations matched for all positive control variants for at least one of the transcripts, except for one variant (c.782 + 1G > A), in which SpliceAI predicted the loss of donor site but not the loss of its acceptor site pair, thus Δ(E7) observed in the assay was not predicted.

We compared the predicted transcripts against the observed assay results at different levels (> 5%, > 50%, and > 80%) of total characterised aberrant transcripts for control and test variants (Fig. 2, Supplementary Table 3). Of the 49 variants with > 5% aberrant transcript expression, 40 (81.6%) had at least one predicted transcript that matched with assay results. All nine variants with observed splicing impact but no predicted aberration were located inside the splice region. Variants inside the splice region, including those located at the splice site dinucleotides, often alter splicing through loss of donor/acceptor site. SpliceAI correctly predicted the loss of donor site for six variants inside the donor splice region, but did not accurately predict the precise location of the loss of their acceptor site pair, resulting in a lower number of matched transcripts with exon skipping or intron retention. Transcript prediction was better for variants located outside of the splice region at all three levels of aberrant transcript expression compared to variants inside the splice region. Outside the splice region, all variants with > 5% aberrant transcript expression had matching predicted aberrant transcript(s). Therefore, the SAI-10k-calc default settings had high sensitivity for exonic variants outside the splice region that lead to usage of new or cryptic splice sites, generating transcripts with partial exon deletion or partial intron retention.

Fig. 2figure 2

Predictive performance of SpliceAI-10k calculator inside or outside the splice region and at different percentage aberrant transcript cutoffs. Data presented for variants inside the splice region includes the splice site dinucleotide variants. A variant having a predicted aberrant transcript that matches with at least one variant-induced transcript in the assay is counted as a concordant observation

Of the 16 test variants that did not induce any characterised aberrant transcript, 10 (62.5%) with SpliceAI maximum delta scores ranging from 0.01 to 0.17, were correctly predicted as having no splicing impact (Fig. 1, Supplementary Table 3). Six of the 16 variants located in exon 4 or 8, with SpliceAI maximum delta scores ranging from 0.5 to 0.71, were incorrectly predicted to generate transcripts with partial exon deletion. Further inspection using SpliceAI-visual [21] revealed that exons 4 and 8 are bounded by strong native splice sites with SpliceAI reference scores > 0.97. At any predicted splice site position, SpliceAI provides the alternate score, which reflects the splice site strength after introducing a variant into the reference sequence. The six false positive variants had SpliceAI alternate scores for donor/acceptor gain ranging from 0.38 to 0.72 that were weaker than the alternate scores for the native splice sites (Supplementary Fig. 3). This means that the false positive predictions were due to failure of the default algorithm to consider the splice site competition condition (i.e., strength of native vs new splice site).

However, the splice site competition condition was not applicable to certain prioritised variants. The SAI-10k-calc default algorithm correctly predicted the transcripts with partial exon deletion for nine variants although the alternate scores for the new donor/acceptor were lower by more than 0.2 compared to the native splice sites (Supplementary Table 4). For example, c.559G > T led to donor gain with an alternate score of 0.41, a considerably weaker score than the native donor (0.97), but usage of the new donor site was still observed in the assay.

Splicing assay results can contribute to TP53 variant classification

Focusing on variants with a very strong splicing impact (PVS1 (RNA)_Variable weight expression higher than 80%), addition of publicly available data indicates that at least 14 of the test variants could be classified as P/LP using TP53 specifications v2 (Table 7). All these variants had SpliceAI scores ≥ 0.4, all were absent from the FLOSSIES database and four were observed in cancer probands in the TP53 database. Only one of these variants (c.318C > G (p.Ser106Arg)) was already classified as LP in ClinVar, with a single submission from Invitae which mentioned a positive splicing prediction. One variant, c.314G > T (p.Gly105Val), currently classified in ClinVar as LP by a single submitter with no evidence summary provided, and in the absence of privately-held clinical data available, stayed as VUS in this study with 5 points (PVS1_Strong and PM2_Supporting codes applied).

Table 7 TP53 variants classified as pathogenic or benign after incorporating splicing assay results

Similarly, we found that experimental evidence of no splicing impact could contribute to variant classification by allowing use of the relevant functional and computational codes. Specifically, of the 16 variants predicted to alter splicing by either SpliceAI or MES but which did not produce any aberrant transcript according to our results, 13 could now be classified using TP53 specifications v2: five missense variants as P/LP, four missense variants as LB, and four synonymous variants as LB. This evaluation would add nine new clinically-relevant classifications.

留言 (0)

沒有登入
gif