The missing link: ARID1B non-truncating variants causing Coffin-Siris syndrome due to protein aggregation

ARID1B variants included in the study

Overall, we analysed seven non-truncating variants located in the EHD2 domain of ARID1B (Fig. 1C, Table S1). Variants were annotated to ARID1B reference transcript NM_020732.3 (GRCh37/hg19). All variants were absent from gnomAD, with the exception of E2011V (one heterozygous carrier). The novel inframe variant 2129del4 (c.6385_6397delinsA p.(Glu2129_Ala2133delinsThr)) (Fig. 1C) occurred de novo in an individual with coarse facial features, mild developmental delay (DD)/intellectual disability (ID), speech deficits, autistic behaviour, muscular hypotonia, complete agenesis of corpus callosum and hydrocephalus internus (Fig. 1A, File S1 “clinical_table”, File S2 “clinical reports”). Coffin-Siris syndrome was clinically suspected. Given its non-truncating nature (as shown by RT-PCR analysis, Figure S1) it was initially classified as variant of unknown significance (VUS; PM2_supporting, PM4_supporting, PS2_supporting). The second variant herein described is a de novo frameshift deletion 2188ter (c.6463_6473del p.(Ser2155Leufs*33)), which escapes NMD, leading to the generation of an aberrant transcript (Vasileiou et al. 2015) (Fig. 1C). It was identified in a mildly affected CSS individual (Hoyer et al. 2012) (Fig. 1B, File S1 “clinical_table”). Five additional amino acid substitutions in EHD2 were extracted from the literature (Fig. 1C). Their classification in the respective studies was used. Variant C1945R (c.5833T > C p.(Cys1945Arg)) was identified de novo in an individual with clinical suspicion of CSS, and initially classified as VUS. However, in silico analysis including evolutionary conservation and protein predictors suggested a deleterious effect, and a methylation assay revealed a BAFopathy episignature (Aref-Eshghi et al. 2018a). Variant I2031N (c.6092T > A p.(Ile2031Asn)) occurred de novo in an individual with mild DD/ID and dysplasia with agenesis of the splenium of corpus callosum. In silico assessment supported pathogenicity, and it was interpreted as likely pathogenic (Yan et al. 2019). Variant I2031T (c.6092T > C p.(IIe2031Thr)) was reported as causative in a CSS individual with complete agenesis of the corpus callosum and mild DD/ID. It was inherited from the affected mother also presenting with mild ID, but no callosal anomalies (Mignot et al. 2016). In ClinVar it was listed as likely pathogenic. Variant H2054P (c.6161 A > C p.(His2054Pro)) was found de novo in an individual with complete agenesis of corpus callosum and mild DD/ID and was classified as likely pathogenic (Miyamoto et al. 2021). The last EHD2 variant E2011V (c.6032 A > T p.(Glu2011Val)) was characterised as VUS. Although no clinical data were provided, the variant did not show a CSS methylation profile (Aref-Eshghi et al. 2018a), and was herein used as a negative control. Available clinical and genetic data of all individuals are described in File S1 “clinical_table” and “variants”.

To exclude any artefacts in functional experiments, we analysed additional variants located outside of the EHD2 domain (Fig. 1C). Two of them were amino acid changes located in the globular ARID domain, sourced from the ClinVar database: D1099V (c.3296 A > T p.(Asp1099Val)) and G1112D (c.3335G > A p.(Gly1112Asp)). They were not observed in gnomAD and in silico prediction programmes categorised them as deleterious. Clinical information or inheritance pattern were not available, but both were listed as likely pathogenic. The third variant D1727N (c.5179G > A p.(Asp1727Asn)) lies outside any functional domain and was present in gnomAD (11 heterozygous carriers). It was initially classified as a VUS but subsequently downgraded to likely benign because it did not show a BAFopathy methylation pattern (Aref-Eshghi et al. 2018a) (File S1 “clinical_table” and “variants”, Table S1).

Expression and methylation profiles are consistent with BAFopathy

Despite the initial classification of the indel variant 2129del4 as VUS, the strong resemblance of the individual´s presentation to CSS required further investigation. To examine a potential clinical significance, we performed transcriptome analysis, comparing its expression profile to six CSS individuals harbouring pathogenic NMD-inducing ARID1B variants and nine healthy controls. We observed that its expression pattern clustered together with that of the ARID1B truncating alterations, and was distinct from that of controls (Fig. 2A). An RNA sample for testing of the 2188ter deletion was not available. Nevertheless, a previous transcriptome analysis including this variant revealed a similar clustering with pathogenic NMD-inducing ARID1B variants (Vasileiou et al. 2015). Additionally, array-based DNA methylation analysis upon EpiSign assay was applied to samples of both individuals and revealed a genome-wide DNA methylation profile consistent with BAFopathy syndromes (Fig. 2B-D). More specifically, as indicated by Euclidean clustering, multidimensional scaling and an elevated MVP score (both cases = 1.0), the methylation signatures of both the inframe insertion-deletion and frameshift deletion individuals were concordant with those observed in individuals with ARID1A, ARID1B, SMARCB1, SMARCA4 and SMARCA2 variants.

Fig. 2figure 2

RNAseq and methylation analyses. (A) Heat map of differential expression profiles generated from blood of Ind1 (2129del4), six Coffin-Siris (CSS) individuals with truncating variants in ARID1B (A1-A6) and nine controls (C1-C9). Gene expression is scaled across columns. Note that the expression pattern of Ind1 clusters together with CSS individuals and separately from controls. (B-D) EpiSign (DNA methylation) analysis in peripheral blood from two cases with variants 2129del4 and 2188ter in ARID1B. (B) Hierarchical clustering and (C) multidimensional scaling plots indicate that lnd1 (2129del4) (red) and lnd2 (2188ter) (black) both have a DNA methylation profile similar to subjects with a confirmed BAFopathy episignature (blue) and distinct from controls (green). (D) MVP score, a multi-class supervised classification system capable of discerning between multiple episignatures by generating a probability score for each episignature. The BAFopathy score of 1.0 for both cases indicates an episignature similar to BAFopathy reference cases, including those with Coffin-Siris syndrome 1

EHD2 variants do not generally impact the interaction with SMARCA4

ARID1B interacts with SMARCA4 via its EHD2 domain (Inoue et al. 2002). Interestingly, it has previously been shown that the NMD-escaping frameshift variant 2188ter leads to weaker interaction with SMARCA4 (Vasileiou et al. 2015). Considering an impaired interaction with SMARCA4 as plausible cause of pathogenicity, we explored if this was also the case for other EHD2 variants. To this end, ARID1B-T7 expression vectors harbouring the different EHD2 domain variants were generated. We overexpressed the vectors together with SMARCA4-FLAG in HEK293T cells and analysed the interaction through proximity ligation (PLA) as well as co-immunoprecipitation assays (CoIP). While the PLA showed qualitative interaction of all tested ARID1B variants with SMARCA4 (Fig. S2A), quantitative CoIP confirmed that this interaction was indeed markedly reduced for the frameshift variant 2188ter. No effect was shown for the remaining EHD2 variants (Fig. S2B).

Variants in the EHD2 and ARID domains are prone to misfolding and aggregation

As amino acid substitutions and NMD-escaping deletions can affect protein folding and structure, we addressed whether this holds true for variants in the EHD2 domain of ARID1B. To this end, the subcellular localization was examined upon overexpression in HEK293T cells via immunofluorescence staining. Depending on the cell cycle, ARID1B was either homogeneously distributed or in a punctate pattern throughout the nucleus (Vasileiou et al. 2015) (Fig. 3A, wild type; WT). Four of the five EHD2 missense (C1945R, I2031T, I2031N, H2054P) as well as the indel and frameshift variants predominantly showed protein accumulation in circular cytoplasmic formations in 66–93% of the examined cells, depending on the variant. Such formations were only observed in 16% of cells expressing wild type protein, most likely as a result of cellular protein overload due to overexpression (Fig. 3A-B). The EHD2 missense variant E2011V and the variant D1727N lying outside of known functional domains did not show significantly increased formation of cytoplasmic aggregation, with only ~ 30% of observed cells affected (Fig. 3A-B). Surprisingly, the aggregation was more pronounced for the two ARID substitutions (D1099V, G1112D), which exhibited not only cytoplasmic aggregates (in 61 to 88% of cells), but also smaller, nuclear aggregates (12% and 39%). As a result, less than 1% of observed cells displayed the normal nuclear ARID1B distribution (Fig. 3A-B).

Fig. 3figure 3

Immunofluorescence analysis shows protein aggregation. (A) Representative microscopy images of intracellular localization of ARID1B wild type (WT) and mutants overexpressed in HEK293T cells. Scale bar: 10 μm. WT ARID1B is distributed either evenly throughout the nucleus or in nuclear puncta. Variants in the ARID domain (D1099V, G1112D) exhibit cytoplasmic as well as nuclear aggregation. Variant D1727N located outside of any functional domain as well as the EHD2 variant E2011V show normal distribution, whereas all remaining EHD2 variants aggregate in the cytoplasm. Cell counts for quantification are shown in a contingency table. (B) Quantification of aggregation. Statistical analysis: in three independent experiments, 100 transfected cells each were analyzed for ARID1B localization. Bars show the fraction of cells with the respective localization pattern: normal (blue), nuclear aggregation (light green), and cytoplasmic aggregation (black). P-values were generated using a chi-squared test and corrected for multiple testing. Significant aggregation compared to the wild type was observed for all variants except D1727N and E2011V. *** p < 0.001

The cytoplasmic aggregates resembled structures previously described as aggresomes. These are juxtanuclear inclusion bodies in close proximity to the microtubule organisation centre (MTOC), and are surrounded by the intermediate filament protein vimentin (Johnston et al. 1998; Johnston and Samant 2021). A co-staining of transfected HeLa cells with vimentin and γ-tubulin (centromere marker), revealed both the characteristic vimentin cage-like structure around the cytoplasmic formations as well as a co-localisation with the MTOC, further confirming our hypothesis (Fig. 4, Fig. S3).

Fig. 4figure 4

Identification of aggregates as aggresomes. Co-staining of ARID1B WT and variant proteins with the cytoskeletal filament protein vimentin and the centromere protein γ-tubulin shows inclusion of aggresomes in a vimentin cage and close proximity to the MTOC, indicated by arrows. Scale bar: 10 μm

Furthermore, except for the ARID variant D1099V that showed significantly reduced protein expression, the total protein levels were comparable between wild type and protein variants according to western blot analysis (Fig. S4).

Aggregation is likely caused by exposure of amylogenic protein stretches

Computational analysis showed that the EHD2-domain exhibits amylogenic sequences (Fig. 5A, Table S2). The four aggregating missense variants (C1945R, I2031T, I2031N, H2054P) are located in the globular part of the EHD2 domain near the amylogenic segments. Since these variants are predicted to severely disrupt the domain structure (Table S3), the amylogenic sequence stretches will get exposed, thereby likely leading to protein aggregation (Teng and Eisenberg 2009). A similar mode of action is likely for the 2129del4 and 2188ter variants, which are predicted to cause an entire loss of the three-dimensional EHD2 domain structure.

Fig. 5figure 5

Structural analysis of ARID1B variants. (A) Structure of the EHD2 domain indicating the sites of mutation as black balls. The stretch of the E2129_A2133delinsThr mutation is highlighted as black ribbon. The amylogenic sequence stretches are marked in blue. (B) Structure of the ARID domain indicating the sites of mutation as black balls. The amylogenic sequence stretches are marked in blue

The ARID missense variants (D1099V, G1112D) were also predicted to be deleterious according to the AlphaMissense and Vipur predictions (Table S3). They are flanking a sequence stretch (L1100-V1105), which is predicted to exhibit amylogenic properties (Fig. 5B, Table S2). Similar to the EHD2 variants, the two substitutions in the ARID domain are expected to disrupt the three-dimensional structure, thereby offering an explanation for the experimentally observed aggregation.

The two remaining missense alterations (E2011V, D1727N) showed no significantly increased aggregation in the functional assays confirming their initial classification as not causative. This property most likely results from their location within the ARID1B structure. Variant E2011V is located in a long disordered loop of the EHD2 domain (Fig. 5A). Therefore, the effect of the exchange is likely less severe compared to those variants in the globular part of the EHD2 domain. Variant D1727N is located outside of the globular domains (Fig. 5A), so that the exchange is not expected to have a critical impact on ARID1B structure and aggregation properties.

留言 (0)

沒有登入
gif