Clinical and genetic analysis of familial neuromyelitis optica spectrum disorder in Chinese: associated with ubiquitin-specific peptidase USP18 gene variants

Introduction

Neuromyelitis optica spectrum disorders (NMOSD) are autoimmune diseases of the central nervous system characterised by relapsing attacks of optic neuritis (ON), myelitis and lesions in specific brain areas and autoimmunity against the cell membrane protein aquaporin 4 (AQP4) is responsible for its pathogenesis.1 2 The estimated prevalence of NMOSD is 0.3~4.4 per 100 000 worldwide,1 and notably, NMOSD occurs more frequently in the Asian population than Caucasian population.3 The majority of NMOSD cases are sporadic, and it is estimated that familial cases only account for 3% of all NMOSD.4 5 A series of familial NMOSD have been reported after the first case was found in 1938.6–11 Most NMOSD families had no more than three patients and showed relationships between siblings, parent–child and aunt–niece pairs. Intriguingly, familial NMOSD were almost same as sporadic NMOSD in clinical characteristics, such as onset age, sex bias and AQP4-IgG-positive rate.4

The observed familial aggregation and higher prevalence in Asian suggest a potential genetic influence on NMOSD risk. Nevertheless, the genetic research on NMOSD, especially familial NMOSD, is still limited to date. The previous researches generally focused on the human leucocyte antigen (HLA) gene or the deleterious rare variants in the non-HLA gene12 13 because a rare or minor allele is more likely to be a causal factor. It is currently accepted that NMOSD is a non-Mendelian disease, for which a pathogenetic major allele occurs owing to the genetic drift or the frequency-dependent selection. The interactions between different genes, genes and environments are also related to developing a disease. The indifferent or beneficial effect of the major or common allele is, thus, reversed.14–16

For familial NMOSD from different ethnic origins, a number of intrafamily patients have reported shared HLA haplotypes, but few risk loci within other genes except HLA were found.12 In China, a functional deletion variant of the cellular adhesion molecule gene nectin-like molecule 2 (NECL2) was revealed by whole-exome sequencing (WES) in two familial NMOSD cases.17 Furthermore, the first case report of patients across three generations, the father and son, also the maternal aunt of the father, implied the genetic heterogeneity of the disease. In this family, only the aunt was seropositive for the AQP4-IgG assay, and none of the patients carried the deletion variant of the NECL2 gene in the aforementioned report.18

In the present study, we identified 10 Chinese families with NMOSD aggregation through questionnaires for neuroimmunology diseases in four regional medical centres from 2016 to 2020. We studied the important clinical characteristics of 22 cases with familial NMOSD and compared with 459 cases with sporadic NMOSD. We further performed a genetic study by WES for seven families composed of 13 cases and 13 controls. Family-based linkage and association analysis were conducted to reveal potential genes affecting the familial NMOSD, and the heterogeneity tests for the linkage analysis were performed. The functional impacts of single-nucleotide polymorphisms (SNPs) in the ubiquitin-specific peptidase USP18 gene were predicted, for which the genetic variants were related to the disease. Finally, we analysed WES data of the USP18 gene from sporadic NMOSD composed of 228 cases and 1 400 matched healthy controls as well as examined correlations between NMOSD phenotypes and SNP genotypes to further confirm our findings. The data analysis flowchart applied to detect and verify the significant variants in the USP18 gene is shown in figure 1.

Figure 1Figure 1Figure 1

Flowchart of detection and verification of significant variants in ubiquitin-specific peptidase USP18 related to neuromyelitis optica spectrum disorder (NMOSD). BQSR, Base Quality Score Recalibration; GVCF, Genomic Variant Call Format; HWE, Hardy-Weinberg equilibrium test; LOD, LOD score of a test; p, p value of a test; SNP, single-nucleotide polymorphism. Indel, insertion/deletion. VCF, Variant Call Format; VQSR, Variant Quality Score Recalibration. BH, Benjamini and Hochberg test (1995). Holm, Holm test (1979). #Refer to Estrada et al.24 $Refer to Zhong et al.19

Materials and methodsIndividuals

The patients met the international consensus diagnostic criteria for NMOSD5 were enquired in four large regional centres, including The Third Affiliated Hospital of Sun Yat-Sen University, The Second Affiliated Hospital of Guangzhou Medical University, West China Hospital and Huashan Hospital between 2016 and 2020 in China. The questionnaires covering the family history of NMOSD were completed under the patients’ permission. The family members were contacted and requested to fill out the questionnaires with appropriate consent. A total of 22 patients with familial NMOSD from 10 different families were enrolled, and 13 of them and their 13 healthy family members underwent WES for the genetic analysis.

Additionally, we retrieved the clinical data (459 patients) and WES data (228 out of 459 patients and 1 400 matched healthy controls) from our previous study.19 The information of all patients is listed in online supplemental table S1. Written informed consent was obtained from each individual. For patients who were younger than 18 years old or deceased, consent was obtained from the parents or first-degree relatives.

Table 1

Clinical characteristics of familial NMOSD

Figure 2Figure 2Figure 2

Pedigrees for families (A–J) with neuromyelitis optica spectrum disorders (NMOSD). Individuals underwent exome sequencing are marked by *, and genotypes for single-nucleotide polymorphism (SNP) rs2252257 in USP18 are shown below the numbers.

Clinical assessment

Demographic information, disease duration, onset age, episode types and annualised relapse rate (ARR) were retrieved from medical records. Expanded disability status scale (EDSS) scores of each episode were evaluated by more than one experienced neurologist. AQP4-IgG detection was performed using a cell-based assay (EUROIMMUN Medizinische Labordiagnostika, Lübeck, Germany).

DNA Samples

Genomic DNA was extracted from peripheral blood using Qiagen Blood and Tissue Kit (Qiagen, Germantown). Genomic DNA was randomly sheared into 200–250 bp fragments by sonication (Covaris, Woburn), and whole exome DNA fragments were then purified and captured using SureSelect Human All Exon Kit (Agilent Technologies, Santa Clara) according to the manufacturer’s protocol. The products were sequenced using HiSeq 1500 platform in a 150 bp-paired end mode following the manufacturer’s instructions.

Variant detection

For family-based samples, the quality of raw data was assessed with FastQC,20 and low-quality reads and adaptors were trimmed using Trim Galore.21 Burrows-Wheeler Aligner22 was employed to align the clean reads against the human reference genome GRCh38 (hg38). Primitive duplicates were marked, and records were sorted by the tools in Genome Analysis Toolkit (GATK).23 The Base Quality Score Recalibration process was used to adjust the quality of individual sample. Germline SNPs and insertions/deletions (indels) were called following the Genomic Variant Call Format workflow. For the raw SNPs/indels in Variant Call Formats, the classic GATK best practices pipeline was applied, and the Variant Quality Score Recalibration process was used to select variants. The low-quality and ambiguous reads were further filtered out via stringent quality control as in reference.24 The variant tool set (vt)25 was used to decompose multiallelic variants into biallelic variants, which were left-normalised for further analysis. Duplicate variants were removed, and qualified variants were combined into a single file. For population-based samples, variant calling has been performed in reference,19 and the filtering standards were same except for the following: (1) call rate <0.6 and (2) mapping quality <20.

Genetic linkage and association analysis

Pedigree files were created using PLINK26 for seven families (IDs A-G in figure 2) and all seven families as a whole. The Mendel errors were used to exclude individuals and variants with an error rate larger than 5% and 10%, respectively. Variants with p<0.001 of Hardy-Weinberg equilibrium (HWE) test were removed. Combined linkage and association analysis were performed with Pseudomarker,27 to use tests for linkage without linkage disequilibrium (Linkage|NoLD) and allowing for LD (Linkage|LD), for association without linkage (LD|NoLinkage) and allowing for linkage (LD|Linkage), and for both linkage and LD (LD+Linkage) under dominant and recessive modes, as well as affected-only (D1, R1), incomplete penetrance (D2, R2) and full penetrance (D3, R3) models. Morton’s test (1978) was applied to the linkage heterogeneity analysis.28 Fisher’s exact test in R was used to estimate the p value in the association analysis. The OR of an alternative allele was estimated with 95% CI. Furthermore, Holm test (1979),Benjamini and Hochberg (BH) test (1995) were used for p value corrections.

Variant effect prediction

Potential regulatory variants in LD were predicted using HaploReg29 against the following data sets: ChromHMM states from both the core 15-state model and the 25-state model with imputed marks, histone modification regions (H3K27ac, H3K9ac, H3K9me1 and H3K9me3) and DNase hypersensitivity data (DNase) from the Roadmap Epigenomics project. The Encyclopedia of DNA Elements (ENCODE) project provides data sets to assess the protein-binding site and regulatory motif affected by a variant in a non-coding region. Expression quantitative trait loci (eQTL)30 analysis of a variant for different tissues was queried using the Genotype-Tissue Expression portal.31

Clinical statistics

Demographic data for the cases and controls are presented as the ratio or median±SD. Differences between values representing clinical features in different groups were analysed by an independent t test, Mann-Whitney U test, one-way Analysis of Variance (ANOVA) and Scheffe post hoc comparison; p<0.05 (two-tailed) was considered statistically significant. The statistical analyses were performed using SPSS v25 statistics.

ResultsDemographic and clinical features of familial NMOSD

We identified 10 different families with NMOSD composed of 22 patients in our study (figure 2). The familial occurrence of NMOSD was 0.87% (22/2 520) in Chinese, where a total of 2520 consecutive patients were recruited from four regional medical centres (700, 700, 300 and 820 patients, respectively). The separate familial occurrences were 1.71%, 0.57%, 1.00% and 0.37%, respectively, for patients from The Third Affiliated Hospital of Sun Yat-sen University (12/700, IDs. B, C, D, G, H and I), West China Hospital of Sichuan University (4/700, IDs. A and E), The Second Affiliated Hospital of Guangzhou Medical University (3/300, ID. F) and Huashan Hospital (3/820, ID. J). All occurrences were lower than the estimated values of 3% for polyethnic people (including Asian, Latino, White and African people)4 and 2.8% for Brazilians.5

As shown in figure 2, NMOSD across two generations was found in four families (IDs. A, B, C and H), while in others, only one generation was found. The relationships between the patients were either mother–child or siblings. There existed three patients in two families (IDs. F and J) and two patients in the remaining.

All patients with familial NMOSD were seropositive for AQP4-IgG. The important clinical features including onset age, types of episodes, highest EDSS score (H-EDSS) in the course of disease, latest score (L-EDSS) at sample collection and ARR are summarised in table 1. It was shown that the difference between onset age of the mother–child pairs (20.00±8.76 years) was significantly larger than that for the siblings (6.50±6.52 years, p=0.013). This was consistent with the findings that the onset ages were different in mother–daughter pairs, and similar in sister–sister pairs.9 The onset age of mothers (47.25±14.93 years) was older than their children (27.25±7.89 years), but the difference was not significant (p=0.056). Furthermore, the onset ages of mothers, children and siblings (47.25±14.93 years, 27.25±7.89 years and 34.00±13.73 years, respectively) did not differ significantly from those of the patients with sporadic NMOSD (36.82±14.70 years, p=0.16, 0.092, 0.48, respectively).

NMOSD family members tended to have the same type of episodes. Intrafamilial concordance for the initial syndrome was found in six families (IDs. A, C, E, H, I and J). The concordant syndrome was ON in families A, C and I, acute myelitis (M) in families E and J, and area postrema (AP) syndrome in family H. The initial syndrome M was found in nine interfamilial patients (40.91%), ON in 8 (36.36%), AP in 4 (18.18%) and diencephalic syndrome (D) in 1 (4.55%). Therefore, the initial syndromes ON and M were more common in familial NMOSD.

Familial NMOSD similar to sporadic NMOSD in clinical features

The statistical differences of clinical features between patients with sporadic and familial NMOSD (459 vs 22) are compared in table 2. The familial NMOSD was indistinguishable from sporadic NMOSD on the sex ratio, onset age, type of first episode, number of episodes, L-EDSS and ARR. A previous study also confirmed the similarity on clinical features between familial and sporadic NMOSD.4 But for H-EDSS, the score was 6.23±2.50 for the familial patients, which was significantly higher than 4.97±2.07 for the sporadic patients (p=0.03). This result indicated that familial patients may suffer severer episodes than sporadic patients.

Table 2

Comparison of clinical characteristics between sporadic and familial NMOSD

USP18 genetic variants affected NMOSD

The genetic variants for the susceptibility to familial NMOSD were selected using WES data of 13 patients and 13 healthy members in seven families (figures 1 and 2). After a standard data preprocessing pipeline, we detected 5 689 974 raw variants by joint calling across all clean data; and 94 246 out of 5 021 322 qualified variants were normalised and combined with the pedigree information. We found 26 individuals and 74 262 variants suitable for the family-based statistical tests following Mendel error and HWE analysis.

A total of 25 SNPs with the minimal p values and the corresponding maximal logarithm of the odds (LOD) scores of linkage and association tests under dominant and recessive modes, as well as affected-only, incomplete penetrance and full penetrance models are listed in online supplemental table S2. Among them, one potential pathogenetic SNP, rs2252257 (G>A) in USP18 on chromosome 22, was identified to be related to familial NMOSD in the Linkage|LD test and R1 model (p=7.8E-05, LOD=3.1, table 3A). The frequency of the alternative allele in SNP rs2252257 was as high as 80.77% in familial NMOSD, compared with 66.45% in sporadic NMOSD (table 3B), and relatively lower frequencies, 57.69% and 52.05%, were found in healthy members. For East Asian 1000 Genomes and Genome Aggregation Database (gnomeAD), the alternative allele frequencies were 63.29% and 64.82%, respectively.

Table 3

Information on significant variants related to NMOSD

The variant SNP rs2252257 (G>A) showed a recessive inheritance pattern in families B, E, F and G, and only members with the A/A genotype developed NMOSD (figure 2), indicating an important contribution of genetic variant to the disease. However, members carried the G/G or G/A genotype developed NMOSD in families A and C, and the A/A genotype was healthy in families C and D, the additional biological and environmental risk factors cannot be ignored.

The analysis above was based on the genetic data from all seven families. Considering the heterogeneous background of different families, we analysed the linkage of SNPs to only one or some families. The linkage tests for pathogenic variants on chromosome 22 under the R1 model were performed for an individual family, but the samples were too small to provide any meaningful results (online supplemental table S3). Morton’s heterogeneity test for all families showed no significant variants related to one or more particular families (online supplemental table S4).

To verify the homogeneous role of USP18 in the pathogenesis of sporadic NMOSD, we retrieved the variant data of the USP18 gene in 228 cases and 1 400 controls (figure 1). SNPs rs361553 and rs5746523, respectively, were associated with sporadic NMOSD with p=1.29E-10 and 2.01E-09, and predicted to increase the risk with OR=2.49 (1.89–3.27, 95% CI) and 2.30 (1.75–3.00, 95% CI) in the Fisher’s exact test (table 3B). The adjusted p values were still significant in the more conserved Holm test. For SNP rs2252257, it was associated with the disease in the BH test (p=5.93E-06) and other eight variants with BH p<0.05 are listed in online supplemental table S5. The results confirmed that SNP variants in USP18 contributed to the pathogenesis of NMOSD.

Intronic variants impaired the regulation function in USP18

The possible effects of SNP variants on the USP18 genetic function were predicted. As shown in figure 3A, USP18 gene consists of 11 exons and 10 introns. SNP rs2252257 is located in intron 1, only 25 bp upstream of exon 2; SNP rs361553 is also in intron 1 and SNP rs5746523 in intron 2. Based on the predictions of regulatory variants against the ChromHMM states, histone modification regions and DNase hypersensitivity domains, SNP rs2252257 overlapped with the promoter and enhancer regions of the USP18 gene in brain and immune T cells (figure 3B), indicating it played an important role in the regulatory function of the gene. It also occupied the binding sites of transcription factors (Gcm1 and STAT) and proteins (CFOS, GATA2, STAT3 and CTCF) according to ENCODE data sets. The activation of Janus kinase/signal transducer and activator of transcription 3 (JAK/STAT3) inflammatory response was responsible for the NMOSD pathogenesis in Chinese,32 which supported the negative effect of the variant on the genetic function in our study. For SNPs rs361553 and rs5746423, they were found to be in LD with rs2252257 (r2=0.89 and 0.94, respectively, online supplemental table S6) and, therefore, displayed analogous effects, although they were not overlapping with the regulatory regions.

Figure 3Figure 3Figure 3

Structure and function of single-nucleotide polymorphism (SNP) variants in USP18. (A) Locations of SNPs rs361553, rs2252257 and rs5746523 in USP18, a reference allele is coloured in black and an alternative allele in red, adjacent to a functional exon with a red shadow. (B) Overlap of SNP rs2252257 with promoter and enhancer predicted against chromatin, histone and DNase, and transcription factor motif and protein binding site in brain and peripheral blood. (C) Tissue-specific expression quantitative trait loci (eQTL) analysis of SNP rs2252257 with P-M plot in brain and nerve tissues, where no effect, a possible effect, and a significant effect exist in regions of 0 ≤ m ≤ 0.1 (grey), 0.1 < m < 0.9 (light blue) and 0.9 ≤ m ≤ 1 (blue) for the gene expression.

Tissue-specific eQTL analysis in figure 3C shows that SNP rs2252257 possibly changed the expression level of the USP18 gene in brain and nerve tissues, such as hypothalamus (p=0.02, m=0.645), hippocampus (p=0.20, m=0.546), amygdala (p=0.50, m=0.213) and tibial nerve (p=0.0017, m=0.546). In clinic, a lesion in the hypothalamus is considered to be one of the ‘core characteristics’ of NMOSD, to which the change of USP18 expression made a contribution. A truly significant effect is expected if more data are available. A similar effect of SNP rs361553 on the gene expression is displayed in online supplemental figure S1.

In summary, the results in figure 3 implied that the non-rare variants in the promoter, enhancer and transcription factor binding site affected the USP18 gene expression and impaired its molecular function as a negative feedback regulator in the JAK/STAT signalling pathway, which resulted in the activation of pathogenic inflammatory factor in NMOSD.

Clinical phenotypes with different variant genotypes

We examined the correlations of variant genotypes with clinical phenotypes including onset age, ARR, H-EDSS and L-EDSS in familial NMOSD. The differences were not significant between variant genotypes, A/A, G/A and G/G, of SNP rs2252257 (figure 4A), which resulted from the small size of familial NMOSD.

Figure 4Figure 4Figure 4

Evaluation of clinical status based on variant genotypes. (A) Different genotypes of SNP rs2252257 (A/A, G/A, G/G) for evaluations of onset age, average annual recurrence rate (ARR), highest expanded disability status scale score (H-EDSS) in the course of disease and latest score (L-EDSS) at sample collection in familial NMOSD. (B) Different genotypes of rs2252257, rs361553 (T/T, C/T, C/C) and rs5746523 (G/G, A/G, A/A) for evaluations of onset age, ARR, H-EDSS and L-EDSS in sporadic NMOSD.

For sporadic NMOSD in figure 4B, we found that the T/T genotype of rs361553 was associated with a higher ARR (1.22±0.85) than the C/T or C/C genotype (0.69±0.57 and 0.81±0.65, p=0.003 and 0.001, respectively), which indicated that the T/T genotype of rs361553 was related to a high rate of relapse in NMOSD. The differences between variant genotypes of SNPs rs2252257, rs5746523 (G/G, A/G, and A/A) are also shown, without any significant value for clinical phenotypes.

Discussion

The familial occurrences were different between Chinese and polyethnic populations, including Asian, Latino, White and African, which indicated the ethnic origin played a role in NMOSD. The lower diagnosis rate in China was partially due to the poor awareness of NMOSD previously. In our study, NMOSD patients represented a small fraction (14.67%) of family members, indeed, only one or two generations were affected, and mother–child and sibling pairs were identified.

Genetic studies of familial NMOSD are limited.15 In this study, we found for the first time that SNP rs2252257 in USP18 contributed to familial NMOSD by transmitting the pathogenic alternative variant across generations. Possible explanations for the violation of a recessive inheritance pattern in families A, C and D were as follows: (1) male individuals with the A/A genotype tended not to develop NMOSD, (2) some individuals were too young to present with NMOSD and (3) the variant exhibited incomplete penetrance. For sporadic NMOSD, in addition to SNP rs2252257, two SNPs, rs361553 and rs5746523, were predicted to increase the risk of the disease.

In particular, rs2252257 affected the important regulatory functions of the intronic regions in USP18, rs2252257 and rs361553 impaired the expression level of the gene. The change of the USP18 gene was related to the activation of pathogenic signals. Patients with the rs361553 T/T genotype endured a higher ARR than those with the C/T or C/C genotype, which was essentially in agreement with the predictions.

In previous studies, SNP variants in the intronic and promoter regions of the USP18 gene were significantly associated with multiple sclerosis (MS), and the gene expression level was lower in patients characterised by higher relapse rates and neurological disability scores.33 34 Considering the close relationship between MS and NMOSD, the shared pathological gene supported our findings of the biological risk factors in the disease. Furthermore, in contrast to the previous studies focusing on minor alleles, all three risk variants in USP18 are non-rare variants (minor allele frequency>0.5%), for a non-Mendelian disease like NMOSD, a major allele is deleterious on account of gene–gene or gene–environment interactions. Overall, the genetic background of NMOSD is complex and a deeper investigation is demanded.

There are several limitations of our study. First, the sample size of familial NMOSD was rather small, considering the extremely low prevalence of the disease, larger multicentre studies are needed. Second, not all familial NMOSD patients and their first-degree family members underwent WES, the power of genetic analysis was, therefore, restricted. Third, the therapies received by the patients affected the disease courses and severities, which have ultimately influenced the results.

In conclusion, familial clustering of NMOSD was present in the Chinese population with a very low occurrence. Familial and sporadic NMOSD were similar in most terms of clinical manifestations; however, familial NMOSD patients may suffer severer episodes. SNP variants in the USP18 gene contributed to the pathogenesis of NMOSD by impairing the intronic regulatory functions, and non-rare variants have been identified as causal factors for NMOSD.

留言 (0)

沒有登入
gif