Genetic diversity and bioinformatic analysis in the L1 gene of HPV genotypes 31, 33, and 58 circulating in women with normal cervical cytology

The average age of infected females with HPV-58, -31, and -33 were 33.6, 29.8, and 33, respectively. Two samples from HPV-58 and one sample from HPV-33 were excluded due to the undesirable sequencing, thus we continued the study with 30 and 10 samples for HPV-58 and HPV-33, respectively.

Phylogenetic trees

The phylogenetic trees were constructed by maximum likelihood method with Mega 10 software (Fig. 1). Four lineages were distinguished in HPV-58, A (13.3%, n = 4), B (63.3%, n = 19), C (6.7%, n = 2), and D (16.7%, n = 5); however, there were six sub-lineages among HPV-58 isolates, including A1 (10.0%, n = 3), A3 (3.3%, n = 1), B1 (50.0%, n = 15), B2 (13.3%. n = 4), D1 (3.3%, n = 1), and D2 (13.4%, n = 4). HPV-31 was divided into three lineages, A (61.9%, n = 13), B (14.3%, n = 3), and C (23.8%, n = 5). Each lineage fell into two sub-lineages, including A1 (57.1%, n = 12), A2 (4.8%, n = 1), B1 (4.8%, n = 1), B2 (9.5%, n = 2), C1 (9.5%, n = 2), and C3 (14.3%, n = 3). All HPV-33 isolates (100.0%, n = 10) accounted for lineage A (50.0% A1 and 50.0% A2).

Fig. 1figure 1figure 1

Phylogenetic trees of three HPV genotypes 58 (A), 31 (B) and 33 (C) based on alignments of the L1 genes. The trees were constructed in Mega 10 by using maximum likelihood/Kimura 2-parameter method and 1000 bootstrap replicates and the values greater than 70% are shown above the branches. The isolates from this study are shown with small black circles and remaining accession numbers are available HPV genotypes 31, 33, and 58 sequences in the Genbank as references. All HPV genotypes of this study are available in the NCBI database and Genbank accession numbers OQ412837-82, MT267729, MZ221065-73, and MZ221053-57

Amino acid and genetic variabilityHPV-31

Nucleotide sequences analysis on the L1 gene of HPV-31 showed 34 changes (Additional file 1: Table S1). The most frequent mutation, which was seen in all isolates, was C1266A, followed by C186T, in 61.9% (n = 13) of isolates. The third most frequent mutations were T370C, C821A, and G1245A, accounting for 38.1% (n = 8) of samples. Eight mutations were found in all lineage C isolates, T30A, A468G, G777A, A799G, T1035G, G1245A, C534T, and C821A; the two formers of which led to T267A and T274N missense mutations, respectively (Table 2).

Table 2 Amino acid mutations in all HPV-58, -31, and -33 isolatesHPV-33

Compared to the reference gene, the L1 sequence mutations of HPV-33 isolates showed nine variations (Additional file 1: Table S2). Among these, C797A, A1071G, and three changes C167A, G397A, and T885G occurred in 80.0% (n = 8), 70% (n = 7), and 50.0% (n = 5) of isolates, respectively. Table 2 represents all mutations in the amino acid sequences in comparison to the reference. Three replacements T266K, T56N, and G133S accounted for the most frequent amino acid changes.

HPV-58

Fourty seven nucleotides and 19 nonsynonymous mutations were observed in the L1 gene of HPV-58 isolates with 86.7% (n = 26) of each G430A, G1258A, C1263A, and A1264G, as well as G840A 83.3% (n = 25) as major substitutions (Additional file 1: Table S3). Among these variations, G430A, G1258A, and A1264G led to missense variations V144I, D420N, and N422D, respectively as predominant amino acid changes. Other variations are represented in Table 2.

N-glycosylation analysis

N- and O-linked glycosylation are two main glycosylation in which glycans attach to side chains of Asparagine and mainly Serin/Threonine, respectively. N-glycosylation is a post translational modification that is involved in myriad biological process, such as protein folding, stability, and host cell membrane-ligand interactions [15]. Compared with the references, none of mutations in studied sequences led to any changes in N-glycosylation sites (Table 3). Although no changes were found between the N-glycosylated sites in references and HPV-58 and -31 isolates, at position 54 of HPV-33 reference a proline locates just after asparagine, which is predicted as a N-glycosylation site (Table 3). Since, in most cases, a proline situated after asparagine may inhibit the N-glycosylation by rendering the asparagine inaccessible, more experimental evidence is needed to confirm this N-link glycosylation site. In HPV-33 isolates 99–1217, 99–1216, 99–1215, 99–1211, 99–1210, and 98–274, threonine was replaced by asparagine at position 56 which was the only difference between these isolates and the N-glycosylation site in the reference.

Table 3 Prediction of N-glycosylation sitesSelective analysis

During natural selection some variants can be adapted or deleted in a population. In positive selection, the variants that confer a fitness advantage are fixed in the population, while in negative selection the variants with a deleterious effect on the fitness are gradually removed [16]. Datamonkey server estimates alpha and beta substitution rates codon-by-codon under selection. A nonsynonymous replacement with a negligible effect on its fitness is classifies as neutral but a variation which leads to an increase in fitness, is considered a positive selection. Also, a purifying substitution is interpreted as a negative selection that means the mutation may gradually be removed from the genome. Tables 4, 5 and 6 indicates the codon-by-codon results from the FEL analysis in case of HPV genotypes 31, 33, and 58. Among 85 nucleotide variations which were under different selective pressures, 43.5% (n = 37) were under negative selection and 55.3% (n = 47) accounted for neutral selection. 17, 17, and 3 codons with negative (purifying) selection were observed in the HPV-58, -31, and -33 sequences, respectively. Only one positive (diversifying) selection at position 150 in the HPV-58 sequences was found. We reported all neutral variations in the case of these HPV genotypes in Additional file 1: Table S4a–c.

Table 4 The codon-by-codon results from the FEL analysis for HPV-58 isolatesTable 5 The codon-by-codon results from the FEL analysis for HPV-33 isolatesTable 6 The codon-by-codon results from the FEL analysis for HPV-31 isolatesHomology analysis on L1 loops

According to the HPV-16 major capsid protein, the pentameric L1 protein consists of five external surface loops, which are designated as BC, DE, EF, FG, and HI loop [17]. The structure of the HPV-16 L1 protein was compared with each HPV-31, HPV-33, and HPV-58 references by using homology model to predict the changes in the L1 loops (Fig. 2). Among missense variations in HPV-31 samples, T267A and T274N were in FG loop and S67L in BC loop. Out of four nonsynonymous replacements in the HPV-33 isolates, three changes T56N, G133S, and T266K were located in BC, DE, and FG loop, respectively. Among nineteen changes in the HPV-58 amino acid sequences, eleven replacements were located in DE, FG, and HI loops, including V144I, L150F, S159G, and P163T in the DE loop; K292T, A296P, D299N, and V311A/G in the FG loop; T375N, G378D, and D383N in the HI loop. Remaining mutations occurred out of the loops.

Fig. 2figure 2

The L1 gene in reference genomes of HPV-58, -33, and -31 compared with that of HPV-16. Red = BC Loop, pink = DE Loop, blue = EF Loop, green = FG Loop, orange = HI Loop. The mutations in the understudied isolates are presented by brown boxes

留言 (0)

沒有登入
gif