Detection of human noroviruses in sewage by next generation sequencing in Shandong Province, 2019–2021

Quantification of norovirus genomes in wastewater

All 36 sewage samples collected from January 2019 to December 2021 tested positive for both GI and GII NoV RNA via qPCR assay. The concentrations (genomic copies l−1) of norovirus GI and GII ranged from 7.37 × 102 in January 2019 to 1.29 × 107 in August 2019, and from 2.38 × 105 in June 2021 to 1.83 × 107 in May 2021, respectively (Fig. 1A). The average viral concentrations of NoV GI and GII were calculated and compared using a t-test, revealing a significant difference (F = 7.366, P < 0.01). The logarithm of the geometric means of GI concentrations was significantly lower than that of GII (Fig. 1B). The peak concentrations of NoV GI and GII were both observed in August, with values of 1.06 × 106 and 3.69 × 106, respectively. The minimum concentrations were recorded in February, with values of 8.60 × 103 and 3.05 × 105 respectively. However, the concentrations of genogroups GI and GII did not exhibit statistically significant differences before and during the COVID-19 pandemic (GI: F = 4.12, P = 0.05; GII: F = 2.19, P = 0.148; t-test).

Fig. 1figure 1

Genome Concentrations of norovirus GI and GII in sewage. Panel A displays log values of genome concentrations of norovirus GI and GII in sewage in each month from 2019 to 2021. Panel B displays the log values of genome concentrations of GI and GII by month

As internal quality control, PMMoV was tested positive in all samples with the genomic copies l−1 ranging from 1.1 × 107 to 1.0 × 109 copies per liter. These relatively stable and high values suggest that sewage was concentrated at high efficiencies.

Partial VP1 gene amplification and NGS data

VP1 nested RT-PCR assay revealed a 100.0% (36/36) positivity rate for GII and a 94.4% (34/36) positivity rate for GI amplification. The negative results were associated with sewage samples collected in February and November 2020. The NGS clean data from 70 amplicons was independently analyzed using the CLC Genomics Workbench. The number of NGS reads and NoV reads produced from each amplicon is detailed in Supplementary Table 1. In total, 722,234,524 NGS reads were generated from 34 GI amplicons, with 721,931,942 (99.96%) identified as NoV. Similarly, 733,146,126 NGS reads were generated from 36 GII amplicons, with 722,743,097 (98.58%) classified as NoV. The total filtered reads per amplicon ranged from 5,884,219 to 31,363,014, with 78.30% to 100.00% of these reads mapped to NoV.

A total of three genogroups (GI, GII and GIX) were detected from all 36 sewage samples. For the 34 amplicons generated using the GI primer set, 721,931,942 reads were mapped to NoV, of which 99.99% (721,919,682/721,931,942) were identified as GI while less than 0.01% (12,260/721,931,942) were identified as GII. For the 36 amplicons generated using the GII primer set, 722,743,097 reads were mapped to NoV, of which 89.03% (643,486,469/722,743,097) were identified as GII while 0.51% (3,694,068/722,743,097) and 10.46% (75,562,560/722,743,097) were identified as GI and GIX, respectively.

Genotype diversity

A total of 23 genotypes were identified in this study. When summing up the NGS reads of all samples, the read counts ranged from 14,578 (GII.10) to 328,172,418 (GI.5). The five most abundant genotypes were GI.5, GII.2, GI.3, GII.4, and GII.17. Additionally, the detection frequency of individual genotypes was calculated, revealing that the five most frequently detected genotypes were GI.5 (86.11%), GII.2 (86.11%), GII.4 (63.89%), GII.17 (58.33%), and GII.13 (55.56%) (Table 1 and Fig. 2).

Table 1 Numbers of NGS reads, frequencies, and ranks for individual norovirus genotypes detected in sewageFig. 2figure 2

Norovirus genotypes detected from sewage samples in every month during 2019–2021. The color illustrates the logarithm value of the number of NGS reads belonging to different genotypes

The genotype proportion, termed G-proportion, was determined by dividing the read counts of a specific genotype by the total read counts of all genotypes within the same month. Subsequently, the G-proportion was categorized into six groups: 0– < 5%, 5– < 10%, 10– < 15%, 15– < 20%, 20– < 30%, and ≥ 30%. Based on this classification, the number of genotypes detected each month was enumerated (Table 2), with monthly detections ranging from 3 to 13 genotypes. The lowest number of genotypes were observed in June 2021, while the highest was recorded in September 2020. When considering genotypes with a G-proportion exceeding 5%, the monthly detections varied between 2 and 8 genotypes. The minimum number of such genotypes was detected in November 2020, whereas the maximum was observed in both August 2020 and April 2021. There were 23 months during which at least 5 genotypes with a G-proportion greater than 5% were detected. Among these, 7 months exhibited at least 7 genotypes with a G-proportion exceeding 5%, specifically in July 2019, September 2019, January 2020, April 2020, August 2020, September 2020, and April 2021. Annually, 20, 18, and 19 genotypes were detected in 2019, 2020, and 2021, respectively.

Table 2 The numbers of genotypes in different G-proportion groups, by year and by month*

To further investigate the predominant genotypes during the study period, we identified genotypes with a G-proportion exceeding 15% and recorded the number of months in which these genotypes were detected (Supplementary Table 2). The findings indicated that five genotypes—GI.3, GI.5, GII.2, GII.4, and GII.17—exhibited a G-proportion greater than 15% and were detected in more than five months. Notably, GI.3, GI.5, and GII.2 were detected in over ten months.

We analyzed the G-proportion of these five main genotypes across different months (Fig. 3). A long-term competitive relationship was observed between GI.5 and GII.2, with these two genotypes displaying opposite trends in more than 20 out of the 36 months, as their G-proportions alternated in dominance. During the period of 2019–2020, GII.17 emerged as one of the predominant genotypes. However, from June to August 2020, its detection rate experienced a marked decline, and since September 2020, GII.17 has been classified as a minor genotype. In contrast, the detection pattern of GI.3 exhibited a pulse-like trend from September 2019 to June 2021, indicating the occurrence of intermittent small-scale outbreaks during this timeframe. Although the G-proportions of GII.4 were lower than those of GII.2, GII.4 remained consistently present from 2019 to 2021.

Fig. 3figure 3

The G-proportion of 5 predominant genotypes in each month from 2019 to 2021. Panel A displays GI.3 and GI.5. Panel B displays GII.2, GII.4, and GII.17

Monthly distribution of common genotypes

We independently computed the read counts for each season during the years 2019, 2020, and 2021. Subsequently, we determined the monthly proportion (denoted as M-proportion) of a specific genotype by dividing its read count for a given month by the total annual read count of that genotype within the same year. We then categorized the M-proportion into six groups: 0– < 5%, 5– < 10%, 10– < 15%, 15– < 20%, 20– < 30%, and ≥ 30%. Based on this classification, we quantified the number of genotypes detected each month. We focused on genotypes that were detectable in the majority of months and, for these genotypes, we counted the number of months in which the M-proportion exceeded 15%. Our findings indicated that 14 genotypes, specifically GI.5, GII.2, GI.3, GII.4, GII.17, GII.3, GIX.1, GII.13, GI.2, GI.1, GI.8, GI.4, GI.9, and GI.6, were consistently detected in most months with an M-proportion greater than 15%. Further analysis on a monthly basis revealed several instances of notably higher M-proportions: November 2019 for genotype GI.3, July 2020 for genotype GII.4, May 2021 for genotype GI.3, June 2021 for genotype GII.2, and October 2021 for genotype GII.17. Then we conducted a statistical comparison of the M-proportions by seasons for each genotype, but no significant differences were observed.

Homologous comparison and phylogenetic analysis

Homologous comparison was conducted on the partial VP1 sequences generated from resequencing analysis (n = 379). Nucleotide similarities and the coefficient of variation (CV) of these similarities were calculated. The incorporation of the mean and CV of similarity provides a more precise reflection of the average similarity and dispersion among indigenous viruses compared to traditional indices. The analysis revealed that, for GI and GII, GI.3 exhibited the highest CV (6.73%), with similarities ranging from 80.5 to 100.0%, while GII.3 demonstrated the lowest CV (1.46%), with similarities ranging from 93.7 to 100%. Specifically for GI, GI.5 had the lowest CV (2.54%), with similarities ranging from 86.3 to 100.0%. For GII, GII.2 had the highest CV (5.00%), with the similarities ranging from 79.9 to 100.0%. The mean nucleotide similarities of each genotype were also explored and they ranged from 92.69% (GII.17) to 98.37% (GII.3).

To investigate the genetic relationships of NoV, phylogenetic analysis were performed based on the partial VP1 region, utilizing sequences from Shandong and 51 reference sequences from GenBank (Fig. 4). The phylogenetic trees did not reveal distinct temporal demarcations among the isolates, as sequences from different years of isolation were found within the same cluster. Multiple transmission chains were identified in the phylogenetic trees for genotypes GI.1, GI.2, GI.3, GI.5, GII.2, GII.4, and GII.17.

Fig. 4figure 4

Phylogenetic trees based on partial VP1 nucleotide sequences of GI (A) and GII (B) noroviruses. The sequences are identified by a code that consists of primer set (GI/GII), followed by collection year (YY), collection month, and genotypes

Evolutionary rate of norovirus GI and GII

The molecular evolution rate analysis was carried out on genotypes with more than 10 sequences and effective sample size (ESS) values exceeding 200. The results indicated that GI.3 exhibited the highest evolutionary rate at 9.81 × 10–3 substitutions/site/year (95% HPD: 1.71 × 10–3–3.38 × 10–3), while GII.3 displayed the lowest evolutionary rate at 3.07 × 10−4 (95% HPD: 1.34 × 10−5–1.75 × 10−3) (Supplementary Table 3).

留言 (0)

沒有登入
gif