PE/PPE mutations in the transmission of Mycobacterium tuberculosis in China revealed by whole genome sequencing

Sample structure

Of all domestic 3202 strains, 2745 isolates (85.7%) belonged to lineage 2 (94.4% belonged to sublineage 2.2.1), 443 (13.8%) isolates belonged to lineage 4, only 14 isolates belonged to other lineages (lineage 1 and lineage 3). We constructed maximum likelihood phylogenetic trees for lineage 2 and lineage 4 M. tuberculosis isolates, respectively (Fig. 1a and b). The results of strain clustering showed that 1462 strains in lineage 2 were grouped into 446 transmission groups, which were consisted of 2 to 107 isolates. In lineage 4, a total of 52 clusters contained 132 isolates, ranging in size from 2 to 9 isolates (Table 1). It should be noted that lineage 2 strains had a considerably larger proportion of strains in transmission clusters than lineage 4 strains (53.3% vs. 29.8%, P < 0.001).

Fig. 1figure 1

(a) Phylogenetic tree of 2745 Chinese M. tuberculosis strains in lineage 2. (b) Phylogenetic tree of 443 Chinese M. tuberculosis strains in lineage 4

Table 1 Lineage and demographic factors associated with transmission clusters (≤ 10 SNP) of M. tuberculosis strains in ChinaThe effect of PE/PPE gene mutations on transmission of L2 strains

A comprehensive analysis revealed 1,141 homoplastic SNPs in lineage 2 strains, as detailed in Supplementary Table 2. After excluding SNPs with a MAF below 0.005, a total of 140 homoplastic SNPs from the PE/PPE gene region were selected for further analysis. Comparing clustered and non-clustered strains, 45 mutation sites with statistical significance (P < 0.05) in the univariate regression analysis. To further investigate these associations, the 59 loci with P values less than 0.2 in the univariate analysis were included in a multivariate logistic regression analysis. The results indicated that 9 sites were identified as influencing factors (P < 0.05), with PE4 (position 190,394; c.46G > A; OR, 2.183; 95% CI, 1.025–4.651), PE_PGRS10 (839,194; c.744 A > G; OR, 1.668; 95% CI, 1.220–2.280), PE16 (1,607,005; c.620T > G; OR, 3.741; 95% CI, 2.039–6.864) and PE_PGRS44 (2,921,883; c.333 C > A; OR, 12.664; 95% CI, 1.696–94.357) considered as risk factors for strain clustering (Table 2).

Table 2 Analysis of the PE/PPE gene mutations in clustering and non-clustering of lineage 2

The 382 strains belonging to lineage 2 formed 77 cross-regional clusters, ranging from 2 to 6 geographic regions. Among the 7 geographic regions, Northern China (31.9%) and Central China (27.3%) exhibited the highest proportion of these cross-regional clusters, followed by and Southwest China (23.1%) and Northwest China (19.5%). In the univariate analysis, 57 SNPs exhibited statistically significant differences between cross-regional and regional clusters (P < 0.05). Subsequent multivariate logistic regression analysis identified 10 mutations as influencing factors (P < 0.05), with 4 mutation positions recognized as risk factors for cross-regional clusters, including PE_PGRS10 (839,334; c.884 A > G; OR, 2.706; 95% CI, 1.081–6.774), PE_PGRS11 (847,613; c.1455G > C; OR, 4.342; 95% CI, 1.636–11.525), PE_PGRS47 (3,054,724; c.811 A > G; OR, 2.099; 95% CI, 1.211–3.637) and PPE66 (4,189,930; c.303G > C; OR, 6.511; 95% CI, (1.679–25.242) (Supplementary Table 3).

The correlation analysis between mutation sites and cluster size revealed that 19 mutation positions were significantly associated with cluster size (P < 0.05), with 13 mutation positions positively correlated with clustering size (rs > 0), including PE_PGRS1 (132,417), PE_PGRS6 (623,472), PE_PGRS9 (836,658), PE16 (1,607,005), PPE26 (2,027,484), PPE34 (2,165,286), PPE35 (2,167,926), PPE44 (3,079,877), PPE54 (3,736,628), PPE56 (3,762,013), PE_PGRS58 (4,032,218), PE_PGRS58 (4,032,760) and PPE69 (4,375,628). For further details refer to Fig. 2.

Fig. 2figure 2

Correlation analysis of PE/PPE gene mutation positions and cluster size

The effect of PE/PPE gene mutations on transmission of L4 strains

A total of 205 homoplastic SNPs were detected in lineage 4 strains, as presented in Supplementary Table 4 for reference. After excluding homoplastic SNPs with a MAF below 0.005, 74 SNPs in the PE/PPE gene region were selected for in-depth examination. A significant difference in 6 PE/PPE gene mutation positions was detected between clustered and non-clustered strains, as per the single-factor analysis (P < 0.05). To further investigate these associations, a multivariate logistic regression analysis was conducted, focusing on the 22 loci with P-values less than 0.2 from the initial univariate analysis. However, no mutations were observed in lineage 4 strains that seemed to facilitate transmission, as displayed in Table 3.

Table 3 Analysis of the PE/PPE gene mutations in clustering and non-clustering of lineage 4

Furthermore, 25 lineage 4 strains grouped into 9 cross-regional clusters, with strains in each cluster spanning two different geographic regions. After conducting univariate analysis, 9 mutation sites were selected for a multivariate regression analysis, revealing that 4 positions were significantly associated with cross-regional clusters (P < 0.05). PE_PGRS4 (338,100; c.974 A > G; OR, 6.090; 95% CI, 1.702–21.793) and PPE13 (976,897; c.1307 A > C; OR, 3.505; 95% CI,1.103–11.132) considered as risk factors for cross-regional transmission of strains (Supplementary Table 5). Due to the low prevalence of lineage 4 strains in China and the relatively small sample size, we did not further analyze the transmission cluster size of lineage 4 strains.

留言 (0)

沒有登入
gif