Large-scale genomic analysis of global Klebsiella pneumoniae plasmids reveals multiple simultaneous clusters of carbapenem-resistant hypervirulent strains

Isolates and genotypes

While all isolates (n=13,178) were Kp according to metadata, Kleborate screening identified them as Kp (n=11,820; 90%), K. quasipneumoniae (n=604, 5%), K. variicola (n=428; 3%), K. aerogenes (n=299; 2%) and other subspecies of Klebsiella (n=27; 0.2%), i.e. some isolates may have incorrect species data in the NCBI database (Additional file 1: Fig. S1). We did not restrict the analysis to K. pneumoniae sensu stricto, but we removed 710 isolates in which PlasmidFinder did not identify any replicons. These removed isolates covered 331 STs, with ST3910 (n=21; 3%) being most frequent. They also had a high abundance of non Kp sensu stricto (K. quasipneumoniae n=140, 19.7%; K. aerogenes n=115, 16.2%; K. variicola (n=109, 15.4%). Notably, these isolates had limited AMR carriage with only 9 isolates carrying carbapenemase encoding genes, among which blaOXA-48 was the most frequent (n=4). There were also few fluroquinolone gyrA mutations (n=65/710) and aminoglycoside resistance enzymes (n=35/710). Only four isolates carried an aerobactin locus. However, 20 isolates had a truncated iro3 locus and 109 K. aerogenes samples had a chromosomally carried salmochelin locus.

The final isolate dataset (n=12,468) contained 1603 different STs (based on 7 chromosomal loci) with the most frequent being ST11 (15%), ST258 (10%), ST15 (4%), ST512 (4%) ST307 (3%) and ST147 (3%) (Additional file 1: Fig. S1). The number of hypervirulent strains was 1881 (15%). The two canonical hypervirulent strains ST23 and ST86 accounted for 1.8% and 0.7%, respectively. The study had global representation, with isolates from 103 countries, including China (20%), USA (14%), Italy (7%), UK (6%), Thailand (4%), and Germany (4%); however, only 2% of all isolates were from sub-Saharan Africa. While the convenience nature of the sampling may make the dataset unsuitable for the estimation of the prevalence of CRhvKp genotypes, the large sample size presents a likely accurate assessment of regional and temporal trends.

Replicons

An important cornerstone of our analysis is the distinction between the replicon family or name (e.g. IncFII(K), IncFIB(K) and ColRNAI) and the underlying unique nucleotide sequences of the replicons belonging to a specific family. For example, across the 12,468 isolates, there were 3034 unique replicon nucleotide sequences, of which 1096 occurred in more than one isolate. The most frequent replicon nucleotide sequence was the PlasmidFinder reference version of IncFIB(K), which occurred in 3123 (24%) isolates. However, the replicons from an IncFIB(K) family occurred 7052 times with eight distinct sequences occurring in more than 100 Kp isolates (Fig. 1). The family and nucleotide sequence diversity are summarised in Additional file 2: Data S2.

Fig. 1figure 1

Frequency of the main nucleotide sequences from the IncFIB(K) replicon family. The eight nucleotide sequences are provided in Additional file 2: Data S2

The average number of replicons per isolate was 4.6 (range: 1 to 29). A total of 6783 isolates (55%) with identified replicons had carbapenemase encoding genes, and 1881 (15%) had either iuc or iro loci. A total of 1028 (8%) isolates with identified replicons had both carbapenemase encoding and hypervirulence (CRhvKp) loci (Additional file 1: Fig. S1). Therefore, the numbers of isolates determined to be CRhvKp, CRKp (non- hvKp) and hvKp (non-CRKp) were 1028 (8%), 5755 (46%), and 853 (7%), respectively (Additional file 1: Fig. S1). The majority of the CRhvKp were from China (n=696) representing 27% of isolates from that country with isolates collected in Russia (n=64), Thailand (n=44), Italy (n=39) and India (n=25) being the next most common.

To examine the geographical and temporal trends of carbapenem-resistant hypervirulent genotypes, we assessed the replicon-based clustering of CRKp, hvKP and CRhvKp isolates across 695 plasmid replicon nucleotide sequences that were present in multiple CRKp, hvKp and CRhvKp isolates (Additional file 1: Fig. S1). The UMAP-based analysis of population structure revealed that clustering was not driven solely by geography or ST (Fig. 2). This contrasts with chromosomal-based cluster analysis, which revealed clustering of isolates by ST (data not shown). We applied the HDBSCAN clustering algorithm on the two-dimensional UMAP projection of CRhvKp isolates (n=1028) and identified 9 groups (Clusters A – I; Table 1, Fig. 2), with 79 (8%) isolates not assigned to any group (see Additional file 2: Data S3 for all assignments). Of these 9 clusters, only two (B and E) did not have strong statistical support of their robustness.

Fig. 2figure 2

Clustering of CRKp, hvKp, and CRhvKp Kp isolates based on replicon sequences coloured by A major sequence types (STs), B country, and C genotype. In D, non-CRhvKp isolates are hidden and only CRhvKp isolates are visible and coloured by plasmid cluster. Axes are dimensionless. Underlying data is presented in Additional file 2: Data S3

Table 1 List of CRhvKp clusters. In some assemblies carbapenemase genes (CR), iuc and iro (hypervirulence; HV) loci were located on the same assembly contig or chromosome. The underlying data is available in Additional file 2: Data S2, S3 and S4Replicon clusters

Cluster A (n=560, Table 1) consisted mostly of isolates from China and accounted for the majority of CRhvKp isolates from that country (n=517/696). This cluster also included isolates with travel links to China [35]. The first isolate from Cluster A was collected in 2012 with the frequency increasing over time (Fig. 3). ST11 was the dominant sequence type in this cluster (n=499/560), which also contained the majority of ST11 CRhvKp isolates (n=499/582). Hypervirulence was driven by iuc1 in nearly all isolates (n=554/560) and 168 isolates additionally had iro1 (77 truncated or incomplete), following the nomenclature established previously [36]. The carbapenemase encoding genes were dominated by blaKPC-2 (n=516/560), with blaOXA-48 the next most common (n=16) (Table 1, Additional file 2: Data S3).

Fig. 3figure 3

Relative abundance and geographic diversity of carbapenem-resistant hypervirulent Kp (CRhvKp) isolates. Both the total number of all genotyped isolates in each year and percentage that are CRhvKp are presented at the top of the figure

The dominant replicons in Cluster A were ColRNAI (n=943), which occurred multiple times in most isolates, followed by repB (n=555), IncFII (pHN7A8) (n=501), IncHI1B(pNDM-MAR) (n=489) and IncR (n=486) (Additional file 2: Data S2). By examining assembly contigs for any co-presence of carbapenemase, virulence genes and replicons, we were able to link replicon and carbapenemase genes in 74 isolates and replicon and siderophore genes in 341 isolates (Table 1). No carbapenemase or hypervirulence-linked siderophore genes were found on chromosomes. In cases where siderophores were located on the same contig as replicons, the vast majority (n=290/343) had a IncHI1B(pNDM-MAR) replicon (Additional file 2: Data S4). Of these, 125 contigs had both repB and IncHI1B(pNDM-MAR) replicons. In addition, 42 contigs had simultaneously repB replicon and siderophores. The majority (n=52/76) of contigs which carried replicon and carbapenemase genes had an IncFII (pHN7A8) replicon, and 49 of these also had an IncR replicon. Two contigs had both siderophores and carbapenemase genes. The first one had repB, IncFIB (pKPHS1) and IncHI1B(pNDM-MAR) replicons. The second had repB and IncHI1B(pNDM-MAR). Both contigs carried blaKPC-2 and iuc1 loci.

Cluster B (n=83) consisted mainly of isolates from China (n=81/83) and the ST11 sequence type (n=74/83). This cluster did not have strong statistical support for its robustness, and based on visual examination of results (Fig. 2) it is related to, but distinct from, Cluster A isolates. In particular, the isolates in Cluster B lack repB and IncHI1B(pNDM-MAR) replicons, characteristic of Cluster A.

Cluster C (n=82) is geographically diverse, with 32 isolates sourced from Russia, 16 from China, 9 from Egypt, and 8 from Germany. There were 15 STs of which ST147 (n=32; 39%) and ST395 (n=24; 29%) were the most frequent. While ST147 had broad geographic distribution, the majority of ST395 isolates in this cluster were collected in Russia (n=20/24; 83%). The most frequent carbapenemases were blaOXA-48 (n=51) and blaNDM-1 (n=23), again without any strong geographic links. Five isolates had both genes. More broadly, 78 isolates had either a blaOXA or blaNDM gene, with blaKPC-2 (n=2) and blaVIM-1 (n=1) accounting for the rest (Additional file 2: Data S2). The dominant hypervirulence siderophore was iuc1 (n=78/82), while an unassigned salmochelin lineage was present in two isolates. Despite geographic and ST diversity, three replicons accounted for the majority of isolates: Col(pHAD28) (n=91), IncHI1B(pNDM-MAR) (n=80), and IncFIB(pNDM-Mar) (n=79) (Table 1). Out of 14 isolates in which a replicon and iuc1 locus were on the same contig, ten had both IncFIB(pNDM-Mar) and IncHI1B(pNDM-MAR) replicons, and a further two had IncHI1B(pNDM-MAR). Replicons linked to carbapenemase encoding genes (n=16) were diverse, and IncL linked to blaOXA-48 (n=4) was the most frequent association.

Cluster D (n=61) consisted of isolates mainly from South(east) Asia (Thailand, n=27; India, n=17; Pakistan, n=8). The first isolate was collected in Malaysia in 2013, while the most recent six isolates were collected in India in 2019. Nearly all isolates belong to ST231 (n=60/61), which shares only two MLST alleles with canonical hypervirulent ST23. The dominant carbapenemase was blaOXA-232 (n=59/61), with blaOXA-181 and blaOXA-48 in the other two isolates. Unusually, hypervirulence was driven by iuc5 (n=60/61) and accounts for nearly all iuc5 carrying CRhvKp (n=60/75). Salmochelin (iro1) was only present in a single isolate. This cluster had near universal carriage of seven replicons: Col440I, IncFIB(pQil), IncFII(K), IncFIA, IncFII(pAMA1167-NDM-5), ColKP3, and Col(pHAD28) (Table 1). IncFII(pAMA1167-NDM-5) was unique to this cluster and IncFIA only occurred in nine further CRhvKp isolates. We were able to link replicons to a carbapenemase gene in 60 isolates, as well as to a siderophore in 8 isolates. Neither siderophore nor carbapenemase genes occurred on chromosomes. In all cases, blaOXA-232 was linked to the ColKP3 replicon, while iuc5 was co-located on a contig with IncFIA and IncFII(pAMA1167-NDM-5).

Cluster E (n=50) contained isolates mostly from Asia (China, n=16; Singapore, n=14; Thailand, n=13), but unlike Clusters A and B, this one also had five samples from three European countries (Latvia, n=2; Greece, n=2; France, n=1). The first isolates were collected in 2013 in Singapore (n=7) and the most recent was from China in 2020. Sixteen isolates were of the canonical hypervirulent strain ST86 and collected in six countries. This cluster also had ST65 (n=14) and ST23 (n=4) isolates. The former shares only two alleles with ST86. The dominant siderophores in this cluster were iuc1 (n=44) and iro1 (n=40) with 40 isolates carrying both. While blaKPC-2 was the most frequent carbapenemase gene (n=32), some isolates (n=13) carried blaOXA-232 just like the isolates from Cluster B. All these blaOXA-232 carrying isolates were collected as part of the same Thai study in 2016; however, they belonged to 8 different STs [37]. Unlike Clusters A and B, this one had only two near universal replicons: repB (n=50) and IncHI1B(pNDM-MAR) (n=49). We were able to link replicon to carbapenemase genes in 19 isolates and to a siderophore in 32 isolates. Nearly all linked iuc loci were located on a contig with either IncHI1B(pNDM-MAR) (n=4), repB (n=1) or both (n=22) (Additional file 2: Data S4). The blaOXA-232 gene was co-located with a ColKp3 replicon in all blaOXA-232 carrying isolates, like Cluster B. The replicons linked to blaKPC-2 were much more diverse. Out of 6 contigs three had a IncFII(K) replicon, two IncFII(pHN7A8), and one IncX6. None of the examined contigs carried both carbapenemase and hypervirulence genes, nor were any of these genes on chromosomes.

Cluster G consists of 32 isolates collected every year between 2015 and 2020 with the majority isolated in China (n=26) (Table 1). While this cluster includes two ST23 isolates and one ST11, it is dominated by ST15 (n=26/32). The main siderophore locus was iuc1 (n=32/32) with two isolates concurrently carrying iro1. The dominant carbapenemase was blaOXA-232 (n=27/32) with blaNDM-1 (n=3/32) next most common. The most interesting aspect of this cluster was the diversity of frequent replicons: repB (n=31), ColRNAI (n=31), IncHI1B (pNDM-MAR)(n=30), IncFIB (pKPHS1)(n=28), IncFII(K) (n=28), ColKP3(n=28), and Col(pHAD28)(n=28). We linked replicon to carbapenemase in 32 isolates and to a siderophore in 7 isolates. The iuc1 hypervirulent locus was linked to IncHI1B(pNDM-MAR) (n=6) and blaOXA-232 was linked to ColKP3 (n=28) as in cluster D, but that cluster has iuc5 linked to both IncFIA and IncFII(pAMA1167-NDM-5). Part of this cluster has been described previously [22], reinforcing the robustness of our approach.

Salmochelin carrying K. aerogenes

Surprisingly, nearly all K. aerogenes isolates (n=288/299) carried a salmochelin locus, but none had aerobactin. In all 41 complete K. aerogenes assemblies this locus was chromosomal (contig > 4,500,000nt). These salmochelin genes had nucleotide identity between 74% and 86% to Kp sensu stricto salmochelin genes. The nearest amino acid sequences outside K. aerogenes were iroB, iroC, iroB and iroN in Enterobacter oligothropicus with 93%, 90% and 81% identity, respectively. A small portion of K. aerogenes isolates had both salmochelin and carbapenemase genes (n=40), among which carbapenemase genes blaKPC-2 (n=13) and blaOXA-48 (n=6) were the most common. Most of the K. aerogenes without carbapenemase genes (n=231/248) did not have any extended-spectrum beta-lactamase (ESBL) encoding loci, despite most of them (n=171/231) being collected after 2010 — a period in which ESBL encoding genes were common in Kp. PlasmidFinder identified replicons in 174 salmochelin carrying K. aerogenes. Their replicons formed a cluster of samples consisting mainly of USA isolates (n=73), with a few from Germany (n=9), Lebanon (n=6) and other countries. Apart from one small study [38], we believe this is the first major report of the widespread presence of salmochelin in K. aerogenes.

Comparison to replicons of historic isolates

Plasmid sequences of 35 Kp isolates from the historic Murray collection were identified and compared to the 3034 unique replicons in our whole collection (n=12,468). Remarkably, there was a substantial overlap between these unique replicon sequences. For example, the same repB sequences occurred in 1124 of all isolates and in 19 Murray collection isolates. Three further replicon sequences [IncFIB(K), IncFII(pKP91) and IncHI1B(pNDM-MAR)] that occurred once each in the Murray collection occurred in more than 240 general isolates (Additional file 2: Data S5). The co-existence of multiple variants of identical replicons up to 90 years ago requires further investigation into the evolution of plasmid replicon sequences, including their mutation rates and any selective pressure.

IncHI1B(pNDM-MAR) replicon

The Cluster A version of the repB gene differs from PlasmidFinder’s IncHI1B(pNDM-MAR) by 4nt and from the closest sequence in our collection by 3nt (out of 570nt). The latter sequence occurs frequently (n=395) in our whole dataset (Additional file 2: Data S2). More importantly, the first 96nt of this replicon’s sequence (those preceding a repB sequence) are almost unique to Cluster A. These leading 96nt have no similarity using BLAST (word size 7) [39] to any other IncHI1B(pNDM-MAR) variant in our dataset. While this version of replicon does occur outside Cluster A, it is rare among CRhvKp (n=3/1028) and CRKp (n=4/5763) isolates, but more common among hvKp (n=184/988). The Cluster A variant has a strong geographic and ST bias. It is found mostly in China (n=561/676), followed by Russia (n=22/676) and 18 other countries. KL64 (n=343/676) and KL1 (n=190/676) are the dominant serotypes, but these are likely the consequence of ST specificity being common in ST23 and ST11 types. Of all ST23 and ST11 isolates with any IncHI1B(pNDM-MAR) replicon, nearly all carried the Cluster A version of the replicon (ST23 197/216; ST11 461/525). In contrast, all isolates of the canonical hypervirulent ST86 (n=84/84) carried the PlasmidFinder’s reference replicon variant.

The IncHI1B(pNDM-MAR) replicon was nearly monophyletic in Cluster A, with 473 samples with identical replicon sequences and a further 7 replicon sequences among 18 samples. However, across the entire dataset, the IncHI1B(pNDM-MAR) replicon consisted of five main nucleotide sequences with only six mutation differences (Additional file 2: Data S2). Based on the context of Cluster A’s blaKPC-2 and IncHI1B(pNDM-MAR) replicon nucleotide sequence, there is little evidence that this cluster has generated an epidemic in countries outside of China that are well represented in our dataset. However, within China, Cluster A has been identified in at least 13 provinces since its first isolates were identified in year 2013. The monitoring of the core replicon signatures of such clusters can therefore assist with identifying the spread of CRhvKp forms.

留言 (0)

沒有登入
gif