Pathogens, Vol. 12, Pages 44: Genomic Analysis Unveils the Pervasiveness and Diversity of Prophages Infecting Erwinia Species

1. IntroductionThe Erwinia genus (family: Erwiniaceae, order: Enterobacteriales, class: Gammaproteobacteria) comprises a heterogeneous group of bacteria recognized for its pathogenicity against a wide range of plants, including crops from Rosaceae, Myrtaceae, and Cucurbitaceae. Erwinia amylovora was the first bacterium identified as causing disease in plants, with bacterial colonization generally beginning in the flowers or shoot tips, followed by migration into plant tissues [1]. Currently, many Erwinia species are described as economically important plant pathogens, including E. pyrifoliae [2], E. tracheiphila [3], and E. psidii [4]. The management of Erwinia disease in plants includes the use of antibiotics, such as streptomycin, but the emergence of bacterial resistance [5] and policies to restrict its use makes necessary the development of alternative control strategies, such as bacteriophage control [6,7].Bacteriophages (phages) are ubiquitous in natural environments and are powerful drivers of bacterial evolution, such as providing antibiotic-resistance genes, virulence-related genes, and resistance to lytic bacteriophages [8]. Besides shaping cell fitness when integrated into bacterial chromosomes, prophages may shift from a lysogenic to a lytic life cycle depending on the environmental conditions, causing cell lysis and a shrinkage in the bacterial population [9]. The cell lysis promoted by most phages is highly selective, making them appealing to biocontrol pathogenic bacteria since other microbes are virtually unaffected.Besides the relevance of phages to bacterial ecology and evolution, few studies investigating the diversity of (pro)phages of Erwinia were carried out using genomic data [10,11]. A comprehensive description of prophage occurrence may be helpful to deeply characterize Erwinia pathogenicity and the development of efficient control strategies, such as a better understanding of bacterial susceptibility to lytic phages and the evolution of antibiotic resistance genes.Although biocontrol using phages is promising, bacteria can circumvent phage infection using a range of mechanisms, such as CRISPR-Cas [12], DISARM [13], and BREX [14] systems. The CRISPR-Cas system is an adaptive system based on nucleotide sequence complementarity between the CRISPR spacer and the target mobile genetic element (MGE), meaning that past contacts with phages and other MGEs are recorded to prevent future infections. DISARM and BREX systems are based on host DNA methylation followed by restriction–modification of a phage genome (DISARM) or blocking phage replication/integration (BREX). Thus, the description of these systems could provide clues on bacterial immunity to phages.

In this study, we aimed to broadly describe the Erwinia prophage distribution and diversity, including the identification of novel sequences. We used PHASTER to scan publicly available Erwinia genomes for prophage-like sequences. Taxonomical assignments and sequence diversity analysis indicated that nearly all identified sequences were unrelated to previously described Erwinia (pro)phages. Based on prophage profiles, we pointed out that their occurrence is highly dependent on the Erwinia species and, thus, mainly influenced by the species-specific genetic background. Furthermore, we searched for anti-phage defense systems in Erwinia genomes to acquire information on the bacterial susceptibility to known Erwinia phages and the putative prophages identified in this work.

The data generated here contribute to describing the interplay between phages and Erwinia and could serve as a starting point for future studies regarding Erwinia pathogenicity and environmental fitness.

4. DiscussionIn this study, we analyzed the occurrence and diversity of prophage-like sequences in 221 publicly available Erwinia genome assemblies. Over the years, there has been an increasing number of high-quality Erwinia genome assemblies deposited in public databases, reflecting the growth of interest in genomic characterization of this genus due to its relevance to crop production [35,36]. Many Erwinia species are associated with plant disease [37], to which prophages could contribute relevant genetic features such as virulence genes and superinfection exclusion [38], modulating pathogen aggressiveness. Thus, the extensive description of prophages presented here is the first step to uncovering their impact on Erwinia evolution and ecology.Comprehensive genomic analysis pointed out that prophages are common in bacterial genomes, especially in pathogens from the Enterobacterales order [39]. Indeed, putative prophage sequences are pervasive in the Erwinia spp., indicating that they could be relevant to bacterial evolution and environmental fitness.

All Erwinia spp. genomes had signatures of prophage-like sequences, indicating that every strain evaluated in this study was infected by lysogenic phages during their evolutionary history or vertically inherited prophage genes. Despite this, the number and completeness of the putative prophage sequences highly depended on the Erwinia species. While E. amylovora and E. gerundensis had a higher proportion of defective sequences (incomplete and questionable), E. tracheiphila and E. rhapontici were enriched in putative functional prophages (intact sequences). It is important to emphasize that we used only in silico analyses to describe the occurrence of prophages in Erwinia spp., which is usually limited in detecting prophages with genes and genomic structures similar to previously described phages. Therefore, experimental analyses of prophage induction may reveal new sequences not detected by computational tools.

An in vitro screening for Erwinia phages from Podoviridae and Myoviridae families pointed out that lysogeny is rare in E. amylovora since any isolate had evidence of prophages using qPCR, as well as the absence of spontaneous or induced release of prophages [40]. Indeed, among the 146 assemblies analyzed, we found that only 12 genomes (8.2%) had intact sequences. The scarcity of putative inducible viral genomes in E. amylovora and E. gerundensis suggests that these species are efficient in inactivating prophages or preventing new integration events. The prophage profile in E. amylovora and E. gerundensis (Figure 2) suggests that their sequences were under a similar evolutionary trend observed for other enterobacterial prophages, in which many prophage genes are rapidly lost after genome integration, followed by a slower genetic decay of the remaining genes that could provide adaptive fitness to the cell. In such a scenario, many prophage genes in a given Erwinia species are possibly orthologous and derived from ancestral prophage integration events [41].

On the contrary, E. tracheiphila, E. rhapontici, and E. persicina apparently had different dynamics regarding prophage acquisition and evolution. These species harbored a higher number of putative prophages and a higher proportion of intact sequences. Although rapid prophage decay is a common event due to the pervasiveness of defective sequences in Erwinia genomes, the acquisition of new prophages seems recurrent in these species, possibly influenced by the absence of DISARM, BREX, and CRISPR-Cas anti-phage defense systems in their genomes. Thus, the domestication of new prophages tends to occur more frequently, and such horizontally acquired DNA may constitute an important evolutionary trait for these Erwinia species.

Furthermore, most of the isolates of a given Erwinia species were sampled in multiple locations. Thus, they are probably subjected to variable abiotic factors that impact phage stability and infectivity [42,43,44,45]. Therefore, the species-specific profiles of prophages observed here are likely an outcome of the genetic background and ecological niche of the Erwinia species and are less influenced by environmental conditions.It is important to note that some of the Erwinia spp. genomic sequences were highly fragmented, mainly due to the technical limitations of genome sequencing and assembly and to the features of the sequences (e.g., GC content, large homopolymeric regions, and repetitive sequences) [46]. Some prophage-like sequences were located near contig edges and had small sizes, especially within fragmented bacterial assemblies. However, we did not exclude these sequences from the analysis since they were likely derived from prophages. From the perspective of the description of prophages in Erwinia, greater harm could be made if we ignore the clear signals of prophage occurrence in some bacterial genomes. We wondered if, in future releases, the prophage-finding tools could provide warnings when candidate prophages have small sizes and are near contig edges, which may hamper accurate prophage sequence delimitation and completeness estimation.Contrasting the GC content of the hosts and phages is commonly conducted in phage characterization studies. A comprehensive genomic analysis indicated a linear relationship of GC content between phages and their hosts [30], although such a trend might be missed when analyzing a small subset of the data. We could not detect a direct correlation of GC content between prophage-like sequences and Erwinia genomes for the putative Erwinia prophages. However, such genomic features did not vary significantly between the putative prophages and their hosts (Table S10), suggesting that GC-content adjustments are in progress during prophage domestication. Furthermore, such small variations in GC content are expected since bacterial genomes naturally have GC-skew and heterogeneous distribution of nucleotides [47,48], implying that genome segments may have a GC content that is slightly different from the average value.The sequence-based taxonomic assignment is important to group similar phages from an evolutionary perspective, which may have practical value concerning viruses’ origin, replication mechanism, and life cycle. Most classified putative Erwinia prophages belonged to the Myoviridae family, while fewer were from Siphoviridae and Podoviridae (Figure 4A,B). A similar trend was observed in previous in vitro screenings of phages infecting Erwinia [40,49], detecting only phages from the Caudoviricetes class.Given that the cryptic filamentous viruses (class: Tubulavirales) are ubiquitous in prokaryotes, we expected to find filamentous prophages in Erwinia genome assemblies [33]. For this purpose, we employed the pipeline “Inovirus_detector” [33]. However, no inovirus-like sequences were retrieved. Furthermore, the highest scores of the Baltimore classification provided by VPF-class indicated that all candidate prophages identified by PHASTER had genomes composed by dsDNA, which was consistent with the absence of Inoviridae prophages within Erwinia assemblies according to the Inovirus_detector tool.Overall, the set of prophage-like sequences showed a low degree of intergenomic similarities, as evidenced by the high number of species clusters with few sequences each. However, some species clusters contained a relatively high number of putative prophages of E. amylovora (clusters 2, 5, 7, 8, 11. Table S6), such as cluster 5 which contained 98 defective sequences, each one in a specific E. amylovora isolate. These sequences are possibly orthologous and transferred vertically. Although most prophage-like sequences are considerably divergent, the gene-sharing network indicated that they share many genes (Figure 3), which could be due to horizontal gene transfer, including between prophages from different Erwinia species.Most of the putative Erwinia prophages described here are not covered in databases since only three sequences had more than 70% of intergenomic similarities to known phage sequences, according to VIRIDIC (Table S6). This evidenced that only a small fraction of Erwinia phages is already described in databases. Thus, this work expanded the sequence space of Erwinia-infecting viruses.Bacteria possess diverse anti-phage defense systems, employing many mechanisms to reduce infection by lytic or temperate phages [50]. To better understand the profile of prophages observed in Erwinia, we searched for DISARM, BREX, and CRISPR-Cas defense systems. We could not detect putative DISARM systems in the analyzed Erwinia genomes, while the BREX system was detected only in ~2.7% of the genomes, mainly in E. pyrifoliae isolates. However, these Erwinia isolates possessed multiple prophage-like sequences, including intact ones. Thus, the temperate phages that infected E. pyrifoliae probably evaded their BREX mechanism since these defense systems are likely functional (complete).Finally, we investigated the occurrence and diversity of CRISPR-Cas systems in Erwinia spp., given their role in bacterial immunity against phages and being the only adaptive and heritable defense system known to date [51].Most Erwinia assemblies (76.9%) had putative CRISPR-Cas systems, a high frequency compared to many other enterobacteria [12]. However, the pervasiveness of this defense system is biased due to the more significant proportion of E. amylovora isolates in the data set, all showing CRISPR-Cas system type I-E. Many other Erwinia species, such as E. gerundensis, E. rhapontici, and E. tracheiphila, did not show CRISPR-Cas system according to the CrisprCasFinder algorithm. Although varying greatly in the Erwinia genus, the number and type of CRISPR-Cas system are well conserved within each Erwinia species (the only exception was Erwinia persicina), suggesting that they are under strong evolutionary constraints after speciation, and their biological relevance must be analyzed in a species-wise manner.

We could not identify most of the protospacers targeted by the Erwinia CRISPR spacers, indicating that there is still a large amount of undescribed mobile genetic elements (MGEs) and the need for further environmental microbiome characterization. Besides this, the accumulation of sequence mutation could hamper alignments between CRISPR spacers and targets, hindering its functionality and protospacer identification.

E. amylovora and E. pyrifoliae are closely related species and show similar disease symptoms [2]. Using sequences from public databases, we found that plasmid sequences were the most identifiable targets of the CRISPR spacers from these species, indicating that these MGEs may represent a more significant threat than phages in natural environments. The colonization niche of E. amylovora and E. pyrifoliae is the aerial parts of Rosaceae [2], in which they are in contact with other bacteria, possibly other Erwinia populations [52]. Indeed, the BLASTn best hits indicated that most of the plasmids targeted by CRISPR spacers in E. amylovora derived from isolates of this species (Table S11).Interestingly, the Erwinia tasmaniensis phage phiEt88 sequence and the plasmid pET35 from E. tasmaniensis ET1/99 were prevalent among the targets of E. amylovora and E. pyrifoliae CRISPR spacers. E. tasmaniensis is a non-phytopathogenic species [53], which might act as an antagonist of plant pathogenic Erwinia. Likewise, it is known that non-phytopathogenic epiphyte Pantoea species could be infected by Erwinia phages, being used as carriers to introduce phages in populations of plant-pathogenic Erwinia [54,55]. Owing to the prevalence of MGEs from E. tasmaniensis as putative targets of E. amylovora and E. pyrifoliae CRISPR spacers, we wondered if similar events might occur between these Erwinia species, with an intimate association and frequent MGE exchange between bacterial populations.

Furthermore, the Erwinia CRISPR-Cas system targeted the candidate prophages identified by PHASTER, regardless of their completeness. The vast majority of the CRISPR spacers and the target prophage-like sequences occurred in different genome assemblies (2193/2197 = 99.8%), indicating that many putative prophages are active in the environment, possibly acting as temperate phages, infecting or being repealed by Erwinia.

Most of the E. amylovora isolates had CRISPR spacers directed to the prophage-like sequences (139/146 = 95.2%), correlating well with the low relative abundance of intact sequences in this species. Such prophage sequences are regarded as “true” phages, generally possessing a complete or near-complete set of genes necessary to virus metabolism [56]. Since CRISPR-Cas systems prevent virus infection, new phages are unlikely to infect the cell. On the other hand, E. pyrifoliae did not show any spacer targeting the prophage-like sequences, even though all isolates had two different types of CRISPR-Cas systems with many CRISPR arrays. Since E. pyrifoliae assemblies harbored multiple putative prophages, we hypothesized that their CRISPR-Cas systems are somewhat inefficient in collecting spacers against these MGEs. Thus, a bias of protospacer acquisition from phages may exist depending on Erwinia species, possibly influenced by the impact of prophages on cell fitness or virus strategies to circumvent CRISPR-Cas defense, such as DNA modifications and protein inhibitors [57].We could not guarantee that the CRISPR spacers were acquired specifically from the prophage sequences since viruses generally show high levels of genetic mosaicism, sharing genome segments even between distantly related phages [58]. However, owing to the restricted ecological niche of Erwinia and the probable lower richness of dsDNA viromes in aerial parts of plants compared to other environments, it is very likely that the CRISPR spacers are directed to the putative Erwinia prophages.

留言 (0)

沒有登入
gif