Elucidating the picocyanobacteria salinity divide through ecogenomics of new freshwater isolates

A new picocyanobacterial dataset of freshwater, brackish, and marine strains

To expand the coverage of available freshwater picocyanobacterial genomes, we obtained cultures from various lakes and reservoirs via an isolation campaign spanning more than 5 years, including isolates from several continents (north Asia, west and central Europe, south-east Oceania, Central/North America, South America, and even Antarctica) that reasonably cover the entire globe (Fig. 1). Isolates also covered a range of trophic regimes, from oligotrophic lakes like Lake Baikal (Russia), Lake Nahuel Huapi (North Patagonia, Argentina), Lake Maggiore (Italy), or Tous reservoir (Spain) to mesotrophic habitats such as Amadorio reservoir (Spain), or hypertrophic such as Lake Chascomús (Argentina) and encompassing cold and glacial lakes, temperate reservoirs, and tropical lakes. In total, we sequenced the genomes of 58 new isolates resulting in 32 new species (based on >95% average nucleotide identity, ANI) that greatly expands the current number of complete or near-complete freshwater picocyanobacterial genomes (Additional file 1: Table S1).

Fig. 1figure 1

Distribution of the 58 new picocyanobacterial isolates obtained in this study. The number of sequenced genomes is shown in brackets for each location, which is color coded with a red star. Additionally, the names of each isolate, lake, and region of origin are included. Isolates are also coded with green (PC-rich) or pink stars (PE-rich) according to their pigment type composition

Given the phylogeny of the isolates obtained (see “Phylogeny of the new freshwater picocyanobacterial isolates” section below), this work only compares cluster 5 picocyanobacteria, specifically from the Synechococcus and Cyanobium genera, comprising isolates from different SCs (5.1, 5.2, 5.3) obtained from marine, brackish, and freshwater envionments. A parallel work compares cluster 5 picocyanobacteria (α-cyanobacteria) with other unicellular Synechococcus-like strains (β-cyanobacteria) [47]. We used only complete or near-complete genomes (draft) derived from cultures to reduce to a minimum the bias introduced from incomplete genomes derived from metagenomes (MAGs) and single cells (SAGs). A summary of all the main genomic features, origin, and references for the 132 marine, brackish, and freshwater obtained genomes used in this comparison is shown in Additional file 1: Table S1. To clarify the origin of the compared strains, marine, brackish, and freshwater designations were made according to two criteria: (i) their origin and (ii) their tested growth or selection in artificial media. Thus, a marine strain was defined as one isolated from a coastal marine/pelagic open ocean environment that can only grow in marine medium (i.e., >35 g/L NaCl), e.g., Synechococcus sp. WH8102 [15] or Synechococcus sp. RCC307 [16] from SCs 5.1 and 5.3, respectively. We define as euryhaline/halotolerant those strains that were isolated from brackish/estuarine/coastal systems with intermediate salinities lower than the ocean such as the western Black Sea (18–22 g/L), Pearl River estuary, China (14–35 g/L), or Chesapeake Bay, USA (25–30 g/L). We also define as halotolerant growth in an intermediate salinity medium (5–30 g/L) or across a wide range of salinities, for instance strains BSA11S/BSF8S [30, 31], Synechococcus sp. WH5701/RS9917 [16] or LTW-R [29]. Finally, freshwater strains are defined as those that were isolated exclusively using BG-11 medium and have a freshwater pelagic origin, e.g., S. lacustris and C. usitatum Tous [33], from SCs 5.3 and 5.2, respectively. These latter strains have also been grown at different salinities with optimal growth in BG11 medium and die within days at salt concentrations >3–5 g/L [33]. We acknowledge though that since not all 132 isolates have been analyzed with respect to their growth across a range of salinities, some isolates may be mis-categorized. However, the above definitions provide a modus operandi for moving forward that should be followed until such a time that the precise salinity growth ranges of all picocyanobacteria are known.

Phylogeny of the new freshwater picocyanobacterial isolates

To assess the phylogeny of our newly isolated and sequenced freshwater strains, we constructed a 365-protein concatenated phylogenomic tree including all existing marine, brackish, and freshwater culture-derived cluster 5 picocyanobacterial isolates (Fig. 2). As expected, none of our strains fell inside marine SC 5.1, but rather they all affiliated within either SC 5.2 or 5.3. Our isolates include five new strains within the S. lacustris clade [33] from various lakes and reservoirs different to the original culture retrieved from Tous reservoir (Spain). Such new isolates came from Lake La Cruz (Spain), Loriguilla reservoir (Spain), and Lake Maggiore (Italy). We also identified a new, closely related species (Cruz CV12-2-Slac-r, albeit only 76-77% ANI to S. lacustris genomes), thus delineating two distinct clades with a threshold at 90% ANI (Additional file 2: Fig. S1). All of these strains fall inside SC 5.3 and show very low ANIs compared to all SC 5.1 and 5.2 strains as well as the SC 5.3 Mediterranean Sea isolates MINOS11 and RCC307 (68–72% ANI). Our new SC 5.2 freshwater isolates span a wide diversity (70–95% ANI) and confirm this SC comprises brackish/estuarine isolates as well as freshwater strains. Applying a 90% ANI criteria threshold, we reveal 25 distinct clades inside SC 5.2, forming genetically differentiated species with >90% identity. We also include new strains affiliated to previously studied picocyanobacteria such as Cyanobium usitatum [33], Cyanobium gracile [39], Vulcanococcus limneticus [34], or Synechococcus sp. WH5701 [16]. Indeed, isolates closely related to Cyanobium usitatum (>90% ANI) were obtained from multiple lakes ranging from cold to temperate as well as from (ultra) oligotrophic to meso-eutrophic conditions. Such new isolates mainly came from cold ultraoligotrophic lakes Baikal (Siberia, Russia), Wakatipu (New Zealand), and oligotrophic Maggiore (Italy) or meso-eutrophic Loriguilla and Amadorio reservoirs (Spain). Hence, the cosmopolitan nature of both Cyanobium usitatum and Synechococcus lacustris [33] is reflected here having obtained globally distributed new isolates.

Fig. 2figure 2

Phylogenetic analysis of all 58 new freshwater picocyanobacterial isolates as well as previously isolated brackish and marine isolates. The phylogeny was rooted at the S. elongatus and PCC clade. Prochlorococcus and Ca. Synechococcus spongiarum spp. were added to complete the phylogeny. In total, 365 core universal proteins were used to make the phylogeny (PhyloPhlAn3.0). Bootstrap values >50 are circle color coded. Marine, brackish/estuarine/halotolerant, or freshwater species are indicated in blue, green, and red, respectively. The size of the circle is used as a proxy of the genome size. %GC content is indicated by the filled circle portions

Main genomic features across habitats and sub-clusters

To complement the abovementioned phylogenetics and ANI results (Fig. 2 and Additional file 2: Fig. S1), we plotted genome size, %GC content, and median intergenic spacers from all 132 genomes analyzed (Fig. 3 and Additional file 1: Table S1). We also performed different single pair ANOVA tests between habitats (marine, brackish, freshwater) and SCs (5.1, 5.2, 5.3) to statisticially assess differences in their genome size, %GC content, and median intergenic spacers (Additional file 3: Additional dataset 1). We observed a generally smaller genome size (2.53±0.23 Mb on average) and lower %GC content (58.58±3.19 %) across all isolates from SC 5.1 (p-value <0.05) compared to their SC5.2 and SC5.3 counterparts. These mostly open ocean isolates from off-shore oligotrophic waters are the most widespread and possess the smallest Synechococcus genomes (<2.5 Mb) encountered so far [16, 19,20,21]. However, there are a few freshwater strains from SC 5.3 (S. lacustris group) that have the lowest %GC content (ca. 52–53%) and smallest median intergenic spacers (20–25 bp) of all the cluster 5 picocycanobacteria (p-value <0.05) and are also smaller than 2.6 Mb (Fig. 3, Additional file 1: Table S1), as previously noted [32, 33]. In particular, strains Cruz CV12-2 and Cruz CV12-2-Slac-r have the smallest genome sizes. A handful of freshwater strains from SC 5.2, such as HWJ4-Hawea/ WAJ14-Wanaka (coming from New Zealand lakes) and representatives closely affiliated to the Cyanobium usitatum species in SC 5.2 [33], originating mostly from oligotrophic lakes, are also in this size range (<2.6 Mb). All of these SC5.2/5.3 strains are examples of cosmopolitan small genomes that have colonized a wide range of freshwater systems around the globe, from cold ultraoligotrophic to temperate mesotrophic habitats [33] and it is particularly relevant that being small-sized (comparable to marine isolates) they abound in oligotrophic freshwater systems, sharing a similar trophic status to that observed in the ocean.

Fig. 3figure 3

Genome size (Mb) versus %GC content, number of coding sequences (CDS), and median intergenic spacer (bp) plots between picocyanobacterial SCs. Each genome is color coded according to the habitat of origin and shape coded according to the SC to which it belongs

However, despite the abovementioned exceptions, we generally observed that freshwater and brackish genomes showed a higher %GC content (ca. 64% on average) and larger estimated genome size (average 2.69 Mb for brackish and 2.9 Mb for freshwater strains, with SD of 0.37 and 0.41 Mb, respectively), covering a range between 2 and 4 Mb, a considerably larger range than shown for marine isolates (2.2 to 3.5 Mb) (ANOVA, p-value <0.05). Thus, overall genome reduction appears to be much more prevalent in marine representatives.

The shared and flexible genome of Synechococcus-Cyanobium picocyanobacteria

To determine how the shared and flexible genome differed between picocyanobacterial SCs and habitats (Fig. 4), we used complete or near-complete (draft) genomes derived from cultures to reduce to a minimum any bias to detect genes belonging to the strict core, soft core, shell, and cloud [48]. Note that this analysis compares a set of microbes belonging to three SCs that span 67–99% ANI, and hence, we are comparing pangenomes of genomically distant populations at the level of genus and family that should not be confounded with strain-level pangenomics. Comparing the meta-pangenome of marine, brackish, and freshwater isolates (Fig. 4) as a whole, we observed a higher percentage of strict core and soft core genes in marine (32% strict core, 41.5% soft core) and brackish (28.95% strict core and 38.8% soft core), compared to freshwater strains (14.3% strict core and 35.4% soft core), consistent with the aforementioned genome reduction in marine picocyanobacteria. Individual comparisons showed that marine strains possessed the highest number of core and soft core genes (1170 strict core genes and 1517 soft core genes) followed closely by brackish representatives (971 and 1303 respectively), but far from freshwater strains (504 and 1240 genes, respectively). These higher values of the persistent/shared genome were also observed when the Prochlorococcus and Synechococcus pangenome was compared [16]. Overall, our data suggests that freshwater picocyanobacteria have a greater diversity and gene pool compared to their salt-adapted counterparts.

Fig. 4figure 4

Meta-pangenome analysis of marine, brackish, and freshwater picocyanobacteria from different SCs conducted using the GET_HOMOLOGUES package. The shared genome content is divided into strict core and soft core while the flexible genome is divided into shell and cloud categories. Each category is color coded

We next repeated these calculations between SCs but this time regardless of salinity origin (Fig. 4). We found relatively high shared gene content (ca. 28 and 37% for strict and soft core, respectively) when comparing the meta-pangenomes of SCs 5.1 (isolates solely of marine/brackish origin) and 5.3 (including both marine and freshwater isolates). This is particularly interesting since SC 5.3, comprising marine strains like RCC307 and freshwater strains like S. lacustris, are quite far apart in terms of ANI (67–72%) compared with SC 5.1 and 5.2 isolates. When we included SC 5.2 in the analysis (compared to SCs 5.1 and 5.3), the total number of shared genes was drastically reduced: 10–12% for strict core and 34–36% for soft core, which likely reflects the large genetic diversity present in SC 5.2 encompassing strains spanning the salinity divide and with a wide range of genome sizes (Figs. 2 and 3 and Additional file 1: Table S1).

Analyzing genomes from all three SCs and habitats together (Fig. 4, Additional file 4: Fig. S2A and Additional file 5: Additional dataset 2), we obtained the smallest strict core (351 genes, 10.7% of the total) and soft core (1190 genes, 36.3% of the total) gene set, which represents 47% of the total genomic repertoire. These results led us to determine a strict picocyanobacterial core genome curve, which stabilized at ca. 350 genes, while the meta-pangenome curve comprised >35,000 genes and was far from reaching a plateau (Additional file 4: Fig. S2B). This trend was also observed in a pangenomic study of marine SC 5.1 Synechococcus [16]. As expected, >80% of all genes belonging to strict and soft core were all related to amino acid biosynthesis (ca. 7.5 %), protein metabolism (ca. 10–13%), carbohydrates (6.8%), cell division and cell wall biosynthesis (5%), photosynthesis (ca. 2.5%), DNA/RNA metabolism (7%), or fatty acid and lipid metabolism (ca. 3%) (Additional file 4: Fig. S2C). On the other hand, >80% of the genes associated with the shell and cloud categories were labelled as other categories based on SEED, which exemplifies the enormous number of hypothetical and unknown functions in the flexible compartment of these microbes. A list with SEED annotation for all these four pangenome categories is shown in Additional file 6: Additional dataset 3.

General features of the picocyanobacterial proteome

We next assessed the variation in isoelectric points (pI) in whole proteomes and constructed a principal coordinates analysis (PCO) based on a Bray-Curtis resemblance matrix for all 132 marine, brackish, and freshwater picocyanobacteria (Fig. 5A) building on a previous study which suggested that the changes at the level of protein amino acid composition and pI constitute a way to predict the preferred habitat of the different microorganisms [49]. The general phenomenon observed pointed towards a more acidic pI in all marine isolates compared to a more neutral and basic pI in freshwater isolates. Brackish, halotolerant, and estuarine picocyanobacteria showed either a pattern more related to marine isolates (e.g., WH5701, RS9917, RS9916, BS56D) or freshwater strains (e.g., NIES-98, NS01, or BSA11S/BSF8S).

Fig. 5figure 5

A Upper panel: Whole-proteome isoelectric points (pI, x-axis) versus relative frequency (y-axis) among different picocyanobacteria. Habitats are color coded accordingly. Lower panel: PCO plot based on a Bray-Curtis dissimilarity resemblance matrix obtained from the relative frequencies of 28 pI values (increments of 0.5 from 0 to 14). Each habitat is symbol and color coded accordingly. The SC to which each isolate belongs is also represented. B Whole-proteome pI comparison between close-phylogenetic neighbors. We provide a small inset of the phylogeny, AAI and ANI values. Dotted lines show the freshwater representative. Straight lines show the marine/brackish representative. A small inset of their phylogenetic affiliation is shown to highlight these pairs are the closest salt-adapted versus freshwater picocyanobacteria sequenced so far. Freshwater (red), brackish (green), and marine (blue) isolates are color coded accordingly

These differences were also analyzed within sub-clades where we compared close-phylogenetic neighbors (with highest ANI and AAI values whenever possible) from different salinity types (Fig. 5B). In so doing, we aimed to reduce any taxonomic signal to a minimum. We performed four different comparisons: (i) RCC307 (Mediterranean Sea, marine) from SC 5.3 with S. lacustris and CV12-2-Slac-r (Tous reservoir and Lake La Cruz respectively, freshwater) from SC 5.2; (ii) Tobar-12-5-g (Lake El Tobar, freshwater) and SynAce (Ace Lake, brackish) both SC 5.2 representatives; (iii) CB0101 (Chesapeake Bay, brackish/estuarine) compared to Hayes-HJ21 (Lake Hayes, freshwater) both SC 5.2 representatives; (iv) WH5701 (Long Island Sound, brackish) compared to 1G10 (Nahuel Huapi, freshwater) also both SC 5.2 representatives. All these comparisons reiterated higher acidic pIs in the salt-adapted strains, while freshwater strains exhibited the highest neutral and basic pI peaks. Moreover, these comparisons also highlighted higher AAI values compared to ANI values in all cases, as previously noted [49].

Habitat and picocyanobacterial sub-cluster specific metabolism in terms of gene/protein presence/absence

To better understand what metabolic capacities differentiate salt-adapted and freshwater picocyanobacteria, we compared the presence/absence of various genes/proteins between habitats and SCs (Additional file 7: Table S2). Metabolic capacity used Cyanorak (CK) clusters [28] and compared 67 freshwater, 17 brackish, and 48 marine origin genomes sub-divided into 51 SC 5.1, 72 SC 5.2, and 9 SC 5.3 genomes. We verified CK annotations using KEGG, SEED, and EC numbers and assigned PSSMs based on CDD/SPARCLE. Based on all of the homology matches with the abovementioned CK database, we determined the presence/absence of each gene/protein variant. However, we must clarify that while a specific gene set may be absent in genomes obtained from one habitat type, it does not rule out that habitat type possessing a different gene set to do the same job, specifically with the number of hypothetical proteins that remain with unknown function. Moreover, this work deals with a new set of draft genomes that are not closed into a single contig. Hence, there could be a few genes/proteins present at the edges of broken contigs that are not detected. The 14,062 genes not present in Cyanorak clusters, mostly from the novel freshwater strains described here and representing the accessory/flexible genome (shell and cloud categories), were annotated with the last version of the NCBI nr database (Additional file 8: Additional dataset 4). A PCO and a clustering plot (Fig. 6) based on the presence/absence (Kulczynski index) of all genes derived from Additional file 7: Table S2 was also obtained. As depicted in Fig. 6, marine and freshwater picocyanobacteria grouped separately based on their gene presence/absence, with a clear separation between marine sub-clusters 5.1A and B as well as between freshwater strains. The latter comprised the majority of freshwater/brackish isolates from SC5.2 that grouped separately from the abovementioned smaller genomes of the cosmopolitan S. lacustris (SC5.3), Cyanobium usitatum (plus related Cyanobium spp. from SC5.2), and New Zealand Hawea/Wanaka strains from SC5.2. Subsequent habitat and sub-cluster-specific genes/proteins are shown in Additional files 9, 10, 11, 13, 14, 15, 16, 17, 19, and 21: Tables S2-S12, Fig. 7 and discussed below for each type of metabolism where there were ecologically significant similarities and differences:

Fig. 6figure 6

A Clustering and B PCO plots obtained from a resemblance matrix based on Cyanorak (CK) gene presence/absence (Kulczynski index). Both plots comprise all 132 picocyanobacteria labelled according to their habitat of origin and SCs. Overlayed clusters from 75 to 95% of similarity are shown in the PCO plot, determining the % of shared features between genomes

Fig. 7figure 7

Picocyanobacterial habitat and sub-cluster (SC)-specific gene/protein presence/absence. Each habitat and SC are color coded accordingly. Presence/absence is based on total percentages of genomes that possess each gene/protein based on Cyanorak clusters (CK). We used 67 freshwater, 18 brackish, and 47 marine genomes sub-divided into SC 5.1 (51 genomes), SC 5.2 (72 genomes), and SC 5.3 (9 genomes)

Sulfur metabolism

Sulfur is one of the most abundant elements in seawater, not only in the form of sulfate but also within other forms like DMS and DMSP [50]. Conversely, it is much less abundant, in general, in freshwater systems [51] where it may be limiting for microbial life (e.g., in Lake Baikal [52];). Thus, we might expect a greater capacity for sulfur acquisition in freshwater isolates. Indeed, we found that the genomes of freshwater picocyanobacterial strains specifically harbored additional rhodaneses (CK_00007139), which catalyze the detoxification of cyanide and their subsequent conversion to thiocyanate, or (aryl) sulfatases such as sulfatase subfamily S1 (CK_00006730) involved in the transformation of phenol sulfate and water to phenol and sulfate, both of which were absent in marine strains (Fig. 7 and Additional file 9: Table S3). Also, of particular relevance here is the CysWT sulfate transporter, which is required for optimal growth of the freshwater strain Synechococcus elongatus [53] and was initially detected in some of the first freshwater picocyanobacterial MAGs from SC 5.3 [32]. This transporter has been mostly detected in freshwater and terrestrial cyanobacteria [54] compared to marine strains. Here, we detected the CysWT and CysPA sulfate transport system in over 50% (38/67) of the freshwater strains analyzed, mostly in members of SC 5.2 (albeit a few S. lacustris of SC 5.3 strains also possessed it), being completely absent from marine strains and present in only 1/17 brackish strains. Another sulfate ion transporter (CK_00009119) was present in 36/67 freshwater strains from SC 5.2 and 7/17 brackish strains, but interestingly, it was present in all strains from marine SC 5.3 and clades V, VIa/b from SC 5.1. On the other hand, sulfate permeases/transporters such as Sul1 (CK_00001149) or Sul3 (CK_00056721) were present in all marine strains and most brackish (15/17 and 9/17, for Sul1 and Sul3). However, only 26 and 8/67 freshwater strains harbored Sul1 and Sul3, respectively. Conversely, genes for assimilatory sulfate reduction were present in all the picocyanobacterial genomes analyzed (Additional file 9: Table S3), including phosphoadenylyl-sulfate reductase [thioredoxin] (CK_00001149), adenylylsulfate kinase (CK_00000454) and sulfite reductase (CK_00000887).

Nitrogen metabolism

Various studies have shown that nitrogen (particularly fixed forms such as ammonia and nitrate) is, together with P, the main limiting nutrient for phytoplankton growth [55]. This is consistent with the presence of ureases (123/132 possess the entire ure cluster), nitrate/nitrite reductases (120/132 harbor nirA, 119/132 possess narM and 118/132 contain narB genes), and ammonia permeases (amt1 is present in all marine, 51/67 freshwater and 16/17 brackish strains; amt2 is present in 68/132 genomes, particularly in 54/67 freshwater strains) in most but not all marine [19], freshwater, and brackish picocyanobacteria (Table S4). However, they all possess the global nitrogen regulator NtcA (CK_00000468) as well as the PII protein (glnB - CK_00000186). Interestingly, various freshwater (12/67) and brackish (3/17) isolates contain a second PII copy (glnB2 - CK_00041583) (Additional file 10: Table S4). It seems possible that freshwater strains have evolved additional copies of this regulator together with additional glutamine synthetases (see amino acid section below) to cope with the variable nitrogen levels present in lakes of different trophic status.

On the other hand, while we found that the ability to degrade cyanate into ammonia and CO2 via cyanate hydratase was a common feature of marine and brackish representatives as previously noted [19, 56], this enzyme was present in only half (38/67) of the freshwater strains (Additional file 10: Table S4). It is possible that the prevalence of this bicarbonate-dependent enzyme in marine strains is correlated with the relative stability of ocean pH (generally ca. pH 8.2±0.3) [57], a feature that is much more variable in freshwater systems, e.g., from neutral to slightly alkaline in Lake Baikal [52] and Spanish reservoirs, meromictic Lake La Cruz (Spain) from which we have isolated different strains [58], or acid like in some French reservoirs [59]. Moreover, among all the compared planktonic picocyanobacteria, the only strain harboring a nitrogenase was the freshwater isolate V. limneticus spp. that acquired the nif operon via HGT [34]. Apart from this one exception, no other picocyanobacteria of all those analyzed here showed the ability to fix nitrogen. Finally, there were specific nitrate/nitrite transporters for marine and brackish strains such as the nitrate transporter nrtP (CK_00001676) and the focA nitrite transporter (CK_00001669) [60], which were absent in all freshwater strains. Conversely, freshwater isolates harbored the nrt ABC transporter, a well-defined nitrate/nitrite transporter in S. elongatus [61]. The exact reason why marine and freshwater microbes harbor different transporters for the same nutrients (either S or N) is unknown.

Phosphorus metabolism

Phosphorus (P) is another potentially limiting nutrient for picocyanobacterial growth across both marine and freshwater systems [62,

留言 (0)

沒有登入
gif