Fossil-calibrated molecular clock data enable reconstruction of steps leading to differentiated multicellularity and anisogamy in the Volvocine algae

Selection of orthologues in taxa across the Archaeplastida provides the basis for inferring how multicellularity and cellular differentiation evolved in the volvocine algae. We sampled a total of 164 taxa representing all major clades of Archaeplastida: Rhodophyta, Streptophyta, and Chlorophyta. Specifically, we sampled 12 rhodophytes across 3 red algal subclades, 45 streptophytes representing 13 major green algal and land plant subclades, and 107 chlorophytes, representing prasinophytes, Ulvophyceae, Trebouxiophyceae, and Chlorophyceae, including 68 unicellular and multicellular volvocine algae (Fig. 2, Additional File 1: Figure S1, Figure S2, and Figure S3). We chose the red algae for our outgroup, as multiple studies indicate that they are sister to the green algae + land plants (Embryophyta), and that the red and green algae share a common plastid ancestor [34, 35]. Although our chief aim was to discern, on a geological timescale, the evolution of traits leading to multicellularity and differentiation in the volvocine algae, we included other green algal genera such as Coleochaete and Tetraselmis to further illuminate the history of Viridiplantae. The former is generally considered to be relatively closely related to land plants [36, 37], while the latter is believed to be sister to the three major Chlorophyta clades [38]. Because all multicellular lineages necessarily evolved from unicellular ancestors, clarifying which lineages are sister to which multicellular clades will enable a deeper understanding of how this major transition occurred.

Fig. 2.figure 2

Time-calibrated phylogeny of the Archaeplastida. Branching order of the tree was inferred under maximum-likelihood analysis from an aligned amino acid, concatenated dataset of 263 nuclear genes. Numbers on branches represent bootstrap and posterior probability values, respectively. Branch lengths, corresponding to time, were inferred under the CIR relaxed clock model using 16 most clock-like genes as determined by Sortadate. Blue bars correspond to the inferred 95% HPD interval for each node. Red bubbles correspond to calibrated nodes (Table 1), and blue bubbles correspond to key divergences among the volvocine algae (Figure S4B). Members of the multicellular volvocine algae (Tetrabaenaceae, Goniaceae, and Volvocaceae) are denoted in orange, purple, and green. Taxa in black font are unicellular volvocine algae

A single dataset consisting of 263 single-copy, protein coding genes was compiled and analyzed under Maximum-Likelihood (ML), Bayesian inference (BI), and coalescent-based (CB) phylogenetic methods. These 263 genes were conserved across the three main clades of Archaeplastida. Our concatenated alignment for ML and BI analyses represents an aggregate of 79,844 amino acids, equivalent to 239,532 nucleotide positions, with a total of 62,106 parsimony-informative sites. All raw reads used to complete our single-gene and concatenated alignments encompassing 164 total taxa were mined from previously published data located in public repositories (Additional File 2: Table S1) [11, 20,21,22, 24, 39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83].

Phylogenetic analyses of the volvocine algae indicate multiple independent origins of multicellularity and differentiation. Our ML, BI, and CB analyses all indicate two independent origins of multicellularity among the volvocine algae: one in the lineage leading to the Tetrabaenaceae and another in the lineage leading to the Goniaceae + Volvocaceae (Fig. 2, Additional file 1: Figure S1, Figure S2, and Figure S3). These results bolster the findings of Lindsey et al. [11] and Ma et al. [12] As before, our ML and BI analyses have identical branching orders. The Tetrabaenaceae + Vitreochlamys ordinata are shown to be sister to several Chlamydomonas species + the Goniaceae and Volvocaceae in our ML and BI trees (Maximum-likelihood bootstrap support (MLBS) = 100, Bayesian posterior probablility (BPP) = 1.0) (Fig. 2, Additional file 1: Figure S1, and Figure S2), and this topology both corroborates and expands on with those presented in several earlier studies [11, 12]. Our CB tree, however, indicates a slightly different branching order for the Tetrabaenaceae. The resulting coalescent-based tree shows the Tetrabaenaceae + V. ordinata forming a clade with Chlamydomonas reinhardtii and its relatives, and this clade is shown to be sister to the Goniaceae + Volvocaceae (Coalescent posterior probability (CPP) = 1.0) (Additional file 1: Figure S3). Of note, the sister relationship between the Tetrabaenaceae + V. ordinata and the Chlamydomonas clade is poorly supported in our CB analysis (CPP = 0.34).

In accordance with the findings of Lindsey et al. [11], all three of our phylogenetic analyses indicate a minimum of 4 independent origins of somatic cellular differentiation and a minimum of 3 origins of anisogamy among the volvocine algae. Origins of somatic cell differentiation occur in the following lineages: (i) Astrephomene, (ii) section Volvox, (iii) Pleodorina thompsonii, and (iv) the Pleodorina japonica + Volvox carteri clade. Anisogamy evolved from isogamous ancestors at least three times in the following lineages: (i) Astrephomene, (ii) section Volvox, and (iii) in the Eudorina + Volvox + Pleodorina (EVP) clade.

In contrast to several recent volvocine studies [11, 23], all three of our phylogenetic analyses support the major conclusion that the Goniaceae are monophyletic, albeit with varying support values (MLBS = 95, BPP = 1.0, CPP = 0.67). This conclusion was reached by multiple earlier investigations [12, 18, 31, 32, 84,85,86,87,88,89,90,91]. Consistent with results of Lindsey et al. [11] and Ma et al. [12], we find that Volvox section Volvox is sister to the Pandorina + Volvulina + Colemanosphaera (PVC) and the (EVP) clades within the Volvocaceae. Our ML + BI and CB trees contain minor branching order differences in the EVP clade, but these have no bearing on our major findings.

Variation in divergence times inferred under different relaxed molecular clock models necessitate validation tests. All divergence time estimates were inferred by Phylobayes 4.1b [92] under a Bayesian approach for our inferred ML and CB species trees. For each topology, a total of 14 nodes were calibrated across Rhodophyta, Streptophyta, and Chlorophyta (Fig. 2 and Table 1), and four relaxed clock models: autocorrelated lognormal (LN) [93] and Cox-Ingersoll-Ross (CIR) [94] models, and the uncorrelated gamma (UGAM) and white -noise (WN) models [95]. Autocorrelated models (i.e., LN and CIR) allow the rate of evolution to vary across branches, and the rate of evolution is more similar along branches for closely related species compared to distantly related taxa [93, 94]. Uncorrelated models (i.e., UGAM and WN) assume each branch of will have its own unique rate of evolution irrespective of rates across branches for closely and distantly related species [94, 95]. Inferred divergence times established under all clock models are largely consistent across the three major red and green algal clades (Figure S4). However, there are nodes such as the root age, earliest rhodophyte divergence, and major divergences within the volvocine algae where one model infers a date markedly younger or older than the others (Fig. 3).

Table 1 Fossils used to calibrate nodes in Archaeplastida treeFig. 3.figure 3

Estimated divergence times of the volvocine algae. Branching order of the tree was inferred under maximum-likelihood analysis from an aligned amino acid, concatenated dataset of 263 nuclear genes. Numbers on branches represent bootstrap and posterior probability values, respectively. Branch lengths, corresponding to time, were inferred under the CIR relaxed clock model using 16 most clock-like genes as determined by Sortadate. Blue bars correspond to the inferred 95% HPD interval for each node. Green bubbles correspond to a developmental trait gain, and red bubbles corresponds to loss of a trait. The figure table lists the 12 developmental traits identified by Kirk in their original order. Blue bubbles indicate key divergences in the volvocine algae. Members of the multicellular volvocine algae (Tetrabaenaceae, Goniaceae, and Volvocaceae) are denoted in orange, purple, and green. Taxa in black font are unicellular volvocine algae

The estimated mean root age of all tested clock models varies by ~630 million years (MY) with the WN model estimating the earliest red algal divergence as ~2016 million years ago (MYA), and the CIR model estimating it as late as ~1385 MYA (Additional file 1: Figure S4B). Averaging the mean root ages of the LN, CIR, and UGAM models produces an average root age of ~1503 MYA, reducing the variance in estimated mean root age between WN and the others to ~500 MY. Similarly, for the earliest rhodophyte divergence, the WN model inferred a mean date significantly older than all other models by at least 300 MY, whereas the LN, CIR, and UGAM models estimated mean ages within <100 MY of each other. Large date discrepancies such as these may be solely due to differences in clock algorithms [104, 105].

For all volvocine algae divergences, the UGAM model estimated ages markedly younger than the three other models we tested (Additional file 1: Figure S4A, B (divergences 9-11)). It is noteworthy that the inferred ages of the UGAM model for the volvocine algae are very similar to the dates inferred by Ma et al. [12], who used a single relaxed clock model. When taking the average of the mean dates inferred by the LN, CIR, and WN models, there is a consistent ~130 MY difference between the average and a mean node date inferred by the UGAM model for this group. Significantly, the UGAM 95% highest posterior density (HPD) intervals for the volvocine algal divergence do not overlap with any of the 95% HPD intervals inferred by other models, conversely the other three models see a comfortable overlap in their 95% HPD intervals for volvocine divergence estimates.

Given that the WN and UGAM models inferred outlier dates for certain key divergence events, we decided to exclude both as clock models. Ages inferred by the CIR model in this study were never observed to be outliers for key divergence events, and the ages inferred by the CIR model for the volvocine algae are nearly identical to the WN model’s estimates for this group. Additionally, divergence estimates inferred under the CIR model for the volvocine algae were the most conservative among the LN, WN, and CIR results. Altogether, the foregoing considerations prompted us to report dates for major divergence events using data produced by the CIR relaxed clock model (Additional file 1: Figure S4A, B). Lepage et al. [94] performed model comparisons of the CIR, LN, UGAM, and WN relaxed clock models using real, as opposed to simulated, molecular datasets that varied in type (nuclear vs mitochondrial and nucleotide vs amino acid) and number of taxa. Their tests concluded that autocorrelated clock models performed best against the various datasets used in their study. Furthermore, LN and CIR models outperformed UGAM and WN models when taxonomic sampling was high, and the CIR model was recommended by the authors for dense taxonomic datasets. Fossil cross validation tests were performed as described earlier for each relaxed clock model, with the objective of identifying a best-performing analysis. However, no relaxed clock model performed markedly better than any of the others (Additional file 2: Table S2).

Fossil selections and Archaeplastida phylogenetic results and divergence times. Since no reliable fossils exist for the volvocine algae, we selected 14 fossil taxa across the Archaeplastida where reliable fossils are abundant (Fig. 2 and Table 1). For each primary fossil calibration used in this study, each calibration point was constrained to a range rather than a fixed-point estimate, acknowledging the inherent uncertainty in fossil ages. With each date range, we specified a soft bound where 2.5% of the total probability mass is positioned outside of the specified lower and/or upper bounds. Detailed information regarding each fossil taxon and calibration point may be found in Additional file 3: Supplementary Methods [18, 26, 27, 75, 96, 97, 99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,

留言 (0)

沒有登入
gif