Fruit sugar hub: gene regulatory network associated with soluble solids content (SSC) in Prunus persica

Selection of contrasting 'O × N' siblings for soluble solids content and sequencing summary metrics

Six peach individuals from the 'O × N' population were selected for transcriptomic analyses of the soluble solids content (SSC) phenotype. These individuals were classified into two phenotypic classes (LowSSC and HighSSC), which presented consistent SSC values during the three evaluation seasons (2015–2017). Detailed phenotyping information about selected individuals in all evaluation seasons is shown in Additional file 2: Table S2, considering fruit pubescence (peach/nectarine), flesh color (white/yellow), maturity date, fruit size, firmness, and SSC. There were no significant differences in maturity date, fruit size, and firmness values between phenotypic classes, and no correlation between SSC traits was observed with flesh color or maturity date phenotypes. Significant differences were observed for SSC traits between phenotypic classes with no significant differences between biological replicates and evaluated seasons (Additional file 2: Table S2), with average SSC values of 10.4 ºBrix (LowSSC) and 17.0°Brix (HighSSC).

Interestingly, there is a correlation between SSC and the glabrous traits in selected 'O × N' individuals, with peaches associated with lower soluble solids content than nectarines (Additional file 2: Table S2). We constructed three RNA libraries for each individual chosen using different fruits from three independent RNA extractions to be sequenced as technical replicates. The sequencing results are detailed in Additional file 3: Table S3. On average, 40,331,623 reads were sequenced for each RNA library, and approximately 91.3% of the total reads passed quality filters. After filtering, 89.7% of the total sequenced reads were correctly aligned on average against the Prunus persica v2.1 reference genome.

Differential expression analysis between 'O × N' siblings with low and high SSCs

When performing differential expression analysis between LowSSC and HighSSC samples, we identified 7188 differentially expressed genes (DEGs) with an FDR < 0.05, which are represented in a green–yellow color scale heatmap (Fig. 1A) considering the average expression value of the three technical replicates for each individual. The biological and technical replicate distributions are shown in Fig. 1B. The individual with the greatest dispersion among the technical replicates was O×N-037. However, even so, it was possible to differentiate both phenotypic classes, with the first component explaining 30% of the observed variability. To perform an SSC network analysis with the genes that presented more significant expression differences between phenotypic classes, the 7188 DEGs identified were filtered with a |log2FC| > 1.0, as shown in the volcano plot in Fig. 1C. Thus, 672 and 1011 candidate genes with higher expression in individuals with LowSSC and HighSSC were selected for the following analyses.

Fig. 1figure 1

Differential expression analysis between contrasting SSC samples from the 'O × N' population. A Differentially expressed genes between LowSSC and HighSSC samples are represented in a green-yellow color scale heatmap (FDR < 0.05). Data was scaled using a z-score scaling method, dividing the mean value of each gene by the standard deviation. Each column represents the average expression of three replicates for each selected individual. B Partial least square discriminant analysis (PLS-DA) using normalized counts of each sequenced library. Each symbol represents the three replicates of each selected individual. Yellow and green symbols represent LowSSC and HighSSC samples. C Volcano plot with differentially expressed genes comparing LowSSC and HighSSC samples. Candidate genes for each group were selected with a p-value < 0.01 and |log2FC| > 1.0. D Gene ontology term enrichment analysis with genes overexpressed in LowSSC and HighSSC samples. The scale color represents the adjusted p-value, and the point size represents the gene ratio

Gene Ontology analysis was carried out to identify enriched terms between both sets of candidate genes. As shown in Fig. 1D, the enriched GO terms in the group of 672 candidate genes for LowSSC related to photosynthesis, response to light stimulus, environmental stimulus, decreased oxygen levels, wounding, fatty acids, cold, ethylene, and water deprivation were identified. GO terms related to the cell cycle, cell division, chromosome segregation, phenylpropanoid, and flavonoid biosynthetic processes, regulation of hormone levels, auxin transport, and cellulose biosynthetic and metabolic processes were identified in the group of 1011 candidate genes for the HighSSC samples.

Candidate regulatory gene identification comparing DEG physical positions with the localization of the described QTL for SSC in the 'O × N' population

According to the results published by Nuñez-Lillo et al. [21], the position of the QTL for the SSC phenotype in the 'O × N' population was found to be located between 12.1 and 18.3 Mbp of chromosome 5 of the peach genome containing a total of 1211 genes. Considering the physical position of the 1683 DEGs obtained in this research, only 91 colocalized with the QTL for SSC, 57 with greater expression values in HighSSC samples, and 34 with greater expression values in LowSSC samples. These 91 genes could be considered candidate regulators of this phenotype (Additional file 4: Table S4). Of this list of candidate genes, two cellulose metabolism-related genes, one member of the MAP kinase family of proteins, and nine transcription factor genes stand out as regulatory genes of the SSC phenotype due to their functional annotations, as shown in Table 1.

Table 1 Candidate regulatory genes are differentially expressed between contrasting SSC individuals colocated with the QTL for SSC in the 'O × N' population

Among these 12 candidate genes, nine had relatively high expression in the HighSSC samples, and only three had relatively high expression in the LowSSC samples. The interaction information for the four transcription factors shown in Table 1 was obtained from the ConnecTF database (AT5G13180, AT5G63260, AT1G18330, and AT5G50670). The only gene with enough target genes represented in the differential expression analysis (p-value < 0.01) to be used in the SSC gene regulatory network construction for the 'O × N' population was the transcription factor AT5G63260, which is described in Arabidopsis thaliana as a C3H67. In the peach genome, this transcription factor was annotated as a zinc finger CH domain-containing protein 43 for two genes, Prupe.5G158600 and Prupe.5G158700, with higher expression values in HighSSC samples.

Regulatory network analysis for the soluble solids content phenotype

In the group of candidate genes for LowSSC, a total of 14 transcription factors were identified in the transcription factor-target gene (TF-TG) interaction database, of which 8 had a p value > 0.01 based on the ratio of total target genes and differentially expressed genes (query); therefore, they were excluded from the network analysis. In the group of candidate genes for HighSSC, 16 transcription factors were identified, 13 of which were excluded from the network analysis due to a p value > 0.01. In this sense, six transcription factors associated with individuals with LowSSC (PpCBF4, PpHB40, PpSTZ, PpESE3, PpERF4, and PpERF017) and three transcription factors related to individuals with HighSSC (PpRVE1, PpHB7, and PpC3H67) were selected for SSC network construction.

A regulatory network for the SSC phenotype was constructed with these nine transcription factors and the other 620 DEGs with TF-TG interactions in the ConnecTF database (Fig. 2). The HighSSC samples showed a high number of transcripts with functions related to the cell cycle (PpCYCA2.4, PpCYCA3.4, PpCYCD1.1, and PpCDKB1.2), flavonoid biosynthesis (PpPAL1, PpCHS, Pp4CL2, PpHCT, and PpCCR), and the regulation of brassinosteroids (PpBRX, PpBRH1, and PpBR6ox1). On the other hand, with higher expression in LowSSC samples, a high number of transcripts with functions related to photosynthesis (PpPSAO, PpPSAN, PpPSAL, PpPIL5, PpPIF4, PpNDF5 and several PpLHCs) and ethylene pathways (PpACO1, PpERF1, PpERF4, PpERF9, PpERF13, PpERF17, PpERF106 and PpRAP2.4) were identified.

Fig. 2figure 2

Soluble solids content network analysis. Representation of most informative genes associated with soluble solids content regulatory network. Each node represents a differentially expressed gene, and each edge represents a DAP-seq gene association. Orange-scaled colored nodes correspond to the fold change absolute value between Low SSC and High SSC comparison. Nodes with colored borders correspond to genes associated with metabolic pathways or signaling

Furthermore, several metabolic pathways, including sugar accumulation, cell wall remodeling, and regulation of abscisic acid, jasmonic acid, and auxin pathways, were represented in both the LowSSC and HighSSC samples. However, regarding genes related to sugar accumulation, a more significant number of genes were detected in HighSSC samples (PpSWEET1, PpSWEET15, PpSTS, PpERD6L14, and PpSIP1) than in LowSSC samples (PpSWEET17, PpTPS7, and PpSTP3). Finally, the cell wall remodeling-related genes identified in the HighSSC and LowSSC samples differed markedly. On the one hand, in HighSSC individuals, a greater number of genes associated with cellulose biosynthesis (PpIRX1, PpIRX3, PpCSLA02, PpCSLB03, PpCSLB04, PpCSLD3, PpCSLG2, and PpCSLG3) and pectin modifications (PpPL, PpQRT3, several PpPGs and several PMEis) were identified along with two genes described as expansins (PpEXPA1 and PpEXPA8). On the other hand, three genes with β-glucosidase activity (PpBGLU4, PpBGLU11, and PpBGLU12) and two xyloglucan endotransglucosylases/hydrolases activity (PpXTH15 and PpXTH16) associated with the hemicellulose disassembly process were identified in LowSSC individuals.

RT‒qPCR validation of five candidate genes in SSC-contrasting peach varieties

Three contrasting peach varieties for the SSC trait were selected to evaluate and validate candidate gene associations with SSC genetic control ('Rebus,' 'Summer Fire' and 'Venus'). As shown in Fig. 3A, the three selected varieties had significant differences in soluble solids content, with the highest being 'Rebus', with an average SSC value of 20.7°Brix, followed by 'Summer Fire', with 15.9°Brix, and finally 'Venus', with the lowest SSC value (11.1°Brix). As shown in Fig. 3B, all these peach varieties exhibited similar phenotypes for other fruit quality traits, such as fruit size, skin color, pubescence, and flesh color. Therefore, these three peach varieties were suitable for SSC candidate gene validation, considering that HighSSC and LowSSC individuals in the 'O × N' population have average SSC values of 17.0 and 10.4°Brix, respectively.

Fig. 3figure 3

Candidate gene validations by RT-qPCR in contrasting peach varieties for the SSC trait. A Characterization of SSC phenotype in three peach varieties, 'Rebus' (R), 'Summer Fire' (SF), and 'Venus' (V). Statistical analysis was performed with a one-way ANOVA test; significant differences were represented by an asterisk (****; p < 0.0001). B Fruit phenotype of the contrasting peach varieties for the SSC trait. Photographic record of the fruits' external (upper images) and internal (lower images) phenotypes in each selected variety. C Correlation analysis between each selected candidate gene's RT-qPCR expression values (left axis) and contrasting peach varieties' SSC phenotype (right axis)

To validate the results obtained in the 'O × N' population, five candidate genes from the SSC network analysis shown in Fig. 2 were analyzed by RT‒qPCR in the three contrasting SSC peach varieties described above, four with high expression in the HighSSC samples (PpRVE1, PpSWEET15, PpCSLG2 and PpPAL1) and one with high expression in the LowSSC samples (PpCBF4). As shown in Fig. 3C, candidate genes with high expression in HighSSC samples of the 'O × N' population also presented high expression in the variety with the highest soluble solids content, 'Rebus.' By comparing the expression values obtained by RT‒qPCR of each selected transcript with the °Brix of each variety SSC phenotype, Pearson correlations of 0.91 (PpRVE1), 0.96 (PpSWEET15), 0.90 (PpCSLG2) and 0.86 (PpPAL1) were obtained. Similarly, the expression of PpCBF4, the only evaluated candidate gene with high expression in LowSSC samples, was greater in the 'Venus' variety according to RT‒qPCR, with a Pearson correlation of -0.80 with the SSC phenotype. These results validated the relationships of all these candidate genes with the observed differences in the SSC phenotype.

留言 (0)

沒有登入
gif