Assessing the capabilities of 2D fluorescence monitoring in microtiter plates with data-driven modeling for secondary substrate limitation experiments of Hansenula polymorpha

Magnesium limitation experiment

In the first cultivation experiment, the magnesium (Mg2+) supplementation of the modified SYN6-MES medium was varied between 295.8 mg/L and 0 mg/L. In addition to the initial cell dry weight (CDWt0) of 0.03 g/L, for the cultures with 295.8 mg/L magnesium also 0.06 g/L and 0.11 g/L were cultivated. Following the experimental layout described in Additional file 1: Fig. S1, the resulting online data of the OTR, scattered light and GFP fluorescence is shown in Fig. 1A-C. Exemplary recorded 2D spectra are shown in Additional file 1: Fig. S2. Offline sampling data for the cultures with 295.8 mg/L magnesium and a CDWt0 of 0.06 g/L (blue upward triangle) and 1.18 mg/L magnesium and a CDWt0 of 0.03 g/L (red downward triangle), taken at a 1.5 h interval, are shown as hollow symbols in Fig. 1D-F. The filled, linearly interpolated symbols describe a more realistic sampling interval of 6 h and are later used for PLS model calibration.

Fig. 1figure 1

Time-resolved (A-C) online monitoring signals and (D-F) offline sample measurements of H. polymorpha RB11 pC9-FMD (PFMD-GFP) cultivations at different initial cell dry weight (CDWt0) and magnesium (Mg2+) concentrations. (A) The mean oxygen transfer rate (OTR) of culture replicates (n = 2–3, Additional file 1:Fig. S3) was determined by a μRAMOS device [48]. The low standard deviations are shown as shaded areas and indicate good reproducibility. Hollow symbols indicate every sixth data point. (B) Scattered light (λex = λem = 600 nm) and (C) GFP fluorescence intensities (λex = 420 nm, λem = 530 nm) were extracted from 2D spectra of duplicates (solid and dotted lines). Hollow symbols indicate every fifth data point. Values of (D) glycerol, (E) cell dry weight (CDW), and (F) pH value are based on singular offline measurement from parallel cultivation, taken every 1.5 h (hollow and filled symbols). Filled, linearly interpolated symbols describe a sampling interval of 6 h. Cultivation conditions: 48-well microtiter plate with round geometry, modified SYN6-MES medium, liquid volume = 800 μL, shaking diameter = 3 mm, shaking frequency = 1000 rpm, temperature = 30 °C

The OTR signals in Fig. 1A are comparable to the results of Kottmeier et al. [47], although being obtained in MTPs instead of shake flasks. The cultures with a magnesium supplementation of 295.8 mg/L and a CDWt0 of 0.03 g/L (light blue pentagons) showed exponential growth until 22.3 h, with a maximum OTR value of 34.5 mmol/L/h. Afterwards, the OTR dropped below 5 mmol/L/h. Both the scattered light and the GFP signals followed this exponential growth pattern. Shortly before the maximum OTR was reached, a sudden increase was observed for the GFP signal, which is connected to the partial derepression of the FMD promotor due to low glycerol concentrations [49, 50]. After the increase, the GFP signals flattened asymptotically. The cultures with an increased CDWt0 (purple circles, blue upward triangles) described an earlier exponential increase, but were otherwise similar. With decreasing the magnesium supplementation, the maximum OTR was reduced, and the final signal drop was successively replaced by a slight linear decrease. Accordingly, the scattered light signal was observed to transition from an exponential increase to a linear increase upon reaching the OTR maximum. For the GFP signal, a sudden stagnation was observed for the cultures supplemented with a magnesium supplementation of up to 4.14 mg/L (yellow leftward triangles). No considerable metabolic activity was measured for the cultures without magnesium (pink squares) and the wells containing only non-inoculated medium (black hexagons).

The offline data in Fig. 1D-F described an exponential increase of the CDW and an according glycerol and pH decrease during the unlimited growth phase. In total, around 4.1 g/L CDW were accumulated for the culture supplemented with 295.8 mg/L magnesium. The pH value decreased to a minimum of 5.48. For the culture supplemented with 1.18 mg/L magnesium, the transition from an exponential OTR increase to a linear decrease resulted in a slow linear decrease and a residual glycerol concentration of 5.5 g/L at the end of the cultivation. Here, a total of 1.9 g/L CDW was accumulated, while the pH decreased to 5.75. For the 6 h sampling interval the exponential growth pattern was described inaccurately, as especially the time of glycerol depletion and the minimum pH value were not correctly reflected.

Similar to the study by Berg et al. [41], the PLS models were generated based on online spectral data of two replicates and the linearly interpolated offline values of the sparse, 6 h sampling interval of the cultures shown in Fig. 1D-F. The prediction dataset consisted of the spectral data of the remaining cultivation conditions shown in Fig. 1A-C. The OTR was not used for PLS modeling. The resulting root-mean-square error (RMSE) for different numbers of latent variables (LVs) are shown in Additional file 1: Fig. S4. The appropriate number of LVs was identified by a decreasing RMSECal, sparse, accompanied by an increase of the RMSECal, full. According to Berg et al. [41], at this point, the model starts to overfit the sparse data, instead of representing the inherent biology-based spectral dynamics. Following this methodology, the PLS models were generated using two LVs for the glycerol concentration and the CDW and three LVs for the pH value. The calculated offline parameter progressions for the calibration and the prediction dataset are shown in Fig. 2.

Fig. 2figure 2

(A-C) Calibration and (D-F) prediction of the PLS models for three offline parameters for H. polymorpha RB11 pC9-FMD (PFMD-GFP) cultivations at different initial cell dry weight (CDWt0) and magnesium (Mg2+) concentrations. This figure is based on the data in Fig. 1. The PLS models were calibrated using the linearly interpolated values of the filled symbols and a total of 248 2D spectra of the two cultivation conditions shown in A-C. The PLS models for (A, D) glycerol, (B, E) CDW, and (C, F) pH value were generated using two, three, and two LVs, respectively. Solid and dotted lines describe the calculated parameter progression of duplicates. Hollow symbols in A-F were used for model validation only. Cultivation conditions: 48-well microtiter plate with round geometry, modified SYN6-MES medium, liquid volume = 800 μL, shaking diameter = 3 mm, shaking frequency = 1000 rpm, temperature = 30 °C

In Fig. 2A, the glycerol values calculated for the cultures with 295.8 mg/L magnesium (blue upward triangles) agreed well with the exponential decrease, described by the offline values of the short sampling interval. The lowest glycerol concentration is calculated at 0.15 g/L and coincided with the depletion of glycerol depletion according to the sparse sampling interval. For the cultures with 1.18 mg/L magnesium (red downward triangles), the initial non-linear glycerol decrease was calculated appropriately. However, according to the PLS model, glycerol consumption ceases after 19 h. This led to a final difference of 1.2 g/L between the calculated and the measured values, which represented the time point of the onset of limitation and the stagnating GFP signal (Fig. 1C). For the prediction dataset (Fig. 2D), good predictability of the glycerol concentration was observed during exponential growth, while under magnesium-limiting conditions, the glycerol consumption was underestimated. As a result, especially for the cultures supplemented with 2.37 mg/L (orange diamonds) and 4.14 mg/L magnesium (yellow leftward triangles), the calculated values considerably exceeded the measured values. In total, an RMSECal, full of 0.57 g/L was calculated, while for the prediction dataset, an RMSEPred, full of 0.94 g/L resulted. Relative to the measured glycerol range of 11.98 g/L, this accounts for 5.1% for the calibration dataset and 8.5% for the prediction dataset (Additional file 2: Table S1).

The results for the PLS model of the CDW are shown in Fig. 2B and E. The CDW was accurately calculated for both cultivation conditions of the calibration dataset. For the prediction dataset, the calculated values agreed well with the measured values during the exponential growth. However, especially for the increasingly limited cultures, the PLS model described a premature growth stagnation. This stagnation led to the final CDW values being underestimated by as much as 0.5 g/L for the cultures with 4.14 mg/L magnesium. In conclusion, an RMSECal, full of 0.18 g/L (3.8%), and an RMSEPred, full of 0.3 g/L (6.5%) were determined (Additional file 2: Table S1).

The PLS model of the pH value is shown in Fig. 2C and F. The model accurately reflected the initial pH decrease and the subsequent increase for the calibration cultures with 295.8 mg/L magnesium. However, the minimum value measured at 5.46 was calculated to be 5.55, which represented the lowest value of the 6 h sampling dataset used for calibration. For the culture supplemented with 1.18 mg/L magnesium, the PLS model accurately described the pH value throughout the cultivation time. Transferring the model to the prediction dataset resulted in comparable observations. While the pH values of the strongly limited cultures were described accurately, deviations increased for the cultures of higher magnesium. This was especially observed for the time after the metabolic activity ended. A systematic overestimation of the pH values was observed for the cultures with a CDWt0 of 0.03 g/L of 295.8 mg/L magnesium (light blue pentagons). In summary, an RMSECal, full of 0.025 (4.9%) was obtained, while for the RMSEPred, full a value of 0.032 (6.0%) resulted.

In conclusion, despite the changes in the scattered light and fluorescence dynamics, the resulting PLS models provide a reasonable overall accuracy. Although for individual cultivation conditions the predictive performance showed shortcomings of the models, with a relative RMSE below 10%, the acceptance criterium for a transferable PLS model described by Yousefi-Darani et al. [51] is met.

Potassium limitation experiment

In the second experiment, the potassium (K+) supplementation to the modified SYN6-MES medium was varied. As for the previous experiment, the calculated RMSE included three very similar cultivation conditions, the variation of the CDWt0 was omitted. Thereby, also the number of cultures grown under limited conditions could be increased. In total, eight cultivation conditions with potassium concentrations between 2017.3 mg/L and 20.2 mg/L were investigated.

With decreasing potassium concentration, the online spectral data in Additional file 1: Fig. S5A and B described a transition into a decelerated linear increase for both the scattered light and the GFP intensity. This resulted in the maximum intensities being decreased by more than for the cultures with the lowest potassium concentration of 20.2 mg/L when compared to the fully supplemented cultures. Accordingly, also the offline data in Additional file 1: Fig. S5C-E described a reduced, linear growth and consumption during the limitation phase. Comparable to the previous experiment, the linear interpolation of the sparse sampling interval did not accurately describe the non-linear growth pattern and missed the exact time of glycerol depletion, as well as the correct minimum pH value.

Similar to the previous experiment, the PLS model was generated based on the spectral online and the sparse offline sampling data of the cultures with the second highest (100.9 mg/L potassium, blue upward triangles) and the lowest (20.2 mg/L potassium, pink squares) potassium supplementation. The prediction dataset consisted of duplicates of the remaining online monitored cultivation conditions, shown in Additional file 1: Fig. S5A-B. The RMSEs resulting from the variation of LVs are shown in Additional file 1: Fig. S6. The RMSECal, sparse and RMSECal, full were of the same magnitude, as observed for the magnesium limitation experiment. Similarly, a continuous decrease of the calculated values was observed for increasing LVs. An overall reduced RMSE was observed, when excluding the cultures with the highest potassium concentration from the prediction dataset (RMSEPred, full, − 100%, blue diamonds). Following the previously described criteria for the final PLS models of the CDW and the pH value, three LVs were used, while for the glycerol concentration, four LVs were chosen. The resulting trajectories, calculated by the PLS models, are shown in Fig. 3.

Fig. 3figure 3

(A-C) Calibration and (D-F) prediction of the PLS models for three offline parameters of H. polymorpha RB11 pC9-FMD (PFMD-GFP) cultivations at different potassium (K+) concentrations. This figure is based on the data from Additional file 1: Fig. S5. The PLS models were calibrated with the linearly interpolated values of the filled symbols and a total of 356 2D spectra of the two cultivation conditions shown in A-C. The PLS models for (A, D) glycerol, (B, E) CDW, and (C, F) pH value were generated using four, three, and three LVs. Solid and dotted lines describe the calculated parameter progression of duplicates. The hollow symbols were used for model validation only. The crossed symbols describe the value for the last offline sample taken after 54 h. Cultivation conditions: 48-well microtiter plate with round geometry, modified SYN6-MES medium, liquid volume = 800 μL, shaking diameter = 3 mm, shaking frequency = 1000 rpm, temperature = 30 °C

In Fig. 3A and D, the calculated glycerol values showed a tendency towards the values of the interpolated sparse sampling interval. Consequently, for both the calibration and the prediction dataset, the glycerol progressions were described more accurately for lower potassium concentrations. For higher potassium supplementation, the glycerol consumption rates were underestimated, leading to the calculated glycerol depletion to occur later than determined by the measurements. For the culture supplemented with 2017.3 mg/L potassium (purple circles), the glycerol concentration was calculated to a value of − 3.5 g/L after 21 h. Here, a close connection to the increased scattered light (Additional file 1: Fig. S5A) can be assumed. In total, the determined RMSECal, full is 0.51 g/L (4.7%), while for the prediction dataset, the RMSEPred, full was calculated to 2.05 g/L (18.7%). Excluding the cultures with the highest potassium concentration resulted in a considerably reduced RMSEPred, full, − 100% of 0.57 g/L (5.2%).

As for the glycerol concentration, also for the PLS model of the CDW, shown in Fig. 3B and E, higher inaccuracies were observed for higher potassium concentrations resulting from an underestimated growth rate. Only for the cultures with 2017.3 mg/L potassium, the calculated and the measured offline values were again in very good alignment during the exponential growth. However, after the exponential growth was terminated (between 18 h and 42 h) non-plausible CDW fluctuations with deviations of up to 0.45 g/L were calculated. As a result, an RMSECal, full of 0.15 g/L (4.0%) was calculated, while the RMSEPred, full was determined to 0.55 g/L (14.6%). The RMSEPred, full, − 100% was reduced to 0.29 g/L (8.4%) as the described fluctuations were omitted.

The calculated values for the pH value of the calibration dataset in Fig. 3C show a tendency of underestimation for the low potassium supplementation and an overestimation for the high potassium supplementation. When applied to the prediction dataset (Fig. 3F), the PLS model accurately predicted the pH values for the cultures with potassium concentrations of up to 35.3 mg/L (orange diamonds). However, for higher potassium concentrations, the accuracy decreased. Considerable shortcomings were observed for the cultures of 2017.3 mg/L potassium, for which the measured minimal pH value was missed by 0.14. In total, the RMSECal, full of the PLS model was calculated to 0.034 (4.3%), while the RMSEPred, full was calculated to 0.054 (6.9%). When excluding the cultures with the highest potassium concentration, the RMSEPred, full, − 100% was reduced to 0.032 (6.5%). While this represents only a limited reduction of the percentage value, compared to the RMSEPred, full, the absolute RMSEPred, full, − 100% is nearly reduced by half.

In conclusion, the potassium experiment showed lower PLS modeling performance than the magnesium variation experiment. This partially resulted from the chosen cultivation conditions and the reduced similarity between the calibration and prediction dataset.

Phosphate limitation experiment

In the third experiment, the supplementation of phosphate (PO43−) was varied between 0 mg/L and 697.9 mg/L. The resulting online and offline values are shown in Additional file 1: Fig. S7. The OTR signals indicated the onset of a limitation for phosphate concentrations of 139.6 mg/L (light blue pentagons) or less. The limitation led to a decreased maximum of the OTR, followed by a plateau of variable length. Concurrently, the scattered light transitioned to a linear increase, while the GFP signal stagnated. Upon the final decrease of the OTR, an additional final increase of the GFP intensity was observed, the extend of which decreased with decreasing phosphate supplementation. Noteworthy, for all phosphate concentrations, the scattered light signals reached comparable maximum values. For the cultures without phosphate (pink squares) and the non-inoculated medium (black hexagons), no considerable signal increase was observed.

Similar to the scattered light, the offline data in Additional file 1: Fig. S7D-F indicated a comparable CDW of 3.2 g/L CDW for both sampled conditions, despite a slower growth of the limited culture. In contrast, the higher phosphate supplementation led to a minimum pH value of 5.44, whereas for the lower supplementation a pH minimum of 5.8 was observed. Again, the linear interpolation of the sparse sampling interval introduced inaccuracies for the timing of the depletion of glycerol, the maximum CDW, and the minimum pH value for the cultures with the higher phosphate supplementation. For the culture of lower phosphate supplementation, the trajectory was appropriately described.

The PLS models were generated based on the data of the first 34.5 h of cultivation. The calibration dataset included the cultivation conditions shown in Additional file 1: Fig. S7D-F, while the prediction dataset consisted of the remaining inoculated cultures. With increasing LVs, the RMSECal, full, and the RMSECal, sparse of the PLS models continuously decreased for all offline parameters as shown in Additional file 1: Fig. S9A-C. The two calibration RMSEs of the glycerol concentration decreased in parallel, while for the CDW, the RMSECal, full surpasses the RMSECal, sparse for five LVs. For the pH value, the two values were nearly identical for all eight LVs. For the prediction dataset, the glycerol and the CDW concentration showed a constant positive offset of the RMSEPred, full to the RMSEs of the calibration datasets. Only for the pH value, a valley-shaped progression with a minimum of three LVs was observed. For this parameter, excluding the cultures of the highest phosphate supplementation resulted in a minimum RMSEPred, full, − 100%, which was comparable to the RMSEs of the calibration datasets. In conclusion, the criterium for choosing the number LVs [41] did only apply for the CDW and led to five LVs. However, for the glycerol concentration and the pH value the criterium did not apply. Thus, five and three LVs were chosen subjectively, as these represented the minimum values of the RMSEPred, full, − 100%. The resulting PLS models are shown in Fig. 4.

Fig. 4figure 4

(A-D) Calibration and (E-H) prediction of the PLS model for three offline parameters, based on 2D spectra (A-C, E-G) including and (D, H) excluding the scattered light of H. polymorpha RB11 pC9-FMD (PFMD-GFP) cultivations at eight different phosphate (PO43−) concentrations. This figure is based on the data in Additional file 1: Fig. S7. The PLS models were calibrated with the linearly interpolated values of the filled symbols and a total of 288 2D spectra from the two cultivation conditions shown in A-C. The models in Fig. 4D and H were generated including only the fluorescence intensities. The PLS models for (A, E) glycerol, (B, F) CDW, and (C, D, G, H) pH value were generated using five, five, and three latent variables, respectively. Solid and dotted lines describe the predicted parameter progression of duplicates. Crossed symbols describe the value for the last offline sample taken after 44 h. Cultivation conditions: 48-well microtiter plate with round geometry, modified SYN6-MES medium, liquid volume = 800 μL, shaking diameter = 3 mm, shaking frequency = 1000 rpm, temperature = 30 °C

Comparable to the previous experiment, the PLS model for glycerol in Fig. 4A and E showed a tendency towards linear glycerol consumption for both the calibration and the prediction dataset. Consequently, the cultures with lower phosphate concentrations were predicted more accurately, while the exponentially grown cultures with phosphate supplementation of at least 244.3 mg/L showed slight deviations from the measured values. As a result, the RMSECal, full was determined to 0.58 g/L (5.2%). With a RMSEPred, full of 0.61 g/L (5.6%) and a RMSEPred, full, − 100%, of 0.59 g/L (5.4%), very comparable values were achieved for the transfer to the prediction dataset.

Also, for the calculated CDW in Fig. 4B and F, a better predictability for the cultures with lower phosphate supplementation was observed. However, when transferring the model to the prediction dataset, the noise of the calculated values significantly increased, while the reproducibility of the replicates decreased. Additional, notable deviations from the measured offline values were observable for the cultures with 104.7 mg/L (green rightward triangles) and 697.9 mg/L phosphate (purple circles) between 19 h and 21 h of cultivation. While for the lower phosphate supplementation, a singular offline measurement error is conceivable, for the higher supplementation, the maximum calculated CDW of 3.0 g/L underestimated multiple offline measurements, ranging between 3.4 g/L and 3.9 g/L. Therefore, here, an inaccurate PLS model is more likely. In total, the RMSECal, full was calculated to be 0.12 g/L (3.0%), while the observed deviations of the prediction dataset resulted in an RMSEPred, full of 0.26 g/L (6.6%). Excluding the cultures with the highest phosphate supplementation resulted in a reduced RMSEPred, full, − 100%, of 0.16 g/L (4.3%).

The PLS modeling results for the pH value, shown in Fig. 4C, are in accordance with the glycerol concentration and the CDW. Again, the calculated values for the limited cultures described the PLS models more adequately compared to the cultures with high phosphate supplementation, which were systematically overestimated between 12 h and 19.5 h of cultivation. Nevertheless, the model correctly reflected the minimum pH value of 5.44, which was not part of the calibration dataset. After the glycerol is consumed, the model eventually calculated a second, non-apparent decrease in the pH value for the high phosphate supplementation. Transferring the PLS model to the prediction dataset resulted in more systematic deviations, as shown in Fig. 4G. The PLS model appropriately reflected the decline of the pH value during the initial exponential growth phase. However, especially for the cultures with 87.2 mg/L (yellow leftward triangles) and 69.8 mg/L phosphate (orange diamonds), the pH values were calculated to further decrease until the end of the metabolic activity, despite being measured to be constant. After the metabolic activity had ceased, the calculated pH value increased to values approximating the measured values. For the cultures supplemented with 697.9 mg/L phosphate, the calculated progression of the pH value was qualitatively comparable to the cultures supplemented with 244.3 mg/L phosphate (blue upward triangles). However, the minimum calculated pH value was 5.4 instead of the measured minimum value of 5.19. In conclusion, the RMSECal, full was calculated to be 0.038 (3.8%), whereas the described inaccuracies of the prediction dataset resulted in an RMSEPred, full of 0.085 (8.5%) and a considerably reduced RMSEPred, full, − 100% of 0.04 (5.4%).

Exclusion of the scattered light

In the phosphate limitation experiment, while the calibration dataset was described accurately, systematic inaccuracies were observed when transferring the PLS models to the prediction dataset. Therefore, overfitting of the calibration data in general and the spectral data in particular can be suggested. One way to investigate this hypothesis is to modify the selection of included spectral data. To exemplarily demonstrate the impact of the spectral input data, additional PLS models were generated using only the fluorescence of the 2D spectra, as shown in Additional file 1: Fig. S10.

The RMSEs for a variable number of LVs are shown in Additional file 1: Fig. S9D-F. While for a low number of LVs, the RMSEs of the calibration datasets were higher than for the PLS models including the scattered light, with increasing LVs, the RMSEs of the two models were more comparable. The same holds true for the RMSEPred, full and RMSEPred, full, − 100% of the PLS models for the glycerol concentration and the CDW. Only for the pH value, excluding the scattered light resulted in an additional reduction of the RMSEPred, full, − 100%. The improved PLS modeling performance for three LVs is visualized in Fig. 4D and H.

For the cultures supplemented with 52.3 mg/L phosphate, a more constant trajectory was calculated, whereas for the culture with 244.3 mg/L phosphate, the previously observed second pH increase was no longer exhibited in the new model. For the prediction dataset in Fig. 4H, the new PLS model resulted in a considerably better alignment for the later phase of cultivation. Firstly, the systematic overestimation of the pH decrease for the strongly limited cultures was reduced. Further, also the asymptotic behavior of the pH value observed for the cultures with 139.6 mg/L phosphate after metabolic activity was terminated was described more correctly by the new model. Nevertheless, also the new model did not predict the minimum value of 5.19 for the cultures with the 697.9 mg/L phosphate correctly. In total, the PLS model including only fluorescence intensities resulted in an RMSECal, full of 0.04 (4.0%), an RMSEPred, full of 0.106 (10.6%) and an RMSEPred, full, − 100% of 0.034 (4.6%).

As described before, this noticeable systematic improvement of the prediction resulted from a change in the dominating spectral dynamics due to the exclusion of the scattered light. By applying PCA this change can be visualized, as shown in Fig. 5.

Fig. 5figure 5

Scores of first to third principal component (PC1-PC3) over cultivation time, based on (A-C) 2D spectra including scattered light and fluorescence, as well as based on (D-F) only fluorescence intensities for the cultivation of H. polymorpha RB11 pC9-FMD (PFMD-GFP) at different phosphate (PO43−) concentrations. The explained variance for each PC is shown in brackets. For clarity, only data of one replicate per cultivation condition is shown. Only every 10th datapoint is indicated by a symbol. Asterisks in Fig. 5A and E indicate reversed Y-axis direction used for clarity. Spectroscopic measurement settings: excitation wavelength range = 280 nm – 700 nm (step size = 10 nm), emission wavelength range = 278 nm – 720 nm (step size = 0.45 nm), integration time = 30 ms. Cultivation conditions: 48-well microtiter plate with round geometry, modified SYN6-MES medium, liquid volume = 800 μL, shaking diameter = 3 mm, shaking frequency = 1000 rpm, temperature = 30 °C

The scores of the first principal component (PC1) for the dataset including the scattered light accounted for an explained variance of 99.64% and resembled the scattered light intensities shown in Additional file 1: Fig. S7B. An even higher explained variance of 99.82% was achieved for the PC1 scores of the dataset including only the fluorescence intensities, for which the progression was qualitatively comparable to the GFP fluorescence shown in Additional file 1: Fig. S7C. With increasing PCs, the remaining variance is successively described. However also, the noise considerably increased, which is in good accordance with the literature [52]. For the PC2 scores of the dataset excluding the scattered light, the observed plateau strongly resembled the progression of the pH value. As it was not observed in any of the scores of the dataset including the scattered light, a connection to the improved PLS model is conceivable. The reason why the predictive performance of the model including the full dataset is lower lies in the covariance-based algorithm of the PLS regression. Thereby, although the scattered light may not be optimal for describing the pH value, it comprises a very large amount of the spectral variance and is thus also included in the model. This effect may be even increased for increasing LVs. However, by excluding the scattered light, these deteriorating online signals are no longer considered for modeling. Instead, signal dynamics for PLS model generation originate only from the inherently pH-correlating fluorescence intensities.

Comparison of PLS models

In both, this and the previous study by Berg et al. [41], PLS regression models were generated using the same monitoring hardware and biological system. However, while the previous paper described a simple glycerol concentration variation study, in this study, more complex systems of secondary substrate limitations were investigated. As the workflow for generating the PLS models was identical, the results can directly be compared, to estimate the robustness of the methodology. In Fig. 6, a summary of the relative RMSEPred, full (plain columns) and RMSEPred, full, − 100% (backward diagonal hatched columns) is given. Additionally, also for the PLS models including only the fluorescence intensities (Fl), the RMSEPred, full, − 100% (forward diagonal hatched columns) and the respective as RMSEPred, full, − 100% (cross-hatched columns) are shown. An analogous visualisation of the absolute values is shown in Additional file 1: Fig. S11.

Fig. 6figure 6

Comparison of the relative RMSEPred, full for glycerol, CDW, and pH value for the PLS models generated in Berg et al. [41] and this study. Backward diagonal hatched columns describe RMSEPred, full, based on the complete prediction dataset, except the culture holding the initial concentration of the respective secondary substrate according to Jeude et al. [53] (− 100%). Forward-hatched columns describe the RMSEPred, full, calculated for spectral online datasets including only the fluorescence intensities (Fl). For the diagonal cross-hatched columns, additionally, the cultures with the highest initial concentration were excluded. The individual values are given in Additional file 2: Table S1

The glycerol-variation experiment of the previous study resulted in low relative RMSEPred, full, with values between 3.5 and 5.3%. In contrast, for all limitation experiments of this study, values above 5% were obtained. The highest relative RMSEPred, full of up to 18.7% was calculated for the glycerol concentration of the potassium limitation experiment (K+). A reduction by more than 10% was obtained when excluding the cultures with the highest potassium supplementation (K+(− 100%)). For the phosphate variation experiment, this exclusion procedure resulted in an RMSEPred, full, − 100% between 5.4 and 4.3%. The additional in-silico exclusion of the scattered light (PO43−(Fl)) further reduced the RMSEPred, full, − 100% for the pH value from 5.4 to 4.6%. However, no considerable change in the RMSE was observed for the CDW, while for the glycerol concentration, the RMSEPred, full, − 100% even increased.

In conclusion, depending on the offline parameter, the average relative RMSEPred, full for the PLS models generated from the full 2D spectra ranged between 7.1 and 11.0%. A reduction to values between 6.0 and 6.4% was achieved for the exclusion of the cultures with the highest supplementation. Although these values still represented a considerable increase compared to the glycerol variation experiment of Berg et al. [41], with relative values below 10%, the PLS modeling results of the presented study can be considered a success. In fact, the obtained relative RMSEs are well comparable to other studies using fluorescence spectroscopic PLS modeling in stirred tank batch reactors. For example, relative values between 3.8 and 9.1% have been reported for the prediction of singular carbon sources and between 3.0 and 6.8% for the CDW [16, 54,55,56]. However, due to the reduced experimental throughput in stirred tank reactors, in these studies, only a limited number of different cultivation conditions was used for external validation. In conclusion, this further supports the high potential of the 2D fluorescence online monitoring technology in MTPs.

In addition to the comparison with other studies, the PLS model performance can also be compared to the respective conventional determination method of each offline parameter (i.e., HPLC, gravimetry, pH-electrode). For the glycerol concentration, the determined relative RMSE of more than 5% is considerably higher than the standard deviation of the implemented HPLC method, which was below 0.2% (data not shown). In contrast, for the CDW measurements, the determined relative RMSE is in good agreement with the reported standard deviations between 0.9 to 7% [57]. Finally, for the pH value, the absolute RMSE of around 0.05 represents the higher end of the tolerance of a single pH measurement. However, finally, it has to be stated, that the financial and personnel efforts for generating a comparable amount of offline data by manual measurements is beyond any feasibility. From this perspective, the PLS models based on 2D fluorescence spectroscopy outperform any of the conventional measurement methods.

留言 (0)

沒有登入
gif