Quantitative Modelling in Stem Cell Biology and Beyond: How to Make Best Use of It

When attempting to answer a biological question, experimental means and direct hypothesis testing may not be sufficient to yield a clear answer. The experimental data may contain the information relevant to answer the question, but it may not be readily extractable through direct statistical tools. An example is the search for cell fate choice patterns using genetic cell lineage tracing [20•, 21,22,23] via the Cre-Lox recombinase system [24,25,26]. The Cre-Lox system allows labelling of cells with fluorescent proteins in an inheritable way, i.e. all of a labelled cell’s progeny caries that fluorescent marker, which allows to trace individual clones (see Fig. 1A). When tissue is harvested and clones analysed, one can obtain the full statistical distribution of clone sizes and their composition of cell (sub-)types if appropriate markers are available. The issue is that, while this statistical distribution is the result of the cell fate choices and thus contains some information about them, it cannot directly reveal the cell fate choice patterns. The problem is twofold: (1) the data is a snapshot of clonal distributions at particular time points, while the biological question is about a dynamical process, i.e. the changes of cell type over time and upon cell division, and (2) the data is multicellular, about cell populations (a clone is a sub-population of cells), while the biological question is about the fate of single cells and their daughters upon division. Hence, experimental data and the biological question, that is, any associated hypothesis, do not match in terms of scale of time (static vs. dynamic) and cell number (single- vs. multicellular).

Fig. 1figure 1

Hypothesis testing via quantitative modelling, exemplified on clonal statistics. A Depiction of lineage tracing via Cre-Lox recombination and clonal statistics. Transgenic animals carry a GFP gene preceded by a Stop sequence that is flanked by Lox constructs. The Lox-flanked Stop sequence is removed by a Cre recombinase which is expressed upon administration of tamoxifen, and thus, GFP is expressed. The GFP label is inherited to the progeny of the initial cell, which constitutes a clone and which grows over time upon Cre recombination (centre top, ©2014 SpringerNature. Reprinted, with permission, from [28]). The statistical distributions of clone sizes is recorded (centre bottom, data from P. H. Jones as published in [29]), yet it cannot directly distinguish between hypotheses of cell fate outcome (bottom). B Quantitative modelling can bridge the gap between hypotheses and data: each hypothesis represents the rules for a stochastic model of cell fate choice dynamics, which predicts the hypothesis’ expected clonal statistics. The latter can then be directly compared with the experimental data and tested (bottom, data from P. H. Jones, as published in [29]). C, D Illustration of universality. C Two models of stem cell fate choice in homeostasis, which differ in some features, yet predict the same clone size distribution and are thus indistinguishable through static cell lineage tracing data [30] (plot on bottom: ©2019 SpringerNature. Reprinted, with permission, from [31]). D ~ 800 randomly generated cell fate models can be categorised in only two universality classes, one predicting an exponential distribution in the long term limit and the other one a normal distribution if mean clone sizes are large (plots reprinted on CC-BY license from [32••]). The two classes are distinguished by only one predictive relevant property, namely, whether the number of stem cells is strictly conserved or not

At this point, quantitative modelling can help. It can bridge the gap between biological question/hypotheses and the experimental data, as depicted in Fig. 1B. The key is that every candidate hypothesis can be interpreted as the rules for a mechanistic model, which can be formulated and evaluated mathematically or computationally, as a set of differential equations (deterministic dynamics) or as a stochastic process (which includes random noise), to generate “virtual data”, as would be predicted by that hypothesis. That prediction can then be directly compared with the data and be tested on it.

As an example, let us again consider the search for cell fate choice patterns. Let us assume that we have measured clonal distributions experimentally; yet, we cannot directly see the cell fate choice rules from this data. We can, however, take the possible candidate hypotheses and translate each of them into the update rules of a stochastic process. Then, these models can be evaluated to produce predicted clonal (probability) distributions as output, and the parameters of the models can be fitted by some optimisation method. Those predicted clonal distributions can now be directly compared with the data. Hence, the mathematical modelling turns hypotheses that cannot be compared with the data, into predicted clonal distributions that can be directly overlaid with the data (see Fig. 1B).

Now, if we find that a model’s output cannot be fitted to the data, we can reasonably reject the corresponding hypothesis. However, we have to be careful how to interpret a fitting model output. Can we confirm a hypothesis and settle the question about cell fate choice rules if a predicted clonal distribution fits the data? The answer is a clear “no”; a prediction fitting the data does not confirm a hypothesis, since other candidate hypotheses might as well predict the data equally well or better. And other than being the exception, this rather is the rule. On the one hand, over-fitting, as described before, can allow a wrong model to be fitted to the data if it is formulated with too many free parameters.Footnote 3 Hence, if one does not have the knowledge of all parameters in a possibly very complex biological process, the only reasonable option is to simplify the model so far that the number of parameters is low enough to avoid over-fitting. Thus, many aspects of an initially complex model may need to be neglected (for stochastic models, this simplification means essentially that neglected features are assumed to be random and unbiased, which is covered by a stochastic model’s random noise). On the other hand, often more than one model can fit the data, even if the number of parameters is sufficiently low, since some features that distinguish models may not affect the predictions at all. This phenomenon is often called “universality”, which we discuss in the following section.

Universality: Curse and Opportunity

Universality is the phenomenon that different models can sometimes generate the same predictions with respect to a certain type of data, if some quantities, like mean values or passed time, are sufficiently large [27]. Models which yield the same predictions have some common features, called “predictive relevant”, but may differ substantially in others, called “predictive irrelevant” features (notably, predictive irrelevant features may yet be biologically relevant). Models that differ only in predictive irrelevant features, i.e. yielding the same predictions, can be categorised in one “universality class”, while those that differ in predictive relevant features belong to different universality classes. This has the unfortunate consequence that hypotheses that correspond to models of the same universality class will fit the data equally well and thus cannot be distinguished when the corresponding models are tested against that data. From this also follows that a fitting model does not mean that it is the “correct” model if any of the predictive irrelevant features are biologically relevant for the posed biological question, since any other model of the same universality class, but which may differ in biologically relevant, yet predictive irrelevant features, could fit the data as well.

Universality can have several origins:

Weak convergence [33]: for stochastic processes—which model some degree of randomness—the phenomenon of “weak convergence” means that they generate statistics that converge over time, or if mean numbers are large, to the same limiting distributions, if the predictive relevant features are the same. The most common of these universal limiting distributions is the normal distribution. There is a vast number of random numbers and stochastic processes which all produce a normal distribution and thus are of the same universality class; only few predictive relevant features must be fulfilled for this: (1) the final outcome of the process is a sum of individual steps/random quantities, and (2) the mean value and variance of each step are bounded [34]. Furthermore, the number of steps (interpreted as time steps in a stochastic process) must be large. Notably, any statistical features of individual step sizes, beyond the boundedness of mean and variance, are predictive irrelevant and do not affect outcomes, if the number of steps is large.

Non-dimensionalisation: in both stochastic and deterministic models, quantities and parameters contain physical units, and these units can be arbitrarily chosen. For example, instead of using “seconds” as time unit, one may want to choose the inverse of the cell division rate as time unit, whereby the cell division rate becomes trivially “one division per time unit”. This can be done with other parameters as well, which thus become predictive irrelevant. By non-dimensionalisation,Footnote 4 several different models may actually map to the same non-dimensionalised model, with a common prediction, and those thus form the same universality class.

Universality of critical phenomenaFootnote 5: complex systems with many interacting components may display critical phenomena, like phase transitions, (e.g. liquid to gas or liquid blood that becomes a solid blood clot). Sufficiently close to the critical points, many models that differ in some features—i.e. the predictive irrelevant features—predict the same functional behaviour of the quantities describing the collective properties of those systems [35, 36]. The predictive relevant features are usually very few and often categorical, for example, which quantities are conserved, what symmetries prevail, and whether the configurations of the system are continuous or discrete (countable in integer numbers).

As an example, consider two cell fate models in homeostasis, as depicted in Fig. 1C. In model 1, a stem cell (S) divides, and upon this division, the daughter cells irreversibly choose their fate, to either remain a stem cell until the next division or to commit to differentiation (C). In model 2, cell divisions are constrained to be always asymmetric, with one cell remaining a stem cell (S) and the other one being primed for differentiation (D), while the cell types may also change independently of cell division, in a reversible way, that is, an S-cell can become a D-cell and a D-cell can reverse to become an S-cell again [30]. Despite these fundamental differences, both models predict the same clone size distribution (Fig. 1C, bottom). Why is this, and what are the predictive relevant features those models share? To answer these questions requires some mathematical analysis, on which we will elaborate later (see also a detailed analysis in Ref. [32••]).

Besides structural features of models, the parameters of a model can be predictive relevant or irrelevant. (Predictive) Irrelevant parameters are those which do not change the model predictions at all when changed under conditions where universality prevails; relevant parameters are those which affect the model predictions. However, a minimal set of relevant parameters does not necessarily include the plain model parameters; often, predictive relevant parameters are the product or ratio of plain model parameters rather than the parameters themselves. For example, in model 1 of Fig. 1B, the predicted distribution of C-cells depends only on the ratio of division rate and terminal differentiation rate, not explicitly on the individual parameters themselves[20•].Footnote 6 If one has found a best set of parameter values, then doubling both the division rate and the terminal differentiation rate leads to the same best fit, and thus, the “true” best set of parameters is not identifiable [37].

But universality also provides opportunities and thus may be a desired property: if we only wish to distinguish predictive relevant features, and accept for now that we cannot distinguish the predictive irrelevant ones, we do not need to test all models of a universality class but can choose the simplest model—having the lowest number of parameters and being the easiest one to evaluate and analyse—as a representative of that class and thus simplify the whole modelling campaign substantially. Since the predictive features are often categorical, the number of universality classes is usually very small, and thus, only a small number of models, one representative of each universality class, need to be tested. Furthermore, universality is to some extent essential for model testing: models always require some degree of simplification. Universality allows simplifications, that is, negligence of predictive irrelevant features, without compromising the predictive accuracy of a hypothesis/model. Without universality, that is, if all features were predictive relevant, every simplification would lead a model to deviate in its predictions from the data, and even a reasonably “true” model—when subject to some technically necessary simplifications—would not fit the data. This is usually not desired, since simplifications, and be it just for technical reasons, are often essential to evaluate models properly.

Could we overcome the limitations posed by universality? Universality emerges in view of the type of data and the circumstances under which it is collected; other types of data or a change of experimental settings may render certain predictive irrelevant features relevant and thus distinguishable. One could therefore try to obtain richer data with more features. For example, when assessing cell fate choices, one could try to directly observe them through intra-vital live imaging, to gain the time dimension as feature of the data, and with this, further details of the cell fate choices could be distinguished. While such experiments are possible in some circumstances (for example, to observe live cell fate choices in mouse epidermis [38]), they are more expensive in terms of money and effort, more invasive, or not possible in many tissues and situations. On the other hand, universality does not always emerge: usually only in limiting cases, e.g. when experiments are run over longer time scales or when numbers (such as clone sizes) are large, properties are genuinely universal [33]. When data is collected from experiments after shorter time scales or when numbers are smaller, for example, short-term cell lineage tracing after few cell divisions [28, 39], the data, and related model outputs, are not universal, and more features could, in principle, be distinguished. However, this may lead to a trade-off one wishes to avoid: while model details are easier to distinguish for short-term data, reasonable and necessary simplifications to the model may lead to undesired deviations.

To summarise, there is no one-size-fits-all solution, and a lot of intuition is needed to balance the trade-offs between the opportunities and limitations of universality: on the one hand, one wishes to distinguish a sufficient number of features, i.e. having them predictive relevant; on the other hand, one wishes to simplify the models as much as possible, by neglecting predictive irrelevant features. Ideally, the predictive relevant features are the same as the ones relevant to the biological question; this cannot be assured, but appropriate choices of experimental settings can adapt universal features for our purposes, at least to some extent. Unfortunately, the predictive (ir-)relevant features which define the universality classes are often not known beforehand. Then, we may need to travel down the rocky route and follow the classical scientific method, according to K. Popper: come up with a set of all plausible hypotheses and test the corresponding models for all of them; reject those hypotheses which cannot be brought in accordance with the data through fitting, while those that fit (possibly more than one model) may then constitute the universality class of the “true” model. Without prior knowledge about universality classes, however, the number of candidate models to test could be extremely large and arbitrarily complex. Hence, in order to optimise a modelling approach, it is essential to gather some a priori knowledge about the universality classes and their predictive relevant features. This can only be obtained by a mathematical analysis of candidate models’ properties beforehand, as described in the following section.

留言 (0)

沒有登入
gif