Diffusion models with time-dependent parameters: An analysis of computational effort and accuracy of different numerical methods

Mathematical models describing the time course of decision-making have become an increasingly valuable tool in psychology and neuroscience. One of the most successful variants of such models is the drift-diffusion model (DDM; Ratcliff, 1978), a class of mathematical models for binary decisions (for reviews, see Forstmann et al., 2016, Ratcliff et al., 2016, Schwarz, 2022). This section begins with a description of the standard DDM and then introduces two cases where the parameters of the DDM are time-dependent, that is, they vary with the progression of time in an experimental trial. Based on this, we then lay out the purpose of the present paper in more detail and sketch the outline of its remainder.

The general idea behind the DDM is that the cognitive system accumulates noisy evidence for one or the other of two response options over time (Fig. 1) . If there are two response options, a selection occurs once the evidence exceeds one of the respective thresholds that are denoted here as b and −b for response options 1 and 2, respectively. Evidence accumulation starts at zero without a bias toward one response option, and is driven toward one threshold with a constant increase per time step, referred to as the drift rate μ. The accumulation process is noisy, and usually modeled as a Wiener process. Because of the noise, the entire diffusion process exceeds the correct threshold at a random point in time, and can also result in an erroneous decision when exceeding the incorrect threshold. While these parameters cover the decision part of the DDM, perceptual and motor processes are captured by the residual (or non-decision) time, which is added to the decision time. The full model comprises several additional parameters. In particular, the starting point, the residual time, and the drift rate can vary from trial to trial, and the starting point can be biased toward one or the other threshold (for overviews of the full model, see Ratcliff, 1978, Voss et al., 2013, Voss et al., 2015). The thresholds (b,−b) and the drift rate μ are assumed to be constant within a trial (although, as mentioned, they can vary across trials). In other words, both parameters are time-independent (or stationary), because their value does not depend on the progression of time within a single trial.

While this stationary assumption simplifies mathematical tractability and is sufficient to model many psychological phenomena, variants of the DDM with time-dependency of these parameters are motivated by additional psychological phenomena. For example, Heath (1992) considered that the drift rate may diminish with increasing processing time. Moreover, Heath also showed how a time-dependent drift rate could arise within the cascade model by McClelland (1979). In the following, however, we will describe more recent modeling work that employed time-dependent changes of the drift rate and time-dependent thresholds.

One example for time-dependency of the drift rate μ(t) is the Diffusion Model for Conflict (DMC) tasks (Ulrich et al., 2015). This model was particularly developed to account for positive and negative delta functions in typical conflict tasks. Consider, for example, an Eriksen flanker task (Eriksen & Eriksen, 1974). In this task, a centrally presented imperative stimulus requires a left or right manual key press, while it is surrounded by flankers that either demand the same response (in congruent trials) or the other response (in incongruent trials). Response times (RTs) are shorter and less error-prone in congruent than in incongruent trials, what is typically referred to as the congruency effect. This congruency effect usually becomes larger with longer RTs, that is, a positive delta function is observed (see de Jong et al., 1994, Pratte et al., 2010, Schwarz and Miller, 2012).

The Simon task (Simon, 1969) is another conflict task where participants are to respond to a stimulus (e.g., the identity of a letter or a color), which is presented either in a left or right location. Here, participants respond faster when stimulus location and (correct) response location match (in congruent trials) than when they mismatch (in incongruent trials). However, in this case, the congruency effect becomes smaller with longer RTs, that is, a negative delta function is observed (e.g., Pratte et al., 2010).

Motivated by dual-process accounts for congruency effects, the DMC explains different delta functions by assuming that two diffusion processes are superimposed: (1) a linear drift function representing controlled response selection and (2) a pulse function toward the same response in congruent trials and to the alternative response in incongruent trials (see Fig. 1b for an illustration). The latter process represents the automatic activation induced by the task-irrelevant stimulus features in a conflict task (e.g., the flankers or the location of the stimuli).

The superimposed process tends to hit the upper threshold earlier when the automatic activation turns toward the positive direction (in congruent trials) compared with the case of an activation into the negative direction (in incongruent trials). This reflects the congruency effect (see Fig. 1b). In technical terms, the superimposition results in a time-dependent drift rate, because the drift rate changes with increasing time during a trial. Importantly, different delta functions can also result: The superimposition of the two diffusion processes results in a negative delta function, when the automatic activation reaches its maximum rather early, but it results in a positive delta function, when the maximum is reached later.

Another example of time-dependent drift rates is the Shrinking Spotlight Model (White, Ratcliff, & Starns, 2011) that addresses conflict processing in the flanker task. This model also accounts for the observation that responses with short RTs are especially error-prone in incongruent trials. Such a result suggests that the flankers exhibit their influence especially during the early phases of processing. The model accounts for this by assuming that attention is distributed broadly at the onset of a trial and shrinks over time, thereby decreasing the influence of the flankers the more time has progressed. For a congruent trial, the resulting drift rate is time-independent. In contrast, for incongruent trials, the resulting drift rate is time-dependent and increases with time, as more and more information is processed and the influence of the flankers becomes smaller.

The second parameter that may change with time is the threshold b (Fig. 1c). Arguably, in most of the applications of the DDM, thresholds are assumed to be constant within a trial. However, models in which thresholds vary with time—most often they are assumed to collapse with increasing time—have been suggested to account for some phenomena (e.g., Churchland et al., 2008, Ditterich, 2006a, Evans et al., 2020, Hanks et al., 2011). Collapsing thresholds were even suggested to account for slow errors, thereby rendering the drift rate between-trial variability as unnecessary (e.g., Ditterich, 2006b, Palmer et al., 2005).

Further, a recent study suggested that different experimental methods to induce speed-accuracy tradeoffs affect different parameters of a DDM. More precisely, Katsimpokis, Hawkins, and van Maanen (2020) suggested that instructions affect the initial separation of thresholds, while response deadlines affect the rate of their collapse over time. A very similar idea, though for an accumulator model, was suggested to account for results obtained with free-choice tasks. These are tasks where the stimulus does not request one particular response, but where the actor should choose among a set of response options (e.g., Berlyne, 1957, Janczyk et al., 2020). Based on results from priming experiments, Mattler and Palmer (2012) suggested that the decision criterion relaxes in free-choice trials, an assumption similar to a collapsing threshold in a DDM.

It is, however, not clear though whether time-dependent thresholds are psychologically plausible. Moreover, at least three studies (Evans et al., 2020, Hawkins et al., 2015, Voskuilen et al., 2016) concluded that collapsing thresholds add little if any improvement to model fit when compared to a DDM with time-independent thresholds. For example, Voskuilen et al. (2016) used a hyperbolic ratio function to model a decrease of the thresholds over time (see Fig. 1c).1 The best estimates for its parameters resulted in a shape resembling a time-independent threshold (see their Fig. 5).

Yet, there may be tasks and situations where collapsing thresholds are more likely to improve model fit. One such situation are difficult tasks with a very small drift rate, thus resulting in long RTs. While the DDM has mainly been applied to rather simple tasks with short RTs, it appears also viable for longer RTs (Lerche & Voss, 2019). In this case, the process will eventually hit a threshold, but this may take much too long with a time-independent threshold (e.g., Voss, Lerche, Mertens, & Voss, 2019). As an additional example, expanded judgment or deferred decision-making tasks are cases where decisions are made based on increasingly smaller amounts of evidence as time progresses (e.g., Busemeyer & Rapoport, 1988), and data from highly practiced participants may better be accounted for with collapsing thresholds as well (Hawkins et al., 2015).

In summary, it is controversial whether or not time-dependent thresholds are valuable to the DDM. Although this controversy is not yet settled, the present article nonetheless includes time-dependent thresholds into its analyses.

The present article reports a comparison of different methods to solving the underlying mathematics of diffusion models, particularly with time-dependent parameters. More precisely, we investigate, on the one hand, the direct simulation of the model using the stochastic Euler method. On the other hand, we analyze the integral equation method as well as different methods based on a reformulation of the model into partial differential equations (PDEs), such as the matrix method of Diederich and Busemeyer (2003). We focus on the computational efficiency of the different methods; thus we present a theoretical analysis of how much more effort is required for a desired error reduction and investigate the computational times required for the different implementations. In this course, we highlight the critical differences between these methods, their potential advantages and shortcomings, and present solutions to some known issues with particular methods.

Shinn, Lam, and Murray (2020) describe and compare different approximation methods for diffusion models, and the authors also introduce a Python software package that allows for highly efficient computations in a flexible environment. They further discuss the versatility of the approaches, for example, when applied to time-dependent parameters. While their paper focuses on the practical implementation of the methods, the emphasis of our work is to present the theoretical foundations and to explore the significance of the mathematical background for the application.

The remainder of this article is structured in the following way. Section 2 formalizes the DDM and introduces stochastic simulations, random walks, and—with a particular focus—the Kolmogorov Forward Equation (KFE) and the Kolmogorov Backward Equation (KBE) as methods to implement the diffusion model. Section 3 introduces several methods of discretizations of the aforementioned methods with emphasis on the quality of the approximations and on the computational cost of these numerical approaches. We also introduce a new approach which is based on the discretization of the KFE and a realization that is able to handle time-dependent drift rates and thresholds without a drop in efficiency (see in particular Section 3.3 and Appendix A.1). The practical dependence of computational costs (or efficiency) on the desired level of error reduction is investigated in Section 4. In Section 5, we fit our approach to empirical data gathered in the context of DMC, that is, a model with time-dependent drift rates (Ulrich et al., 2015). Section 6 summarizes the conclusion of the present work, including advice for researchers using DDMs with time-dependent parameters. The source codes to reproduce all computations are available as a Zenodo repository (Richter et al., 2023).

Remark 1 Terminology: Model, Method, and Discretization

In the following, it is necessary to distinguish between model, method, and discretization. A model describes a cognitive process and is formulated as a stochastic differential equation (i.e., Equation  (1)). From this model, we deduce the first-passage time distribution, and to do so, different methods are available, that is, stochastic simulations, the KFE (6), the KBE (7), the integral equation (3), and random walks (see Section 2.4). All four methods can be used to compute the predicted probability density function (PDF) of the first-passage times, and if they were solved exactly, they would yield the identical result.

This, however, is not possible in the general case as it would require infinite computational resources. Hence, for each method, different discretizations are possible. For example, the trapezoidal rule (20) is used in approximating the integral equation and finite difference discretizations are applied to the KBE and KFE. Depending on the method and on the specific situation (e.g., time-dependency of parameters), one approach will be superior to the other. A discussion of the subtle differences is the content of the following sections.

留言 (0)

沒有登入
gif