Computational modeling has helped cognitive scientists, psychologists, and neuroscientists to quantitatively test theories by translating them into mathematical equations that yield precise predictions (Palminteri et al., 2017, Wilson and Collins, 2019). Cognitive modeling often requires computing how well a model fits to experimental data. Measuring this fit – for example, in the form of model evidence (Kass & Raftery, 1995) – enables a quantitative comparison of alternative theories to explain behavior. Measuring model fit to the data as a function of model parameters helps identify the best-fitting parameters for the given data, via an optimization procedure over the fit measure (typically negative log-likelihood) in the space of possible parameter values. When fitted as a function of experimental conditions, model parameter estimation can help explain how task manipulations modify cognitive processes (Eckstein et al., 2022); when fitted at the individual level, estimated model parameters can help account for individual differences in behavioral patterns (Lee & Webb, 2005). Moreover, recent work has applied cognitive models in the rapidly growing field of computational psychiatry to quantify the functional components of psychiatric disorders (Huys, Browning, Paulus, & Frank, 2021). Importantly, cognitive modeling is particularly useful for explaining choice behavior in decision-making tasks – it reveals links between subjects’ observable choices and putative latent internal variables such as objective or subjective value (Tversky & Kahneman, 1992), strength of evidence (Bitzer, Park, Blankenburg, & Kiebel, 2014), and history of past outcomes (Dayan & Niv, 2008). This link between internal latent variables and choices is made via a policy: the probability of making a choice among multiple options based on past and current information.
An important feature of choice behavior produced by biological agents is its inherent noise, which can be attributed to multiple sources including inattention (Esterman and Rothlein, 2019, Warm et al., 2008), stochastic exploration (Wilson, Geana, White, Ludvig, & Cohen, 2014), and internal computation noise (Findling & Wyart, 2021). Choice randomization can be adaptive, as it encourages exploration, which is essential for learning (Sutton & Barto, 2018). Exploration can come close to optimal performance if implemented correctly (Chapelle and Li, 2011, Thompson, 1933, Wang and Wilson, 2018). However, the role of noise is often downplayed in computational cognitive models, which usually emphasize noiseless information processing over internal latent variables – for example, in reinforcement learning, how the choice values are updated with each outcome (Daw & Tobler, 2014). A common approach to modeling noise in choice behavior is to include simple parameterized noise into the model’s policy (Wilson & Collins, 2019). For example, a greedy policy, which chooses the best option deterministically, can be “softened” by a logistic or softmax function with an inverse temperature parameter, β, such that choices among more similar options are more stochastic than choices among more different ones. Another approach is to use an ϵ-greedy policy, where the noise level parameter, ϵ, weighs a mixture of a uniformly random policy with a greedy policy. This approach is motivated by a different intuition: that lapses in choice patterns can happen independently of the specific internal values used to make decisions. Multiple noise processes can be used jointly in a model when appropriate (Collins & Frank, 2012).
Failure to account for a noisy choice process in modeling could lead to under- or over-emphasis of certain data points, and thus inappropriate conclusions (Nassar and Frank, 2016, Schaaf et al., 2019). However, commonly used policies with noisy decision processes share strong assumptions. In particular, they typically assume that the level of noise in the policy is fixed, or “static”, with regards to some learning variable (e.g., trial for ϵ-greedy and value difference between choices for softmax), over the duration of the experiment, with some exceptions reviewed by Schulz and Gershman, 2019, Yechiam and Busemeyer, 2005 further described in Discussion. This static assumption could hold for some sources of noise, such as computation and some exploration noise, but many other sources are not guaranteed to generate consistent levels of noise. For instance, a subject might disengage during some periods of the experiment, but not others. Therefore, existing models with static noise estimation might fail to fully capture the variance in noise levels, which can impact the quality of computational modeling.
To resolve this issue, we introduce a dynamic noise estimation method that estimates the probability of noise contamination in choice behavior trial-by-trial, allowing it to vary over time. Fig. 1A illustrates examples of static and dynamic noise estimation on human choice behavioral data from Eckstein et al., 2022, Master et al., 2020. The probabilities of noise inferred by models with static and dynamic noise estimation are shown in conjunction with choice accuracy. In this example, choice accuracy drops steeply to a random level (0.33) around Trial 350, indicating an increased probability of noise contamination. This change is captured by dynamic noise estimation but not the static method.
Our dynamic noise estimation method makes specific, but looser assumptions than static noise estimation, making it suitable to solve a broader range of problems (Fig. 1B). Specifically, a policy with dynamic noise estimation models the presence of random noise as the result of switching between two latent states – the Random state and the Engaged state – that correspond to a uniformly random, noisy policy and some other decision policy assuming full task engagement (e.g., an attentive, softmax policy). We assume that a hidden Markov process governs transitions between the two latent states with two transition probability parameters, TRE and TER, from the Random to Engaged state and vice versa. Note that static noise estimation can be formulated under the same binary latent state assumption, with the additional constraint that the transition probabilities must sum to one, making it a special case of dynamic noise estimation (see Materials and methods for proof). The hidden Markov model of dynamic noise estimation captures the observation that noise levels in decision-making tend to be temporally autocorrelated, which may be a reflection of an evolved expectation of temporally autocorrelated environments (Group et al., 2014).
We show that noise levels can be inferred dynamically trial-by-trial in multi-trial decision-making tasks, using a simple, step-by-step algorithm (Algorithm 2). On each trial, the model estimates the probability of the agent being in each latent state using observation, choice, and (if applicable) reward data. It estimates the choice probability as a weighted average of decisions generated by the Random policy and the Engaged policy, which is then used to estimate the likelihood. Therefore, dynamic noise estimation can be incorporated into any decision-making models with analytical likelihoods. Model parameters can be estimated using procedures that optimize some likelihood metric, including maximum likelihood estimation (Fisher, 1922) and hierarchical Bayesian methods (Piray, Dezfouli, Heskes, Frank, & Daw, 2019).
留言 (0)