Bayesian stopping

Stopping rules in statistics are rules for determining when a scientist can or should stop collecting data and make an inference. These rules have been mostly discussed in the context of classical (frequentist) statistics, in particular in relation to null hypothesis significance testing (NHST). Yet recently, these rules have started to garner attention within Bayesian statistical circles as well. In the world of Bayesian statistics, the purpose of data collection is often to help concentrate the probability mass that, conditional on our current knowledge and in relation to our current purposes, is spread out too thinly across the relevant parameter space.1 A common suggestion is that data collection can cease when sufficient probability mass has condensed within a small enough region of that space.

The question of how concentrated probability should be before we can stop has, in the discussion so far, been understood in a purely quantitative way, focusing exclusively on the amount of probability to be found in a given region of parameter space. Taking our cue from the literature on acceptance rules in formal epistemology, we argue that qualitative aspects of a probability distribution may also carry important information as to whether stopping is warranted. As will be explained, a purely quantitative approach to stopping parallels a proposal in formal epistemology to the effect that a proposition is acceptable precisely if it is sufficiently probable. More recently, formal epistemologists have advanced acceptance rules that take a more structural look at probability assignments, by focusing on relations among probabilities rather than on the probability of a single proposition. We argue that, in Bayesian statistics, there can be benefits to similarly taking into account not simply how much probability is amassed in a region of interest but also how that probability is concentrated, in the sense of what the structure or shape of the distribution in the given region is. For instance, a sharp peak in a posterior distribution can sometimes provide a clear indication of the value of a parameter, even if, according to the quantitative proposal, it would be premature to stop. This paper works out the structural alternative and compares it with a purely quantitative stopping rule advanced in John Kruschke’s seminal book on Bayesian data analysis (Kruschke, 2015). In addition, we discuss the Bayes factor approach to stopping, which has gained popularity in recent years but is not the central focus of this paper.

Kruschke compares his proposal with other stopping rules along two dimensions, viz., speed and accuracy. The former is important because we do not want to waste resources on experiments that are unlikely to have any further impact on our view, or delay implementing a policy or recommending a treatment if the experimental results clearly indicate that the policy/treatment will be beneficial; the latter (i.e., accuracy) is important because we do not want to stop too early, ending up with a mistaken view of the matter under study and consequently perhaps implementing policies or recommending treatments that are going to do more harm than good. The same dimensions are used in comparing our new proposal with the two alternatives mentioned above.

Section 2 describes present thinking on stopping in the Bayesian community, the main focus being on Kruschke’s contributions to this debate. Section 3 shows how applying to statistics some recent insights from the philosophical literature on rational acceptability suggests a structural Bayesian stopping rule that is interestingly different from the purely quantitative approach to stopping advocated by Kruschke. Using computer simulations, Section 4 compares this structural rule with Kruschke’s as well as with the Bayes factor approach in terms of speed and accuracy, finding that the three proposals make different speed–accuracy trade-offs. Because different contexts of use may call for different trade-offs, the conclusion of this paper will not be that one proposal can be declared the clear winner but rather that which rule is called for may depend on the context of use.

留言 (0)

沒有登入
gif