On the Reality of the Base-Rate Fallacy: A Logical Reconstruction of the Debate

3.1 Two Objections: Premise Mismatch

Full rationality on the part of the experimental subjects could be rescued by arguing that there is a mismatch between the premises identified by the experimenters and those endorsed by the participants, i.e. by challenging the second auxiliary hypothesis of our target argument. In this section, we will take into consideration two objections to the diagnosis of irrationality, both of which can be construed as counterarguments that run along these lines.

According to Gigerenzer et al. (1988); Gigerenzer (1991), participants would not be committed to the (implicit) assumption that the cab involved in the accident was randomly sampled from the population of cabs in the cityFootnote 2. Therefore, their a priori confidence that the cab in question was blue would not be constrained by the proportion of blue cabs specified in the taxi cab problem. This follows from the principle of direct inference (PDI), which is a supplement to the central tenets of Bayesianism (i.e., probabilism and conditionalization). Indeed, according to this principle, base rates should constrain one’s prior probability that an object x has a property P if and only if (a) one has the information that x has been selected at random from a certain reference class of objects Xs and that some proportion of Xs share the property P, and (b) one lacks any other relevant information. If participants were not committed to the randomness assumption, then condition (a), which is crucial for PDI to be applicable, would not be fulfilled and consequently it would be rational for them not to apply Bayes’ theorem to the base rates specified in the problem.

Moreover, participants would not have enough information to infer that, for any distribution d of colours in the relevant subset of the population of cabs from which − they would suppose − the cab has not been randomly selected, d is more likely than other distributions. Therefore, it would be rational for them to assign a value of 0.50 to the prior probability that the cab was blue (and hence also to the prior probability that it was green), thereby applying one version of so-called principle of insufficient reason (PIR). And if the inputs of Bayes’ theorem are 0.50 and 0.50 as far as prior probabilities are concerned, and 0.80 and 0.20 as far as likelihoods are concerned, then its output (that is, the posterior probability that the cab was blue given that the witness identified it as blue) is exactly equal to 0.80.

In short, if the natural language premises P accepted by participants are mapped onto the formal counterparts P** (alternative to P*), where

$$ P^ = p_\!\text : p(h) = p(\sim h) = 0.50\\ \qquad \quad \text equal prior probabilities.}\\ p_\!\text = p_\!\text \end\right. } $$

then

$$ p(h|e) = \frac = 0.80 $$

This objection raised against the diagnosis of irrationality can be put to empirical test as follows. It implies that if random sampling is made salient (i.e., verbally asserted, or better, visually observed and learned by randomly drawing from an urn of items representing cabs), then participants will apply Bayes’ theorem to the base rates specified in the taxi cab problem.

However, underweighting of base rates was demonstrated in several studies in which participants actually drew random samples from a target population (see Kahneman and Tversky 1996). Furthermore, in studies of this kind participants’ attention is manipulated with the intended effect of making them focus on the randomness assumption. But it is not entirely clear whether this is achieved or, as a result of such manipulation, they focus instead on base rates. Hence, it is not clear whether the objection at issue here is tested or not.

In addition, from the proposal put forward by Gigerenzer, Hell and Blank, it follows that participants will not apply Bayes’ theorem to the base rates specified in the cab problem if they are presented with a variant of the problem in which random sampling is not made salient and individuating information are absent. If so, then participants will exhibit base-rate neglect if they are presented with the following version of the problem, which is adapted from Lyon and Slovic (1976); Tversky and Kahneman (1982):

A cab was involved in a hit-and-run accident at night. Two cab companies, the Green and the Blue, operate in the city. You are given the following data:

(a)

85% of the cabs in the city are Green and 15% are Blue.

What is the probability that the cab involved in the accident was Blue rather than Green?

However, the median and modal answer given by participants to this question is 0.15. Apparently, in this case base rates are taken seriously into account. One could stick to Gigerenzer, Hell and Blank’s proposal, and hence argue that participants would not have enough information to infer that the actual distribution of colours in the relevant subset is different from the distribution of colours in the population of cabs from which − they would suppose − the cab has not been randomly selected. As a consequence, it would be rational for them to assign a value of 0.15 to the probability that the cab was blue, thereby applying another version of PIR.

Nevertheless, it seems to be an ad hoc move in defence of full rationality to postulate a tendency on the part of the experimental subjects to employ different versions of the same principle in different contexts, without determining in advance when and why one version of PIR is preferred over another or, in other words, when and why base-rate information are equal to prior probabilities.

A second objection that hinges on a premise mismatch between experimenters and participants was raised by Macchi (1995); Macchi and Bagassi (2007); Wolfe (1995). According to all of them, base-rate neglect would result from so-called inverse fallacyFootnote 3, i.e. confusion between the conditional probability p(e|h), which is the intended meaning of individuating information, and the inverse probability p(h|e), which is the requested posterior probability.

However, they disagree on the source of this confusion. In Macchi and Bagassi’s view, it would depend on the ambiguity of individuating information: “[...] the witness made correct identifications in 80% of the cases and erred in 20% of the cases”. Indeed, the word “cases” could refer to the cabs shown, the colour of which is correctly and incorrectly identified by the witness 80% and 20% of the time respectively, or to the testimonies given by the witness, which correctly and incorrectly report the colour of cabs 80% and 20% of the time respectively. Moreover, complementarity between the percentages specified in the text (80% and 20%) makes it easier to confuse p(e|h), which may not be complementary to p(e|\(\sim \)h), with p(h|e), which is complementary to p(\(\sim \)h|e). In Wolfe’s view, instead, it would depend on people’s intrinsic inability to understand the difference between the two conditional probabilities p(e|h) and p(h|e).

If participants were victims of this semantic confusion and interpreted individuating information as the requested posterior probability, either due to their cognitive limitations or to ambiguity in the text, then it would be rational for them to neglect base rates.

In short, if natural language premises P accepted by participants are mapped onto the formal counterparts P*** (alternative to P*), where

$$ P^ = p_\!\text = p_\!\text \\ p_\!\text : p(h|e) = 0.80 \ne 0.20 = p(\sim h|e)\\ \qquad \qquad \text not \text \end\right. } $$

then

$$\begin p(h|e) = 0.80 \end$$

This objection to the diagnosis of irrationality can be put to empirical test as follows. Macchi and Bagassi’s version of such counterargument implies that if the ambiguous word “cases” is removed and the percentages given in the text are replaced by others that are not complementary, then participants will not exhibit base-rate neglect. Therefore, if individuating information are stated, for instance, like this: “[...] the witness recognized as Blue 80% of the Blue cabs and mistook 40% of Green cabs for Blue ones” (Macchi 1995), participants will not confuse p(e|h) with p(h|e) and will accordingly take into account base rates. However, current evidence is mixed: this prediction is confirmed by experimental results presented in Macchi (1995), but it is disconfirmed by results presented in Villejoubert and Mandel (2002). The latter results show that “even though the formulation of diagnostic [i.e. individuating] information was in line with Macchi’s (1995) recommendations for reducing the [confusion between p(e|h) and p(h|e)] [...], 51% of participants made almost all their judgements in accordance with the inverse fallacy [and consequently neglected base rates]” (Villejoubert and Mandel 2002: 177).

Similarly, Wolfe’s version of this counterargument implies that participants will not exhibit base-rate neglect if they are tutored on the meaning of individuating information, and hence are trained to distinguish p(e|h) from p(h|e). This prediction is partially confirmed by experimental results presented in Wolfe (1995), which show that “the mean absolute difference between [participants’] [...] answers and those produced by Bayes’ theorem [was significantly reduced immediately after receiving] [...] written and pictorial tutorials on the nature of the hit-rate [i.e., individuating information], [but participants] [...] did not demonstrate transfer of training [effects that remained stable over time]” (Wolfe 1995: 99-100).

Furthermore, from the suggestion made by Macchi and Bagassi, it follows that, provided that individuating information are disambiguated as shown above, if participants presented with the taxi cab problem are requested to estimate not only p(h|e) but also p(\(\sim \)h|e), they will assign complementary values to these probabilities, thereby complying with so-called additivity principle (AP) of CPT.

A similar consequence follows from Wolfe’s suggestion: on condition that participants are suitably trained to distinguish p(e|h) from p(h|e), if they are asked to estimate p(h|e) and p(\(\sim \)h|e), they will assign complementary values to these probabilities and comply with AP, even when p(e|h) \(+\) p(e|\(\sim \)h) \(\ne \) 1. Nevertheless, neither of the above-mentioned predictions has been tested yet.

To sum up, it appears that the two objections to the diagnosis of irrationality formulated by Tversky and Kahneman (and in particular to their second auxiliary assumption) which have been discussed in this section are not well supported by the empirical evidence collected so far.

3.2 Four More Objections: Response Misunderstanding

Full rationality on the part of the experimental subjects could also be rescued by arguing that participants’ responses are misunderstood by the experimenters, i.e. by challenging the third auxiliary hypothesis of our target argument. In this section, we will take into account four objections to the diagnosis of irrationality, all of which can be construed as counterarguments that run along these lines.

According to Levi (1981, 1983, 1996) and Cohen (1981)Footnote 4, participants would estimate the probability that the cab in question was blue conditional on the fact that the witness reported it as blue and on the fact that it was involved in an accident. If accident information was taken by participants to be relevant, then condition (b) of the principle of direct inference (PDI) presented in Section 3.1 would not be met. As a consequence, the base rates specified in the taxi cab problem should not constrain one’s prior probability that the cab involved in the accident was blue, and would be rationally neglected by participants.

Moreover, participants would not have enough information to infer that, for any distribution d of colours in the reference class of cabs in the city involved in accidents, d is more likely than other distributions. Therefore, it would be rational for them to assign a value of 0.50 to the prior probability that the cab in question was blue (and hence also to the prior probability that it was green), thereby applying one version of the principle of insufficient reason (PIR). The output of Bayes’ theorem (p(h|e \(\wedge \) b), namely the posterior probability that the cab was blue given that the witness identified it as blue and given that it was involved in an accident) is precisely equal to 0.80, if its inputs are 0.50 and 0.50 as far as the relevant prior probabilities p(h|b) and p(\(\sim \)h|b) are concerned, and 0.80 and 0.20 as far as the relevant likelihoods p(e|h \(\wedge \) b) and p(e|\(\sim \)h \(\wedge \) b) are concerned.

In short, the natural language response r given by participants is mapped onto the formal counterpart r** (alternative to r*), where

$$\begin r^\! :\! p(h|e \wedge b)\! =\! \displaystyle \frac\! = \! \displaystyle \frac= 0.80 \end$$

p(h|b) \(=\) 0.50 \(\ne \) 0.15 \(=\) p(h)

p(\(\sim \)h|b) \(=\) 0.50 \(\ne \) 0.85 \(=\) p(\(\sim \)h)

Hence, base-rate information do not equal relevant prior probabilities.

p(e|h \(\wedge \) b) \(=\) 0.80 \(=\) p(e|h)

p(e|\(\sim \)h \(\wedge \) b) \(=\) 0.20 \(=\) p(e|\(\sim \)h)

Hence, individuating information equal relevant likelihoods.

This objection raised against the diagnosis of irrationality can be put to empirical test as follows. It implies that participants will apply Bayes’ theorem to the base rates specified in the taxi cab problem if they are presented with a variant of the problem in which base-rate information pertaining to the distribution of colours in the set of cabs in the city involved in accidents are provided. If so, then participants will take into account base rates if they are presented with the following version of the problem, which is adapted from Bar-Hillel (1980, 1983) and Tversky and Kahneman (1982)Footnote 5:

A cab was involved in a hit-and-run accident at night. Two cab companies, the Green and the Blue, operate in the city. You are given the following data:

(a’)

although the two companies are roughly equal in size, 85% of cab accidents in the city involve Green cabs and 15% involve Blue cabs;

(b)

a witness identified the cab as Blue. The court tested the reliability of the witness under the same circumstances that existed on the night of the accident and concluded that the witness made correct identifications in 80% of the cases and erred in 20% of the cases.

What is the probability that the cab involved in the accident was Blue rather than Green?

The median (but not the modal) answer given by participants presented with this variant of the problem in which more specific base rates are provided is 0.60. Hence, this prediction is only partially confirmed by data.

In addition, from the proposal made by Levi and Cohen, it follows that participants will not apply Bayes’ theorem to the base rates specified in the cab problem, if information about base rates in the more specific reference class of cabs involved in accidents are not provided and individuating information are absent. If so, then participants will neglect base rates if they are presented with the version of the problem adapted from Lyon and Slovic (1976); Tversky and Kahneman (1982):

A cab was involved in a hit-and-run accident at night. Two cab companies, the Green and the Blue, operate in the city. You are given the following data:

(a)

85% of the cabs in the city are Green and 15% are Blue.

What is the probability that the cab involved in the accident was Blue rather than Green?

However, data suggest that the opposite is the case. Indeed, as shown in Section 3.1, the median and modal answer given by participants to this question is typically 0.15. One could stick to Levi and Cohen’s proposal, and argue that participants would not have enough information to infer that the actual distribution of colours in the reference class of cabs involved in accidents is different from the distribution of colours in the class of cabs. As a consequence, it would be rational for them to assign a value of 0.15 to the probability that the cab in question was blue, thereby applying another version of PIR.

Again, it seems to be an ad hoc move in defence of full rationality to postulate that experimental subjects tend to employ different versions of the same principle in different contexts, without determining in advance when and why one version of PIR is preferred over another or, in other words, when and why base-rate information are equal to relevant prior probabilities.

Importantly, it should be pointed out that if base-rate neglect originates from the fact that base rates specified in the text do not equal prior probabilities − as argued on different grounds by Gigerenzer, Hell and Blank, and by Levi and Cohen − then it will disappear in experimental conditions in which base rates are not provided in the text and prior probabilities are directly elicited from participants. However, this prediction is disconfirmed by results presented in Evans et al. (2002); Rottman (2017); Pighin and Tentori (2021).

Similar considerations hold for individuating information. As shown above, Levi and Cohen did not call into question that individuating information are equal to relevant likelihoods. Nevertheless, this assumption is not indisputable. One could object that participants might think that the witness was less reliable in identifying colours of cabs involved in accidents (as she did on the night of the accident) than in identifying colours of cabs (as she did when tested by the court), for instance because fire accidents make colours less distinguishable. If so, then the inputs of Bayes’ theorem are not necessarily 0.80 and 0.20 as far as the relevant likelihoods are concerned. Therefore, even though the inputs of this theorem are set to be equal to 0.50 and 0.50 as far as the relevant prior probabilities are concerned, its output may be different from 0.80. It seems that one should also determine in advance when and why individuating information equal relevant likelihoods in order not to make any ad hoc move in defence of full rationality on the part of the experimental subjects.

A second objection that hinges on a misunderstanding of participants’ responses was raised (although not explicitly) by Eddy (1982)Footnote 6. According to him, base-rate neglect would result from the inverse fallacy, which he attributed to people’s intrinsic inability to understand the difference between the two conditional probabilities p(h|e) and p(e|h)Footnote 7, just as Wolfe did. But unlike Wolfe, he suggested that participants presented with the taxi cab problem would not grasp the meaning of the question that they are supposed to answerFootnote 8. Hence, they would report the probability that the witness identified the cab in question as blue given that it was blue, which is conveyed by individuating information, instead of estimating the inverse probability that the cab was blue given that the witness identified it as blue, which is the requested posterior probability.

If participants were victims of this confusion due to their cognitive limitations and interpreted the requested posterior probability as individuating information, it would be rational for them to neglect base rates.

In short, the natural language response r given by participants is mapped onto the formal counterpart r*** (alternative to r*), where

r***: p(e|h) \(=\) 0.80

This objection to the diagnosis of irrationality can be put to empirical test as follows. It straightforwardly implies that participants will not exhibit base-rate neglect if they are tutored on the meaning of the requested posterior probability, and hence are trained to distinguish p(h|e) from p(e|h).

Furthermore, from the proposal put forward by Eddy, it follows that, provided that participants receive suitable training so as not to confuse these conditional probabilities, if they are requested to estimate not only p(h|e) but also p(\(\sim \)h|e), they will assign to p(\(\sim \)h|e) a value which is complementary to that assigned to p(h|e), thereby complying with AP, even when p(e|h) \(+\) p(e|\(\sim \)h) \(\ne \) 1. However, neither of the aforementioned predictions has been tested yet.

A third objection that relies on a misunderstanding of participants’ responses was raised by Cohen (1979, 1981). In his view, participants would not grasp the intended meaning of the word “probability”, which is left implicit in the question that they are requested to answer. They would interpret “probability” as Baconian probability, and not as Pascalian (i.e., classical) probability. These two concepts of probability, which would both be legitimate, are constrained by different principles. In particular, there is no analogue of Bayes’ theorem in the theory of Baconian probability. Indeed, Bayes’ rule is clearly not validated by Baconian probability, as it might be the case that “pI(B|A) > 0 even when pI(B) \(=\) 0” (Cohen 1979: 392).

Cohen (1979) provided a brief (and rough) characterization of this non-standard notion of probability, which might partially overlap with that of Bayesian confirmation (introduced by Carnap 1950/1962). However, a thorough comparison of Baconian probability, classical probability and confirmation would be well beyond the scope of the present paper. For our purposes, it suffices to notice that, according to Cohen’s objection to the diagnosis of irrationality, the most common response given by participants presented with the taxi cab problem would have been misclassified as being Pascalian and irrational, rather than as being Baconian and rational.

In short, the natural language response r given by participants is mapped onto the formal counterpart r**** (alternative to r*), where

r****: pI(h|e) \(=\) 0.80

Here, pI stands for a Baconian (or inductive) probability function.

According to Cohen, “we should interpret [participants’] [...] answers, where possible, in whichever of the two ways does not involve them in committing any [...] fallacy” (Cohen 1979: 397), without determining in advance when and why one concept of probability is preferred over another. Apparently, this is an ad hoc move in defence of full rationality on the part of the experimental subjects − see Kahneman (1981); Tversky (1981); Kahneman and Tversky (1983).

Sablé-Meyer et al. (2021); Sablé-Meyer and Mascarenhas (2022) explicitly referred to the notion of Bayesian confirmation when arguing that it is the structure of the problem that causes participants to neglect base ratesFootnote 9. Indeed, according to them, participants would be prompted by pragmatic pressures of the experimental setup to engage in a question-answer reasoning process. It follows that they would interpret the two options “Blue” and “Green” as competing hypotheses about the cab involved in the accident, and the witness report as evidence adduced by the experimenters to discriminate between the two.

To this aim, participants would estimate the degree of confirmation or evidential support brought by the available evidence to the hypotheses. If so, given that c(h, e), namely the degree of support brought by the witness report to the hypothesis that the cab in question was blue, unlike the posterior probability p(h|e) does not crucially depend on the prior probability that the cab was blue, it would be rational for them to exhibit base-rate neglect.

In short, the natural language response r given by participants is mapped onto the formal counterpart r***** (alternative to r*), where

r*****: c(h, e) \(=\) 0.80

Here, c stands for a suitable confirmation measure, for instance the likelihood ratio measure of confirmation whose linear transformation in the bounded interval \([0,+1]\) is defined as followsFootnote 10:

$$ L(h,e) \in [0,+1] = \frac = \frac = 0.80 $$

This objection to the diagnosis of irrationality can be put to empirical test as follows. It implies that if participants are discouraged from thinking that individuating information are meant to be used to choose between the options “Blue” and “Green”, they will take into account base rates. This prediction is confirmed by the results of an experiment conducted by Schwarz et al. (1991), in which participants read information on a computer screen. Computers in 1991 were far less likely to be considered as endowed with any sort of intentional agency, as they did not conform to human standards of communication. Therefore, in this experimental condition individuating information were far less likely to be regarded as cues deliberately given to participants in order for them to discriminate between the two options. As expected, base-rate neglect was significantly reduced.

In addition, from the proposal put forward by Sablé-Meyer, Guerrini and Mascarenhas, it follows that participants will take into account base rates, if the contrastive nature of the question: “What is the probability that the cab involved in the accident was Blue rather than Green?” is suitably made less salient. Nevertheless, current evidence is mixed: this prediction is confirmed by data reported in Sablé-Meyer et al. (2021), but it is disconfirmed by data reported in Mangiarulo et al. (2021). The latter data show that “participants’ judgements [are] [...] deeply affected by impact [i.e. confirmation] [and hence base rates are neglected]” (Mangiarulo et al. 2021), even though the contrastive nature of questions like: “What is the probability that this circle is Black rather than White?” is suitably de-emphasized.

To sum up, it appears that the four objections raised against Tversky and Kahneman’s diagnosis of irrationality (and in particular against their third auxiliary assumption) which have been discussed in this section are not well supported by the available empirical evidence.

At this point, it should be noted that, to the best of our knowledge, the objections considered so far have not yet been put to empirical test jointly, namely, as an aggregate hypothesis that different objections might pertain to different participants. In principle, this might explain away the majority of participants’ responses, and thus rescue full rationality on their part. Although recently Stengård et al. 2022 have modelled participants’ responses at an individual level, the target of their experimental investigation was not to challenge the idea of base-rate neglect as a real fallacy, and hence the diagnosis of irrationality. The kind of techniques from Stengård et al.’s (2022) study could still be fruitfully applied to test the aggregate hypothesis mentioned above and deliver further progress in the debate on the base-rate fallacy.

Table 1 This table summarizes the six objections to the diagnosis of irrationality considered in the present paper

留言 (0)

沒有登入
gif