a. n»... z! .

5A..

 

 

 

 

 

\
”C

N NJ 3 1293

                                  

\Illlllllllllllllllllllllll

This is to certify that the

thesis entitled

A REANALYSIS OF THE BASE RATE PROBLEM THROUGH

UNDERSTANDING SUBJECTS' JUDGMENTAL REASONINGS

presented by
Wing—Shing Chan

has been accepted towards fulﬁllment
of the requirements for

M.A. degree in Psychology

 

Mzw

Major professor

Date 5-31-1990

0-7639 MS U is an Affirmative Action/Equal Opportunity Institution

 

 

 

 

r W H
LIBRARY
mchigan State
University
k ,

 

 

PLACE IN RETURN BOX to remove this checkout from your record.
TO AVOID FINES return on or before date due.

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

MSU Is An Affirmative Action/Equal Opportunity Institution
czmeMS-DJ

 

 

A REANALYSIS OF THE BASE RATE PROBLEM
THROUGH UNDERSTANDING SUBJECTS' JUDGMENTAL REASONINGS
By

Wing-Shing Chan

A THESIS

Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

MASTER OF ARTS

Department of Psychology

1990

 

 

ABSTRACT

A REANALYSIS OF THE BASE RATE PROBLEM
THROUGH UNDERSTANDING SUBJECTS' JUDGMENTAL REASONINGS
BY
Wing-Shing Chan

This study was a response to Bruner's (1986) call for
describing the process of judgment itself, in lieu of
studying judgmental errors according to some mathematical or
logical norms. Since the multi-modal nature of the response
from base rate problems rendered traditional analysis using
median problematic, qualitative methods were applied to
analyze the verbal protocols of subjects' reasonings. The
intuitive and probabilistic mode of judgment were delineated
for classification purposes. It was discovered that the
distribution of judgment were more probabilistic under
problem contexts of low diagnosticity, extreme base rate and
physical mechanistic environment. The same conclusions were
not found for causality. Subject's sex, culture, age, major
area and knowledge about the Bayes' rule showed no effect.
Moreover, the probabilistic judges seemed to be less
susceptible to mode shifts than the intuitive judges. It was

suggested that cognitive complexity might be related to the

_ ii _

 

 

 

mode of judgment subject used. Qualitative insights also
showed that subjects' apparent judgmental errors against
normative rules in fact derived from cognitive and meta-
cognitive skills which are vital to sound judgment in the

real world.

-iii—

 

 

Copyright by
WING-SHING CHAN

1990

 

 

Dedicated to

Buddhas and Bodhisattvas

 

 

ACKNOWLEDGMNETS

I wish to thank professor Ralph Levine for his
heartfelt support and mind-broadening intellectual
communication over the past few years. His expert advice
contributed significantly to the final scientific
presentation of the results and conclusions of this study.
Emeritus Professor Charles Wrigley with whom I started my
thesis research, has been offering me much guidance in my
academic, personal and financial affairs. His challenging
ideas help me strengthen my own thoughts and arguments. His
sincere and useful support during my critical periods as a
naive foreign student in U.S.A. deserved a great deal of
merit. I wish also to thank the department chair, professor
Gordon Wood who joined my thesis committee and offered
excellent advice. Without their support I could not have
pursued an intellectual search successfully. Their advice
also furnished my critical thinking towards this research.

I am indebted to the Chinese students who participated
in my experiment and gave me a chance to understand better
how people make judgments. Thanks also go to my present
advisor in the College of Education, professor Stephen

Raudenbush whose intellectual and financial support helped

 

 

me very much during the latter period of thesis writing. He
also joined my thesis examination committee. I wish to
acknowledge my thanks to Erik Kvan, who initiated my
critical thinking for the truth in human science while I was
working as a tutor in the Chinese University of Hong Kong.

I wish also to thank Chai-Liang Huang and Li—Wen Liaw,
graduate students in MSU, for introducing me about Buddhism,
which helped me sail through many difficulties during my
study.

Last but not least, I wish to express my indebtedness
to my father who died during my study in America and to my
mother, brothers and sisters for my absence. Of course, my
unceasing emotional support, patience and numerous help from

my wife from Taiwan, Li-ming are unforgettable.

- vii -

 

 

TABLE OF CONTENTS

ABSTRACT o I o a o o o o o o o o o o a o o o 0
ACKNOWLEDGMENTS o o o o o o o a u v o o o a a 0

INTRODUCTION 0 I I O I I I I I I I I I I I
The Base Rate Problem . . . . . . . . . .
Normative Solution of the Base Rate Problem

THE DEVELOPMENT OF UNDERSTANDING ABOUT THE BASE
PROBLEM . . . . . . . .
Representativeness .
Causality . . . .
Diagnosticity . .
Relevance . . . .
Problem solving .

c o o o o
o o I c o
o o o o a
o o o a o
o o l o a
t o o o 0

o
o
o
.

RATE

CONCEPTUAL PROBLEMS INVOLVED IN ANALYSING SUBJECT'S

RESPONSE I I O I I I I I I I I I
Responding to Base Rate is Better than not
Responding . . . . . . . . . .

Proximity to the Bayes’ Norm as being Bayesian

Proposed Remedies . . . . . . . . . . . .
Obtain Additional Measures . . . . . . .

Categorize the Sub- Distributions Qualitatively

METHODOLOGY . . . . -
Subjects . . . . -
Procedures . .
Materials . . . «
Specific Research Questions .
Analytic Method for the Verbal

Protocols

RESULTS . . . . . . . . . . .
A Traditional Analysis of Numerical Data .
Qualitative Categorization of Judgment Using

Verbal Protocols . . . . . . . . . . .
Reliability of coding . . . . . .

Problem Contexts and Mode of Judgment
Quantitative Analysis . . . .
Causality and mode of judgment .

Diagnosticity and mode of judgment

viii

 

 

Extreme base rate and mode of judgment . .
Problem context (social vs. physical) and mode
of judgment . . . .
Comparison of Consistency between Intuitive and
Probabilistic Judgment . . . .
Effects of Culture, Sex, Age, Major and Bayes'
Knowledge on Judgment . . . . . . . . . .

DISCUSSIONS . . . . . . . . . . . . . . . . .
Qualitative Aspects of Intuitive Judgment . . .
' Intuitive judges do not take the information
for granted . . . . . . . . . . .
Intuitive judges look for more relevant
information . . .
Intuitive subjects supply their own knowledge

and assumptions . . . .
Intuitive judges balance information by its
relevance . . . . . .
Implications for the Debate about Human
Rationality . . . . . . . .

Implications for the Debate about Clinical vs.
Statistical Prediction . . .

Cognitive Complexity as a Determinant .of Judgment
A Hypothesis . . . . . . . . . . . . .

Implications for Future Research . . . . . . . .

Limitations . . . . . . . . . . . . . . . . . .

APPENDIX A: Questionaire of the Base Rate Problems
APPENDIX B: Two Protocol Examples . . . . . . . .
APPENDIX C: Classification of the Two Protocols .

APPENDIX D: Histograms of the Numerical Judgmental
Responses ‘— 0 o a o o o o I o o o n 0

LIST OF REFERENCES 0 o I o o I o o o o o o o o o 0

ix

49
51
54
58

61
63

65
66
67
69
72
78
8O
83

84

92

96

102

 

2.
3.
4.

5.
6.
7.
8.
9.
10.

11.

12.

 

LIST OF TABLES

Quantitative Results based on Central Tendency

Indicators for Probabilistic Mode of Judgment

Indicators for Intuitive Mode of Judgement . . .

Stability and Change under Various Problem
Conditions I I I O D I I I I I I I I I I I I

Effect of Causal Base Rate on Judgment . . . . .
Effect of Diagnosticity on Judgment . . . . . .
Effect of Extreme Base Rate on Judgment . . . .
Effect of Physical Context on Judgment (I) . . .
Effect of Physical Context on Judgment (II) . .

Relative Consistency of Judgmental Mode under
Various Comparisons . . . . . . . . . . . . .

Effects of Sex, Major and Knowledge about Bayes'
rule on Judgment . . . . . . . . . . . . . .

Correlation between age and mode of judgment . .

34
40

41

44
47
49
51
54

54

57

59
6O

 

 

 

LIST OF FIGURES

1. Engineer-lawyer problem . . . . . . . . . . .
2. Distribution of responses to the cab problem.
3. Histogram of the subjects’ responses for

prOblem F o I o o O 0 I o .

D.l - Histogram of the subjects' responses for

problem A . . . . . . . . .

D.2 Histogram of the subjects' responses for

problem B I I I I I I I C I

D.3 Histogram of frequency of subject's response

for problem C . . . . . . .

D.4 Histogram of the subjects' responses for

problem D o t O o o o o o o

0.5 Histogram of the subjects' responses for

problem E . . . . . . . . .

D.6 Histogram of the subjects' responses for

problem F . . . . . . . . .

xi

10
21

37

102

103

104

105

106

107

 

 

 

INTRODUCTION
The study of error or bias in human judgment and
thinking is not an invention by today's social scientists
(e.g. Evans, 1989; Kahneman & Tversky, 1982; Nisbett & Ross,
1980). In a discussion about the concept of thought, Bruner
(1986, p.106—107) wrote:

...It was no accident that the mathematician
George Boole entitled his famous work on algebra
The Laws of Thought. Thought, in this
dispensation, is a normative idea, a specification
of a criterion of right reason.

...it was certainly the hope of early logicians
and philosophers to find some way of sorting out
the chaff of unreason from the wheat of reason.
And this was to be accomplished by the provision
of finer and finer rules of right reason (that is,
laws of logic) rather than by closer and closer
description of the activity of thinking itself.

...It is curious how little psychological
curiosity there was about the sources of these
errors, and from the Sophists to Wurzberg one can
find relatively little difference in the way they
were accounted for. They were "weaknesses" in our
logical processes, earlier couched in terms of
weaknesses for the undistributed middle, later as
"set effects" or "atmosphere effects". To put 1t
in a word, there was no psychology of thought,
only logic and a catalogue of logical errors.

...The same case holds for the history of
inference as for deduction, as with the "base rate
fallacy" I discussed in chapter 6. Departure from
Bayesian criteria is "fallacy", and departures as
before are attributed to weakness, some to
weakness induced by bias.

 

 

Bruner did not mean that the results of the studies on
human judgmental errors are wrong. He actually suggested
that by categorizing judgmental errors instead of describing
thinking itself, researchers have not given the psychology
of thought, or judgment a chance to develop.

This thesis represents a small step in response to
Bruner's advocacy for describing human judgment itself
instead of merely studying how judgmental errors occur. The
preconditions to fulfill such a goal would at least
include, as necessitated by Bruner's argument, the following
two points:

1. Restraining our past tendency to study judgment
according to a normative criterion and of our focus at
the causation for the errors.

2. Adopting a research methodology which could maximize
our chance of being able to describe the judgmental
processes itself.

It is my belief that quantitative research methodology in

social science research provides a vigorous and sensitive

tool for detecting relations among constructs. However it
is not very good at generating the most useful and
interesting questions or constructs. Qualitative
methodology, however, is better at describing social

phenomena with detailed information which often helps

 

 

 

 

 

generate some insightful and useful questions and
constructs. The weaknesses of qualitative research are
mainly due to its poor generalization and unavailability for
falsification.

One possible research methodology optimizing the
advantages of both quantitative and qualitative methods is
to let the latter do the job of generating ideas and the
search for meaning, and let the former build a model of the
resulted constructs vigorously. The present thesis adopts
such an approach.

The topical research area in this thesis concerns the
study of the base rate problem. Nowadays the base rate
problem is one of the most intensely studied topics of
inferential judgment, parallel in status to the study of
syllogistic logic in deductive reasoning. In social
psychological research, this area of study is often referred
as the base rate fallacy (e.g. Borgida & Brekke, 1981).

The present research attempted to reanalyze the base
rate problem by collecting information about subjects'
reasonings. Forty Chinese students were tested individually
on 7 base rate problems with theoretical interests. Verbal
reasonings together with the numerical responses were used
to categorize the judgment involved. The qualitative

categories of judgment were used instead of the numerical

 

 

responses to analyze the effects of problem contests such as
causal base rate, non-diagnostic information, extreme base
rate and physical mechanistic environment. The effect of
culture, age, sex, major area and knowledge about Bayes'
rule on the mode of judgment used were also studied. The
stability of the two mode of judgment against experimental
treatments were also investigated using appropriate
statistical tests. Qualitative aspects of the intuitive
judgment was described and the implications to the debate

about human rationality was also discussed.

The Base Rate Problem

 

The base rate problem normally experimented in
psychology is in fact a mathematical Bayes' problem with two
outcomes. The following is an example taken from Tversky
and Kahneman (1980), commonly referred to as the "cab"
problem:

A car was involved in a hit and run accident
at night. Two cab companies, the Green and the
Blue, operate in the city. You are given the
following data:

(a) 85% of the cabs in the city are Green and
15% are Blue.

(b) a witness identified the Cab as Blue. The
Court tested the reliability of the witness under
the same circumstances that existed on the night
of the accident and concluded that the witness
correctly identified each one of the two colors
80% of the time and failed 20% of the time.

 

 

What is the probability that the cab involved
in the accident was Blue rather than Green?

Analytically, this problem is constructed according to
the mathematical Bayes' formula with two mutually exclusive
elements. In this case the outcome is either a blue cab or a
green cab. The subject is asked to determine the
probability for a given outcome.

Another common element of the base rate problems is the
base rate information. It is the statistical probability
for an outcome given no further information about an event.
For example the base rate information is 85% for green cab
and 15% for blue cab. Since the outcomes are mutually
exclusive, the addition of their probability must be equal
to unity.

The third important element in these problems is the
"diagnostic information", so termed for our convenience when
discussing research on the base rate problem. This
information gives us specific information about the
occurrence of an event in addition to the base rate
information. For this example the diagnostic information is
the reliability information of the witness. Using Bayes'

formula, a normative solution can be computed.

 

 

Normative Solution of the Base Rate Problem

 

Let us explain how a Bayesian optimum can be computed
for the base rate problem when both the base rate
information and the diagnostic information are expressed in
numbers.

Let P(C) be the probability of occurrence for the
outcome category C. P(C") will be the probability for the
mutually exclusive event of C, called C". Following the
fundamental axiom of mathematical probability theory, P(C)
+ P(C") = 1. In our case, the base rate information is given
by P(C) and P(C”). These probabilities are sometimes
referred as the prior probabilities.

The probability that category C has occurred given the

 

diagnostic information D is the probability that takes into
consideration of the diagnostic information as well as the
base rate information. This probability is referred in
mathematics as P(C/D), read as probability of C given D.
P(C/D) and P(C"/D) are sometimes referred as the posterior
probabilities. The Bayes' formula for this probability is as
follows:

P(C/D) = P(D/C)P(C) / ( P(D/C)P(C) + P(D/C~)P(C~) )

For the previous cab problem, the probability that the
blue cab involved in the accident given the fact that the
witness has identified the cab as blue with a certainty of

80% can be computed by following the above equation.

 

The required probability is P(blue cab/identified as
blue), the base rate information is P(blue cab)=0.15 and
P(green cab)=0.85. P(identified as blue/blue cab) is the
witness's identification ability, is therefore 0.80.
P(identified as blue/green cab) is the error rate in
identification of the witness, and is thus 0.20, assuming
that the error rate is the same when identifying the blue
cab or the green cab. Accordingly,

P(Blue cab/identified as blue)

= 0.80*0.15 / (0.80*0.15 + 0.20*0.85)
= 0.414

The required answer is thus 41.4%

Generally, the base rate problem is used for
investigating the use of base rate in judgment or decision
making, as well as to compare people's judgment with the
optimal judgment accorded by the Bayes' theorem. Researchers
are also interested in studying the various factors which
would make people more prone to making a Bayesian optimal
judgment and the factors which would affect how subjects
combine the base rate information and the diagnostic

information while making judgment.

 

 

THE DEVELOPMENT OF UNDERSTANDING ABOUT THE BASE

RATE PROBLEM

 

The attempts to investigate and explain people's non-
Bayesian behavior began when Kahneman and Tversky (1973)
called our attention to the pitfalls of human judgment
against the mathematical norm. Explanations were sought to
explain why people commit errors in judgment. Later on as
further research (c.f. Borgida and Brekke, 1981) found that
people do use the base rates and/or give answers close to
the Bayesian optimum under certain experimental
manipulations, explanations were refined to contain both
Bayesian and non—Bayesian behaviors and to state the
conditions under which each behavior occur.

It is beneficial to review the literatures
chronologically to understand why certain explanation forms
have come and gone. The following review attempts to
highlight the major development of explanations about the
base rate problem. It does not pretend to contain every
base rate studies nor every technical subtlety affecting
the use of base rate. For a longer review of other details,

see Borgida and Brekke (1981).

Representativeness

In the now classic engineer—lawyer problem (Listed as
problem C in appendix A) designed by Kahneman and Tversky
(1973), it was discovered that people seemed to judge a
randomly selected personality description according to how
well the description represented a typical engineer's
characteristics irrespective of the ratio of the number of
engineer's versus lawyer's descriptions in the sample.

Each of the subjects were given five personality
descriptions consecutively and were told that each
descriptions were randomly selected from 100 descriptions of
a group of people consisted only of engineers and lawyers.
The subjects were also divided into two groups. In one group
they were told that the initial ratio of the number of
engineers versus lawyers was 70:30 and the other group
30:70. The subjects were requested to judge the probability
of each of the five personality descriptions as being
belonged to an engineer.

If the subjects were sensitive to the prior
probabilities, the estimated probabilities from the two
groups with different priors should be different. However
estimations from the two groups almost lie on a 45 degree
straight line on a cartesian plane away from the normative

Bayesian curve (see figure 1). The results showed that the

 

 

10

two groups' answers were almost identical, not sensitive to
the difference in prior probabilities and of course not

consistent with the predictions from Bayes' theorem.

 

 

 

 

 

'0
n- I
IO- //
10— /'
if» p
i: /
3 u— .
s /
ao— /
_ /
2..—
/
II /
l l l L l l l g
' u :0 u u so u 10 to u no
Probability (Engineer)
Lo '
Figure l: Engineer-lawyer problem. Median judged
probability (engineer) for five descriptions and for
the null description (square symbol) under high and
low prior probabilities. (The curved line displays the
correct relation according to Bayes' rule.) (Source:
Kahneman, Slovic and Tversky, 1982, p.55.)

 

 

 

When the personality description looked like a typical
engineer's characteristic, subjects responded with a high

median probability, i.e. 90 - 100%. But when the description

 

11

offered no specific information, subjects respond with about
50% certainty. Subjects' degree of certainty seemed to vary
with the extent of which the description looked like a
typical engineer.
Therefore Kahneman and Tversky (1982) concluded:
Given specific evidence ..., the outcomes under
consideration ... can be ordered by the degree to
which they are representative of that evidence.
The thesis of this paper is that people predict by
representativeness, that is, they select or order

outcomes by the degree to which the outcomes
represent the essential features of the evidence.

(p.48)

Basically Kahneman and Tversky (1973) tried to show
that people's judgement were not affected by base rates but
only followed the representativeness of the diagnostic
information. The only case under which Kahneman and
Tversky's (1973) subjects followed base rates was when no
diagnostic information of any kind was given. However, we
cannot thereby say that their subjects showed some signs of
using base rates because the base rate problem without
diagnostic information should not be considered as a base
rate problem. (see our definitions for base rate problem in
Chapter 1) In short, the early research done by Kahneman and
Tversky (1973) and others (e.g. Hammerton, 1973; Lyon and
Slovic, 1976) showed a period of non—Bayesian behavior,

nonuse of base rates and misjudgment.

 

 

 

12

Causality

Later research with varying experimental manipulations
began to show that people do use base rate under certain
conditions. Ajzen (1977) was first to point out:

In contrast to previous research, it was found

that people's predictions were strongly influenced

by base rate information but only to the extent

that the base rates had causal implications for

the criterion. When the base rates did not have

such causal implications, they were largely

neglected in favor of diagnostic information.

(p.303)

For example, in one of Ajzen's (1977) factorial
experiment, subjects were requested to judge from a
personality outline of a factitious person, the probability
to pass a final examination. Included within the study were
two types of base rate conditions. In the causal base rate
condition, subjects were given the information that 75% (or
25%) of the students had passed an examination of the same
course two years ago. For the noncausal base rate condition,
subjects were told that a certain educational psychologist
interviewed some of the same students who passed the exam
two years before. 75% (or 25%) of his sample passed the
exam.

A post—hoe test was used to confirm that the 25%
passing rate was perceived by subjects as a significantly

more difficult exam than the 75% one. Ajzen (1977, p.304)

thereby thought that the inferred exam's difficulty level

 

 

13

"does have a causal effect on a given student's success or
failure".

The results showed a significant interaction between
the base rate of success (75% vs. 25%) with the type of base
rate (causal vs. non-causal) as well as a main effect on the
base rate of success. And the causal base rate had a
stronger effect on prediction of exam success than the
noncausal base rate.

All the results in Ajzen's experiment taken together
indicated that different base rates do had different effects
on people's judgment. The effect of base rate was largest
when the base rate had a causal implication for the
diagnostic evidence. When the base rate was noncausal, the
effect of base rate was minimal and people would judge
mainly by means of the diagnostic information.

Ajzen's (1973) results had given us a more precise
picture about people's use of base rate than the early
Kahneman and Tverskys' (1973). Unlike the latter authors'
claim, Ajzen discovered that people do use base rate in
their judgment, at least when the base rate information has
a causal implication. However, Ajzen's results were
consistent with the representativeness proposal when the

base rate was a noncausal one.

 

 

 

14

Diagnosticity

Since the discovery of the base rate fallacy, people
start to investigate the effect of different types of base
rates by varying the quality of the base rate, such as
causal vs. noncausal one. It was discovered by Ajzen (1977)
that people are sensitive to a base rate with a causal
implication towards the diagnostic information. Very
naturally, the next step would be to investigate the
diagnostic information by varying this variable in order to
study its effect on people's use of base rate. This task was
taken up by Ginosar and Trope (1980). They pointed out that
causality alone does not determine the use of base rate, the
validity of the diagnostic information also affects the use
of base rate.

Ginosar and Trope (1980) restudied the engineer-lawyer
problem by adding a diagnostic condition with inconsistent
information. This condition contained information with
implications for both engineer and lawyer. Subjects' median
responses then varied in direct proportion to the variation
in base rates. Parallel. results were also demonstrated with
their 'field-of-study' problem.

Results showed that, in addition to causality, the
validity or diagnosticity of the diagnostic information also

plays an important part in determining the use of base rate.

 

 

15

Similar results were obtained in other studies by varying
explicitly the degree of diagnosticity or accuracy of
individuating information (Fischoff & Bar-Hillel, 1984;
Hinsz, Tindal, Nagao, Davis & Robertson, 1986). In addition,
when some unrelated information was added to a diagnostic
information, the diagnosticity would be diluted rapidly
(Nisbett, Zuckier & Lemley, 1981).

All the studies cited in this section demonstrate that
diagnosticity of information can influence the way people
make judgment. People tend to rely on diagnostic information
when the diagnosticity is high, and rely on the base rate
when the diagnosticity is made minimal. These findings are
consistent with the early Kahneman and Tverskys' claim of
representativeness except that there are proven conditions
under which people consistently made more use of base rate.
The diagnosticity explanation can join with the causality
explanation to co—determine probability judgment and the use
of base rate. A later study by Hinsz et a1. (1981) indeed
demonstrated that although causal nature of the base rate
factors had a significant effect on subject's probability
judgment, it was relatively minor in comparison with the
impact of the accuracy of the source information, or

diagnosticity.

 

 

 

16

Relevance

Researching on the side of the base rate, Bar-Hillel
(1980) argued that relevance, not causality per se,
determines the use of base rate. Using a modified cab
problem, Bar-Hillel showed that subjects' responses came
closer to the Bayesian optimum when the base rates of cabs
at the region closer to the neighborhood of the accident was
additionally stated. Bar—Hillel argued that such sub-group
of base rate becomes more relevant and will be integrated by
subjects together with other information.

In short, we can see that in order to determine the use
of base rate or probability judgment, relevance is important
on the side of base rate factor, and diagnosticity is vital

on the side of the diagnostic information.

Problem solving

Through a series of experimental manipulations, Ginosar
and Trope (1987) had successfully demonstrated that people's
judgment under uncertainty do vary under a number of new set
of conditions. These researchers argued that judgment under
uncertainty can be explained parsimoniously by the "problem
solving" approach.

For example, probability judgment was found to depend

on prior problem diagnosticity. When prior problems had non—

 

 

17

diagnostic conditions, the judgment of the problem that
followed would exhibit a higher use of base rate than when
preceded by a problem with diagnostic condition. Ginosar and
Trope offered the explanation as 'prior activation of
inferential rules'. In a second experiment, the original
engineer-lawyer problem was listed in a sentence by sentence
format. The change in mean probability judgment was
explained as 'concurrent activation of inferential rules'.
In the third experiment, probability judgment was found to
vary according to whether the correct category (in a
Bayesian sense) was initially given to the subject or not.
Ginosar and Trope again related this effect to the goal-
directedness in problem solving theories. By decreasing the
source reliability of the diagnostic information in the
engineer—lawyer problem and converting the cab problem to
resemble drawing marbles, significant decrease in
probability of judgment were observed. These phenomena were
explained as restrictions on the application of the
representativeness rule and enhancement of the applicability
of the sampling rule respectively.

Ginosar and Trope were the first researchers who
attempted to uphold a coherent and consistent theoretical
framework (i.e. problem solving approach) to explain the

various experimental results in base rate research. Their

 

 

18

effort still represents by now the most encompassing
theoretical work in this field.

In conclusion, our review of the major literature
confirms Bruner's claim that almost no research effort is
directed to the study of thinking or judgment by themselves.
All the research cited in this chapter is only concerned
with whether subjects are making correct judgment according

to the normative Bayes' theorem.

 

 

 

19

CONCEPTUAL PROBLEMS INVOLVED IN ANALYSING

SUBJECT'S RESPONSE

 

As usual, researchers generally computed and presented
the mean or the median judgmental responses (e.g. Bar—
Hillel, 1980; Ajzen 1977, Ginosar & Trope, 1987, etc.). The
central tendency is either compared to the Bayesian Optimal
value or compared against treatments to obtain causal
relationship between treatments and the mean judgmental
responses.

However, the above common data analytic procedures and
interpretation contain, I think, at least two unjustified
beliefs, intermixed with conceptual and technical

difficulties. They are listed as follows:

Responding to Base Rate is Better than not Responding

 

One of the designs illustrating this belief is set up
via a ANOVA (e.g. Ajzen, 1977). A base rate problem is given
to two groups of subjects, with base rate being different in
the two groups. When the mean responses of probability
judgment are significantly different between the two
groups, researchers obtain evidence that subjects are

sensitive to the magnitude of base rate. Since a Bayesian

 

 

20

scenario involves the mathematical weighing of the base rate
information and the diagnostic information, the proven
sensitivity to base rate level is generally construed by
researchers as being a better judgment, at least better than
subjects who seem to concern about the diagnostic
information only (see Kahneman, Slovic & Tversky 1982).
However, a priori speaking, using base rate does not
guarantee the final answer to be Bayesian equivalent,
because, in actual practice, there are numerous ways of
using the base rate. The outcome is a priori unpredictable

with respect to Bayesian optimum.

Proximity to the Bayes' Norm as being Bayesian

A mean or median response close to the Bayesian optimal
value is regarded as the better Bayesian judgment. There
are two problems with regard to this belief. First, it is a
priori possible for some non-Bayesian behaviors to get
answer close to the Bayes' optimal value. Second,
distribution of response in base rate research are often
bimodal or multi—modal (e.g. see Figure 2), using mean or
median as central tendency are not a fair measure of what

subjects are doing.

 

 

 

21

 

 

 

 

 

 

N=52
Median
hﬂodo
20 - .
>~
g 15- .
s
5’
2 1o- .
IL
5- 1 .
1. .ll 1.11111...
0 20 40 60 80 100

Response:

Figure 2: Distribution of responses to the cab
problem.. The arrow indicates the correct Bayesian
estimate. (Source: Bar-Hillel, 1980)

 

 

 

In fact, the underlying generative mechanisms within
each sub-distribution might be different. Some might be
close to Bayesian behavior while some might not. Therefore
the central tendency measure using mean or median actually
mis-represent the underlying mechanisms with regard to
Bayesian optimum. As a consequence, the distance between the
central tendency measure and the Bayesian optimal value may

be rendered non-interpretable.

 

 

 

22

Proposed Remedies

 

In relation to the conceptual problems in analysing
subjects' judgmental response, the following two remedies

are proposed:

Obtain Additional Measures. A particular defect in

 

former base rate studies is that, by focusing on subjects'
final responses, one loses sight of the underlying
generative processes or mechanisms of the subjects'
solutions. Accordingly one loses the fundamental grounding
in deciding whether the subject's behavior is Bayesian or
non-Bayesian. It is suggested that one can use post-
experimental interview or thinking-aloud procedures (for
trained subjects) to investigate the explanations,
reasonings and conscious processes of thinking (Newell and
Simon; 1972). By contrasting the final judgmental responses
with the reasoning of subjects, a better measure of whether
the subjects' behavior is Bayesian or non—Bayesian can be
obtained. Even though it turns out to be non—Bayesian, the
results would still open a new horizon of research into the
structure of people's reasoning in addition to obtaining

their numerical probability judgmental responses.

Categorize the Sub-Distributiggs Qualitatively. A

simple way to solve the multi-modal distribution problem

 

23

might be to assign the separate sub-distributions integer
values and use chi—square to test the change in the sizes of
the sub-distribution under different experimental conditions
with theoretically important implications. However, there
might be borderline cases in which it may be difficult to
classify by numerical magnitudes of response to which sub—
distribution they belong, or whether these cases should be
considered as a separate meaningful subgroup. Independently
obtained qualitative categorizations of the verbal protocols
obtained from subjects as recommended in last section might
be useful in solving this classifying problem. Some useful
qualitative classifying techniques can be adopted from
'phenomenography' (Marton, 1981) or from 'grounded theory'
(Strauss, 1987).

The common features of these classifying techniques
involve careful coding of individual verbal protocols. The
protocols are brought together into groups on the basis of
similarity and the groups can be compared to each other. A
higher order of meanings which emerged are combined to form
the categories of descriptions. The distinctive feature of
this method is that 'the analysis is dialectical in the
sense that bringing the quotes together develops the meaning
of the category, while at the same time the evolving meaning
determined which of the categories are included or omitted'

(Marton & Saljo, 1984, p.55). A more detailed description of

 

 

24

this technique would be discussed in the "methodology"
chapter of this thesis.

Readers might question the subjectiveness of the coding
process involved. This can be answered by noting that
qualitative categories are constructed from data to
understand the phenomena, they are by no means final and are
subjected to modification or synthesis when provided with
more data or when the research focus shifts (see Strauss,
1987).

As long as the schemes of classification and coding are
carefully and explicitly laid out, the categorization
processes can be repeated by independent judge to check the
reliability of coding using this schemes. Repeated research
can provide information to validate or reject the usefulness
of the kind of coding in relation to specific research

purposes (Strauss, 1987).

 

 

 

25

METHODOLOGY

The methodology adopted in this work is a response to
Bruner's call for a study of judgment itself as well as a
response to the conceptual and technical problems in base
rate research as discussed in the last section.

Our attempt is to reanalyze the base rate problem
through understanding subjects' judgmental reasoning.
Whether subjects' responses comply with the normative rule
is not the primary interest. The investigation of subjects'
judgmental reasonings is achieved by collecting verbal data
in addition to numerical responses. Subjects were asked to
think aloud in judging several base rate problems of
theoretical interest. They were then interviewed by the
experimenter with regard to their judgmental reasonings.

Qualitative techniques were applied to the analysis of
the verbal data. Useful and meaningful constructs of the
judging process were delineated for subsequent quantitative

analysis. Qualitative insights would also be observed.

 

 

26

Subjects

The subjects were 40 volunteers recruited from Chinese
students on Michigan State University campus, 20 males and
20 females. For each gender, half of them were chosen with
backgrounds in arts or social sciences and the remaining

half with a major in science or engineering.

W

Each subject was tested individually with the
experimenter by his side. The testing session generally
involved 40 minutes to 1 hour. The questionnaire was typed
in English while all colloquial interchanges were mainly in
Chinese. Mandarin was used for students who came from
Mainland China or Taiwan. For Hong Kong students, the
Cantonese dialect was used.

First, the subject was allowed to read the first page
of the questionare containing the instructions. (The whole
questionnaire is listed in the appendix of the thesis) Then
the interviewer would give the following statements:

This research is to study how ordinary people
make judgment on certain everyday affairs. You

would have seven problems to do. In all of these

problems, there are no absolute answers of any

kind. You don't have to worry whether your answer

is right or wrong. Therefore you can use your own

methods to make what appears to you the best d

judgment. In the beginning you would use a metho

called thinking aloud method. That means you try

' ' hen you are
to say what you are thinking about w
thinking over the problem. Just tell us what comes

 

 

27

to your mind and we would tape—record it. After

you have used this method to finish all of the

problems, I would interview with you and ask you

what your reasoning is and how you come to the

answer. The questionnaire generally takes 20

minutes to complete and the interview session

would last for a further 20 minutes. There is no

time limit in completing the questionnaire. You

can do it at your own pace. If you have any

questions about the meaning of the wordings in the

questionnaire, feel free to ask me. Do you have

any questions? ... If not, you can begin. "

While the subject was thinking aloud, the interviewer
put down the main arguments or reasoning processes spoken
out by the subject. After the thinking aloud procedure, the
interviewer proceed to interview the subject on each problem
about the subject's reasoning processes.

To be economical while still remaining accurate, the
verbal protocols from the think aloud and research interview
sessions were written simultaneously during-the experiment.
The protocols were written at sufficient detail to capture
the main reasonings of the subject. Examples of the‘
protocols can be found in the appendix. Tape recordings
were only referred to whenever there were some problems in
understanding the protocol for the subject.

A subject might change his answer and / or method of
approaching the problem at any time in the course of the
experimental period. In this study, the final answer and /
or process that the subject agreed to be his best judgment

was taken as the data input point, no matter how many times

the subject had changed his mind during the experiment.

 

 

 

28

Materials

 

The questionnaire used in this study consisted of seven
problems. The first problem was the cab problem researched
by Kahneman and Tversky (1973). The second was the same
problem except that the base rate was modified to become a
causal one (Tversky and Kahneman 1980). The third and the
fourth were the engineer—lawyer problem with the description
about Jack and about Dick. The fifth problem was the same as
the third but with an extreme base rate of 1:99. The sixth
problem was a base rate problem about the performance of a
machine. The last problem was a mathematical Bayes' problem
stated in linguistic form. Unfortunately, due to some
unnoticed typing errors, this question was discarded from
the final analysis. A copy of the questionnaire can be

found in the appendix.

Specific Research Questions

 

The main purpose of this research is to restudy the
base rate problem using the methods of thinking aloud as
well as the research interview. It is hoped that these
methods can reveal further the reasoning or thinking
processes of the subjects in order to better understand how

people judge.

 

 

 

29

We wanted to categorize people‘s thinking methods into
several distinct and meaningful ways by studying the verbal
protocols of the subjects. Therefore instead of looking at
the responses from a perspective of checking whether or not
subjects make use of base rate, we can actually see how
subjects shift from one type of thinking category into
others among different types of problem contexts. Using
these descriptive categories, we can attempt to answer
statistically (using Chi square) the following specific
research questions:

1. Does the causality of the base rate affect people's
mode of judgment? (by comparing problems A and B)

2. Does the diagnosticity of information affect people's

 

mode of judgment? (by comparing problems C and D)
3. Do people judge differently to a base rate with high
extremity (i.e. 99:1)? (by comparing problems C and E)
4. Do people judge differently between problems of a
social context and that of a physical context? ( by

comparing problem F to A or B)

5. Do males judge differently from the females?

6. Does age affects the way people make judgment?

7. Do students' major areas affect how they would judge?
8. Do people who have learned Bayes' rule judge different

from those who have not?

 

30

By comparing the numerical probability responses
between our Chinese subjects and those of the American or
Israelian subjects in past research, we can also get a rough
idea about whether culture would make a difference in the

response to base rate problems.

Analytic Method for the Verbal Protocols

Think-aloud data have been most useful in tracing the
sequencing of information processing in problem solving. In
our study, however, the critical information are very few,
namely the base rate and the diagnostic information. It was
found from our pre-analysis that the think-aloud data are
not particular illuminating in our case because the
sequencing of information is not very important (e.g. as
contrasted against playing chess or performing operations to
control a factory boiler). Moreover, most of our subjects
were not capable to verbalize very well about their thinking
while solving the base rate problem. It was decided that our
study would depend mainly on the verbal data from the
research interview while those of the thinking aloud session
only supplement concurrent information about what was
happening while the subjects were solving the problems.

The analytic method for the verbal protocols in search

of meaningful constructs is borrowed from what Marton (1981)

 

 

31

termed as "phenomenography". He explained it as "It is
research which aims at description, analysis, and
understanding of experiences; that is, research which is
directed towards experential description" (Marton, 1981,
p.180). Such research is possible because the experience of
reality have been repeatedly shown to be experienced in a
limited number of qualitatively different ways. (See Gibbs,
Morgan and Taylor; 1980 for an overview)

Marton also called this kind of research as second
order research, as distinguished from the first order
research which tries to describe various aspects of the
world. He gave an example: the first order research is like
asking the question "Why do some children succeed better
than others in school?", while the second order research is
like "What do people think about some children succeed
better than others in school?".

Applied to our study, the first order perspective is
like asking "Why do people make wrong probability
judgment?". Our research perspective is similar to a second
order one by asking "What are the grounds for your best-made
judgment?".

Two kinds of results are expected by this kind of
research: the categories of description and the

distribution of subjects over the categories. The categories

 

32

of description can be considered as abstract instrument for

the analysis of concrete cases in the future. Or we can

study a historical fact like individual X exhibit conception

Y under circumstance Z.

In practice, the phenomenographic method applied in
our case is as follows.—

1. The protocols of the individuals were first read, all
comments relevant to enquiry are marked and
identified.

2. The pools of comments thus obtained were then read
for each problem across individuals.

3. Extracts were thus brought together into groups on the
basis of similarity and the groups are delimited from
each other on the basis of their differences. A higher
order of meanings thus emerged are combined to form
the categories of descriptions. The distinctive
feature of this method is that " our analysis is
dialectical in the sense that bringing the quotes
together develops the meaning of the category, while
at the same time the evolving meaning determined which
of the categories are included or omitted" (Marton,

1984, p.55).

 

33

RESULTS

Our results are presented in several ways. First, the
response of numerically judged probability is analyzed in
the traditional way using the median as central tendency
measure. By this we can compare our results with the other's
results quantitatively. Second, the constructs or strategies
established by qualitative methods are presented, and the
rules used to form the constructs will be shown. Third,
these strategies of judging will be used to make a
statistical comparison between problems. The effect of
problem contexts on the distribution of people's strategies
of judging would be observed. Fourth, the effect of
culture, sex, and major area on the preference of different

modes of judgment will be presented.

A Traditional Analysis of Numerical Data

The numerical judgmental responses were submitted for a
traditional quantitative analysis using median as central
tendency measure. While this is not our main purpose, the
results could be compared to other research based on these
methods. The quality of data collection could be

ascertained.

 

34

 

Table 1
Quantitative Results based on Central Tendenpy

Mean, median and the corresponding Bayes' optimum for
the judgment problems Studied

 

Problem Mean Median Bayes' Base N
Optimum Rate

 

A 55.7% 80% 41% 15% 39
B 54.2% 70% 41% 15% 39
C 65.9% 71% n.a. 30% 40
D 32.4% 30% n.a. 30% 38
E 39.6% 15% n.a. 1% 40
F 34.5% 21% 21% 10% 39
G 38.6% 40% 14% 10% 14*

 

Note * : Valid subject size for problem G decreases due
to defects in some questionnaires.

 

 

 

As we can see from Table 1, the median responses of the
cab problems (A & B) are not close to the Bayes' estimate.
The median for problem A and B is 80% and 70% respectively,
while the corresponding Bayes' estimates are both 41%. There
are no standard Bayes' estimates for the engineer-lawyer
problems (C, D & E) because the diagnostic information for
these problems are not written in explicit quantitative
terms. The medians for problem F lies exactly on the
corresponding Bayes' optimum, being 21%. For problem G, we

have collected only 14 valid cases due to some typing errors

 

35

in the questionnaires for the earlier 25 subjects. In this
problem, the median answer is 40%, quite apart from the 14%
Bayesian optimum.

Results from problems A and B seem to repeat the
findings of Tversky and Kahneman (1980). Our subjects'
median answers, being 80% and 70%, are almost exactly the
same as Tversky and Kahnemans' 80% and 60%. Since our
subjects' answers fall short of the Bayes' optimum: 41%,
they exhibit the usual fallacy. But the casual base rate in
problem B seems to help shift the median response closer to
the optimum. This shift is similar to Tversky and Kahnemans'
result but a little less in magnitude.

Again, our results for the engineer-lawyer problems
(C,D) are almost the same as those of Ginosar and Trope
(1980, p.235). Our median answers are respectively, 71% and
30% while their results are 69% and 30%! This seems to
confirm their findings that "base rates will be utilized to
the extent that the usefulness of the individuating
information for diagnosing category membership is
diminished" (Ginosar & TrOpe, 1980, p.228).

Problem E has a very extreme base rate (1:99), compared
to problem C's (30:70). Our median response is 15%,
diminished much from problem C's 80%. Because problem C and

E are the same except for the base rate. The results seem to

 

 

 

 

 

4"”. . mi. inn-"e. «.Zq'ag'

 

36

show that under extreme base rate, people tend to use the
base rate more.

Problem F is interesting because the median response of
our subjects is exactly equal to the Bayes' optimum (i.e.
21%). The result seems to imply that when handling judgment
of the pure physical realm, our subjects are on the average
a perfect Bayesian! However, as we examine the distribution
of responses carefully, (see Figure 3) most of the responses
lie in the two modal regions of about 10% and 72%! (compare
the base rate and diagnostic information :10% and 70%) Very
few subjects' responses are close to 21%. The use of median

response to summarize the results here is not justified.

Our last problem is a Bayes' problem in mathematical
terms. Most subjects showed signs of difficulty in
understanding or solving this problem. Unfortunately, due to
a typing error in some questionnaires, valid sample size was
reduced to only 14. It was decided to abandon this problem.
Nevertheless, the median answer of 40% is quite far away
from the optimal 14%

In sum, quantitative analysis using median seems to
repeat major findings of other researchers. This give us
some confidence that our data collection procedure is quite

reliable. Results from problem F also highlight the fact

 

 

37

 

Response :
Midpoint :
(percent)+

=\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
12 N\\\\\\\\\ \\\

:\\\\\\\ <-—
i

:\\
36 :\\\\\

24

.8
I
so E\\
+
72 .\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
o....+. ..4... . ..8.. . ..i2.. . +...i6....
Frequency

Figure 3: Histogram of the subjects' responses for
problem F. Bayes' estimate denoted by the arrow sign
<--.

 

 

that using median response might not be justified as a
summary of the responses. It also shows that using median
response alone can easily neglect other distinct judgmental
modes at work. For example, the distribution of responses in

problem F was obviously bimodal.

 

 

38

Qualitative Categorization of Judgment Using Verbal

 

Protocols

From our initial phenomenographic analysis of the
verbal protocols from the interview, some qualitative
distinct categories of judgment seems to stand out as
probable research categories in our data. In the later phase
of the data collection, fewer exploratory questions were
asked and the data collection concentrated on finding out
what particular way of judging the subject used. If the
subject's judgment conformed with those earlier discovered
categories, the amount of time asking related questions was
shortened because the discovered categories act as a schema
in understanding the current subject's way of judging. The
shifting focus of data collection at various stages of
research is called by Strauss (1987) as "theoretical
sampling". Strauss explained this technique as one "whereby
the analyst decides on analytic grounds what data to collect
next and where to find them, ... so this process of data
collection is controlled by the emerging theory ... When
done well, this analytic operation pays very high dividends
because it moves the theory along quickly and efficiently"
(Strauss 1987, p.38-39).

Two core categories were finally decided for our set of

data, they are the probabilistic mode and the intuitive

 

 

 

39

mode. In fact hardly any single pair of terms could fully
represent the ways people judge. Other terms like
scientific, axiomatic, mechanical or abstract may fulfill
some of the descriptive functions not covered by the term
probabilistic. For "intuitive" mode, words like lay,
analytic, experimental or pragmatic might express some of
the meanings not captured by the word "intuitive". Our
terms are chosen because they seem to be more inclusive,
less misleading and distinguishable from terminologies
already in use by other researchers.

The term 'probabilistic" does not automatically means
it is correct and intuitive does not mean incorrect.
Probabilistic mode of judgment refer to the judgment which
mainly utilizes the calculus of chance, e.g. the urn model,
the axiomatic additive rule and multiplicative rule, or the
conditional probability. Intuitive mode of judgment refer
to the common everyday judgment which does not solely rely
on axiomatic probability theories. No ready scheme or
formula is used for judging. Instead, people using intuitive
mode of judgment would consider the quality and relevance of
the evidence, put higher weight on the particular case and
on the present moment, might employ IF-THEN criterion or use
narratives to fill in gaps of information. Descriptions

about conceptual and the empirical indicators for the

 

 

40

probabilistic mode and intuitive mode of judgment are shown

in Table 2 and Table 3 below.

 

Table 2
Indicators for Probabilistic Mode of Judgment

Concept indicators Examples of
empirical indicators

 

Subject to some "It's a pure math or
calculus of chance probability problem"

Use the "urn" model, Use only the original ratio
judge solely by sample (i.e. base rate)

proportion

Employ some axiomatic 0.9*0.3 +0.1*0.7
probability theory

e.g. the multiplicative 80%*15%

rule, conditional

probability 0.1*0.7/(0.l*0.7+0.9*0.3)

 

 

Using the above criteria for the two modes of judgment,
we can treat the protocol of each problem answered by each
individual as one data point and classify it as either
belonging to probabilistic or intuitive mode. In a few cases
where the verbal protocol was not very clear for
categorization, the experimenter decides whether the
judgment is predominantly probabilistic or intuitive. When

all cases are settled, we can study quantitatively how the

 

 

 

 

Table 3

Concept indicators

Indicators for Intuitive Mode of Judgement

Examples of
empirical indicators

 

Does not rely on
abstract axiomatic
theory

Consider the relevance
and quality of evidence

Generally put higher
weight on evidence which
is present or is about
the individual case
concerned

Employ IF-THEN
criterion

Use narratives to
fill in gaps of
information

 

"The description does
not give us any
information"

"The number of cabs in
city is irrelevant"

"The witness is more
important, statistical
data are not"

"Past and present, no
necessary relationship"

"If no interest in social
issues, hardly a lawyer"

"Lawyer works in group,
engineer works alone"

 

distribution of the modes of judgment varies as a function

of problem context. Two examples showing the verbal

protocols and the reasons for classifying them can be found

in the appendix.

Reliability of coding.

An independent judge was called

upon to code the interview protocols according to the

criteria from Tables 4 and 5. The resulted coding was

checked against the original one coded by the experimenter.

 

 

42

Extremely high reliability was observed. For problems A, B
and F, only 1 out of 40 was missed. Two protocols were
coded differently for problem C. Interrater correlation
coefficients for problems A, B, F and C were respectively,
0.945, 0.947, 0.947 and 0.892 Exact match in coding was
found in problem D and E. Accordingly, the coding criteria
for the two categories of judgment should be very reliable.
Upon review of the discrepancies between the
independent judge's coding against the experimenter's, the
experimenter decided that the judge's coding was better and

the data for final analysis was changed thereof.

Problem Contexts and Mode of Judgment: A Quantitative
Analysis

If we compare the distribution of the number of
subjects exhibiting the two modes of judgment between
problems, we might be able to test how the problem type, or
context would affect the distribution of modes of judgment.
This comparison is similar to a pre— and post—test design.
As outlined in our paragraph on research questions, we would
like to test whether "causal" base rate, low diagnostic
information, extreme base rate, and problem context (social
vs. physical) affect the resulting distribution in modes of

judgment. Obviously, since all we have are frequency data,

 

 

 

43

contingency table would be used. The McNemar test may best
be used to test the null hypothesis that there had not been
a change in the proportion of all subjects who used
intuitive judgment (or equivalently, probabilistic judgment)
(Conover, 1980, p.132). From this test we can also know
whether intuitive or probabilistic mode of judgment have a
higher proportion of changers between problems.

Before we begin to use the categories of judgment to
analyze conditions related to the shift of judgment, we
should examine whether the categories are relatively stable
among problems. As shown in the table below, the phi
coefficient indicates the correlation of the mode of
judgments between problems. Higher correlation indicates
greater stability. Among the contrasts of theoretical
interests, the phi coefficients ranged from 0.282 to 0.837
and are all statistically significant. The mean phi was
0.550. Thus the average high phi coefficients indicates that
the qualitatively coded modes of judgment are stable enough

as a construct.

 

 

 

 

 

Table 4

Stability and Chapge under Various Problem Conditigpp

Stability and change were tested by the phi

coefficient and the McNemar's chi square respectively.

Intuit.

(A) Prob.

Intuit.

(C) Prob.

Intuit.

(C) Prob.

 

 

 

 

 

Intuit.

(A) Prob.

 

 

Intuit.

B) Prob.

 

McNemar's
Phi Coeff. Chi Square
0.837 0.33
(p<.001) (p>.05)
McNemar's
Phi Coeff. Chi Square
0.551 11.0
(p<.001) (p<.001)
McNemar's
Phi Coeff. Chi Square
0.724 6.0
(p<.001) (p<.02)
McNemar's
Phi Coeff. Chi Square
0.357 8.07
(p=.014) (p<.01)
McNemar’s
Phi Coeff. Chi Square
0.282 6.25
(p=.043) (p<.02)

 

 

 

Causality is supposed

Causality and mode of judgment.
to be what makes problem B differs from problem A. Because

in problem B, the base rate is given as accident rates
instead of the relative cab size of the two cab companies.
Past research showed that there was a drop in the median
response from about 80% to 60%, closer to the Bayesian
optimum 41%, when the base rate in the cab problem was given
in accident rates (Tversky and Kahneman 1980). The drop was
attributed to the causality of the base rate that "readily
elicits the inference that the drivers of the Green cabs are
more reckless and/or less competent than the drivers of the

Blue cabs" (Kahneman et a1. 1982, p.157).

In fact only 12 (30.8%) out of 39 subjects in our study

gave the "causal" problem a different numerical answer than
that of problem A. For the majority of subjects (69.2%), the

"causal" base rate did not affect the way they made
udgment.

With regard to the change in
zbjects shifted their modes between problem A and Problem

mode of judgment, only 3

The change was not statistical significant (McNemar's

-square=0.33, p>0.05).
The reasons given by the intuitive subjects who did not

it their judgment are summerized below. The subject

er 2 for problem B, for example, are denoted by SZB. The

 

case where there are over five subjects is not listed by

individual subjects. Only the subject size will be given.

1. Past and present are independent, they have no
(necessary) causal effect. (S2B,S9B,S4OB,SZ4B,329B)

2. Witness is more important, more reasonable. (Reasoning
similar to that for problem A). (N=12)

Five subjects who remained intuitive but nevertheless
affected by the causal base rate lowered their numerical
estimates. They reasoned as:

1. Accident rate has some influence. Lower the witness's
reliability. (S33B,SZGB,SlB,SZ8B,S38B)
For all subjects who shifted their judgment, their
reasonings were:
1. We should consider the accident rate. I used
mathematical calculation. (827B)
' This concerned the occurrence of accidents, we have to
consider this information. (8258)
Accident rate is just background information, witness
is more important. (SlOB)
Subjects using probabilistic mode of reasoning usually
9 no specific reasonings for using the same strategy.
r considered the mathematical calculation as the

opriate method to answer these problems.

 

 

Table 5

Effect of Caggal Base Rate on Judgment

 

(B)
Intuitive Probabilistic
Intuitive 23 2
(A)
Probabilistic 1 l3

 

 

Diagnosticity and mode of judgment.

The diagnostic information in problem C and D differ in
their diagnosticity. The description of Dan was intended to
convey little diagnostic information. Past research (Ginosar
i Trope 1980) showed that the engineer-lawyer problem's
redian response dropped from about 70% to 30% when the low
iagnostic description of Dan replaced the high diagnostic
arsonality description of Jack.

Our study showed that 16 subjects (40%) resorted to the
:e rate of engineer (i.e. 30%) to answer problem D,
ause they thought the description was vague or contained
:le information. Based on the same reason, 5 subjects
5%) gave an either—or (i.e. 50%) as the answer. Together

; (N=21) of subjects described Dan's description as

48

providing no information to decide for a career between
engineer and lawyer. Interesting enough, 6 subjects (15%)
thought that Dan's description now seemed more like a
lawyer, the percentage they gave ranged from 5% to 20%.

These subjects explained the choice by saying since Dan is

 

of high ability, high motivation, it sounds like an
achieving young lawyer. In addition, two of them believed an
engineer works alone, so the description of "He is well
liked by his colleagues" doesn't fit for an engineer. With
regard to the shift of judgmental mode, it was found that
out of 28 subjects using intuitive mode for problem C, 11
(39.3%) of them shifted to probabilistic mode. They choose
the sample ratio (30%) as answer, noting that the
description was too vague. For those using probabilistic
judgment in problem C, all of them remained as probabilistic
for problem D. Eight (72.7%) of these subjects admitted that
the vague description confirmed their use of the sample
ratio as answers. The shift of mode here was statistically

Significant, McNemar's chi-square: 11.0, p<.001.

 

49

 

Table 6
Effect of Diagnosticity on Judgment

(D)

 

Intuitive Probabilistic
Intuitive 17 11
(C)
Probabilistic 0 ll

 

 

 

Extreme base rate and mode of judgment. The results of
adopting extreme base rate (e.g. 90%/10% or more) have
produced equivocal results (see the review by Borgida and
Breke, 1981). Our study have found that 19 (67.9%) of the 28
intuitive judge in problem C were affected by the extreme
base rate of 99%:1% in problem E. 7 (25.0%) of those
intuitive judges decreased their confidence about the
description as belonged to an engineer, although the
probability they gave was still over or equal to 50%,
indicating also that the description is still an engineer.
They claimed that extreme base rate had an effect, but the
description was still like an engineer. Five (17.9%)
subjects used the 1% as ground level but gave an answer

Slightly higher than 1% (i.e. values ranged from 5 to 20%)

 

 

50

to indicate their belief that the description looked like an

engineer, but the probability should not be as low as 1%.

Another 7 intuitive subjects (24.1%) shifted the mode of

judgment to probabilistic, 6 of them adopted the sample

ratio (1%) as answer while the remaining one made some

calculation with the use of probability theory. 9 intuitive

subjects (33.3%) in problem C gave the same answer to

problem E, unaffected by the extreme base rate. Their main

reasonings are listed below:

1. I focus on the character. (57E)

2. Just like the taxi problem, ratio does not have much
meaning. (S40E)

3. Aged 45. a lawyer with no interest in social and
political affairs, not likely. (829E)

4. It has relation with the character, not number. (SZZE)

5. I base my judgment on the description, statistical

data has no relation. (S31E)

5- The information is so strong. (S6E)

7~ The description can hardly be a successful lawyer.
(89E)

8- No social and political interest, like math puzzle, it

is an engineer. (S37EIS35E)

For the probabilistic judges of problem C, all of them

remained probabilistic for problem E. The overall sh1ft of

 

 

51

judgment between subjects in problem C and problem E is

statistically significant, McNemar's chi-square=6.0, p<.02.

 

Table 7

Effect of Extreme Base Rate on Judgment

 

(E)
Intuitive Probabilistic
Intuitive 22 6
(C)
Probabilistic 0 12

 

 

 

 

Problem context (social vs. physical) and mode of
judgment. Base rate fallacy is often documented under
circumstances of life-like problems, e.g. judging on
witness' reliability in court and inferring a person‘s
career from one's character. One might be curious how the
subjects, usually college students, might err on the base
rate problems and be able to study advanced mathematics
courses on the other hand. A hypothesis might be for
Students with some backgrounds in introductory probability
theory, they might be more prone to intuitive judgment for

life-like problem, and to probabilistic judgment for more

52

mechanical, physical problem. Our sample of subjects are
particularly appropriate for testing this hypothesis, as all
of them have at least learned some elementary probability
theory in high school. Maybe there is some significant
difference between the social and physical world as
perceived by people that they would use different mode of
judgment for the two worlds.

Problem F is designed to be a physical, mechanical
problem that is concerned with the accuracy of a machine
with computer vision on a testing document which contains
ellipses and circles of different proportions. This problem
is highly comparable to problem A or B because the latter
problems is concerned with the accuracy or reliability of
the witness while the cabs of different colors have a
different proportion or a different prior accident rates.
The crucial difference between problem E and problem A or B
is the problem context, for the former is life—like and in
the social world; while the latter is mechanical and within
the domain of the physical world.

Our results confirm our prediction, the shift of
judgmental mode, mainly from intuitive to probabilistic, is
statistically significant. McNemar's chi—square was 8-07,
p<.01 between problem A and F and 6.25, P<-°2 between

problem B and F.

 

 

53

In problem F, many subjects who turned to probabilistic
mode did not give specific reasonings for their doing.
Nevertheless, the significance of the difference between a
human affair and a machine in affecting judgment as
perceived by our subjects can be traced by the following
clues. The first five reasonings came from intuitive
subjects.

1. It is a machine, more mechanical, therefore it's
probability should be 70%. (SZOF)

2. It is a machine, not a man. It does repeated actions,
its error rate should be the same. ($22F)

3. Computer is rather 'dead' thing. When the computer has
made an answer, the original document's ratio does not
reflect the error. (SZ3F)

4. Because it is a machine, I would trust more about its
reliability, (827F)

5. It is mechanical, more mathematical, it is different
from the previous personality problem ($39F)

6. This is a machine, not a human being. Therefore it is

a pure math problem. (59F, a probabilistic subject)

 

 

54

 

Table 8
Effect of Physical Context on Judgment (I)

(F)

 

 

Intuitive Probabilistic
Intuitive 12 13 ,‘
(A)
Probabilistic 2 11

 

 

Table 9

Effect of Physical Context on Judgment (II)

 

 

(F)

 

Intuitive Probabilistic
Intuitive 11 13
(B)
Probabilistic 3 11

 

Comparison of Consistency between Intuitive and
Probabilistic Judgment

In this section, we ask the question: Which mode of
judgment is more susceptible to change under different

problem contexts? This question will be answered globally

 

 

55

across all the problems and specifically for each
theoretical meaningful pair of problems.

To answer the question globally, the subjects were
divided into the subgroups of intuitive vs. probabilistic
types according to the mode of judgment the subjects used in
problem A. Then for each subgroup, the Cochran‘s test for
related observations (Conover, 1980, p.199) was applied to
test for the omnibus treatment effect of problem contexts
on judgment for the remaining five problems. If the
treatment effect is found to be significant, it means that
the particular subgroup of subjects have significantly
changed their judgment among the other five problems. This
indicates that these subjects would significantly change
their strategy of judgment under the influence of some
problem contexts. For the probabilistic Subjects,
operationally defined, the treatment effect was just
marginally significant with a Chi—square of 9.49, df=4,
p=0.050. However, the corresponding Chi-square for the
intuitive subgroup was 18.7, df=4, p=0.0009. The p values
also act as a measure of the strength of the treatment
effect here. The results indicated that the treatment effect
of the problem contexts for the intuitive subgroup seemed to
be stronger than the probabilistic subgroup, and therefore

the intuitive subgroup changed more. That is to say,

 

 

 

probabilistic subjects tended to apply the same strategy

across all problem contexts while the intuitive subjects
varied their strategies when facing the different problem
contexts.

The above global difference between the strategies of
the intuitive and probabilistic people can also be examined
specifically for those theoretically meaningful pairwise
comparisons which had a significant treatment effect. The
distribution of subjects for those pairwise comparisons with
a significant effect on mode shifts were organized in the
following table. The usual Chi-square test for no
association was applied. A significant Chi-square would mean
that the proportion of subjects who changed from one problem
to another was dependent on the subjects' initial mode of
judgment, i.e. intuitive or probabilistic strategies. In
other words, the proportion of changers were different for
the two mode of judgment (Bishop, Fienberg and Holland,
1988).

Three out of the four comparisons indicated a
significant chi-square value which rejected equal
consistency pattern for the two judgmental modes. This seems
to reveal that when there was a change of judgment between

two problems, the proportion of intuitive people who changed

 

 

 

 

T‘_-

57

 

Table 10

 

Relative Consistency of Judgmental Mode under Various
Comparisons

l (D)
Same Different Chi Square

 

Intuit. 17 11 6.02
(C) Prob. ll 0 (p<.02)*

(E)
Same Different Chi Square

 

Intuit. 22 6 3.03
(C) Prob. 12 0 (p>.05)

(F)
Same Different Chi Square

 

Intuit. 12 13 4.8

(A) Prob. ll 2 (p<.05)

(F)
Same Different Chi Square

 

Intuit. ll 13 3.89
(B) Prob. ll 3 (p<.05)

* individual p values might change slightly due to
the total number of comparisons made

 

 

judgment was higher than the that of the probabilistic
people. The specific results were consistent with the global

results.

 

 

 

 

58

Effects of Culture, Sex, Age, Major and Bayes' Knowledge on

 

Judgment

From our results in the chapter on the traditional
analysis of subjects' numerical response, we found that the
results repeated to a high degree of the research formerly
done on the American and Israelian subjects. There is no
reason to suspect that the Chinese subjects' responses are
highly different from the pattern of responses in the West.

Our subjects were coded to belong to either arts and
social science or natural science and engineering. They were
also asked whether they learned Bayes' rule before.
Crosstabulations between major area and mode of judgment
obtained no significant chi-square for test of independence
for all problems used in this study. Thus major area of
study does not seem to affect whether people use intuitive
or probabilistic mode of judgment. Chi-square tests for the
effect of knowledge about Bayes' rule also obtained no
significant results. In our sample of subjects all of whom
knew at least some simple probability theories, knowing
Bayes' rule is probably an indicator of better statistical
knowledge. Our results showed that judgmental mode was not
affected by better statistical knowledge at all.

Crosstabulation between sex and mode of judgment

obtained significant chi-square for problem F only, chi-

 

 

 

59

square was 4.51, p=0.0337. Our female subjects seemed to
view the problem about the machine vision with a more
intuitive perspective than the males. However, in general

the sex effect is not dominant.

 

Table 11

Effects of Sex, Major and Knowledge about Beyes' rule
on Judgment

Chi-square statistics with corresponding probability
value shown in parentheses.

Problem Sex Major Knowledge of
N Bayes' Rule

 

 

A 39 0.300 (.584) 0.014 (.905) 0.551 (.458)
B 39 0.0410 (.839) 0.742 (.389) 2.839 (.092)
C 40 1.91 (.168) 0.476 (.490) 0.0770 (.781)
D 39 0.0332 (.855) 2.17 (.140) 0.300 (.584)
E 40 1.62 (.204) 0.404 (.525) 0.331 (.565)
F 39 4.51* (.0337) 0.0144 (.905) 0.365 (.546

 

Note * : Significant at 0.05 level

 

 

Correlations between the subject's age and his mode of
judgment across all problems were also computed. None of
them were significant. The correlation coefficients are

tabulated below:

 

 

 

Table 12

 

Correlation between age and mode of judgment

 

Problem Correlation N p
A -0.0446 37 0.794
B -0.1949 37 0.248
C 0.0219 38 0.896
D 0.2058 37 0.222
E -0.0520 38 0.756
P -0.0466 37 0.784

 

 

 

 

61

 

DISCUSSIONS

Histograms of our subjects' responses confirmed the bi—
modal or multi—modal nature of the response from the base
rate problems. The usual analysis based on the mean or
median is called into question. Our study demonstrated that
intuitive and probabilistic mode of judgment can be
successfully delineated in the base rate problem.
Probabilistic mode of judgment conforms the calculus of
chance, or the "urn" model and involves explicit application
of the axiomatic probability theory like the multiplicative
rule or the conditional probability rule. Intuitive mode of
judgment do not rely on abstract axiomatic theory.
Relevance, importance and weights of the evidence are also
considered. Logical deduction and narratives are also used
to fill in the gaps of the given information.

There does not seem to be any difference in numerical
responses between our Chinese subjects and the American or
Israelian subjects of some earlier research. Whether someone
is in arts or science does not seem to affect what mode of
judgement is used. Sex has a small effect. In 2 out of 6
problems, more female subjects appear to be judging

intuitively than males.

 

 

62

The context of base rate problem does seem to affect
the distribution of the mode of judgment in our sample of
subjects. The distribution of subjects turn towards being

more probabilistic under the following problem contexts:

1. Diagnostic information with low diagnosticity for
judgment.

2. Extreme low or high base rate.

3. A pure physical or mechanical context.

While the third result is a new discovery, results 2 and 3
are parallel to former research about factors affecting the
use of base rate in social judgment.

There is no significant difference between the
distributions of judgment of the cab problem with a causal
base rate and that with a non-causal base rate. Former
research (Tversky and Kahneman, 1980) used to claim that
more subjects used the base rate under the causal condition
because the median response under the causal condition (60%)
was closer to the base rate (15%) than that under the non-
causal condition (80%). From our data, it appears that
although 5 out of 25 subjects did lower their response under
a causal condition which decrease the median response to
70%, they were still using intuitive judgment. Tested from
the standpoint of judgmental mode, only 3 subjects changed

mode and the result was not statistically significant.

 

 

 

63

Chi—square tests of independence and the Cohran's test
for related observation revealed that probabilistic judges
are less susceptible to judgmental mode shift than the
intuitive subjects. It appeared that probabilistic judges
just plug in the numbers by some probability rule, although
incorrect, and remain relatively unchanged by the
experimental manipulations. Intuitive subjects would
consider the experimental information and shift to the

probablistic mode as regarded necessary.

Qualitative Aspects of Intuitive Judgment

The probabilistic mode of judgment which complies with
the calculus of probability is not very interesting per se,
at least not as interesting as the functioning mechanism of
the intuitive mode of judgment. The reason is a high
proportion of subjects which ranged from 35% to 70%,
exhibited intuitive mode of judgment in the sample of
problems we tested. Many interesting questions could arise,
such as whether people who make intuitive judgment are
irrational, or is it just a matter of education? How do
politicians, physicians and bankers conduct their business,
presumably through intuitive judgment? Can we trust our
jUdge and juries if they are intuitive thinkers?

Alternatively, besides knowing its "errors", can we learn

 

 

 

64

anything worthwhile from intuitive judgment? Could
artificial intelligence learn something from intuitive
judgment? These list can go on, indefinitely. Having gone
through the experience of interviewing 40 talented and
educated people about how they make judgments on those base
rate problems, I try to present my opinions, if only partial
answers, to the above important questions.

A theoretical world is like a base rate problem. The
problem consists of two and only two piece of important
information: the base rate and the diagnostic information.
There is a theoretical optimum, obtained by applying Bayes'
formula to these two information. Numerical answer is exact
or up to the number of decimal points we desire. The
theoretical world abstracts the real world and is therefore
not the real world. Depending on the quality of such
abstraction, the theoretical world represents the real world
variably.

A real world is like the world we live in. We can
doubt. We can ask question and are given some answers. We
can challenge the authority. We can find out more about
something if we are not sure. There are established rules to
do certain things. We perceive quality; something is well
done and some are not. We know that people tell lies; we do

not believe in everything.

 

 

65

Both intuitive and probabilistic mode of judgment
belong to the real world. The intuitive mode is with us all
the time and the probabilistic mode exists only when we are
making abstractions or theorizing.

The base rate problem with its Bayes' solution is like
a faultless world. There are absolutely no doubt or problem
about anything, except that you are supposed to make a
probabilistic estimate of the diagnostic information (e.g.
given the described characteristics of the person, how much
is the probability that this person is an engineer?). Then
this probability estimate together with the base rate are
supposed to be entered into the Bayes' formula to obtain the
optimal solution to the problem. The intuitive subjects do
not work like this. They do not take the information for
granted. They appear to function in a complex realm, as if
in the real world. The following discussion, supported by
the subjects' , are used as examples to highlight the
special cognitive and meta—cognitive aspects of the
intuitive mode of judgment. The conditions under which

intuitive mode of judgment would err are also discussed.

 

Intuitive judges do not take the information for
granted. People in an intuitive mode of judgment do not
take things for granted. You cannot easily get them to obey

by saying: "Forget about everything else, just give me an

 

 

answer by looking at the two given key information." Because

that is not the usual way judgment holds in the real world.
These people challenge the logic implied by the question.
For example by saying:"More cars don't mean they must crash
more." We all know the simple fact that a student who
studies longer hours might not do better than the brighter
student who studies less. A person is generally considered
of less intelligent if he or she can only follow what he or
she is told and question nothing. Relevant excerpts from
subjects' reasonings are listed below. (S and A stand for
subject and problem number respectively; the numeral in

between is the subject number)

1. Many (more) cars don't mean they must crash (more).
(S31A)
2. A car was involved ... Two cab companies ... It

doesn't mean that the car involved in the accident is
a cab! (SBA)

3. When the accident happens, maybe green cars are not
around the scene. ($23A)

4. Maybe blue cab's business is better. ($29A)

Intuitive judges look for more relevant information.

 

There is evidence that the intuitive subject tries to obtain

all the important information relevant to the problem he

wants to judge just as what he will do to a similar

 

 

situation in the real world. Of course, he cannot do so in

an experiment. He must supply his own assumptions, drawn
from his experience. There is a dialogue between a person
and his world in the real world. He can always ask for more
information or search it by his own. No such thing exist in
an experiment. Experimental results might possibly differ in
an imagined experiment which would supply any additional
information the subject wants. Without the information
considered crucial in judging a given case, the subject can
only make his own assumptions or quit if he is permitted to
do so. (Note: Some subjects questioned me for additional
information which I did not have and certain subjects
expressed to me that it was difficult for them to judge
without knowing more about the case) Evans (1989) missed
this point by considering the additional information that
subjects made up led them towards wrong judgment (against
the norm in the theoretical world). Some examples are as
follows:
1. Other relevant factors: driver, car's machinery. May
be blue higher (better) than green. (SZZA)
2. Other similar data (about the witness) might be

proposed (needed). (S21A)

Intuitive subjects supply their own knowledge and

 

assumptions. When the subject has to supply his own

 

 

68

assumptions to fill in gaps of the problem to be able to
judge, he draws his assumptions from his repertoire of
knowledge and belief useful to the given situation. For
example, our subjects think what is like for a car accident,
for a court investigation or for the personality of some
engineers they know or have heard of. This information is
life-like and thus usually comprises many factors and
dimensions. As a result, these assumptions should in general
exceed or contradict the assumptions that the problem
intends. Nevertheless, these assumptions of knowledge and
belief might be wrong and inaccurate by themselves, or they
might be wrongly applied to the given situation. This is an
example of judgmental error in the real world. Some

examples from our subjects' reasonings are as follows:

1. Lawyer is not liked by the colleagues. (S70)

2. Engineer works alone. (S40D)

3. Engineer work independently. (S34D)

4. Engineer work in team. (Sl4D)

5. Lawyer is very competitive, unlikely be liked by

colleagues. (SZOD)
6. Lawyer like this cannot be successful (89C)
7. An engineer is freer to discover and is allowed to

have mistakes. (S23D)

 

69

8. Aged 45, a lawyer and have no interest in social and
political affairs is unlikely. (SZ9E)

9. Since there are fewer blue cars (in the city), witness
will pay more attention if it (the involved car) is
blue. Percentage (the witness's ability in recognizing
colors) should be higher than 80% (the given). (S33A)

In real world resource of time and material are limited
for any given person for a certain purpose. One cannot
obtain any information one wants. One must plan to obtain
the most important information relevant to the situation in
the most economical way. Of course, one cannot always do
this optimally. One is then stuck with information with
second class value or miss the chance of getting the
information at all. This may be why people err in the real

world.

Intuitive judges balance information by its relevance.
People in real world consider the importance, value and
relevance of things. Our subjects look at information as if
weights are attached to them. They decide subjectively what
is relevant, what is important. They can compare the
importance of any one piece of information in the given case
relevant to finding the solution, not just two piece of

information. the experimenter intends.

70

A person functioning in the real world suffers from
limited cognitive ability. Our brain cannot recall or
compute like a computer. Our mind works mostly in discrete
levels; seldom on a continuum. For example, our study shows
that intuitive people can become probabilistic when the base
rate is made to be very extreme, like from 30% to 1%. The
Bayes' rule can give an answer no matter how slight a change
is the base rate. But a person can only respond when the
change is subjectively detectable and is being felt as of a
significant magnitude. Moreover, when the situation is
complex and of numerous dimensions, a person might not be
able to summarize all the information to a level that he can
manage cognitively. He is then bound to make error in
judgment.

Excerpts which demonstrates the importance of relevance

are:

1. I focus on the character. (57E)

2. Just like the taxi problem, ratio does not have much
meaning. (S40E)

3. It has relation with the character, not number. (SZZE)

4. I based my judgment on the description, statistical
data has no relation. (S6E, S31E)

5. The experimental evidence is primary, the frequency
data is only secondary. (S7A,B)

6. It's a single event, witness is more important. (SZ4A)

 

71

7. I trust repeated experiment. ($39A)

8. I trust witness, the rate is irrelevant. (N=11,A)

9. The past and the present are independent. (N=17,B)
10. I judge according to the error percentage, ratio has

no influence. (N=14,F)

11. The description is vague, it can be either an engineer
or a lawyer. (S6,19,22,21,28,D)

Excerpts which illustrate the intuitive subject's use of

balancing information are:

1. Lower witness's rate, since the two rates are very
different now. (SZ7A,830A)

2. It has a higher accident rate, we should lower the

witness's reliability. (S33,26,l,28,30,B)

3. The majority is circle. Lower the error percentage.
(SZ7F)
4. The extreme base rate has effect, but it is like an

engineer. It should be higher than 1 %.
(Sl9,21,34,14,27E)

5. It is an engineer. Since there is only one engineer,
we should lower the probability.
(SZ4,39,30,1,28,33,20E)

6. We are not given any (useful) information, we have to

depend on the earlier data. (89D)

 

72

Implications for the Debate about Human Rationalipy
Research in human judgment of the last decade had been
a debating ground for the forum about human rationality
(e.g. Cohen 1981). The proposal that humans are not as
rational as they might seem came from experimenters who
recently discovered that people fail to meet the norms in
many deductive and inferential tasks (e.g. Evans, 1989;
Kahneman & Tversky 1982; Nisbett & Ross, 1980). The defense
for human rationality were usually of a theoretical nature.
For example, Cohen (1981) presented the views that some
subjects might be functioning in some other equally valid
concepts of probability, one of which belongs to the
Pascalian probability elaborated in Cohen (1977). But the
question is if the other type of probability is equally
valid, why don't subjects reach answers close to the
mathematical or statistical norm? White (1984) suggested
that practical judgment is concrete, everyday and
unstructured and the objective of judgement is not to
produce an outcome that is right in a normative sense but an
outcome that satisfies the practical concerns. Still this do
not explain very well why people fail in experiments about
judgment and why mathematical and statistical norms can be
applicable. There are data which support the experimenters
while the theorists usually present no empirical support.

Because of this reason, the debate was not well settled. I

 

 

 

 

73

would present a different view of the debate, based on the
qualitative and quantitative study that I conducted.

The Majority of people cannot reach the exact answer to
a mathematical or statistical problem. That is for sure. It

take years of education to learn the higher level

 

mathematics. In my experiment on 40 Chinese students, mostly
graduate student; only 22 (55%) claimed that they have
learned about Bayes' theorem before. However, none of them
remembered the formula off hand. Only three subjects in my
sample used conditional probability to reach the same answer
required by Bayes' rule in five instances. Well if most
people cannot reach the mathematical norm, doesn't that mean
they are not rational enough, to function in the modern
world full of uncertainty?

With the insight from my experiment, I would answer
that if people always automatically and mechanically apply
Bayes' theorem in their judgment, that will show only that
these people lack the normal cognitive and meta-cognitive
intelligence for judgment in the real world. As I have
discussed before, the real world phenomena as presented to
us is often incomplete, untrue or lack of relevant
information unless we search for it. Problems in real world
are always multi—dimensional. Information has levels of
quality. The real world is not the faultless world as in the

Bayes' problem that researchers intend to test their

74

subjects. Our study documented that subjects do not take
"evidence" for granted. They question authoritative judgment
by asking questions, ask for more information or search for
additional information himself if considered necessary and

possible. The subjects have past knowledge and beliefs that

 

they would bring the situation to make the best judgment.
They weigh the evidence by considering its relevance,
credibility and importance. Although in a limited cognitive
capacity, they summarize information and balance the weights
of the information. They also uses logic to examine the
quality, weights and credibility of the evidence.

These characteristics of intuitive judgment,
empirically supported by our data, are quality that is
unmatched by a simple-minded and mechanical application of a

mathematical or statistical formula. The reason is again

 

that real world problems are generally complex, incomplete
and non—routine. For example, our subjects had reasons to
question whether besides the quantities of cabs, other
things were equal. They questioned whether the business,
machinery, driver training between the two cab companies
were the same. These are important factors which could
affect the decision making. Without these information, the
intuitive subjects refused to accept cab sizes as the

appropriate signal for the likelihoods of making accidents.

 

75

A mechanical application of a mathematical rule would not
take care of other possible important but unconsidered
factors. Without the crucial information, a sensible person
would either search for additional information or supply his
own assumptions and beliefs. However, simple application of
some mathematical norm could just work on the given limited
data which might be irrelevant or wrong. An intuitive
subject can decide whether a piece of information is
relevant or not, and he or she can weigh the evidence by its
relevance. Obviously, plugging in some pre-determined
formula simple-mindedly would not consider the weights of
information in a sensible way. The reason that subjects
fail to meet the mathematical norm in experiments is because
these subjects are employing intuitive judgment, as if what
they would do in the real world. They employed cognitive and
meta—cognitive skills that go beyond the information given,
the boundary that the experimenters want to impose. One
cannot really determine whether the subject's intuitive
judgment is right or the experimenter's normative judgment
is right? Because looking from the perspective of the real
world, the experimenter who expects their subjects to plug
in a formula without questioning the details of the given
information is simple-minded. Alternatively, from the

perspective of the theoretical world, the subjects are just

 

76

wrong, doing "unnecessary" things and bringing in
"redundant" information within an experiment.

Well, does it mean that people's intuitive judgment
never err? By no means. From our discussion of the last
section, we suggested that people could make mistakes in
most of their cognitive and meta-cognitive thinking. For
example, people can be over suspicious, heading in the wrong
direction for information and putting off decisions. Their
beliefs and knowledge that they bring to the situation can
come from a poor and distorted memory, or from a wrong
interpretation. Cognitive capacity can also limit the
ability to detect minute change in the environment. When the
situation become more complex, people might not be able to
summarize information properly and might fail to balance the
weights or relevance of the information correctly.

Then what is the role of the mathematical and
statistical norms for judgment in the real world?
Mathematical and statistical norms can be used to enrich our
knowledge and belief bases. However, these norms must not be
applied mechanically to the real world without using the
above-mentioned cognitive and meta-cognitive skills of
intuitive judgment. Whether a mathematical or statistical
formula adequately represents the real world for real world

purposes must be pre-judged using intuitive judgment.

 

77

Finally, does it mean that the intuitive mind always
functions better than a computer? A computer which can only
do mechanical application of mathematical norms would fail
to compete with the intuitive mind in many real world
issues. However, the speed and storage capacity of a
computer usually surpass those of a person. For designing an
artificial intelligence which can make decisions in the
complex real world, we must model the machine after the
cognitive and meta—cognitive abilities of human's intuitive
judgment as above-mentioned. Then the That is to say, the
artificial intelligence should include a search for
additional relevant information, going beyond the given
situation and assumptions. It must be able to supply its own
assumptions and knowledge when the given problem does not
comply with the usual framework of analysis. It must also
decide the relevance of information with respect to the
problem solving. It should be able to balance a large number
of information according to the relevance of each piece of
information. Then the advantages of speed, storage and
precision of the computer can make for the relatively
limited capacity of the ordinary person. A computer can also
be designed to help make judgments when the situation is
becoming too complex for a person cannot summarize, weigh or
balance the over-loaded information properly. Motivational

problems in human judgment such as excessive emotion,

 

 

 

prejudice, vested interest or fatigue also give a rationale
for machine decision making. However, in any case, cognitive
and meta-cognitive skills of intuitive judgment has to be
given a dominant position in the design of machine decision—

making.

Implications for the Debate about Clinical vs. Statistical

 

Prediction

The clinical vs. statistical prediction problem has
been an important debate in psychology for the last several
decades (Meehl, 1954, 1986; Holt, 1958, 1986). Because of
the apparent similarities between clinical judgment and the
intuitive mode of judgment, our descriptions of the
characteristics of intuitive judgment can have some
implications for this debate. According to the last
section's discussion, clinical prediction can benefit from
exercising the advantages of intuitive judgment in real life
clinical judgment. In theory and possibly in practice,
clinical judgment could take into account the unique
characteristic pattern of information about an individual
and the clinician can obtain additional relevant information
if necessary. Statistical prediction is less flexible in
this regard because usually the predictor variables are

determined in advance and are the same for every individual.

 

 

 

79

Clinical judgment can make use of the clinician's experience
and knowledge in understanding an individual when some
crucial information is lacking. A clinician can also decide
the weights of the evidence. In a common statistical
prediction framework, neither additional knowledge nor
weights of the information can be utilized.

Hence, despite the large amount of evidence showing the
superiority of statistical judgment (Kleinmuntz 1990),
clinical judgment still has some important advantages over
statistical prediction. Although people prefering
statistical prediction can argue that artificial
intelligence can replace all the above advantages of the
intuitive elements in clinical judgment, yet no AI program
can be developed in a foreseeable future that can replace
all the judgmental functions of a physician or a clinical
psychologist. Clinical judgment should also incorporate the
knowledge of statistical prediction as part of the evidence
for judgment. In this way, both the advantages of clinical
and statistical judgement can be utilized. Research in
making clinical judgment more explicit can help improve the
mechanisms as well as the accuracy of clinical judgment. The

errors of clinical judgment can be reduced also.

 

 

 

 

 

 

 

80

Cognitive Complexity as a Determinant of Juggment: A
Hypothesis

The present study has revealed several situational
factors which affect the mode of judgment people used,
namely: diagnosticity, extreme base rate and physical
mechanistic contexts. However, all of the personal variables
including sex, age, major area and former knowledge about
the Bayes' rule did not show any significant relationship
with the mode of judgment used. Demographically, it seems
that there is no effective way to predict whether a person
is predominantly probabilistic or intuitive in their
judgment.

Nevertheless, some of the our results showed some light
of hint. The probabilistic subjects seemed to adopt a
single strategy of judgment while the intuitive subjects
changed more often. Qualitative analysis also showed that
intuitive subjects attended to more differentiated aspects
of the problems. The probabilistic subjects largely employed
some over-simplified probabilistic rule on a limited amount (
of given information across different problems. Hence one
possible personality determinant of judgment could be the
cognitive complexity—simplicity of a person. Bieri et a1.
(1966, p.185) described: 'Cognitive complexity may be

defined as the capacity to construe social behavior in a

 

 

 

 

81

multidimensional way. A more cognitive complex person has
available a more differentiated system of dimensions for
perceiving others' behavior than does a less cognitive
complex individual'. As elaborated in the section on the
qualitative dimensions of intuitive judgment, intuitive
judgment seems to involve more complex evaluation and
attends to a larger differentiated dimensions than
probabilistic judgment does. The similarity of intuitive
judgment and cognitive complexity on the differentiation of
dimensions of judgment suggested that cognitive complex
people might be more intuitive in their judgment under
uncertainty than cognitive less complex people. Future
research measuring the association of cognitive complexity
with the mode of judgment might establish some useful

personality determinants of the judgmental mode.

Implications for Future Research

 

Continuing research on intuitive judgment can be
helpful in understanding human decision making as well as
helping the design of machine intelligence. Qualitative
studies on the decision making of experts in scientific and
social affairs is recommended. Attention can be paid to how
people make errors in using the cognitive and meta-cognitive

skills of intuitive judgment, but not how people make errors

 

 

against the mathematical and logical norms within a

theoretical world.

Different decision situations, besides the base rate
problem, such as employment, marriage and business decision
making can be studied by creating scenarios for people to
judge and by recording the reasonings of the people in
making judgment.

The methods of establishing constructs through
qualitative methods and the subsequent application of
quantitative analysis is recommended to future research in
the field of psychology of judgment, as well as for other
fields in the social sciences.

As hypothesized, cognitive complexity might be related
to probabilistic judgment as a personality determinant.
Future research should be directed to test this hypothesis.

Since the subjects in this study were a special group
of overseas Chinese graduate students, generalizations of
the results here to the whole Chinese or American population
might not be immediate. Similar research using other local
Chinese subjects as well as American subjects should be

studied and compared with the present results.

 

 

Limitations

The present sample of subjects consisted of mainly
Chinese overseas graduate students in Michigan State
University. Although the numerical responses between our
sample and those of the past research using Western subjects
are highly similar, the hasty generalization to the similar
American population is not advisable until confirmation by
collection of similar verbal reasoning data on the American
subjects.

Despite the highly reliable property of the coding
scheme, the usefulness of the qualitative categories of the
two modes of judgment need to be ascertained by future
research in the similar area.

The present experimental design, as demonstrated, could
reveal the properties of intuitive judgment which can make a
case for the defend of intuitive rationality against the
attacks by earlier researchers. However, whether the
intuitive judges had rightfully used the base rate
information has to be investigated within the framework of
intuitive judgment, not from a faultless and mechanical
normative environment. Therefore whether there is truly a
defect in intuitive judgment in making use of the base rate
could not be empirically tested until the mode of intuitive
judgment is adequately described by further research along

some similar lines of the present research.

 

84

Appendix A

Questionaire of the Base Rate Problems
inclination

“This study is to investigate the thinking process of people's
judgement of some uncertain everyday affairs. Please literally
read the questions and try to think aloud. I am not primarily
interested in your final solution, still less in your reaction
time, but in your thinking behavior, in all your attempts, in
whatever comes into your mind, to recount exactly what unfolds
in your consciousness, your hesitations, doubts, the ideas which

come into your mind, etc. Be bold and speak them aut.‘

' i can assure you that the study has nothing to do with the
study of your In or personality, unlike many psychological
experiments, there is no deception of any kind. There is no
so-called correct answer to the questions . Your response would

not be judged as right or wrong.‘

'While you are thinking aloud or answering during the
interview, the content will be tape—recorded for detailed
analysis- If you have any questions or any words you don't
understand, feel free to ask me. Do you have any questions

now? If not, or shall we begin?‘

 

 

 

 

85

KM

"A car was involved in a hit and run accident at night. Two cab
companies, the Green and the Blue, operate in the city. You are given the

following data:

(a) 65:3 of the cabs in the city are Green and 15% are Blue.

(D) a witness identified the tab as Blue. The court tested the reliability
of the Witness under the same circumstances that existed on the night of
the accident and concluded that the witness correctly identified each one

of the two colors 80% of the time and failed 20% of the time.

What is the probability that the cab involved in the accident was Blue

rather than Green?"

Answer S

 

 

 

86

8B3

This problem is the same as the last one in all aspects

except that sentence (a) has been modified-

"A car was involved in a hit and run accident at night. Two cab
companies, the Green and the Blue, operate in the city. l‘ou are given the

followmg data:

(a') Although the two companies are roughly equal in size, 85'}; of cab

accidents in the city involve Green cabs and 15% involve Blue cabs."

(b) a witness identified the Cab as Blue. The court tested the reliability
of the witness under the same circumstances that existed on the night of
the accident and concluded that the witness correctly identified each one

of the two colors 80% of the time and failed 20% of the time.

What is the probability that the cab involved in the accident was Blue

rather than Green?"

Answer g

 

 

87

m

A panel of psychologists have interviewed and administered
personality tests to 30 engineers and 70 lawyers, al successful in their
respective fields. 0n the basis of this information, thumbnail descriptions
of the 30 engineers and 7‘0 laywers have been written. You will find below

a description, chosen at random from the 100 available descriptions.

"Jack is a 45-year-old man. He is married and has four chil ren. He is
generally conservative, careful, and ambitious. He shows no interest in
political and social issues and spends most of his free time on his many
hobbies which included home carpentry, sailing, and mathematical

puzzles."

The probability that Jack is one of the 30 engineers in the sample of

lOOis:

8%

Answer-

 

ii

 

 

88

(D)

Everything being the same as the last problem, please
consider another description drawn from the same group of

people:—

"Dick is a 30-year—old man. He is married with no children. A man of
high ability and high motivation, he promises to be dUite successful in his

field. He is well liked by his colleagues.

The probability that Dick is one of the 30 engineers in the sample of

100 is:

Answer %

 

 

89

GED
Please consider the problem (c) again, supposing that instead
of 30, there is only one engineer in the group, all others being

lawyers.

The probability that Jack is that only engineer in the sample of 100

Answer %

90

GP?

 

A still developing machine of computer vision Wlll commit error
randomly about 30% of any time. Suppose the machine declares a certain
figure to be an ellipse while given a trial document containing 90 Circles
and lo ellipses, estimate the probability that the figure is really an

ellipse.

Answer 5%

l

 

 

91

GB?)

Given the information that (i.e. knowing that) the event B has occurred,
the probability for the occrrence of the event A is B/lo. Knowing that B
does m occur, the probability for the occurrence of the event A is 0.4. The

natural probability of occurrence for the event B is trio.

Now given that (i.e. knowing that) the event A has occurred, what is the

probability for the occurrence of event B?"

38

Answer

 

92

APPENDIX B

Two Protocol Examples

INTERVIEW PROTOCOL
Subject No: 38 Sex: M

A) There are two cases involved: green and failure; blue

and success. The probability is 85%*20%+15%*80% = 29%.
B) Method same as in (A).

C) The only clue is that he shows no interest in political
and social issues and ... like mathematical puzzles. He
seems more like an engineer. It implies that the probability
is 30/100. And since he is chosen from 100 people. The

probability is l%*30%=3%.

D) The clue about him is even fewer, it is hard to judge.
It is hard to use calculations. Thinking with numbers here
has little use. The best way is not to calculate. Given
information is too few. <No answer, the subject does not

want to give an answer to this problem>

 

93

E) 1/100 * l/100 = 0.01%. Picking one from 100, its own
probability is l/lOO. (reasoning is similar to that in

problem C) Therefore the probability is 0.01%.
F) 70/100 * 10/100 = 7%.

G) P(AB)=0.6, P(AB')=0.4, P(B)=l/10. P(B/A)=P(AB)/P(A)=
0.4*0.01/(0.4*0.l+0.6*0.9)=6.8%. I haven't used Bayes'

theorem for several years already.

Statistical knowledge: One course of statistics at 2nd year

at undergraduate college.

Knowledge about Bayes' Theorem: learned before, remember

somewhat.

Major: Computer Science.

 

94

INTERVIEW PROTOCOL

Subject No: 40 Sex: F

A) I believe in the witness, believe in what he said. The

degree of belief is 80%. The occurrence rate of 85:15 has

 

little influence. We should believe in the witness. < 80% >

B) Here I am more certain. Accident rate is similar to the
past crime record. It is not right to suspect him (the one

with crime record) when we have a crime incident. Still 80%.

c) Shows no interest in political and social issues. A
Lawyer should care about the society, care about politics.
Originally it should be 50:50. Since he shows no interest,
it becomes 25:75; he likes mathematical puzzles, therefore
it becomes 12: 87. (taking a further half from 25) The

probability is 87%.

D) He works well with his colleagues. Lawyers always work
together with colleagues, share and exchange opinions.
Engineers work more by himself. It is possible to be a
lawyer. The answer is 13% (using a similar method as problem
C, 100%-87%=l3%). It only means the probability is very

small, the numbers doesn't mean very much. It only means it

is very small.

95

E) This is the same as doing the accident rate of cabs. It
(the ratio) doesn't have any meaning. The probability is

87%. I trust my judgment.

F) I have little concepts about numbers. Since the error
rate is 30%, the hit rate should be 70%. The method of
reasoning is the same as before (same as problem A, B). <

70% >

G) 40%. B occurs and A occurs is 60%. Therefore A occurs and

B occurs is 40%.

Statistical knowledge: afraid of mathematics, learned

nothing about statistics.
Knowledge abOut Bayes’ Theorem: Never heard about it.

Major: Musicology.

 

96

APPENDIX C

Classification of the Two Protocols

Here is an interview protocol from subject #38, a male

graduate student in computer science:

A) There are two cases involved: green and failure; blue

and success. The probability is 85%*20%+15%*80% = 29%.
Classification: probabilistic

Reason: subject explicitly employs the additive and

multiplicative rules of axiomatic probability
B) Method same as in (A).

Classification: probabilistic

Reason: same as above

C) The only clue is that he shows no interest in political

and social issues and ... like mathematical puzzles. He

Seems more like an engineer. It implies that the probability

is 30/100. And since he is chosen from 100 people. The

probability is l%*30%=3%.

97

Classification: probabilistic

Reason: subject actively employs the urn model (i.e.
30/100) for determining the probability of selecting an
engineer in a group of 100 people, 30 of which are
engineers. Besides, there is heavy reliance on the

multiplicative rule of probability theory.

D) The clue about him is even fewer, it is hard to judge.
It is hard to use calculations. Thinking with numbers here
has little use. The best way is not to calculate. Given
information is too few. <No answer, the subject does not

want to give an answer to this problem>

Classification: missing

Reason: subject decides that it is appropriate for him

to give any answer to this problem.

E) 1/100 * 1/100 = 0.01%. Picking one from 100, its own
probability is 1/100. (reasoning is similar to that in

problem C) Therefore the probability is 0.01%.

Classification: probabilistic

Reason: subject's pattern of reasoning is the same as

in problem C above.

F) 70/100 * 10/100 = 7%.

 

 

98

Classification: probabilistic

Reason: subject solely relies on axiomatic probability

theories.

 

 

And the following protocol comes from a female student

(subject # 40) with a major in music:

A) I believe in the witness, believe in what he said. The
degree of belief is 80%. The occurrence rate of 85:15 has

little influence. We should believe in the witness. < 80% >
Classification: intuitive

Reason: subject does not employ any probability theory,
consider the base rate as having little or no influence,
trust the individual case (witness) than the statistical

data.

B) Here I am more certain. Accident rate is similar to the
past crime record. It is not right to suspect him (the one

with crime record) when we have a crime incident. Still 80%.
Classification: intuitive

Reason: subject again does not rely on probability
theory, consider the past as not determining the present

case, believe in the individual witness information.

c) Shows no interest in political and social issues. A
Lawyer should care about the society, care about politics.
Originally it should be 50:50. Since he shows no interest,

it becomes 25:75; he likes mathematical puzzles, therefore

 

100

it becomes 12: 87. (taking a further half from 25) The

probability is 87%.
Classification: intuitive

Reason: subject does not rely on any formal probability
theory. Although subject does manipulate with some numbers,
she does it in a self-made way so as to express her degree
of belief with respect to the evidence. She also uses the
IF—THEN criterion : if one is a lawyer, one should care
about politics, to determine the career from the given

information.

D) He works well with his colleagues. Lawyers always work
together with colleagues, share and exchange opinions.
Engineers work more by himself. It is possible to be a
lawyer. The answer is 13% (using a similar method as problem
C, 100%-87%=l3%). It only means the probability is very
small, the numbers doesn't mean very much. It only means it

is very small.
Classification: intuitive

Reason: creating narratives of a normal working
atmosphere of a lawyer and engineer to fill in the gaps of
information in this case. Subject does not predominantly

rely on some probability theories.

 

 

101

E) This is the same as doing the accident rate of cabs. It
(the ratio) doesn't have any meaning. The probability is

87%. I trust my judgment.
Classification: intuitive

Reason: reason is analogous to those for problem A or B

above

F) I have little concepts about numbers. Since the error
rate is 30%, the hit rate should be 70%. The method of
reasoning is the same as before (same as problem A, B). <

70% >
Classification: intuitive

Reason: reason similar to those in problem A or B.
Subject relies heavily on information about the individual
case concerned, and not rely on probability theory. Surely
the subject has to know at least that hit rate = 100% -
error rate, in order to comprehend the question. The
possession of such knowledge does not mean that the judgment

is predominantly a probabilistic mode.

APPENDIX D

Histograms of the Numerical Judgmental Responses

 

 

Response
Midpoint
(percent)+

Eh\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\

\\
\\\\\\\\\\\\\\\\\\
I
l\\\\
33
+
\\ <--
48 \\
I
63
+\\\\
78 j\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
I>§..+ ........ +....I....+....I....+....I....
0 5 10 15 20
Frequency

Figure D.1: Histogram of the subjects' responses for
problem A. Bayes' estimate denoted by the arrow Sign

<-—.

 

 

 

103

 

 

Response
Midpoint
(percent)+

7
+\\\\\\\\\\\\\\\
\\\\\

I\\\\\\\
\\
37
+\\\\\ <--
52 ’\\\\\
67 i\\\\\
+\\\\\
82 I\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
0 ........ 4 ........ 8 ....... i2 ....... i6....+

Frequency

Figure D.2: Histogram of the subjects' responses for
problem B. Bayes' estimate denoted by the arrow sign
<—_ O

 

 

104

 

 

Response
Midpoint
(percent)+

5 I\\\\\\\\\\

20 l\\
35 +\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
50 l\\\\\\\\\\\\\\\

65 l\\\\\
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\

30 +\\\\\\\\\\\\\\\
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\

95 \\\\\\\\\\\\\\\\\\\\
T\\\\\\\\\\\\\\\\\\\\\\\\\

Frequency

Figure D.3: Histogram of frequency of subject's
response for problem C. Bayes' estimate not available
for this problem.

 

 

105

 

 

Response
Midpoint
(percent)+

l \\

I\\\\\\
1 +\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\

U)

I\\\\\\\\\\\\
61 \\

 

Frequency

Figure D.4: Histogram of the subjects' responses for
problem D. Bayes' estimate not available for this

problem.

 

 

106

 

 

Response I
Midpoint
(percent)+
5IQR\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
20 ’\\
35
50 l\\\\\\\\\\
65
\\\\\
80 +\\\\\\\\\\\\
’\\\\\
95 \\\\\
T\\\\\\\\\\
I....+....I....+....I....+....I....+....I....+
0 4 8 12 16
Frequency

Figure D.5: Histogram of the subjects' responses for
problem E. Bayes' estimate not available for this
problem.

 

 

107

 

 

Response l
Midpoint
(percent)+
l \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
12 \\\\\\\\\\\\

 

 

 

\\\\\\ <--
I
\\
36 \\\\\
+
all
I
60 \\
+
72 I\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
0 ........ 4 ........ 8 ....... 12 ....... 16....
Frequency

Figure D.6: Histogram of the subjects' responses for
problem F. Bayes' estimate denoted by the arrow sign
<--.

 

 

108

LIST OF REFERENCES

Ajzen, I. (1977) Intuitive theories of events and the
effects of base rate information on prediction. Journal
of Personality and Social Psychology, 35, 304-314.

Bar-Hillel, M. (1980) The base rate fallacy in probability
judgment. Acta Psychologica, 44, 211—33.

Bieri, J.; Atkins, A.L.; Briar, S.; Leaman, R.L.; Miller, H.
& Tripodi, T. (1966) Clinical and Social Judgment: The
discrimination of Behavioral Information. N.Y. : John
Wiley & Sons, Inc.

Bishop, Y.M.M.; Fienberg, S.E. & Holland, P.W. (1975)
Discrete Multivariate Analysis. Cambridge: The MIT
Press.

Borgida E. & Brekke, N. (1981) The base rate fallacy in
attribution and prediction. In J. H. Harvey, W. J. Ickes
& R. F. Kidd (Eds.) New Directions in Attribution
Research (V0143); Hillsdale. N.J.: Erlbaum.

Bruner, J. (1986) Actual Minds, Possible Worlds. Cambridge:

Harvard University Press.

W.J. (1980) Practical Nonparametric Statistics

 

Conover,

 

109

Cohen, L.J. (1981) Can human irrationality be experimentally
demonstrated? The Behavioral and Brain Sciences, 4,
317-70.

Ericsson, K.A. and Simon, H. (1980) Verbal reports as data.
Psychological Review, 87, 3, 215—250.

Evans, J.ST.B.T. (1989) Bias in Human Reasoning. UK:

 

Lawrence Erlbaum Associates Ltd.
Fischhoff, B., & Bar—Hillel, M. (1984) Diagnosticity and the

base rate effect. Memory and Cognition, 12, 402-10.

 

Gibbs, G., Morgan, A. & Talor, L. (1980) A review of the
research of Ference Marton and the GoteboggyGrgyp
Institute of Educational Technolggy, The Open University,
Study Methods Group, report no. 2.

Ginosar, Z. & Trope, Y. (1980) The effects of base rates and
individuating information on judgments about another
person. Journal of Experimental Social Psychology, 16,
228-42.

Ginosar, Z. & Trope, Y. (1987) Problem solving in judgment

under uncertainty. Journal of Personality and Social

 

Hammerton, M. (1973) A case of radical probability
estimation. Journal of Experimental psychology, 101,

252-54.

 

 

110

Hinsz, V.B., Tindale, R.S. Nagao, D.H., Davis, J.H., &
Robertson, B.A. (1988) The influence of the accuracy of
individuating information on the use of base rate
information in probability judgment. Journal of
Experimental Social Psychology, 24, 127-45.

Holt, R.R. (1958) Clinical and Statistical prediction: A
reformulation and some new data. Journal of Abnormal and
Social Psychology, 56, 1—12.

Holt, R.R. (1986)7C1inica1 and Statisticalyprediction: A
retrospective and would-be integrative perspective.
Journal of Personaligy Assessment, 50, 376-86.

Kahneman, D., Slovic, P. & Tversky, A. (1982) Judgment under
Uncertainty: Heuristics and biases. N.Y.: Cambridge
University Press.

Kahneman, D. & Tversky, A. (1973) On the psychology of
prediction. Psychological Review, 80, 237-51.

Kleinmuntz, B. (1990) Why we still use our heads instead of
formulas: Toward an integrative approach. Psychological
Bulletin, 107, 296—310.

Lyon, D., & Slovic, P. (1976) Dominance of accuracy
information and neglect of base rates in probability
estimation. Acta Psychologica, 40, 287-98.

Marton, F. (1981) Phenomenography- describing conceptions of

the world around us. I2§tructional Science, 10, 177-200.

 

111

Marton, F. & Saljo, R. (1984) Approaches to learning. In F.
Marton, D. Hounsell and N. Entwistle (Eds) The Experience
of Learning. Edingurgh: Scottish Academic Press.

Meehl, P.E. (1954) Clinical versus statistical prediction:

A theoretical analysis and a review of the evidence.

 

Minneapolis: University of Minnesota Press.

Meehl, P.E. (1986) Causes and effects of my disturbing
little book. Journal of Personality Assessment, 50,
370-75.

Newell, A. & Simon, H.A. (1972) Human Problem Solving.

 

Englewood Cliffs, N.J.: Prentice Hall.

Nisbett, R.E. & Ross, I. (1980) Human Inference: Strategies
and short—comings and socialyjudgment. Englewood Cliffs:
Prentice Hall.

Nisbett, R.E., Zukier, H., & Lemley, R.E. (1981) The
dilution effect: Nondiagnostic information weakens the
implications of diagnostic information. Cognitive
Psychology, 13, 248-77.

Strauss, A. (1987) Qualitative Analysis for Social
Scientists. Cambridge: Cambridge University Press.

Tversky, A. & Kahneman, D. (1980) Causal schema in judgment
under uncertainty. In M. Fishbein (ed.) Progress in

Social Psychology,49—72, Hillsdale, N.J.:Erlbaum.

 

112

White, P. (1984) A model of the layperson as pragmatist.

Personality and Social Psychology Bulletin, 10, 333—48.

 

 

 

lililllilll

Hill

ill
906914

lilill’iil

i

Hill”

31293007

i

L

V

I

N

U

E

T

An

T"

S“
"

N

AH

G

I

H

C

T.

m

. .....zz,