! ~—‘~ . «mo 9 £53»; LIPDA Q}: 2? vs “1 hflichngahiS,eb3 Q llaivcrahar ~ a y"“ ”‘5’. m I/lll/ll/I/ll/I ///I/////l////I Ill/I/fl/Wl ////II//////I//i "7 3 1293 10385 8696 This is to certify that the thesis entitled Using Decision-Analytic Techniques to Constrain the Para- meters of and to EvaTuate Models of Human RationaT Decision Processes: With Application to Diagnosis and Treatment Decisions presented by Thomas Henry Whaien has been accepted towards fulfillment of the requirements for Ph. D. degree in Systems Science Yfilmm r / ajor 55 9/4 aw fifia/ Date ,2 6; fl)? V J ’ 0-7 639 OVERDUE FINES ARE 25¢ PER DAY PER ITEM Return to book drop to remove this checkout from your record. r' ti:fi( USING DECISION-ANALYTIC TECHNIQUES T0 CONSTRAIN THE PARAMETERS OF AND TO EVALUATE MODELS OF HUMAN RATIONAL DECISION PROCESSES: WITH APPLICATION TO DIAGNOSIS AND TREATMENT DECISIONS By Thomas Henry Whalen A DISSERTATION Submitted to Michigan State University in partial fulfilment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Electrical Engineering and Systems Science “979 ABSTRACT USING DECISION-ANALYTIC TECHNIQUES TO CONSTRAIN THE PARAMETERS OF AND TO EVALUATE MODELS OF HUMAN RATIONAL DECISION PROCESSES: WITH APPLICATION TO DIAGNOSIS AND TREATMENT DECISIONS By Thomas Henry Nhalen A new methodology for the study of human decision making processes is introduced. The method, called the "cartographic paradigm,” observes a human decision maker's choices among costly sources of imperfect (probabilistic) information in order to make deductions about internal cognitive processes by which a final decision is made, by comparing the human's choices with those predicted by a formal theoretical model for various regions in the model's parameter space. The dimensions of the parameter space are psychologically meaningful variables characterizing individ- ual differences in decision-making approach, and the compar- ison between the model and actual human behavior is facili- tated by a map of parameter space demarcating the regions which lead to different behavior patterns in response to a given decision-making task -- hence the name ”cartographic paradigm”. The paradigm, which is intended for use with a par- ticular class of theoretical models, is demonstrated in a small pilot study using one such model, the MyOpic Conser- vative Bayesian Decision Maker. This model was found to be inadequate as an explanation of behavior at the levels of motivation and expertise studied, especially when data pur- chased later in the decision process tends to contradict the hypothesis favored by data purchased earlier. As an additional check on the assumptions of the Myopic Conservative Bayesian Decision Maker, the pilot study also collected data on self-reported subjective probability according to a paradigm developed by Edwards (G968). Analy- sis of this data revealed subjective probability generally in excess of the optimal Bayesian estimate in the light of data selected and paid for by the subject; earlier studies in which the subject had no control over the amount or type of data showed the opposite effect. Concepts such as differential processing of consistent and inconsistent data, batching of information purchases, and premature termination of information gathering are briefly discussed as candidates for inclusion in improved theoretical models. Opportunities for future research involving the paradigm include, in addition to laboratory studies, a continuing research project in structural analy- sis of problem complexity as related to human problem sol- ving and an Opportunity for applied research in the decision support system component of a management gaming system. A successful program of research on human decision processes carried out according to the cartographic paradigm could benefit professional education in a number of fields by providing a multidimensional means of measuring the differences between students and expert decision makers in any given specialty. Research of this nature could also benefit designers of organizational and automated decision support systems by identifying feasible enhancements to a given decision maker's information processing capabilities, changing the parameters of the decision making process so that they lie in a region which improves expected perfor- mance without introducing elements alien to the thought processes of the human decision maker who must retain responsibility. ACKNOWLEDGEMENTS Thanks are due a number of persons and institutions for assistance and support during the preparation of this dissertation. First of all, I would like to thank my advisor, Dr. Sui-Nah Chan of the College of Human Medicine (Office of Medical Education Research And Development and the Dean's Office for Educational Programs) for his gui- dance, support and encouragement over the past two years, and Dr. Gerald Park of the Department of Electrical Engi- neering and Systems Science for his work as co-chairman of the guidance committee. Thanks also go to the other members of the committee: to Dr. Robert Schlueter, who provided a point of continuity to my entire graduate program by also serving on the guidance committees for my M.S. in Systems Science and my M.A. in Sociology; to Dr. John Kreer; and to Dr. Thomas Manetsch. Informal seminars with fellow graduate students Allen Knapp, Douglas Franco and Suzanne Jennings served as impor- tant challenges to develop and organize my thinking. Computer access was provided through the Medical Edu- cation Resources Sharing Network (CDC 6500 and Amdahl #70), the Department of Electrical Engineering and Systems Science (CDC 6500) and the Computer Institute for Social Science Research (Hewlett-Packard 2000). The text of the disser- tation was prepared using the word-processing capability of the HP 2000. Patricia Simon assisted in the entry of text and code into the computers, and conventional typing help was provi- ded by Sharon Schwab and Marguerite Savage. Merald Clark drew figures a and 2 and inked in special characters and Greek letters. The Michigan State Housing Development Authority facilitated my studies during the early part of the program by re-arrangements and reductions of my hours of work to fit my course schedule; later, support was provided by graduate assistantships within the Michigan State University College of Human Medicine. Last, but not least, I am sincerely grateful to my wife, Evelyn, for her encouragement and understanding during the years of graduate study, and especially for reading and listening to innumerable preliminary fragments of this dissertation. TABLE OF CONTENTS LIST OF TABLESeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeoesoeeeeeeVi LIST OF FIGURESOOOOOOOOOOOOOO0......OOOOOOOOOOOOOOOOOIOV11 LIST OF SYMBOLS...COOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOV111 a. INTRODUCTION.0.0.0.0....-OCOOCCOOOOOOOO0.0.0.000... ‘1 a.“ OVERVIEHOCOOOOOOOOOOOOOOOOOOOCOCOOOOOOOO0.0... ‘1 .2 MYOPIC CONSERVATIVE BAYESIAN DECISION MAKER... 5 .3 METHOD OF STUDY............................... an A REVIEW OF RELATED RESEARCH.................... an J.A.fl Protocols.............................. an a.u.2 Subjective Probability................. A6 A.A.3 Weighting Coefficients................. “8 a.n.u Choice of Information.................. 2a a.u.5 Comparison With the Present Research... 23 a J J O 2. SYSTEM DEFINITION.................................. 27 2.a OVERVIEW...................................... 27 2.2 DECISION-MAKING TASK.......................... 28 2.3 DECISION MAKER................................ 3a 2.” THEORETICAL MODEL............................. 33 2.5 OBSERVER...................................... 38 2.6 MYOPIC CONSERVATIVE BAYESIAN DECISION MAKER... nu 2.6.fl Conservative Update Module............. Ad 2.6.2 Myopic Selection Module................ an iv 3. PILOT STUDY.OOOOOOOOOOOOOCOOOOOOOOOOCOOOOOOOOOIOOOO ’48 3.a CREATING THE TASK ENVIRONMENT................. #8 3.2 MAPPING THE PARAMETER SPACE................... 55 3.3 OBSERVING HUMAN DECISION MAKERS............... 66 3.3.a Preliminary Experiments................ 69 3.3.2 Refinements to the Task Environment.... 78 3.3.3 Analysis of Economic Behavior.......... 83 3.3.” Analysis of Subjective Probabilities... 88 A. PRACTICAL AND SCIENTIFIC IMPLICATIONS.............. 92 u.a IMPLICATIONS OF THE PILOT STUDY............... 93 A.J.fl Substantive Implications: Evaluation of the Model..... 93 M.A.Z Methodological Implications: Evaluation of the Paradigm.. 97 A.2 ALTERNATIVE THEORETICAL MODELS................A00 u.2.a Update Models..........................a00 M.2.2 Choice Models..........................003 4.3 LONGTERM RESEARCH PROGRAM.....................A06 n.3.a Laboratory Experiments.................JO7 h.3.2 Problem-Solving Research In a Medical Environment....aa0 4.3.3 Applied Research Using Management Gaming...........A32 A.A IMPLICATIONS FOR PROFESSIONAL EDUCATION.......AA3 u.5 IMPLICATIONS FOR DECISION SUPPORT SYSTEMS.....AA5 5. SUMMARY AND CONCLUSIONOOOO0......OCOOOOOOOOOOOOOOOOaa7 APPENDIX “3 Derivation Of ES’QCfi;,P&,Yq)eeeeeeeooeeeee121 APPENDIX 2: Pascal Program MYOPICeeeeeeeeeeeeeeeeeeeee122 APPENDIX 3: Materials for the Pilot Experiment........125 REFERENCESOOOOOOOOOOOO.0...00.0.00...0.00.00.00.0000000139 TABLE TABLE TABLE TABLE TABLE TABLE LIST OF TABLES COSTS, PAYOFFS AND PROBABILITIES (TASK ENVIRONMENT FOR THE PILOT STUDY)....... RESULTS OF EXPERIMENT !...................... RESULTS OF EXPERInENT 2.00.00.00.00000000000. RESULTS OF EXPERIMENT 3.0.0.00000000000000000 COMPARISON BETWEEN BEHAVIOR PATTERNS OF SUBJECTS IN EXPERIMENT A AND THE MYOPIC CONSERVATIVE BAYESIAN DECISION MAKER......... MATCHES, NEAR-MATCHES, AND NON=MATCHES OF SUBJECTS WITH THE MODEL GIVEN CONSISTENT AND INCONSISTENT DATAOOOCOOOOOOOOOOCOCOO0.... vi 52 68 7d 75 82 82 FIGURE 9: FIGURE 2: FIGURE 3: FIGURE A: LIST OF FIGURES BLOCK DIAGRAMOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO MYOPIC CONSERVATIVE BAYESIAN DECISION MAKER.. MAP OF PARAMETER SPACEOOOOOOOOOOOOOOOOOOOOOOO C COMPARISON OF SUBJECTIVE AND BAYESIAN PROBABILITIES....0...OOOOOOOOOOOOOOOOOOOOOOOC vii 87 SYMBOL LIST OF SYMBOLS DEFINITION INTRODUCED ON PAGE Set of states of the world 28 Number of states in S 28 One particular state in S 28 Set of alternative final choices 28 Number of choices in X 28 One particular final choice in X 28 Payoff Matrix 29 Payoff for xj when 03 holds 29 s-vector of possible payoffs for xj 29 Set of available tests 29 Number of available tests in Q 29 One particular test in Q 29 Cost of test qk 29 Set of possible results of qk 29 Number of possible results in RR 29 One particular result of test qk 29 Representativeness of rfi to state di 29 Discrete time counter 31 The most favored xj at a given time 31 s-element vector of state saliences 32 Salience of 01 at time t 32 viii Jm(xj9 1Tm) 3: 45 (m: (s,x,r,o)(‘P) 2 ‘1’1 Es,Q<fit,r§,Y1> ‘1’2 V(k,1’i‘t+1,‘l’1,‘i’2) Gk! Time when a final choice is made Expected payoff for act xj at time m Model's action (q or x) at time t Test purchased by model at time t Model's estimate of RT Model's estimate of‘fl% Model's set of parameters Space defined byVfi List of the model's behavior Point in ‘PM giving parameter values Model viewed as a B-valued function A region in TM Conservatism Conservative update function Myopia Value of test k at time t+1 The most favored qk at a given time 3. INTRODUCTION J.fl OVERVIEW The goal of this dissertation is to deveIOp a new methodology for studying the ways human beings use imperfect information in making economic decisions under conditions of uncertainty. Previous research has presented subjects with a fixed quantity of information and used statements of (subjective) probability as the response variable (Edwards, 4972); in contrast, the method to be presented here will use a task environment in which the stimulus consists of opportunities to purchase probabilistic information prior to a final choice with an uncertain payoff, amd the response consists of the sequence of information purchases and the final ‘choice selected. The method is fundamentally an empirical one, but it demands a highly-developed theoretical model to serve as the hypothesis to be evaluated. The result of analyzing one such decision maker's behavior in terms of a given theoretical model may be either the classification of the decision maker within a region of the model's parameter space, or the statement that the model is inadequate to explain the observed behavior. The purchases of costly information can be considered as a behavioral 'window' into the decision maker's cognitive processes, since they allow a clearer view of what information he considers so valuable in arriving at a decision that he is willing to pay for it. In addition, the study of choice among costly sources of probabilistic information is directly relevant to such practical problems as market research studies or medical diagnostic tests, which require careful balancing of the costs and benefits of information gathering. Improved understanding of this process can benefit professional education by analyzing the differences between students and experts; it can also benefit decision support systems by identifying natural extensions to human information processing which do not dilute the understanding needed for responsible control. The paradigm is based on a system composed of a decision-making task, a human decision maker, a candidate theoretical model of human decision-making, and an observer who compares the behavior of the model with that of the decision maker. In Chapter 2, the decision-making task will be formally defined in terms of: (A) a set of possible states of the world; (2) a set of alternative final decisions; (3) a matrix giving the payoff for each final decision under each state of the world; and (A) a collection of opportunities to purchase costly probabilistic information about states of the world. The decision maker will be described by a set of axioms characterizing his U) goals, resources, and constraints. A block diagram will show the interface between the general system and any particular theoretical model. A key part of any model in this paradigm is the definition of a parameter space within which the characteristics of the individual's approach to the problem, as inferred from his behavior, may be placed. An additional set of axioms corresponds to the observer's resources and constraints, and to his goal, which is to find the region of the model’s parameter space within which the representation of the decision maker's characteristics must lie if the model is to explain the decision maker's behavior (or to determine that no such region exists). Because of the central role played by a map of the abstract parameter space, the methodology will ‘be referred to herein as the "cartographic paradigm." The first prerequisite of the cartographic paradigm is a well-defined hypothesis about the decision-making process. This hypothesis must be expressed in a form capable of making predictions about an individual's sequence of information purchases and final choice given a particular problem. In its present form, the methodology is only applicable to deterministic models; however, stochastic models can be accommodated by straightforward generalizations using either conventional statistics or fuzzy set theory. Since different individuals (or different populations in a stochastic model) behave differently when solving the same decision problem, the theoretical model will have one or more parameters corresponding to psychologically significant inidividual differences presumed to underly the differences in behavior. The overt actions of information purchases and final decision must be treated as discrete selections out of finite choice sets, but the parameters of the model may lie on whatever interval, ordinal, or nominal scales are theoretically appropriate. 3.2 MYOPIC CONSERVATIVE BAYESIAN DECISION MAKER The central focus of this research is the cartographic paradigm as a general methodology for evaluating theoretical models of human information processing and decision making in a context of costly imperfect information; the Myopic Conservative Bayesian Decision Maker introduced in this section is intended primarily as a simple example of a theoretical model which satisfies the formal requirements of the paradigm. While the model is based on well-established empirical and analytic research (Raiffa,1968; Edwards, 3968,1972; Schum and Martin,fl968; Gorry et al, 4973), it is not proposed as a complete substantive statement about human behavior, only as a plausible first step which incorporates two concepts, conservatism and myOpia, from out of the many that must be examined in the construction of a full-scale model. A few of the most important of these omitted concepts are discussed in Chapter A. The overall structure of the Myopic Conservative Bayesian Decision Maker is derived from the decision analysis algorithm (Raiffa, A968), in which Bayes' theorem and dynamic programming are used to evaluate each possible sequence of information purchases and final decision so as to maximize expected payoff net of the cost of information. Because of this, the model, like the decision analysis algorithm, assumes a well-structured problem in which all the costs and payoffs are measured on a common scale of utility and all the prior and conditional probabilities are known. This restriction imposes a limitation on the experimental task environments that can be used with this model; in fact, it excludes those ill-structured problems with no provably Optimal solution for which understanding the strategies of expert and would-be expert human beings is of the most practical importance. However, if a well- structured problem is sufficiently complex relative to the time and other resources available for its solution, or if the \cost of information processing is not negligible compared to the costs and payoffs of the task environment, then the exhausitve analysis called for by the decision analysis algorithm is no longer optimal; instead, the problem must be solved using the same kind of heuristic approaches that are used for ill-structured problems. Since different individuals will solve the same problem in different ways in such a case, the Myopic Conservative Bayesian Decision Maker is provided with two parameters, conservatism and myopia, in an attempt to capture some of this variability. It is conjectured that the similarities and differences among individuals measured by research in this environment can later be extended to similarities and differences in behavior on ill-structured problems, so that research findings using well-structured experiments can be relevant to the wider range of problems in which the "recognized expert" is the only standard of quasi- optimality. The concept of conservatism is derived from the work of Ward Edwards (A968, J972). In Edwards’ experiments, subjects were presented with (costless) data from one of two initially equally-likely binomial probability distributions, and asked to estimate the a posteriori probability that the data was derived from a given distribution. When the two distributions differred strongly, subjects changed their Opinion less than the data warranted, but when the distributions were nearly the same, subjects tended to overvalue the data by changing their Opinions more than was warranted. Edwards found that these results could be summarized by using a modification of the odds form of Bayes' Law in which the posterior odds equal the prior odds times some power "c" (for "conservatism") of the odds ratio for the data observed. When c=fl, this is simply the Optimal, Bayesian formulation; c=0 would imply ignoring the data totally and sticking to the original belief that the two distributions are equally probable, while 0 = infinity would mean jumping to complete certainty on the basis of any data whatever. The exponent O can thus be used as a measure Of conservatism, with lower numeric values corresponding to more conservative behavior. Schum and Martin (A968) found that Edwards' model applied equally well to a problem in which six multinomial distributions took the place of the two binomial distributions used by Edwards. Thus, to accomodate this finding about human information processing, the model uses Edwards' general equation, of which Bayes' is a special case. It should be noted that, even if the a priori subjective probabilities add up to one, as new data is used to update these subjective probabilities, the total of the resulting a posteriori probabilities will vary. Edwards (3967) gives a mathematical basis for theories Of subjective probability distributions that need not add up to one; in fact, Edwards casts considerable doubt on the possibility that any consistent theory of subjective probabilities satisfying reasonable assumptions could require that the sum of the subjective probabilities of a set of exclusive exhaustive events must add up to any pre-specified constant. While the model thus addresses an issue first raised by Edwards, the methodological technique differs sharply from his. Prior research has focused primarily on measuring the importance of information to human decision makers in terms of its after-the-fact effect on subjective probabilitites, ‘0 either by asking subjects to give introspective statements of their confidence in a hypothesis or by offering choices among simple bets based on the hypotheses. The cartographic paradigm, on the other hand, draws the majority of its data from economically-motivated choices among data sources; the impact of information will be measured more by the circumstances under which a subject is willing to pay for that information than by backwards inference from subsequent behavior. Analysis of economically-motivated purchase of information leads one immediately to the literature Of information value theory, which in principle can provide the Optimal choice of information purchase or final decision at all times, given either Bayes' or Edwards' model of information processing. However, as Gorry et al (A973) paint out, the computational burden of this algorithm quickly becomes insupportable for problems of even moderate complexity. In order to implement their computer program for medical diagnosis and treatment selection, they evaluate tests as if it were necessary to diagnose immediately after getting the results of the test under consideration, neglecting the option of continuing to test. (Once the test has been selected and the results have been received, the process is repeated and another test may, in fact, be purchased if the value of so doing exceeds that of an 10 immediate diagnosis; thus, the expected value Of a test by Gorry's method is a lower bound for the "true" expected value given by the full decision analysis algorithm.) The "myopia" parameter of the Myopic Conservative Bayesian Decision Maker is derived from a generalization of Gorry's variant of decision analysis. Gorry's model, which looks only one move ahead, corresponds to the present model with a myOpia of 1. When myOpia = 2, each possible test is evaluated as if at most one additional test could be purchased after getting the results of the test under consideration. The original decision analysis algorithm corresponds to the present model with a myopia parameter equal to infinity, or at least very large with respect to any observed sequence of actions; in this case, there is no arbitrary limit on the length of testing sequences that may be considered. This completes the overview of the "MyOpic Conservative Bayesian Decision Maker." This model is an example of the class of models to which the cartographic paradigm is applicable; any such model can be viewed abstractly as a function which, in the context of a given decision-making task, maps a point in an abstract parameter space (such as the space defined by conservatism and myopia) to a particular sequence of observable behaviors. fl.3 METHOD OF STUDY The goal Of the cartographic paradigm is to connect the theoretical insight expressed in such a formal model with a system of observations, in order to learn about the similarities and differences among decision makers. Sequences of information purchases and final choice are replaced by equivalent points or regions in an abstract space of psychological parameters hypothesized to underly behavior on the decision-making task. For a one-to-one function from parameter space to behavior sequence, this conversion can be accomplished simply by inverting the function. However, a one-to-One model has a serious drawback; if two decision makers emit the same behavior sequence in response to one particular decision-making task, they cannot differ from each other on any decision-making task to which the model applies. A more flexible model, such as the Myopic Conservative Bayesian Decision Maker, will produce the same behavior sequence within small regions of its parameter space for a given decision-making task, but different behaviors for a sufficiently distant point in parameter space. A different task will define a different partition of the same parameter space, thus allowing for two points to be in the same region for one decision—making task but two different regions for .12 another. The cartographic paradigm developed for dealing with such models begins with the precise specification of the model -- a computer simulation will Often be the most efficient form, although simulation is not a sine qua non of the method. In addition to the model, the task environment must also be selected. The task environment must be rich enough to elicit a significant variety of behavior, yet simple enough to be understood equally well by all the subjects and to keep the analysis of the results tractable. In a laboratory environment, the characteristics of the task may be "tuned" by simulation runs using various combinations of task and parameter values, or analytically using model equations. Once the model and the task have been specified, the behavior of the model in response to the task must be studied. The result of this stage in the research procedure is a collection of all the behavior patterns of which the model is capable, together with the regions in parameter space corresponding to each pattern. Following this, human decision makers are presented with the same problem, and classified into the regions of parameter space corresponding to their observed behavior patterns. Any subject whose behavior does not match the behavior of the model for any parameter values is not describable by the model; while such an occurrence need not totally destroy the usefulness of the model as a description of some decision makers, it would at the least mean a theoretical incompleteness in the model, if not grounds for its rejection. A possible practical application of this type of analysis, should it prove successful, lies in the training of student decision makers, such as medical students. If a model is discovered such that skilled physicians cluster in an identifiable region of that model's parameter space, the ways in which the behavior of a student differs could be analyzed in terms of that parameter space. This would provide a clearer language with which to give feedback to the student, as well as the possible basis for a new educational technology. Another benefit could be in the design of organizational and automated decision support systems to extend human information processing capabilities without sacrificing the comprehension needed for responsible human control. :1 1.1 3.4 REVIEW OF RELATED RESEARCH This section of the dissertation will review some repre- sentative examples of each of four general methodologies for the study of human information processing and decision- making: Protocols, Subjective Probabilities, Weighting Coefficients, and Information Choice. Each study, in turn, will be discussed in terms of the kind of information pro- vided to the decision maker, the nature of the decision to be made, the implicit or explicit assumptions about human information processing in general, and the role of indi- vidual differences. In the final part Of the section, (3.4.5), the contributions of each of the four schools. of research to the present study will be analyzed. 3.4.3 Protocols ' The most direct way to begin to learn about how a person goes about making a decision is to ask him. The best-known example of this approach is the work of Newell and Simon (1972). In this research, subjects were presented with a complex and highly-structured task environment such as a chess position. All the data to be used in making a de- cision (the next move in the chess problem) were there at the start, and the subject were instructed to "think aloud" in the course of deciding what his best course Of action might be. The recordings of these sessions, called "proto- cols," were then interpreted by the researchers in terms of AS a general theory of problem solving. The result Of this interpretation was a computer program that simulated the behavior of one individual subject on the given problen; both the final choice and the general sequence of processing steps (corresponding to the verbal reports Of thought pro- cesses) were simulated. In effect, Newell and Simon's assumptions about human information processing are mostly imbedded in the structure of their programming language (IPL-V), while separate programs are written to conform to the idiosyncracies Of individual human subjects. Kleinmuntz (A968) presents two protocol-based studies Of medical diagnostic reasoning. The first study, in which a single expert psychologist thought aloud while classifying a number of MMPI profiles on a scale from maladjusted to well- adjusted, was similar in outlook to Newell and Simon' work. The second of Kleinmuntz' studies, however, adds the dimen- sion of information-seeking. In this experiment, neurolo- gists at varying levels of training and experience were pre- sented with a brief description of a hypothetical patient. The subject then proceeded to ask yes-or-no questions about the patient until he reached a diagnosis. The same collec- tion of cases was diagnosed twice, with a lapse of several weeks between sessions; the second time, subjects were asked to state their reasons for each question they asked. 86 Critics of protocol research have pointed out the danger that the introspections called for may alter the mental pro- cesses and so invalidate the findings; Kleinmuntz' method allowed a partial measure of this effect through comparison of the questions asked by a given subject with and without thinking aloud -- a test-retest rank order correlation coef- ficient of .92 was found, which lends credibility to the assumption that the thought processes with and without ver- balization were comparable. Based on this, Kleinmuntz was able to state several hypotheses about the ways that neur- ologists in general go about selecting what question to ask next, and how students, residents, and experienced clini- cians differ. These hypotheses are assessed in a general way in terms of the data, but no specific predictions Of behavior like those of Newell and Simon or Kleinmuntz' MMPI study are attempted in this less-structured task environ- ment. 4.4.2 Subjective probability Most of the research on subjective probability in the last fifteen years has been influenced to a greater or les- ser extent by the work of Ward Edwards. Edwards (1968, J972) obtained human estimates of the probability of differ- ent "states of the world" in situations where the prior probabilities of the states are known and data with known conditional probability given each state are to be used in .17 obtaining an estimate of the posterior probability; he then compared these human estimates with Optimal estimates de- rived using Bayes' rule. (The concept and mathematical operationalization of "conservatism" used in the MyOpic Con- servative Bayesian Decision Maker derives from the results of Edwards' research in this area.) Edwards used many dif- ferent experimental situations, but the simplest of them will sufice to show his paradigm. In this experimant, the subject was told that there existed two bookbags, one with p red poker chips and 100-p white poker chips, and the other with p white poker chips and 100-p red ones. One of the bags had been chosen by the flip of a coin, and a number of chips were drawn with replacement from the selected bag. The subject was then shown the results of this sample, and asked to state how likely he feels it is that the chips were drawn from the mostly red bag. Many variations and elabo- rations of this scenario have been used, including choice among bets or bidding for bets instead of directly stating numeric probability estimates, the use of unequal prior probabilities, et cetera. -- for details, see Slovic and Lichtenstein (d97fl). Most of the work by Edwards and others on subjective probability that was published in time to be reviewed by Slovic and Lichtenstein was concerned with com- paring human performance with Optimal Bayesian performance on this well-defined task; the consistent finding was that humans change their Opinions about probabilities too little 48 when the data are strongly diagnostic, and too much when the data are weak. More recent studies have moved away from reliance on Bayes' rule to search for a formula or formulas which can more accurately reflect human information processing. Wallsten (4976), an outstanding example of this trend, uses four different algebraic "composition rules" as alternative models for how subjects use probabilistic information to alter their a priori subjective probabilities, and tests these models for goodness of fit using sophisticated rank- order based mathematical techniques which avoid earlier researchers' reliance on the numeric qualities Of self- reports Of subjective probability. The models are analyzed for goodness of fit using two different measures, on a subject-by-subject basis as well as overall; the principal model was found to fit all subjects, while one of the three alternative models was capable of fitting all but two, and the other two alternatives showed substantial lack of fit with respect to every subject. 4.4.3 Weighting Coefficients Research aimed at finding numeric weights to represent the relative importance a given human judge attaches to various aspects of a person or thing he is evaulating date back to Wallace's 4923 paper in the Journal Of The American 10 l Society Of Agronomy. In that study, experienced corn judges in Iowa rated 500 ears of seed corn in terms of likely yield when planted; correlation coefficients and multivariate path coefficients between the estimated yields and various sig- nificant characteristics of the corn itself (e.g. length of ear, weight of kernel) were then computed "to make out the score card which really existed in the judges' minds" (Wallace, 4923). Slovic and Lichtenstein (4974) review a large number of studies performed in the 4950's and 4960's in which subjects made a number Of judgements (usually of students or of psychiatric patients) on a numeric or dichotomous scale on the basis of a set of characterisitcs of the persons being judged. These subjective judgements were then compared with the results of a multiple regression analysis with the same characteristics as predictors, and either the judgements or an objective criterion taken as the dependent variable. The studies, taken as a group, seemed to firmly establish the hypothesis that human judgment can be well represented, at least "paramorphicly," by a simple weighted sum of the input data, and furthermore that, when human judgement did differ from the linear prediction of the judgement itself, the linear prediction was more likely to match the criterion! (A "paramorphic representation" is simply a black-box model which matches the system modeled in an input-output sense; 20 the concept was used to separate the empirical demonstra- tions Of the efficacy of the regression equation from the theoretical or introspective question of whether the regres— sion beta weights really capture what is "in the judge's mind. ") These results were the center Of a long and heated "clinical versus statistical judgement" controversy, summar- ized by Slovic and Lichtenstein (4974). At one extreme were those who saw the linear models as providing a major advance in man's ability to improve his own consistency and accuracy of judgement and to convey the judgemental processes of many professions to the next generation of students, by using precisely determined weights for evidence that capture the optimal policy. These views were Opposed by others who decried the loss of nonlinear intuitive or gestalt judge- ments based on human interaction with both the "hard" data processed by the equations for all cases, and the "soft" data that is unique to each particular case, especially when (as was usual in this research) it was human beings that were being evaluated. However, more recent work by Dawes and Corrigan (4974) has cast the question in an entirely new light; they found, in five different judgement tasks, that equations with ran- dom coefficients constrained only as to sign are, on the 21 average, at least as accurate as equations modelling the judgements of human experts or equations derived by standard multiple regression techniques to optimally predict the criterion, given the sample size generally used. Dawes and Corrigan analyzed this surprising result mathematically; in summary, they found it to be due to the extreme generality of the linear approximation technique, combined with the very low sensitivity of the regression equation to random changes in the relative weights Of the coefficients when the predictor variables are intercorrelated. Thus, while find- ing the relative importance of various sources of evidence in an individual's decision-making may remain a desirable goal, it now appears highly questionable whether any method of fitting a linear equation to Observed (or optimal) judge- ments can be of much help. 4.4.4 Choice of Information The systematic study of subjects' choices of information from a defined set of alternatives begins with Glaser et al (4954). In that study, electronics trainees were presented with a description and diagram of a malfunctioning elec- tronic device and a set of possible tests they could perform to troubleshoot. Next to each test was a paper tab that could be ripped off to reveal the result of the test printed underneath. The purpose Of this was to evaluate the pro- ficiency of each trainee on the basis of the appropriateness 22 of the tests selected, as judged by experts in electronics. This goal of evaluating individual competence in terms of conformity to some standard has characterized the majority of research of this type. A number of papers in the applied area of medical diag- nosis have focused on the questions (medical history, physi- cal examination, or laboratory) asked by human physicians and medical students about real or simulated patients. Rimoldi (4964) evaluated questions by measuring how frequently they were asked by members of a population of physicians and medical students, under the assumption that the questions that were asked the most were the ones with the highest utility for the diagnostic task. He also pro- posed using the degree tO which the cumulative utility of the best observed sequence of questions differs from that of the worst Observed sequence as a metric for the difficulty of the task. Sprosty (4964) compared the sequences of questions asked by students who ended up with the correct diagnosis with the sequences of questions asked by students who diagnosed incorrectly. Finally, Barnett (4972) used a sequential Bayesian analysis to determine the increase in certainty provided by the answer to each Of a set of possi- ble questions, using this value to evaluate the effective- ness of students' and clinicians' question sequences. None of these studies, however, has really attempted to model the 23 process by which a person actually decides when to ask what question and when to step asking and make a decision. The work of Fried and Peterson (4969) is particularly relevant to the present research because they used explicit monetary costs for information and payoffs for correct decisions, whereas in the other studies cited above the only cost of information was the risk that the question might not conform to the standard. Fried and Peterson offered sub- jects only one kind Of information (samples from one of two binomial distributions), and compared the amounts of infor- mation purchased under conditions of fixed stopping (where the subject must decide on the sample size before he sees any of the data) and Optional stopping (where the decision whether to continue sampling is made after each binomial event). Subjects conformed well to the Optimal Bayesian sample sizes under fixed stopping, but undersampled under Optional stopping; the authors discuss possible interpre- tations of this result in general terms, but do not offer a predictive model. 4.4.5 Comparison With the Present Research The cartographic paradigm introduced in this disser- tation is related to the protocol studies (section 4.4.4) in the use of a formal, algorithmic model of human behavior to predict the steps a person goes through in making a 24 decision; it differs, however, in avoiding introspection in favor of information purchases as the means of detecting these intermediate stages. Another difference is that the cartographic paradigm implicitly calls for more highly- structured theories Of human information-processing, as com- pared with research such as that of Newell and Simon (4972) where most of the general theory of human information pro- cessing is embedded in the syntax and semantics of the language used to write the simulation programs. (For dis- cussion of this issue, see Scandura, 4977). The present research resembles the research in subjec- tive probability (section 4.4.2) in its concern with the effects of new information on a person’s prior opinions. In addition, the particular theoretical model used in the pilot experiment, the "Myopic Conservative Bayesian Decision Maker," uses experimental stimuli and a system state varia- ble (77) that adhere very closely to precedents established in subjective probability research. In future research using different theoretical models, this latter similarity Of probabilistic stimulus and quasi-probabilistic state var- iables will not necessarily continue to hold, although the concern for the effect of information on Opinion will remain a central concern of the overall paradigm. 25 The research on weighting coefficients (section 4.4.3) resembles the new paradigm in the effort to find a common scale of measurement for the impact Of qualitatively differ- ent kinds of information. In prior research, this scale has been a statistical one such as regression coefficients, while the method being presented in this dissertation uses an economic measure to assess the relative impact of differ- ent kinds of information. The data collection procedures of the cartographic para- digm are a direct development of those of the "choice of information" research discussed above (section 4.4.4). Several important elements of difference do exist, however. Whereas earlier research generally relied on presumed analogies to practical problems and role-playing by the sub- jects to establish the costs of information and benefits of correct decisions, the present paradigm quantifies these costs and benefits with cash payments to the subject. Earlier research used a single dimension Of efficiency or appropriateness of search to compare the behavior of differ- ent individuals; the present paradigm recognizes the multi- dimensional variations in ill-structured problem solving, and is specifically designed to use and evaluate multi- parameter models which embody hypotheses about the relations between different kinds of problem-solving heuristics. Past research of the "choice of information" variety has con- 26 trasted neOphyte and expert decision makers in terms of the surface characterisitcs of the sequences of questions they ask about one or a few particular problems; the present paradigm aims at discovering stable underlying character- istics of individuals as revealed by problem-solving beha- vior, and identifying the clusters of characteristics asso- ciated with expert performance. 27 2. SYSTEM DEFINITION 2.4 OVERVIEW The purpose of this chapter is to define the theoreti- cal and experimental environment within which research is to be conducted, both in the pilot study reported on in this dissertation and in later larger-scale substantive research which will be carried out using the cartographic paradigm. This environment is analyzed in terms of a system composed of the following elements: a decision-making task with cost- ly, imperfect data (discussed in section 2.2); a human decision-maker who performs the task and experiences the associated costs and payoffs (section 2.3); a candidate theoretical model of human decisiOn-making which attempts to explain the similarities and differences between the behav- ior of different persons in like tasks (section 2.4); and an observer who compares the behavior Of the model with that of the human decision maker (section 2.5). Section 2.6 gives the formal statement of the particular candidate theoretical model used in the pilot study (the MyOpic Conservative Bayesian Decision Maker), together with the consequences for the model-specific aspects of other system elements. 28 2.2 DECISION-MAKING TASK The following set of formal elements define a class Of decision-making tasks. This class of tasks is of interest in part because of the practical importance of related prob- lems such as cost-effective medical diagnostic testing, but also because of the opportunity to use a little-studied behavior, information purchases, to make and test inferences about aspects of human information processing that apply in other situations as well. In this research, mathematical techniques developed in the normative theory of decision- making with costly imperfect data, "decision analysis," will be generalized in an attempt to model human behavior; while the performance of skilled human decision makers does not equal that of the normative model on simple problems subject to exhaustive analysis, human beings seem to maintain an acceptable performance level on complex and ill-structured problems for which the normative model fails to reach any decision within feasible time constraints. Any individual decision-making task in this class of tasks is identified by the 4-tuple (S,X,Y,Q), where: (T1) S is a set of 3 possible states of the world 01, exactly one of which is assumed to hold. 3 = {07i, 6-2, 0-3, 000', 0-3., 9“! OE} (T2) (T3) (T4) 29 X is a set of 0 alternative final choices xj, of which the decision maker must choose exactly one. X = {36.1, X2, see, XJ’, see, Xe} Y is a c-by-s payoff matrix whose ji'th element in is the payoff that the decision maker would receive after selecting final choice xj when the state of the world is actually 0} The j'th row of Y, written Yj, is thus an s-element row vector which gives all the possible payoffs for final choice xj (depending upon which state Of the world holds). 0 is a set of D costly tests available to the decision maker, Q = {94: 92. ..., qk, ..., On} Any test qk consists in turn of: (a) A cost Ck that the decision maker must pay to learn the result of test qk; (b) A set fix of nk possible results rfi of test qk Bk = {rfn r11“ ..., rig, ..., r12“); (O) A function Fk(rfi, 01) with range [0,1] which defines the relative degree to which the result ré to test qk is representative of state 61. Without loss of generality, the function is n normalized so that %:;Fk(r&, 01) = 1 for all 30 k and i. (In Bayesian and other statistical models, Fk(r&, 0'3) = P(qk => rfi} 0'3), the conditional probability Of result rfi to test qk given state 61; but see e.g. Tversky and Kanheman (4974) for an alternative formulation of representativeness in human information processing.) 2.3 DECISION MAKER The decision maker is an individual human being about whom the following axioms are assumed to hold: (D0) (D1) The decision maker encounters the task at time t=0, and performs actions at times t = 1, 2, ..., m, where the m'th action is the selection of a final choice xje and each of the other actions is the selection of a test. At all times t = 0, 1, 2, ..., m, the decision maker retains complete knowledge of: (a) the set of possible states of the world S (but not which one holds) (b) the set of alternative final choices X (c) the payoff matrix Y (d) the set of possible tests Q (e) the results of all tests purchased at any time prior to t. (Note that the decision maker need not attend to all of this information; what information the decision maker actually uses in making his decision is a matter for particular theoretical models. Axiom D1 can be satis- fied by written records or other similar aids.) (D2) Just prior to the decision maker's action at time t, the decision maker's Opinion about the relevance of each of the 3 possible states in S may be represented by a salience vector 1 2 i s T i nt§[1rt,nt, ...,rrt, ..., wt] ; nt 6 [0,1] (Salience may be interpreted as subjective probability in models such as the Myopic Conservative Bayesian Decision Maker which define subjective probability.) (D3)1T] = g, ‘3', .00, £11.. Initially, saliences are defined to be equal and may, without loss of generality, be scaled to add up to 1. (D4) After purchasing a test qk at time t for cost Ck and learning that the result is r&, the decision maker revises his Opinion to one represented by‘fl}+1, and selects his next action. (D5) The decision maker's final choice xje‘e X at time t = m is the choice which maximizes the subjective i expected payoff function Jm(xj,‘fih) = :§:Uhy i=1 J1. U) U.) This definition of the decision maker is intended as a general framework only; necessary details such as the interaction between costs and payoffs or the stopping rule which determines m are left to particular theoretical models such as the MyOpic Conservative Bayesian Decision Maker. 34 2.4 THEORETICAL MODEL The block diagram in Figure 1 shows the input-output requirements for a theoretical model to be used in this research process. The lines labeled 8, X, Y, and Q indicate that the model's behavior is dependent on the specification of the task known to both the decision maker and the observer: S is the set of possible states, X is the set of alternative final choices, Y is the payoff matrix, and Q is the set of available tests. 3Kt+1 is the action selected by the model for time t+1, either purchasing test qkt+1 or selecting final choice x3e. If a final choice is selected, t+1 = m and the model termin- ates. Otherwise, the "next" test qkt+1 becomes the "last" ' test qkt in the next time period. (This is the function Of the box labeled "DELAY"). rfi is the result of the last test qkt' defined for t = 1, 2, ..., m-1. It represents the feedback Of information from the environment that the decision maker receives in return for incurring the cost Of diagnostic tests. RE, is an s-element vector which serves as an estimate of the vector‘n} representing the decision maker's opinion at time t. (See assumption D2). fi1 = ['3’ g, 00., %]T0 fit 9 [figgflég 00.,"311‘ ‘ifi s moon 9 1 PARAMETERS x f 1 TASK , g); j o > ‘1 , i T ‘ usr TEST) 7 H 7 FINAL - , PURCHASJ CHOICE crumm A A _ L TEST 16 on“ }(§ ‘{, VOCt.1=ocn‘Xf RESULT “F tn: 14‘ka TERMINATE FIGURE 1: BLOCK DIAGRAM ‘fif+1 is the estimate of the vector representative of the decision maker's new opinion at time t+1, after taking into account the information rfi. ‘fi$+1 becomes‘fik for the next time period, as indicated by the DELAY box. ‘VM is the set of parameters by which the model M repre- sents individual differences that affect behavior on the class of decision-making tasks under consideration. The number of parameters and their interpretation will vary from one theoretical model to another, but each parameter should correSpond to a psychologically meaningful concept, such as the concepts of conservatism and myopia in the two-parameter Myopic Conservative Bayesian Decision Maker. For a given theoretical model, the values of that model's parameters will vary form one decision maker to another. The parameter space Y“ for a particular theoretical model M is defined as the abstract Space whose dimensions correspond to the elements of Th, the set of parameters of the model M. fi, the principal output of the model, is the accumu- lated, ordered list of inferred actionstfii; that is, the model's sequence of information purchases and its final choice. Since the model inputs fib and rfi are determined by the constant features of the task environment interacting with the model itself, the behavior of a given theoretical model M on a given task (S,X,Y,Q) depends only on the values 0f the elements 0f TE. This can be expressed by represent- 37 ing the model M by a family of generalized functions, indexed by task characteristics, from‘fh to the space of possible behavior lists: M(S,X,Y,Q)(‘P) .-. fi, where ‘PeYM. (That is,\P is the specification of all the elements of Th, characterizing a particular real or hypothetical individual decision maker. 38 2.5 OBSERVER The observer is an educator, cognitive scientist, or other person interested in objectively studying the behavior of the decision maker. The role of the observer is given by the following axioms: (O1) (02) (03) (OH) The observer has the same knowledge of the task (S,X,Y,Q) as that assumed for the decision maker in axiom D1. The observer knows what test qkt the decision maker purchases at each time t = 1, 2, ..., m-1 and the final choice xja the decision maker selects at time t=m. By analogy with the list E of the model's behavior, the ordered list of the decision maker's observable behavior, known to the observer, will be referred to as B. The observer does not know the decision maker's opinion nor its true representation‘flf, for any time t > 1. The observer conjectures that model M (from the class of theoretical models defined above) is an adequate representation of the decision maker's behavior; the observer's goal is to find the region 2 C?” such that d A if‘PEz, then M(S’X,Y,Q)(§O) .-. B = B. 39 If the observer succeeds in finding a region 2 as in (0%), he will map its boundaries within Th and state, for a given decision maker, that if the model adequately repre- sents that decision maker's behavior process on the given task, then the values of that model's parameters character- izing the decision maker lie somewhere in region 2. If no such region is found, the observer must conclude that the theoretical model M is not adequate to represent that decision maker's behavior on that task. 40 ‘fl 11 eq 5 W x M TFVIEBIC. v ' I 1‘ I Ar Hf“ W 0 s 13 o v x S 3; 12 yyyvvy CONSERVATIVE mop“; $20:in ~——> CHOICE “BAYESIAN' MODULE /‘ A “:91 1,, RM ‘* I own _ 1 k [?Z—v?'nunt t TEST - PURCH “'49:?" r ASE two: i Q = x =x ‘ ' t" TERMINATE FIGURE 2: HYOPIC CONSERVATIVE BAYESIAN DECISION MAKER #1 2.6 MYOPIC CONSERVATIVE BAYESIAN DECISION MAKER The purpose of this section is to give the mathematical specification of the Myopic Conservative Bayesian Decision Maker, the theoretical model introduced in Section 1.2; this model will be used in the pilot experiment descibed in Chap- ter 3. The model can be conveniently divided into a "con- servative" update module which changes the value of the state vector fik representing the decision maker's salience vector (here interpreted as subjective probability) in the light of new information, and a "myopic" selection module which determines the next action to be taken (information purchase or final choice). 2.6.1 Conservative Update Module The update module of the Myopic Conservative Bayesian Decision Maker is based on the findings of Schum and Martin (3968), who showed that Edwards' concept of conservatism in human information processing applies to subjective proba- bility distributions with multiple alternatives, not just the two-alternative case investigated by Edwards. What these investigators found was that human beings' stated judgements of a posteriori probabilities, given equal a priori probabilities and data with known conditional proba— bilities, can be modeled consistently by a variant of Bayes' rule in which the impact of evidence upon opinion about relative odds is raised to a power (called c by Edwards and #2 by Schum and Martin, but ‘1 in this model). Bayes' rule is the special case where $3 = 1.0, but the research cited has shown that 33 in general is not equal to 1.0 in human infor- mation processing. Thus, the value of the exponent TH is taken as the measure of "conservatism," with lower values corresponding to more conservative behavior -- that is, behavior in which the impact of new information on pre- viously-held Opinion is low. The update module uses the following algorithm to determine fit.” from fit and (521;: If the latest act {it was the selection of a final choice xja, the model terminates. (t = m in this case.) Otherwise, the estimated probability vector‘fi$+1 is defined by the following equations: (1) 1?, 9-: {-31, g, -; 1T. (2) For any t e {1, 2, ..., m-1}, let k be the index of the test qkt purchased at time t, and let 2 be the index of its result, rfi. Define a vector-valued "Conservative Bayesian function" 'A l d w1 w2 w8 E hy:E (fit Pk Y1) = ... ___t:_1_. 5,0 ’ ’ 1:35:17 115%: ’ 1+ws t+1 , T 1.13 . N 2 1 where wt, Q "T: (Fk(rk’o-;)_) (1 “”0 ' «h «i T-fit nghfifihmt - Fk(r‘1%.°i)“t is the updated conservative odds for state of the world 03, which reduces to the updated Bayesian odds when 93 is equal to 1.0. (3)fit+1gEs’Q(fit9 PiE, Y1) for t ‘e {1, 2, coo, [DI-1}. (See Appendix 1 for the derivation of this formula.) MM 2.6.2 Myopic Selection Module The role of the myopic selection module of the Myopic Conservative Bayesian Decision Maker is to connect the in- ternal opinion states, generated by the conservative update module, to the observable actions (test purchases and selec- tion of final alternative) which are the focus of the research paradigm. The module, derived from a variant of decision analysis used by Corry et a1 (1973) for computer- ized medical diagnosis and treatment recommendations, measures myopia by the parameter 9%; the selection module is "myopic" in the sense that, at each time t, each diagnostic test in the set Q is evaluated as if it were necessary to cease testing and choose a final treatment from the set X at or before time t+1+fié. (Time, in the model, is measured by the number of actions that have occurred, where an action is a test purchase or the selection of a final choice). Thus, after learning the results of the t'th test, the model con- siders at most TE additional tests. (Gorry et al consider only the case 3% = 1.) The selection module determines the actc§t+q using the updated estimated subjective probability vector‘fif+1 by the following myopic decision analysis algorithm, with integer- valued myOpia parameter Wfi: (A) Compute the subjective expected value to the simu- lated decision maker of selecting final alternative xj at 45 time t+1, for each 3 €E{1, 2, ..., c}. This value is found by the sum of the payoffs conditional on each state of the world weighted by the subjective probabilities of the states: gyji 11%,, Q YJ- fit” . (B) Determine the best final alternative Xj! such that (C) Compute the net value of the simulated decision maker's expectation assuming that, rather than selecting a final alternative, the decision maker purchases test qk at time t+1, for each k 6(1, 2, ..., D}. This value, denoted V(k, fiki1, TH, 1%), is found by the following recursive set of equations: (1) V(k,‘fi%+1, $3, 0 ) (2) Wk, fit”, Y1, Y2) II C n ~ I ~ 2 £21.:U(‘V1.‘Y2.BS,Q(1Tt+I.rk,‘f’1)):Z-7Wt+1Fk(rk.0‘i) - Ck - . if Y2 > 0 <3) um,Y2,ES’Q<fit+1.r-I§.‘f1>> = max [ ma); V(k' (fit+1vr%9Y1)9%9Y2'1)1 9ES’Q A 1 m3)! YjES’Q(fit+1:rkaYT) ] 46 In Equations 2 and 3, the function U is the value of the simulated decision maker's expectation if his opinion is ES,Q(fi}+1,r£,Tfi) and he considers Wfi-1 additional tests. ES’Q(fi1+1,r£,Wfi), henceforth written E, is the opinion that the decision maker would hold at time t+2 if<§f+1 were the purchase of test qk, and this test gave result rfi. U(Yfi,fi%,E) is either the value of the best treatment given opinion E, max YjE, or the value of the best test at time t+2 given opinion E and myopia %§-1 (since the test would be the second of the 1% tests in the longest possible plan), mi§ V(kI,E,Wfi,W%-1). The subjective probability that the purchase of test qk at time t+1 will in fact yield the result rfi is found by multiplying the conditional probability of result r& to test qk qiven that the true state of the world is OZ, Fk(r£,01). by the subjective probability (at time t+1) of state a}; 'fi%+1, for each possible state 01, and summing over i. The total expected value of purchasing test qk at time t+fl is found in equation 2 by multiplying the value of the decision maker's expectation at time t+2 if test qk at time t+1 gives result rfi, U(jfi,Y§,ES’QCfi1+1,r&,3fi)), by the sub- jective probability of result r& for each.£, summing over 1, and subtracting Ck, the cost of the test itself. Note the recursive definition of V(k,fi1+1,jfi,W%) in terms of U(Wfi,Y%,ES’QCfi§+1,r£,Wfi)) and of the latter in terms 0f V(k'IE5,QCfi£+1,r&,Yfi),Tfi,T§-1). The value of a test at 1.17 time t+1 depends on the values and probabilities of the pos- sible states of knowledge at time t+2 that could result from the test, and the values of these possiblle states of know- ledge depend on the values of the various possible actions at time t+2, evaluated at a reduced myopia. The recursion is terminated after ‘2 levels by invoking Equation 1. (D) Determine the best test qku such that V(k* tfit+1 31,1 9Y2) ')' V(k,fi't+1,Y1,Y2) for all k‘E {1, 2, ..., 0}. (E) Determine 361;.” as follows: 5&44 = Select final choice xjc and put a = t+1 if Yj.fig+1 > ka*{“t+1. Tfi.T%>. <9t+1 = Purchase test qka if ngfif+1.$ v(k*Jfit+19fififwg)‘ When Ta = 1 and?% z 00, this model is the standard de- cision analysis algorithm (Raiffa, 1968). When 13 - 1 and T3: 1, the model is the same as that reported on by Gorry et a1 (1973). H8 3. PILOT STUDY The primary goal of the pilot research study of the Myopic Conservative Bayesian Decision Maker model is to demonstrate and gain experience with the use of the paradigm presented above in evaluating theoretical models of human decision making. A secondary goal is to evaluate the MyOpic Conservative Bayesian Decision Maker model itself as an ex- planatory theoretical model for observed patterns of beha- vior. The study was carried out in three phases: (a) Cre- ating the task environment, (2) Mapping the parameter space, and (3) Observing human decision makers. 3.: CREATING THE TASK ENVIRONMENT The following one-player simulation game was used as the basic framework for the task environment: The decision maker takes the role of a veterinarian in an agricultural cooperative which is raising a large number of a new variety of poultry. The co-op is currently experi- encing losses due to an unknown disease among the birds; the identity of the disease has been narrowed down to four pos- sibilities, each of which is considered equally likely. Because the poultry is a new variety, the decision maker has no relevant personal experience; however, he has access to a number of diagnostic tests with published find- ings (conditional probabilities) resulting from the appli- cation of the respective tests to birds of the variety in -—I——” ”9 question known to have each of the four considered diseases. Each test has a cost which is several percent of the maximum expected profit from the venture. (All the inexpensive tests may be considered to have already been used in narrow- ing the search to four diseases). Each disease has a speci- fic treatment which results in a complete cure and a large profit when applied correctly, but to a total loss when applied in the presence of a different disease. In ad- dition, there is a fifth, ”broad spectrum" treatment which results in partial control of any of the diseases, and thus in an intermediate profit. The decision maker will receive a bonus proportional to the profit made on the venture, net of the cost of any diag- nostic tests used. The principal equipment of the simulation game consists of (A) introductory material giving a background explanation plus the structure of costs and payoffs, and (2) a set of cards, each of which summarizes a particular test (in terms of its cost, possible outcomes, and conditional probabili- ties) or treatment (in terms of its payoff given each possi- ble disease state). These materials, with the actual values used in the experiment, are exhibited in Appendix 3. In terms of the formal model of a decision-making task, 8 is the set of four diseases, x is the set of five treat- ments, Y is the profit arising from each disease -- treat- 5O ment pair, and Q is the set of five cards giving information about each test (qk), with its conditional probabilities (Fk(r;9{,0'j,)). This framework was selected because it is parallel to the important problem of cost-effective use of diagnostic procedures in human medicine, and yet is simple enough to be tractable in a small-scale pilot study of human decision processes. In particular, the agricultural setting was chosen in order to substitute simple monetary costs and benefits for the highly ambiguous value tradeoffs necessary in the treatment of human illness. Once this general framework was established, the next step was to specify the values of the probabilities, costs, and payoffs involved. The goal of this phase of the study was to create a decision-making task that was sensitive to variations in the parameters of conservatism and myopia, and yet was simple enough that the human decision makers could safely be assumed to comprehend it. In addition, the task could not be so over-controlled that the only behavior possible was behavior predicted by the model, since such a situation would prevent any real test of the model's accura- cy. The last of these concerns was addressed first by cre- ating a first approximation to the task with a very high degree of symmetry, so that only a few numbers must be learned by the subjects. The MyOpic Conservative Bayesian 51 Decision Maker was then implemented as a computer simulation program written in PASCAL on MSU's CDC 6500 computer (Appen- dix 2), and simulation experiments were performed using the simple task and various values of conservatism and myopia. The variation in the model's behavior was found to be too small, so changes were made to the costs, probabilities and payoffs and the simulation was repeated, until a task envi- ronment that appeared to meet all these requirements was found. (This iterative simulation method of creating a task environment was chosen in preference to an analytic design of an Optimal environment for the exercise of the Myopic Conservatibe Bayesian Decision Maker's equations in part to ‘ preserve maximum generality of the paradigm across models, since the model is treated as'a "black box"; the process whereby the task environment becomes tuned to the model depends only on the model's behavior, not on its structure.) The final values of the costs, payoffs and probabilities are shown in Table 1. Once the numeric values were established, the next step was to express them in a form suitable for presentation to human subjects. The manner of presentation which was devel- oped is presented in Appendix 3. The two most important difficulties in creating a task environment for human sub- jects, comprehension and motivation, will be discussed below. 52 TABLE 1: COSTS, PAYOFFS AND PROBABILITIES (TASK ENVIRONMENT FOR THE PILOT STUDY) TEST A: COST = 100 I? ACTUAL DISEASE IS: PROBABILITY OF:: VIRUS A : VIRUS B : BACTERIUM C POSITIVE RESULT: .90 : .20 : .20 NEGATIVE RESULT: .10 ° .80 .80 TEST B: COST = 100 IF ACTUAL DISEASE IS: PROBABILITY 0P:: VIRUS A : VIRUS B : BACTERIUM c POSITIVE RESULT: .20 : .90 : .20 NEGATIVE RESULT: .80 .10 - .80 TEST C: COST = 100 IF ACTUAL DISEASE IS: PROBABILITY 0F:: VIRUS A : VIRUS B : BACTERIUM C POSITIVE RESULT: .20 : .20 : .90 NEISATIVE RESULT: .80 : .80 : .10 TEST D: COST = J00 IF ACTUAL DISEASE IS: PROBABILITY OF:: VIRUS A : VIRUS B : BACTERIUM C POSITIVE RESULT: .20 : .20 : .20 NEBATIVE RESULT: .80 : .80 : .80 TEST V: COST = 100 IF ACTUAL DISEASE IS: PROBABILITY OF:: VIRUS A : VIRUS B : BACTERIUM c POSITIVE RESULT: .80 : .80 : .20 NEGATIVE RESULT: .20 : .20 : .80 PAYOFFS FOR DISEASE/TREATMENT PAIRS: IF ACTUAL DISEASE IS: VIRUS A : VIRUS B : BACTERIUM C TREATMENT A : 1000 : 0 : 0 TREATMENT B : 0 : 1000 : 0 TREATMENT c : 0 : 0 : 1000 TREATMENT D : 0 : 0 : 0 TREATMENT x : 300 : 300 = 300 BACTERIUM .20 . 30 [J BACTERIUM .20 . 80 BACTERIUM D . 20 . 80 BACTERIUM D .90 .10 BACTERIUM D .20 . 30 BACTERIUM D O 0 0 1000 300 53 Comprehension: The substantive focus of the carto- graphic paradigm is the study of how different people approach the resolution of the same decision-making task. The task has to be complex enough that the optimal solution path is not obvious, or all the subjects would respond in the same way; the demand effect of the task would have over- come their individual differences. On the other hand, if the task itself is so complex that the subjects do not fully understand it, they will only be able to respond to their own partial models of the task, and there is no reason to expect these partial models to be the same from subject to subject; in this sense, we are no longer observing different peOple's response to the same task. Thus, the task must be understood fully and equally by all subjects, while the solution must be so complex that they vary in their approaches to it. These concerns were in part addressed by using a highly symmetrical set of numbers and experimenting with the simulation program to make sure behavior is sensi- tive to differences in conservatism and myOpia. The subject instructions and information cards in Appendix 3 are inten- ded as a clear and redundant presentation of the task; in particular, the information cards provide a continuing re- minder to reduce the danger of forgetting and misrememb- ering. 5h Motivation: Just as classical decision analysis com- putes the objective value of information in terms of costs, payoffs, and Bayesian probabilities, the Myopic Conservative Bayesian Decision Maker judges the subjective value of information in terms of utilities and subjective probabili- ties. Thus, the accuracy of an experimental study of the model depends heavily on the accuracy with which the utili- ties of payoffs and disutilities of costs are determined. In the preliminary experiments, a linear approximation to the utility of money is used. Since most theoretical and empirical utility curves are most nonlinear near zero (see Baumol, 9977 ), the variable payoff to the subject is restricted to the range from -$1.00 to +$fl.00, and combined with a fixed payment of $5.00. In other words, the sub- 'ject's tOtal payoff is in the range $fl.00 to $6.00, in order to reduce the distortion introduced by the linear approxi- mation to the utility of money. In Experiment U and subse- quent research using the paradigms, more sophisticated models of utility, such as that of VonNeumann and Morgen- stern, will be substituted to increase the precision of the measurements. 55 3.2 MAPPING THE PARAMETER SPACE In order to address the observer's goal of developing a system for classifying overt patterns of behavior in terms of the parameters of the theoretical model, the next step in the pilot study was the preparation of a map of the parame- ter space defined by conservatism and myopia. The map was prepared by running the simulation program with the task environment selected for experimentation (including the ”rigged" algorithm for generating test results) and roughly fifty different conservatism/myOpia pairs. These pairs were restricted to myopia less than five, since the amount of calculation required with myopia of four was so great that a longer planning horizon, at the full breadth of search assumed by the model, can be considered highly unlikely in unassisted human beings at the levels of motivation used. Conservatism was also initially restricted to be no more than «.25, as no conservatism values approaching this limit have been reported in the literature. Within these bounds, mapping was done separately for each of the four levels of myopia considered. For a given value of myOpia, widely sep- arated conservatism values were first evaluated; then, for pairs of conservatism values leading to different sequences of overt behavior, intermediate values were examined to find the values at which behavior shifts from one pattern to the other. If a third pattern existed at conservatism values intermediate between the first two, the interpolation pro- 56 cedure automatically uncovered this fact, then found tran- sition points delimiting all such behavior regions. =m~h<>zmmzcu P L p p b P » PII-l p n p p . b‘ .— d a q d — q d u a 4 q 1 q u - m~.— c~._ m~.. c—.~ no._ 92.. no. co. as. ea. ms. oh. we. co. mm. a. ._ g e _ m _¢— c o ~ m — a — = —u_ m o n . ...: . . . - - . mfia0h: uo all _AJ < ass-among .u umuu .e away .n mama .N you“ .— you» ._ menu a < acoaummwu .e use» .n mama .N you» ._ use» .— you» H < uswiuuouu .— used .e umuu .n umOu .N umwu .~ umOu .n you» w < acmemmmuu ._ umwu .N mama .c umuu .n any» .~ “ecu .— use» .n mesa : a usoeusmuu .— amen .n umOa c < uzoeumouu .— umuu a u uselusOuu o A.mu=uuo < cascade co>aw ausmou musxau umeod egg .mucmsvsm some cw gave vacuum 0:» so .musuuo < cascade cosuw uusnou Afiwxuq umol Ozu .uucsscom sumo cu vacuum c:u amouxo money gum co «OHOZV mzxuhh<=mm mu .99 or myopis = 3 and conservatism > .95), the only behavior pattern that emerges under the conditions of Experiment 3 is: Test 5, Test 1, Test 2, Treatment B. This pattern matches the behavior of subject ZJ. Two of the subjects in Experiment 3 whose behavior was not matched by the Myopic Conservative Bayesian Decision Maker, subjects 22 and 25, used a series of tests which dif- fers from the pattern produced by subject za (and by the model when myOpia = 2 and conservatism > .9u or myOpia = 3 and conservatism >.95) only in the purchase by subjects 22 and 25 of an extra repetition of Test 2 before selecting Treatment B. While no variation of conservatism and myopia alone could account for this behavior, a modified version of the model was able to match the pattern. In the modified version, a third parameter was introduced by varying the utility of a correct diagnosis. For simplicity, conserva- tism was held constant at 1.0. When myopia equaled three and the utility of a correct diagnosis and specific treat- ment was raised from its original value of fl,000 to 6,000, the revised model produced behavior matching subjects 22 and 25; this pattern was stable for utilities from 6,000 at least to 30,000 for this value of conservatism, while utili- ties from 3,000 to 5,000 led to behavior matching subject 21 and the original model (utility = fl,000, proportional to 77 cash profit). The behavior of the remaining subject, 23, cannot be accounted for by either the original model or the extended version; while the second purchase of Test 5 might or might not be indicated for some 3-tuple of conservatism, myopia and utility, the second purchase of Test a on iteration u occurs at a time when the evidence indicates that, for any positive value of conservatism, Virus B is more likely than Virus A. As a result, no matter what positive values are chosen for conservatism, myopia and utility, the MyOpic Con- servative Bayesian Decision Maker could never prefer Test a to Test 2 on the fourth iteration of the sequence of tests purchased by subject 23. In other words, whatever heuristics this subject used (such as "do all tests twice in succession to improve reliability"), these heuristics are located out- side the realm of Bayesian strategy regardless of any modi- fications of conservatism, myopia or utility. 78 3.3.2 Refinements to the Task Environment The results of the three preliminary experiments raised several questions about the conduct of the experiment and about the MyOpic Conservative Bayesian Decision Maker. The prevalence of behavior patterns difficult to justify under any principle of quasi-optimality implies a lack of compre- hension among some of the subjects. The analysis of differ- ent levels of utility carried out using the data from Experiment 3 inidcates that motivation may have been inade- quately controlled. On the other hand, the deviation between the behavior of the subjects and that of the model may indicate that the conservative Bayesian information processing pattern, found by Edwards and his co-workers when subjects were given costless, experimenter-selected data, does not apply to the present situation where subjects select and pay for information. Experiment 8 was designed to shed further light on each of these three issues. The subjects were twenty students from an undergraduate course in statistical methods in psy- chology; the experiment was conducted near the end of the term, so the subjects had recent and continuing exposure to elementary statistical concepts. To assure comprehension, the experimental session consisted of ten repetitions of a replication of Fried and Peterson's (1969) experimental task which is similar in concept to but simpler than the present 79 one, followed by ten repetitions of the game used in Experi- ments a, 2, and 3. The first five of these repetitions of the experimental game were "just for practice," to allow the subject to gain familiarity with the consequences of various ways of responding without affecting his outcome, while the last five were played for lottery chances (see below). All ten repetitions of the Fried and Peterson replication, all five of the "practice games," and the first three out of the five games for lottery chances were played honestly accor- ding to the stated probabilities, but the last two games were rigged, unknown to the subject, so that test and treat- ment outcomes in the next-to-last game matched those in Experiments 1 and 2, and outcomes in the last game matched those in Experiment 3. (A programming error resulted in one subject, #13, receiving erroneous results for the last two tests on the last game; however, this subject had already departed from the model's behvaior before the first errone- ous result.) Data were analyzed for the last two games only, to maximize subjects' comprehension of and familiarity with the experimental task environment. In order to obtain better control of motivation, the payment schedule was also changed. Subjects received a flat cash payment of $U.00 for participation in the experiment, plus chances in two lotteries, each with a prize of $20.00. Each subject's chances in Lottery I were determined by his performance on the ten repetitions of the Fried and Peterson 80 replication, and his chances in Lottery II were determined by his performance on the last five repetitions of the diag- nosis and treatment game. The reason for the lotteries is that, while the utility of money varies in a nonlinear manner which differs from individual to individual and from time to time, Von Neumann and Morgenstern (RQH7; see also Raiffa, a968) have shown that, given certain reasonable assumptions, the utility to a given individual of a chance .in a given lottery, measured relative to the utilities of winning and of losing the lottery, is a linear function of the probability of winning. Thus, in the experiment, a sub- ject received one chance in Lottery II for each $100 of net profit in the last five games, which means that, under the Von Neumann-Morgenstern assumtions, the subject's utility for "money” in the simulation game (net profit of the imagi- nary turkey farm) should be linear. Questions concerning the applicability of the conserva- tive Bayesian model itself to an information selection 'and purchase situation are indirectly addressed by the above- mentioned improvements in experimental control. A more direct comparison between information processing in the present task environment and in the literature of conserva- tism (Edwards U968, 4972; Schum and Martin, N968; Slovic and Lichtenstein, A973) was made by asking the subject to mark a line calibrated from ”completely impossible" to "absolutely certain" to show his subjective probability of the disease SJ he considers most likely. This was done once per game, after a treatment had been selected but before treatment results were given to the subject. In this manner, a direct measure of introspected subjective probability was obtained by a replication of Edwards' general methodology; the prin- cipal difference between this subjecive probability and those obtained by Edwards, Schum and Martin and others is that, in the present experiment, the subjects themselves have selected and paid for the data, wheras in the Edwards- type experiment the experimentor determined the amount and type of the data to be given to the subject, whose only task was to react to these data. The experiment was conducted using two BASIC programs, one for the Fried and Perteson replication and one for the diagnosis and treatment game. The programs ran on Michigan State University's HP2000 ACCESS system, and the subjects interacted with the programs via a TeleType model 33 termi- nal. 82 TABLE 5 Comparison Between Behavior Patterns of Subjects in Experiment 4 and the Myopic Conservative Bayesian Decision Maker Behavior on Game 10 O P 0 W Nomatch Total 0: a a 0 0 0 2 Behavior P: 0 0 0 0 2 2 0n Game 0: 0 0 2 3 a 6 9 W: 0 0 0 0 0 0 Nomatch: 0 1 0 2 u 7 Total: 1 2 2 5 7 37 (Model behavior patterns R, S, T, U and V did not occur among the human subjects.) TABLE 6 Matches, Near-Matches and Non-Matches of Subjects With the Model Given Consistent anf Inconsistent Data Consistent Inconsistent Data Data Total Matches 20 0 20 Near-Matches N/A A U Non-Matches 7 3 10 Total 27 7 3” (D U) 3.3.3 Analysis of Economic Behavior Table 5 shows the degree to which the behavior of 17 subjects in Experiment 9 conformed to that of the Myopic Conservatibe Bayesian Decision Maker. The remaining three subjects were excluded from analysis, one due to noncompli- ance and the other two due to obvious lack of comprehension of the task. Each row of Table 5 corresponds to one class of behavior on Game 9, and each column corresponds to one class of behavior on Game 10. Behavior classes 0, P, and Q are as defined in Figure 3: 0 means prescribing the broad-spectrum treatment with no testing; P means purchasing a specific test, receiving a positive result, and prescribing the corresponding specific treatment; and Q means purchansing test V, receiving a positive result, purchasing a specific- virus test A or B, receiving a negative result, and prescri- bing the specific treatment for the virus not tested for. Because of the way that results are determined, pattern W can only occur on Game 10; it is the behavior sequence found in the third preliminary experiment and in the associated simulation runs with consistent data. The sequence of pattern W is as follows: Purchase test V, receive a positive result, purchase a specific-virus test, receive a negative result, purchase the specific test for the remain- ing virus, receive a positive result, and prescribe the 84 specific treatment for the latter virus. The remaining class, "Nomatch," refers to any pattern of observed human behavior which no choice of parameters for the Myopic Con- servative Bayesian Decision Maker can produce. One of the most striking things about these results is the fact that no subject who received inconsistent data matched any of the ways the model deals with iunconsistent data (patterns R, S, T, U and V). Table 6 gives the number of matches, near-matches and non-matches between subjects and the model. A near-match is defined as a behavior pattern which diverges from a pattern generated by the model only after inconsistent data is received. For example, sub- ject eleven on Game 9 begins with test 5, gets a positive result, purchases test a, gets a negative result, and pur- chases test 2. Up to this point, he matches patterns R and S of the MyOpic Conservative Bayesian Decision Maker. However, the result of test 2 under the conditions of Game 9 is negative, which implies to the subject that one of the three tests is an error. The model continues with test 3 in both patterns R and S, but subject eleven departs from the model by purchasing a repetition of test 5. Note that, while any subject who purchased at least two tests on Game 9 received a test result which was in fact an "error," that is, the result less likely gives the actual disease, only six such subjects purchased a collection of tests whose results were internally inconsistent. The 85 seventh case of inconsistent data occurred on Game 10 as a result of a programming error. In general terms, the results displayed in Table 6 indicate that, while the assumptions of the Myopic Conserva- tive Bayesian Decision Maker may apply to some human decision makers at the levels of training and motivation used in the experiment when the test results are consistent, no evidence in either Experiment 8 or the three preliminary experiments supports the application of the model in the case of inconsistent test results. To be of practical or theoretical use, however, a model of this type must not only be able to model an individual's behavior on isolated instances of decision-making; it must also capture the stable underlying parameters of the infor- mation-processing strategy the individual is using. In the case of the Myopic Conservative Bayesian Decision Maker, these parameters are assumed to be conservatism and myopia. While examination of the subjects' choices in the games and impressions of their comments imply that major learning is still ocurring after 8 or 9 games and thus fully stabilized strategies are not to be expected, it is nevertheless instructive to examine the degree of consistency from Game 9 to Game 10 shown in the present experiment. In Table 5 it can be seen that one of the two subjects who used behavior pattern 0 on Game 9 continued using pattern 0 on Game 10, neither of the two subjects who used 86 behavior pattern P on Game 9 continued using pattern P on Game 10, and two of the six persons who used behavior pattern Q on Game 9 continued using pattern 0 on Game 10. Examination of the raw data revealed that both of the two subjects who failed to match the model on Game 9 but matched Pattern W on Game JO were "near matches" on Game 9 who di- verged from pattern W only after receiving inconsistent data; thus, these two subjects can be considered "stable" in the same sense as the three subjects whose overt behavior was identical on the two games. Of the 17 subjects in the anaysis, 7 were not even near-matches on at least one of the two games, and thus the question of stability is undefined. 5 subjects used beha- vior patterns corresponding to different regions of parame- ter space on the two games; thus, while their behavior on each individual game fit the model, no single {conservatism, myOpia) pair could account for both their behavior on Game 9 and their behavior on Game 90. For the remaining 5 "stable" subjects, however, their behavior on Game 9 (or at least their behavior up to the receipt of inconsistent test resul- ts) comes from the same region of the model's parameter space as their behavior on Game 10. 8'7 fiumsnm: Protabil'ty ( + indicates coincident points. ) 1.0 . I . l U09 ‘- + .0 O l - - 0.9 "— .0 0.. '-° .+ . 1 i 605 0'" O O c 0.5 h— ' Och "' 0 I g 1 00: ~ 0 002 — e 0.1.- g l I ' I l J J J l i 0.0 1 I T 1 l 1 T I I 1 I , O 01 .2 3 on as .5 c7 .9 0,0 1.0 Bayesian Probabilities FIGURE 3: COMPARISON OF SUBJECTIVE AND bAYESIAH PROBABILITIES 88 3.3.9 Analysis of Subjective Probabilities The subjective probability of the subjectively most likely disease was elicited in each game by asking the sub- ject to mark a line calibrated from "totally impossible" to ”totally certain” after the subject had selected a treatment but before he learned the results of that treatment. In Figure 9, this probability is compared with the objective Bayesian probability of the same disease given the test results received in the course of that game. Plus signs (+) indicate two occurrences of the same subjective probability -- Bayesian probability pair. The most important thing that is shown by the graph is that less than a third of the points lie below the diagonal line which represents subjective probability equal to Bayesian probability. This contradicts the finding of Schum and Martin (a968) that subjects given probabilistic data regarding the likelihood of several posible states tended to give subjective probabilities for the most likely state that were lower than the Bayesian probability of that state; the present result also contrasts with the findings of the many reports of research in the two-alternative case reviewed by Slovic and Lichtenstein (J973). The next question to be considered is whether the sub- jective probability the subject expressed by marking the line accurately measures the actual salience of the corres- 89 ponding state for decision-making, given the assumptions of the theoretical model being evaluated. This question cannot be investigated for the an cases for which the MyOpic Con- servative Bayesian Decision Maker failed; also, the question is trivial for the three cases in which subjects prescribed the broad-spectrum treatment without purchasing any tests. The remaining J7 cases are divided among the three behavior patterns P, Q, and W. Each point in parameter space corresponds to a value for the subjective probability after the results of the tests purchased by the model have beeh received and processed one at a time using the model's Con- servative Update Module.) The range of parameter values for which the pattern occurs defines a range Of subjective prob- abilities the pattern is capable of generating; numeric searching within that range can provide a point estimate for the conservatism which, given the actual test results re- ceived, would produce a particular subjective probability within that range. The analytical methods used by Schum and Martin and by others to estimate conservatism rely on the fact that data are presented all at once as a single compound event, which simplifies the mathematics considerably, compared with the complex order and interaction effects which occur when data are presented one at a time and must be processed before the next data-generating event can occur. In the Myopic Conser- vative Bayesian Decision Maker, one-datum-at-a-time proces- 90 sing tends to result in subjective probabilities which deviate from the Bayesian in the same direction as and to an equal or greater degree than procesing data as a single com- pond event. In the Bayesian special case where conservatism equals 1, the two methods yield the same result. In the task environment of the present study, Pattern P can result in any subjective probability of .55 or greater; all four of the subjective probabilities reported in games in which Pattern P occurred were within this range. Pattern Q can result in a subjective probability between .95 and 0.59; for conservatism of 2.68 or more, well beyond the range originally mapped, pattern Q can also occur and pro- duce subjective probabilities greater than .96, but none of the eight games in which subjects' behavior matched Pattern Q resulted in reported subjective probabilities in either of these ranges. Finally, Pattern W can lead to subjective probabilities between .60 and .96, and the reported subjec- tive probabilities in four out of the five games matching Pattern W fell within this range. Overall, seven out of seventeen games had a reported subjective probability con- sistent with the parameter region of the Myopic Conservative Bayesian Decision Maker which gives rise to a behavior pattern matching the subject's behavior. These results imply that, whether or not the MyOpic Conservative Bayesian Decision Maker is an adequate model of that part of a sub- ject's cognitive processes which leads to his choices of 9d tests and treatments, it does not appear promising in its present form as a model of the process which generates his introspective report of subjective probability. 92 H. PRACTICAL AND SCIENTIFIC IMPLICATIONS The primary purpose of this chapter is to evaluate the potential of the cartographic research paradigm presented in this dissertation as one means for improving our knowledge of the similarities and differences between human decision makers, especially between experts and neOphytes. A secon- dary purpose is to evaluate the particular theoretical model used as an example in this study, the Myopic Conservative Bayesain Decision Maker. Section A.T will analyze the results of the pilot study as an evaluation of the Myopic Conservative Bayesian Decision Maker and as an example of the use of the carto- graphic paradigm. Section 3.2 presents a sampling of alter- native theoretical models suitable for use with the paradigm in future research projects. Section ”.3 outlines some promising avenues for future research using the cartographic paradigm or extensions of it together with theoretical models based on those presented in section 3.2. Finally, sections h.h and 3.5 discuss the potential benefits to pro- fessional education and to decision support systems design that would arise from a long-term research effort built on the foundations established in this dissertation. 3.1 IMPLICATIONS OF THE PILOT STUDY M.A.J Substantive Implications: Evaluation of the Model The empirical findings of the pilot study reveal three important shortcomings of the Myopic Conservative Bayesian Decision Maker at the low-to-moderate levels of experience and motivation examined. These shortcomings are: a total lack of fit between model and behavior when inconsistent test results are received; a low rate of fit between model and behavior when consistent test results are received; and self-reports of subjective probability consistently greater than predicted by the theory underlying the model. As discussed in the previous chapter, the subjects who received inconsistent test results can be divided into two classes, depending on whether their behavior diverged from the model's before or after enough tests had been purchased to make the inconsistency apparent (i.e. a positive and a negative on repetitions of the same test or a positive on Tese V and negatives on both specific-virus tests). The implications for the Myopic Conservative Bayesian Decision Maker of the subjects who diverged before receiving incon- sistent results must logically be analyzed together with those who diverged without inconsistent results, as neither group's divergence can be attributed to any effect specific to inconsistent test results. The remainder, who diverged 9'4. only after receiving an inconsistent set of test results, consists of eight subjects in the first two preliminary experiments, and four subjects in Game 9 of Experiment A. While their behaviors show a variety of patterns, the data strongly suggest that when a person receives information which contradicts his expectations, this information is processed in a very different way from other information. For example, some subjects appear to be reacting in these experiments by "backtracking," returning to an earlier stage by discounting one or more tests to reduce the cognitive inconsistency. This issue will be further discussed in Sec- tion ”.2 as it affects the choice of components for new candidate theoretical models to replace the Myopic Conserva- tive Bayesian Decision Maker in future research. Looking only at behavior sequences without inconsistent test results, the data indicate some learning effect. In the preliminary experiments, eight subjects departed from the model without having received inconsistent test results and only four subjects made choices that the model could match. The more experienced subjects in Experiment A departed from the model ten times before receiving inconsistent test results and matched the model twenty times. In other words, after the cases of divergence following inconsistent test results discussed in the previous paragraph are removed, two thirds of the subjects with no prior experience on the game diverged from the model while only one third of 95 the games played by the experienced subjects in Experiment h diverged. These results do not preclude the existence of higher levels of experience at which virtually all games would match One or another of the behavior patterns the model can generate under the important restriction of no inconsistent test results; however, the Myopic Conservative Bayesian Decision Maker does seem clearly inadequate for its original purpose of analyzing the differences between expert and neOphyte decision makers. The reason for this failure may be partially elucidated by examining the data on subjective probabilities. The Con- servative Update Module of the MyOpic Conservative Bayesian Decision Maker is based on the assumption that the same kind of information processing is used in evaluating sources of future information and in evaluating information already received. Furthermore, it assumes that information proces- sing in the game used in this research is of the same kind as was elicited in the research paradigm developed by Edwards (0968, A972) and used by Schum and Martin and others, in which the subjects have no choice in the amount and nature of the data they receive. As the results in the previous chapter show, the self-reported subjective proba- bilities are not in general within the ranges predicted by the model. More significantly, the central finding of pre- vious work on subjective probability upon which the model is based, conservatism, is strongly contradicted by the present 96 data. The only element of the present experimental task which distinguishes it from previous studies in this area (Slovic and Lichtenstein, 4971) is the fact that subjects select and pay for data rather than having it selected and given to them by the experimentor. An important question for future research, as discused in Section 9.3.1 below, is the separation of the effects of paying for data from the effects of the fact that the data sources were selected by the subject. 97 A.J.2 Methodological Implications: Evaluation of the Paradigm The empirical research reported on in this dissertation is an example of the use of the cartographic paradigm to evaluate one particular candidate theoretical model, the Myopic Conservative Bayesian Decision Maker. This two- parameter model was derived from well-established prior research, namely Edwards' and Schum and Martin's findings on conservatism and Gorry's myopic variant of decision analy- sis. The model was formalized in a simulation program: the map of parameter space from which the ”cartographic para- digm" takes its name was then prepared by a series of simu- lation runs which found the critical parameter values which mark the boundaries between regions leading to different behavior patterns in the task environment used in the study. The behavior patterns of human subjects confronted with the same task environment was then examined to determine which region in parameter space, if any, corresponded to each human behavior pattern. This research, which followed the original design of the study according to the cartographic paradigm, revealed several important facts about the model: (a) the model is inadequate to account for behavior subsequent to the receipt of an internally inconsistent set of test results; (2) in the absence of inconsistent test results, the model performs 98 substantially better with experienced than with inexperien- ced decision makers, but it still fails to match a third of the experienced decision makers' patterns even when incon- sistent test results are excluded; and (3) 16 out of the 29 casees of behavior that matched the model, and thus could be categorized as to conservatism and myOpia, correspond to regions in parameter space with myOpia = 2 and conservatism 0.65 or greater (although 6 of these 16 cases showed Pattern h, which also occurs for myOpia :3 when conservatism is greater than .96.) These findings are sufficient to show the paradigm, as Opposed to the model, to be a success; the paradigm has exposed the flaws in an otherwise plausible model, and has identified the parameter region which warrants closest study if the model, or another model using a parameter space including conservatism and myopia among its dimensions, is used in future research. Experiment A also included an additional measure, self-reported subjective probability. This measure was not called for by the cartographic para- digm, nor does it have any part in the analysis based on that paradigm. However, self-reported subjective proba- bility served as a useful supplement to the study in examin- ing the reasons for the model's failure; the lack of conser- vatism in the data may well be the most important substan- tive finding of the pilot study. In future studies, it should be possible to incorporate more sOphisticated direct 99 measures of subjective probability into the economic beha- vior considered by the cartographic paradigm. Edwards (1967) gives an example of direct inferences of subjective probability from a laboratory exercise of economic choice. 100 3.2 ALTERNATIVE THEORETICAL MODELS 4.2.1 Update Models If a theoretical model seeks to predict or explain the process whereby a person selects costly information upon which to base a decision, the model must consider the person's state of belief about the state of the world, and how his state of belief changes as new information about the world is received. This state of belief is formalized in Chapter 2 as the salience vector .n. in axiom (D2); the generalized concept is operationalized in the Myopic Conser- vative Bayesian Decision Maker as a subjective probability vector 1"}, which is updated by the Conservative Update Module. DOCS, the Doctor Simulation System (Chan, 1979) satis- fies most of the requirements placed upon a theoretical model by the cartographic paradigm; it is a simpler model than the MyOpic Conservative Bayesian Decision Maker, although it is embedded in-a much richer task environment. In DOCS, the salience of a disease is equal to the number of a patient's known symptoms which are also symptoms of that disease, divided by the total number of the patient's known symptoms. Provision is made for weighting the importance of symptoms, but this is not used in the current version. The generality of the weighting mechanism may make DOCS a useful 101 testbed for alternative update models in future research, as discussed is Section 3.3.2 below. Most other non-Bayesian formal models of opinion change, such as that of Wallsten (1976) or the regreSsion model reviewed by Slovic and Lichtenstein (1971), address only the relative attractiveness of two hypotheses. It is frequently asserted (e.g. Wallsten page 196) that the two- alternative case can be readily generalized to several alternatives; however, even the highly simplified task envi- ronment used in the present research demonstrates that this is not necessarily the case when the availability of tests such as Test V organize the possible states of the world into hierarchies and/or subsets. Thus, published two- alternative models require analytic extension and empirical testing in the multiple-alternative case before they can be” used as a component for a model of choice of information. Nevertheless, this is an important avenue for future research; models like Wallsten's, developed from the start as models of actual human behavior, promise greater accuracy than such models as the MyOpic Conservative Bayesian Decision Maker which are derived from a formula such as Bayes' rule, which is Optimal only when all of its restric- tive assumptions are met and information processing is essentially unlimited. Another drawback to the use of Bayesian formulas, con- servative or otherwise, to model human behavior when infor- 102 mation is received sequentially has to do with information which contradicts the currently-favored hypothesis. Bayes' rule and its variants handle confirming and disconfirming information in the same way, algebraically combining the odds ratio or likelihood ratio of the new data with the prior odds or probability. The results of the pilot experi- ment as well as much published research (e.g. Abelson et al, 1968) indicate that humans, whether naive or expert, can react in quite different ways to confirming versus discon- firming data. One promising line of research is backtrack- ing, where a mutually-inconsistent set of test results is discounted or disregarded to return the subject's opinions to an earlier, consistent state. The literature of arti- ficial intelligence (e.g. Sussmann, 1977) and cognitive con- sistency theory (Abelson et al, 1968) will be useful sources for future models incorporating this phenomenon. 103 U.2.2 Choice Models The choice module of any theoreticl model within the cartographic paradigm is what converts the opinion state maintained by the update module into the observable choices of information sources and final alternative which form the observer's data. The choice module must handle three kinds of decision: whether to continue seeking information or make a final choice; what kind of information to purchase next; and what final alternative to select. Essentially all the published work in the area that uses explicitly priced information is based on the decision analysis algorithm; Raiffa (1968) and other authors advocate the full decision analysis algorithm as a normative model, Gorry et al (1973) use a myopic version of the algorithm for a project in computer-assisted medical decision-making, and Fried and Peterson (1967) compare human behavior with the optimum in the "optional stopping“ task environment in which the only real decision is when to cease purchasing repeated samples of the same binary test and make a final choice between two alternatives. When information is not explicitly priced, the decision to cease testing and select a final alternative may be made by a threshhold salience for the most salient state, or by defining some test results or combinations therof as leading to "certainty;" in the medical literature, this is referred 10A to as a ”pathognomic" set of signs and symptoms. In most research on choices of information, it is assumed that the final choice of alternative is relatively straightforward once the set of "decision premises" has been determined: "Given a complete set of value and factual premises, there is only one decision consistent with ration- ality" (Simon, 1957). Thus, the key to the final choice of alternative lies almost enitrely in the choices of infor- mation sources. One characteristic of the decision analysis algorithm is that the choice of each test is made based on the out- comes of every previous test; this feature is accentuated when myopia is introduced, because the n'th choice in the sequence is not even hypothetically analyzed at its maximum ' depth of search (myopia) until after the results of the (n-1)'th test have been received. However, a promising heuristic for reducing the decision-making burden is to schedule tests in batches, rescheduling only if the batch is exhausted without yielding sufficient premises for a final choice of alternative. This heuristic is used in DOCS as well as in other published normative and descriptive models in medical decision making, especially with regard to laboratory tests (Pau, 197A). The need for research on the use of this heuristic in future studies involving the carto- graphic paradigm is underscored by the tendency of some subjects on the preliminary experiments to make requests for 105 information such as "Test A and Test B" -- as this heuristic was not the focus of the pilot study, these subjects were simply told "one at a time, please," but a future task environment could easily be constructed which would allow batching. (The structure of Experiment A automatically precluded such requests.) Another aspect of human behavior, which Fried and Peterson demonstrated in a simple task environment and which may well have played a role in the behavior of the subjects in the present experiment, is premature termination. Neg- lecting fixed-budget effects not objectively present in the task environment, optimal strategies call for decisions based only on the present state of knowledge and the future costs, payoffs and probabilities. However, humans tend to resist "throwing good money after bad;" thus, a series of tests which objectively cancel each other out and should return the subject to his starting point tend instead to make him cease testing, even if this means applying the broad-spectrum treatment. Any complete theoretical model of human behavior must take into account this human trait, which is probably useful more often than not outside the laboratory where budget restraints do exist and the value of the information sources themselves (i.e. the conditional probabilities) are only imperfectly known. 106 4.3 LONGTERM RESEARCH PROGRAM The principal goal of this dissertation has been to develop and to demonstrate the use of a new methodology for the study of human decision-making processes, the "carto- graphic paradigm." This section deals with three areas of future research which will build directly upon this work: laboratory experiments to clarify some issues raised by the pilot study and to develop new theoretical models; a large ongoing research program in problem-solving in an ill- structured clinical environment in the MSU College of Human Medicine; and a program of applied research in management information and decision in association with the management gaming program of the Georgia State University College of Business Administration. 107 A.3.1 Laboratory Experiments The pilot study raises some important issues in the area of subjective probability, which can be studied using some relatively simple laboratory exercises. As discussed in Chapter 3, subjects in Experiment 1 indicated (by marking a calibrated line) higher levels of confidence than were justified by Bayes' rule. In a large number of earlier studies reported by Edwards and others, confidence levels below the Bayesian were found. The only unique elements which differentiate the present experimental task from those of earlier studies are the fact that the data presented to the subject were of the subject's own choosing (that is, the subject decided which test to call for; obviously, he did not choose whether the test was to have a positive or nega- tive result), and that the subject was required to pay for the data. Clearly, research is needed in which classic experiments in the literature of conservatism are replicated varying these two factors of choice and payment. The finding may also have some implications for the theory of organizational behavior; it is well known that an individual who participates in a decision-making process, especially in a small group setting, tends to be more com- mitted to the ensuing decision. (Vroom, 1961). Without denying the obvious sociological elements in this facili- tation effect, the findings in the present study imply that 108 a person who has invested time and effort in determining some of the premises upon which the group decision was based may have higher confidence in the correcntess of the decision by Virtue of that fact. This conjecture may be tes- ted by designing and carrying out a series of experiments contrasting (a) group decisions based on group-determined premises; (b) group decisions based on individually-determined premises; (c) group decisions based on externally-determined premises; (d) individual decisions based on group-determined premises; (e) individual decisions based on individually-determined premises; and (f) individual deciSions based on externally-determined premises. Another avenue for laboratory research building upon this dissertation is the development of new theoretical models based on the concepts outlined in Section ”.2. None of these concepts are as fully developed as the conservative Bayesian update formula and the myopic decision analysis algorithm, which is why the latter concepts were used in the pilot study; thus, the first step is to explore the exten- sions needed to published models for use in a richer task environment with explicit costs and benefits. For example, 109 research is currently underway in collaboration with Dr. Chan involving a replication of Fried and Peterson's (1969) optimal stopping experiment augmented with self-report of subjective probabilities (Whalen and Chan, 1979). Future research along these lines will include extending Wallsten's (1976) algebraic models to handle sequential receipt of information, and modifying DOCS (Chan, 1971) to Operate with less user intervention. This research is expected to lead to a new generation of theoretical models suitable for use in the two large research projects described in Sections 110302 and 310303 below. 110 3.3.2 Problem-Solving Research in a Medical Environment The cartographic paradigm will be one of many elements of the continuing program of research into structural and behavioral aspects of ill-structured problem solving being carried out within the Michigan State University College of Human Medicine (Chan, 1971a,b; 1978; Chan and Whalen, 1979; Whalen and Chan, 1979). In the next phase of this research, practicing physicians and medical students will be the sub- jects for experiments using a very complex task environment of medical problem-solving. The data to be collected in these experiments consist of "decision traces," which correspond roughly to the "behavior patterns" described in this dissertation, augmen- ted with two kinds of self-reported confidence level. The decision traces wil be recorded from the interactions be- tween individual physicians or medical students and a compu- terized parient case record system, with a special process tracer acting as front end to collect the data. One of the key phases in the analysis of this data is a cluster analysis of decision traces, to find a collection of distinct canonical strategies which characterize fundamental ways of approaching a problem. Before such a cluster analy- sis can be carried out, however, a metric for the distance between pairs of decision traces is needed. This metric will be defined primarily in terms of a ”deep metric" over 111 an underlying parameter space, which will have dimensions comparable to conservatism and myopia plus additional dimen- sions to be determined by an interdisciplinary team of information scientists and medical educators. Computer- assisted role playing will then be used, in a manner analo- gous to the simulation runs in this dissertation, to find the decision traces which arise from the important "land- marks" in the parameter space. Finally a "surface metric" will be used to locate observed decision traces relative to these landmarks by identifying trivial order effects or other unimportant differences. The reason for the reliance on the "deep metric” and minimization of the role of the "surface metric” lies in the occasions when a slight shift in strategy, or an arbitrary choice among alternatives to which a single strategy is indifferent, leads to a different choice early in the decision trace which in turn leads to sharply divergent surface behavior. In such a case, the two traces would seem far from each other according to the surface metric, but they would be close to respective mem- bers of a pair of landmarks known, via the deep metric, to be close to each other. 112 1.3.3 Applied Research Using Management Gaming The management gaming programs of Georgia State Univer- sity (Nichols and Schott, 1973; Thompson, 1978) have a strong component of decision support systems which students may Optionally choose to use to aid them in making decisions in a simulated competitive business environment. At present, no record is kept of the use of these decision sup- port systems, but preliminary conversations with the persons involved in maintaining them indicate that it is feasible to modify the programs to collect some very useful data about the information sources and statistical transformations used by students in management courses at the graduate and under- graduate levels. Knowledge obtained from comparing these data with data on the choices and performance of the stu- dents in the games themselves can then lead to changes in the game structure in the direction of greater realism, which would be of benefit both to research in decision- making and to the education of the students using the games. 113 A.” IMPLICATIONS FOR PROFESSIONAL EDUCATION One of the more difficult aspects of professional edu- cation in many fields has to do with helping the student acquire competent judgement in the use of costly, imperfect sources of data. Data sources may have a clear monetary cost, such as diagnostic tests in medicine, destructive tes- ting in engineering and market survey research in business, and there may be risks involved in obtaining the data, such as in exploratory surgery or test-piloting experimental air- craft. A more insidious cost of information is the time and effort required to digest the information once it is on hand, which often outweighs the benefits even if the infor- mation is obtained at zero marginal cost. In principle, the decision analysis algorithm is the Optimal solution to the problem of costly imperfect data; however, in many if not most practical cases the cost of applying the algorithm itself, including the data acqui- sition and interpretation cost of obtaining the high-quality numeric data required by the algorithm, exceeds the poten- tial benefits if the algorithm is feasible at all. Profes- sional decision makers, on the other hand, successfully solve such problems on a near-routine basis. The difficulty in training their successors or expanding their number arises because the ability to solve problems does not imply the ability to explain how one solves them in a manner that 11” can readily be put into practice by a student. Learning by example and by supervised practice can never be superceded entirely, nor should they in a responsi- ble profession. However, an improvement in the language with which we talk about this judgement process can greatly improve the students' preparedness to benefit from example and practice. The potential of a program of research carried out under the cartographic paradigm introduced in this dissertation is to develop and validate theoretical models and localize expert decision processes to regions within the psychologically meaningful parameter spaces of those models, using the actual behavior of expert decision makers in a standardized task environment instead of their less-reliable introspections about why they chose that behavior. Students' strategies could then be localized elesewhere in the parameter space, and the ways that the students should change if they want to become more like the experts can be explained in terms of the parametric dimen- sions of the space. 115 9.5 IMPLICATIONS FOR DECISION SUPPORT SYSTEMS The kind of information about human decision-making processes that would be obtained from a program of research using the cartographic paradigm would be of benefit to designers of organizational and automated decision support systems in two ways. First, any methodology which improves our ability to predict what information a decision maker will find useful can be used to design systems that make that information more readily available -- and equally important, separate the useful information from the mass of data which is not useful in the particular decision process used by the decision maker. The second potential contribution is more specifically linked to the cartographic paradigm per se. If the region of parameter space in which a decision maker habitually operates is near, but outside, a region which leads to appreciably better reults on the average, then it may be possible to supply artificial aids which are tailored to the decision maker's own characteristics and designed to help him attain higher performance without interfering with his judgement and responsibility. An example of this philosOphy can be found in Edwards' work on the Probabilistic Infor- mation Processor (PIP), summarized in (Edwards, 1968). In PIP, humans interpreted incoming data by stating the con- ditional probability of each event that occurred (rather 116 than each possible event, an immensely larger undertaking) given each of several hypotheses. A computer then used these conditional probabilities to update the probabilities of the respective hypotheses according to Bayes' law, avoi- ding the conservatism shown by a control group of subjects who updated the probabilities themselves. However, the experimental task environment Edwards used points up the need for extreme caution whenever human judgement is supple- mented or supplanted by a formal algorithm. In the experi- ment, conducted in the nineteen-sixties, subjects inter- preted events in an imaginary world crisis set in 1975 with respect to the likelihood of conVentional or nuclear war, and Edwards defined improved performance in most cases as a higher (less conservative) inferred probability of imminent war after a few events pOinting in that direction. The advantage of an aid such as PIP over one which is construc- ted without reference to human characteristics is that PIP's results can be easily critiqued because the process is close enough to human thinking to be understood; thus, one can say ”I think we NEED to be more conservative in this situation." On the other hand, when a decision maker's staff or computer makes a recommendation on the basis of a process foreign to the decision maker's own thought processes, the decision maker can only "take it or leave it." 117 5. SUMMARY AND CONCLUSION The cartographic paradigm for research in human decision processes, introduced and demonstrated in this dis- sertation, draws upon earlier research in the areas of problem-solving protocols, subjective probabilities, weight- ing coefficients, and choice of information. These intel- lectual debts are acknowledged in Chapter 1, and the present approach is contrasted with each of its antecedents. Chapter 2 contains a formal definition of an experi- mental or observational system composed of a decision-making task with costly imperfect data, a decision maker who seeks to maximize his subjective expected utility, a candidate theoretical model of the decision-making process, and an observer who seeks to explain the decision maker's overt behavior (information purchases and final choice of alterna- tive) in terms of the theoretical model. A theoretical model within this system consists of a parameter space each of whose dimensions is a psychologically meaningful charac- teristic of a human decision maker, plus an algorithm for determining the sequence of test purchases and final choice of alternative that would arise when a decision maker char- acterized by a particular point in parameter space confronts a particular decision-making task. The requirements for a theoretical model are further clarified in Chapter 2 by the presentation of a simple example, the two-parameter Myopic 118 Conservative Bayesian Decision Maker. Chapter 3 reports on a pilot study which demonstrates the use of the cartographic paradigm; in the pilot study, the MyOpic Conservative Bayesian Decision Maker is evaluated in the context of a simulation game in veterinary medicine which was played by four groups of subjects under different conditions of expertise, experience, and motivation. The first phase of the pilot study was to specify the details of the decision-making task in such a way that a sufficient range of behavior patterns resulted from reasonable varia- tions in the model's parameters of conservatism and myOpia, and to design the initial set of experimental materials with which to convey this task to the subjects. The second phase consisted of roughly fifty simulation runs using a PASCAL implementation of the Myopic Conserva- tive Bayesian Decision Maker to produce the map of parameter space from which the cartographic paradigm takes its name. This map was constructed in four segments, for myopia equal to 1, 2, 3, and A; within each segment, the critical values of conservatism at which the pattern of overt behavior changed were found. At such a critical value, the simulated decision maker is indifferent between two substantively different strategies, while all values of conservatism between two critical values at a given level of myopia lead to identical information choices and final choice of alter- native in the particular task environment used, and 119 thus define a region within the parameter space. The third phase of the pilot study was to carry out three preliminary experiments in which the task environment was further refined and selection criteria for appropriate subjects were developed. The final phase of the pilot study was Experiment 4, in which twenty subjects played the simu- lation game ten times each; the last two games each subject played had standardized (rigged) test results and were used for analysis under the .cartographic paradigm_and also in terms of a self-report of subjective probability similar to that used by Edwards (1968). The results of these experiments show that the Myopic Conservative Bayesian Decision Maker is not adequate as a model of human decision processes at the levels of motiva- tion and experience studied, especially when any of the tests purchased by the decision maker give results contra- dicting the hypothesis supported by the previous tests. Analysis of the 21 games in which subjects did use behavior patterns explainable by the model did show clustering within a particular region of parameter space, however; this infor- mation may prove useful in conjunction with any future model which includes conservatism and myopia among its parameters. The supplementary analysis based on self-reported sub- jective probability produced an important unanticipated result. Edwards (1968, 1972) and other researchers found human estimates of a posteriori probability to be consis- 120 tently less than the optimal Bayesian estimate in a wide variety of circumstances in which subjects are given proba- bilistic information; the present study indicates the Oppo- site effect when subjects choose and pay for the information they receive. Chapter 9 begins with a discussion of the substantive and methodological implications of the pilot study itself. Following this, some promising concepts around which new theoretical models can be built are presented, and some specific Opportunities for future research involving the cartographic paradigm are examined. The last two sections of Chapter A are concerned with the potential benefits of a successful program of research under the cartographic para- digm to professional education and to decision support sys- tems respectively. APPENDICES , 2 APPENDIX a. DERIVATION OF ES’Qm'hr-kfl’fi 121 fitvrto‘fi) APPENDIX: Derivation of Es Q( 9 Given a set of exclusive, exhaustive hypotheses 01, i - 1,2, ..., s with prior probability P(ol), and given datum D with conditional probability P(Dloi) for each i. a By Bayes' theorem, P(d‘1[D) - P(O’1)P(D1d'1) ZP(o-m)P(Dla;) . III In odds form, the equivalent formula is "”1"” P(o")P(D|a')/P:ZP(U )P(D|a-) I - P(O' |D) (Em-mule) - P(O’)P(D10‘)>/ZP(O')P(DIO’) . P(’1)P(D’°I) . gflfifiwlm - 1'03)qu _ _P(r) P(DIcr)£l - P(