TRAiNiNG OF MEDiCAL STUDENTS {N A - PROBLEM - SOLVING SKILL- THE GENERA'HON 0F DIAGNOSTIC PROBLEM FORMULATIONS mssertation for the Degree of Ph. D. WEB-EGAN ST ATE UHWERSiTY LINDA KATHLEEN ALLAL 1973 wuurvwwflwflmwW i . i _ ' “ éigzaivczsity .1] ,7: ‘_ .—~--—: [wuLtg ABSTRACT TRAINING OF MEDICAL STUDENTS IN A PROBLEM-SOLVING SKILL: THE GENERATION OF DIAGNOSTIC PROBLEM FORMULATIONS BY Linda Kathleen Allal The purpose of this research was to develop and test experimentally a procedure for training second-year medical students in one aspect of medical problem solving: namely, the generation of diagnOstic problem formulations on the basis of cues obtained during the initial minutes of the doctor-patient encounter. Recent investigations of medical problem solving (e.g., Elstein, et al., 1972) have found that the early generation of problem formulations (or hypotheses) is a major feature of the ekperienced physician's cognitive activity in conducting a clinical workup. Training which explicitly focuses on this process may be expected to improve the medical student's ability to generate appropriate early problem formulations, and, thereby, aid him in making the transition from classroom to clinical practice. The training model developed and tested in this study included two major components: (1) having the student practice the task of generating initial problem formulations under conditions which simulate the early part of the clinical Linda Kathleen Allal encounter, and (2) providing the student with feedback based on the performance of this task by a sample of experienced physicians. Color films which present a "physician's eye View" of the first four-six minutes in a doctor-patient encounter were used to simulate the conditions of the early part of the workup. Two type of feedback materials were utilized: (1) "outcome feedback," consisting of written materials summarizing the outcomes (i.e., problem formu- lations generated and cues associated with each) character- istic of the experienced physicians who viewed each film, and (2) "process feedback," consisting of written materials and a second set of films designed to portray the processes by which the eXperienced physicians arrived at their problem formulations outcomes. The process feedback films consisted of a second version of each of the original films in which "think aloud" recordings, interposed at various points in the dialogue, were used to portray the processes typically going on inside the physician's head as he observes and interviews the patient. A training experiment was designed to test the fol- lowing hypotheses: (1) that the training model would signifi— cantly improve second—year medical students' skill in gener- ating diagnostic problem formulations, and (2) that the model would be significantly more effective when it provides both outcome and process feedback than when it provides outcome feedback only. Linda Kathleen Allal In preparation for the training experiment eight color films were produced and data on problem formulation outcomes and processes were collected from a sample of eight eXperienced physicians. On the basis of these data, the feedback materials and posttest scoring keys were prepared. The training experiment included three conditions: two treatment conditions and a posttest-only control condition, with 16 second-year medical students randomly assigned to each condition. Both treatment conditions involved the application of two-component training model, but they dif- fered with respect to the type of feedback that was provided: under one condition, the subject received outcome feedback only; under the second condition, the subject received both outcome and process feedback. Training consisted of three weekly sessions, followed by a posttest session on the fourth week. The subject's posttest performance was evalu- ated by means of four dependent variables: (1) a problem formulation score, (2) a cue utilization score, (3) a classi- fication of cues with respect to problem formulations score, and (4) a relationships among problem formulations score. Analysis of covariance on the posttest data yielded the following results. Hypothesis 1 was supported: a significant difference was found between the trained groups and the control group on the dependent variable of major interest (problem formulation score), although not on the other variables. Moreover, supplemental analyses indicated_ Linda Kathleen Allal the trained groups‘.problem.formulation performance was superior on both quantitative and qualitative dimensions. Hypothesis 2 was not supported:. there were no significant differences between the two trained groups on any of the dependent variables. Several explanations for the ineffec- tiveness of the process feedback were given consideration, in particular the possibility that the subjects in the group which received outcome feedback only were able, on the basis of this material, to generate their own process feedback. In addition to the analysis of the results of the training experiment, the data collected from the sample of experienced physicians were analysed in detail. On the basis of this analysis several tentative conclusions were drawn. First, with respect to problem formulation outcomes, it was found that the result of the physician's information- processing activity during the early part of the workup is not a unidimensional list of problem formulations, but a structured set of problem formulations which may be described in terms of four features: (1) hierarchical organization, (2) competing formulations, (3) multiple subspaces and (4) functional relationships. Secondly, with respect to problem formulation processes, it was found that direct Kassociative retrieval, rather than strategy—guided search, appears to be the primary cognitive mechanism involved. Linda-Kathleen Allal Discussion of the implications.of the results of the training experiment--for future research and instructional development--dealt with the following topics: (1) exten- sions of the training model to other.types of medical problem- solving skills; (2) use of other types of media (e.g., slide- tape units) to simulate the early part of the workup; (3) questions pertaining to the feedback component of the model; (4) instructional applications of the training materials. The findings from the analysis of the physician data were also discussed with respect to their implications for future research on medical problem solving. References Elstein, A. 8.; Kagan, N.; Shulman, L. 8.; Jason, H.; and Loupe, M. J. Methods and theory in the study of medical inquiry. Journal of Medical Education, 1972, 47, 85-92. TRAINING OF MEDICAL STUDENTS IN A PROBLEM-SOLVING SKILL: THE GENERATION OF DIAGNOSTIC PROBLEM FORMULATIONS BY Linda Kathleen Allal A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Counseling, Personnel Services and Educational Psychology 1973 zfi’ ACKNOWLEDGMENTS 3A v Appreciation is extended to the members of my disser- tation committee, Professor Arthur S. Elstein, Professor Andrew C. Porter, Professor Rose Zacks, and, in particular, to my committee chairman, Professor Lee S. Shulman, for the able guidance, the encouragement and the perceptive Criticism which they offered throughout the course of this research. A special debt of gratitude is owed to Donald Gragg, M.D., David N. Ostrow, M.D., and Michael Spooner, M.D., for their extensive aid in the preparation of the research materials. I am also indebted to Gregory Loftus for his assistance in the administration of the training sessions, and to Mary Fedewa for her typing of the research materials and the many drafts of the dissertation. This research was supported in part by National Institute of Health Grants PM-OOO41 and 71-2208. ii ‘ TABLE OF CONTENTS .Page LIST OF TABLES O O O O O O .0 O O O O O O 0 Vi LIST OF FIGURES O O O O O O O O O O O O 0 ix DEFINITION OF TERMS O I O O O I O O O O O O x Chapter I. INTRODUCTION . . . . . . . . . . . . 1 Overview of the Study . . . . . . . . 4 Rationale . . . . . . . . . . 8 The Training Model . . . . . . . 9 Design of the Study . . . . . . . 12 Research Hypotheses . . . . . . . 14 II. REVIEW OF THEORETICAL ISSUES AND RESEARCH LITERATURE O O O O O O O O O O O O 0 16 Medical Problem Solving . . . . . . . 16 Initial Problem Formulations in Medical Problem Solving . . . . . . . . . . 26 DevelOpment of the Training Model . . . . 35 The Simulation Component of the Model . 35 The Feedback Component of the Model . 40 III. METHOD 0 o o o o o o o o o o o o o 44 Production of the Films . . . . . . . 44 Collection of Physician Data . . . . . 50 Sample . . . . . . . . . . . 50 Materials . . . . . . . . . . 53 Procedure . . . . . . . . . . 55 Analysis . . . . . . . . . . 57 The Training Experiment with Second-Year Medical Students . . . . . . . . . 58 iii Experimental Procedure . . . The Problem Formulation Task . The Instructional Sequence Outcome Feedback . Process Feedback . Materials . . . Posttest Tasks . Subjects . . Pilot Testing . . The Covariate . . . . . Dependent Variables . . . Reliability and Validity . . Additional Dependent Measures Hypotheses . . . . . . . Analysis . . . . . . . IV. ANALYSIS OF THE PHYSICIAN DATA . . . . . Generation of the First Problem Formulation . . . . . . . . . . The Structure of a Set of Initial Problem Formulations . . . .. . . . . . . Structural Features . . . . . . Size and Organization . . . . . Conclusions . . . . . . . . Processes Involved in Generating Initial Problem Formulations . . . . . . . Conclusions . . . . . . . . V. RESULTS OF THE TRAINING EXPERIMENT . . . Tests of Experimental Hypotheses . . . Reliability Estimates . . . . . Results of Hypothesis Tests . . . Relationships among Dependent Variables . . . . . . . . . Supplemental Analyses . . . . . Results of Additional Posttest Tasks Treatment- Control Differences in Problem Formulations . . . . . Comparison of the Treatment Conditions. VI. CONCLUSIONS AND IMPLICATIONS . . . . . . Problem Formulation Outcomes and Processes in the Experienced Physician . . . . iv Page 59 64 67 71 72 76 78 89 91 93 103 107 108 108 110 111 113 114 123 127 127 148 150 150 150 154 168 171 171 177 186 200 200 Page Training of Medical Students in the Generation of Initial Problem Formulations . . . . . . . . . . . 204 Conclusions . . . . . . . . . 204 Implications for Future Research . . 205 Instructional Applications . . . . 212 MFERENCES O O O O I O O O O O O O O O O 214 APPENDICES O O O O O I O O O O O O O O O 221 Case Outline for Film 1: "A 21-year-old college senior" . . . . . . . . . . . 222 Process Checklist . . . . . . . . . . . 226 Training Materials . . . . . . . . . . 230 Additional Posttest Tasks . . . . . . . . 264 Questionnaire . . . . . . . . . . . . 269 Posttest Scoring Keys and Scoring Instructions . 275 Analysis of the Structure of a Set of Problem Formulations . . . . . . . . . . . . 288 Additional Data 0 O O O O O O O O O O 290 Homogeneity of Regression . . . . . . . . 294 Modifications of the Outcome Feedback Version of the Training Materials . . . . . . . . 301 5. 6. 7. 10. ll. 12. 13. 14. LIST OF TABLES The Eight Films . . . . . . . . . Characteristics of the Physician Sample . . The Experimental Procedure . . . . . . . Properties of the Feedback Presented Under the Two Treatment Conditions . . . . . . . Selected Characteristics of the Student Sample Results of the Pilot Test . . . . . . . Features Characteristic of Individual Sets of Problem Formulations, by Film and by Subject Number of Problem Formulations, and Number of Subspaces: Average and Range by Film, and by Subject . . . . . . . . . . . The Relative Frequency with which each Process Checklist Item was Checked . . . . . . Classification of Checklist Items on Two Dimensions: (A) Degree of Subject Stability, and (B) Degree of Task Stability . . . . . . . . . . . . Inter-scorer Reliability Coefficients on the Variables CUE, PF, CUE—PF and R—PF . . . Generalizability Coefficients, and Within-Group Correlation Coefficients (Between Tasks) on the Dependent Variables CUE, PF, CUE-PF and R-PF O O O I O O O O O O O O 0 Means and Standard Deviations on the Dependent Variables and Covariate, by Experimental Condition . . . . . . . . . . . . Adjusted Means on the Dependent Variables, by Experimental Condition . . . . . . . vi Page 46 52 62 70 89 91 .118 124 135 137 151 154 157 157 Table Page 15. Multivariate Analysis of Covariance, on CUE, PF, CUE-PF I R-PF o o o o o o o o o o o 0 16 O 16. Univariate Analyses of Covariance, on CUE, PF, CUE—PF, R-PF . . . . . . . . . . . . 162 17. Scheffé Post Hoc Comparisons on PF and CUE-PF . 165 18. Relationships Among Dependent Variables . . . 170 19. Results of the Recognition of Cues Task, by Experimental Condition . . . . . . . . 173 20. Results of the Additions to Response Sheets Task, by Experimental Condition . . . . . . . 176 21. Analysis of the Structure of the Students' Sets of Problem Formulations, by Experimental condition 0 o o o o o o o o o o o 9 l7 8 22. Number of Subjects Generating at Least One Problem Formulation in Various Categories, by Experimental Condition . . . . . . . 183 23. Means and Standard Deviations of Treatment Group PF Scores on the Training Films (1—6) . . . 187 24. Means and Standard Deviations of Treatment Group Responses to Questionnaire Items (Section 1) . 189 25. Means and Standard Deviations of Treatment Group Scores on Questionnaire (Section 1) . . . . 191 26. Analyses of Variance on the Questionnaire Scores: EV FILM, EV FB, EV GEN . . . . , , , , 191 27. Responses to Sections Two and Four of the Questionnaire . . . . . . . . . . . 195 28. Comparison of the Participants with the Refusal/ No Contact Group on Focal Problems Exam Scores 291 29. Adjusted Means on Number of Subspaces, and Number of Problem Formulations, by Experimental condition 0 O I O O O O O O O O O O 292 30. Multivariate Analysis of Covariance on Number of Subspaces and Number of Problem Formulations . 292 vii Table Page 31. Scheffé Post Hoc Comparisons on Number of Subspaces and Number of Problem Formulations . . . . . . . . . . . . 293 32. Regression and Correlation Coefficients for the Covariate with each Dependent Variable, by Experimental Condition . . . . . . . . 298 33. Tests of Homogeneity of Regression for CUE, PF, CUE-PF, R‘PF o o o o o o o o o o o o 299 34. Regression and Correlation Coefficients for Treatment II with One Deviant Subject Elimnated I O O O O O O O O O O O 300 viii LIST OF FIGURES Figure Page 1. A sample of the Outcome Feedback for Film 1 . . 73 2. Relationship Between Cognitive Outcomes and the Dependent Variable Scores CUE, PF and CUE-PF . 95 3. Structural Diagram of the Composite Set of Problem Formulations Generated by the Physician Sample for Film 1 . . . .. . . . 116 4. Bivariate Distribution of Treatment II Scores on the Covariate and the Dependent Variable CUE . 300 ix DEF INITION OF TERMS EEEf-A cue is an element of data pertaining to the patient's physical and/or psycho-social condition which the physician utilizes to generate diagnostic problem formu- lations. The cues of particular concern in this study are those obtained during the early part of the clinical workup: namely, (1) symptoms of a physical and/or psycho-social nature reported by the patient, and (2) nonverbal cues, including physical signs of illness and general psycho- social indices, observed by the physician. PROBLEM FORMULATION—-A problem formulation is a label, having potential diagnostic and/or management implications, which the physician generates on the basis of cues obtained from the workup. It represents a tentative "working diag- nosis," or hypothesis, which can account for some portion of the cues obtained. A problem formulation may range from the highly general (e.g., "organic disorder," "psychological problem") to the highly specific (e.g., "myocardial in- farction," "glomerulonephritis") depending on the adequacy (of the cues at hand. WORKUP-—This term designates a clinical encounter Ioetween a physician and a patient. A workup typically includes: (1) an interview of the patient to elicit medical, personal and family history, (2) physical examination of the patient, and (3) ancillary diagnostic procedures (e.g., laboratory tests). A workup may take place under a variety of conditions, with respect to setting (e.g., private office, hospital outpatient or inpatient facilities), number of previous encounters between the patient and doctor, and nature of the patient's medical problem (e.g., routine check-up, present illness, medical emergency). The type of workup with which the present study is concerned is the office visit in which the physician encounters a patient for the first time and is presented with one or more complaints pertaining to a present illness. xi CHAPTER I . INTRODUCTION The past two decades have seen considerable ferment and innovation in the field of medical education. The traditional medical school curriculum, organized in terms of discipline-centered coursework in the basic sciences during the first half of the program, followed by supervised clinical work (clerkships) during the second half of the program, has come under substantial criticism as a method of preparing students for their future responsibilities as clinicians. Although criticism has covered a wide range of issues, much of it has focused on two concerns. The first is that the medical student be introduced to his future role as a clinician at a much earlier point in his studies than has traditionally been the case. This concern has led to the inclusion of courses dealing with clinical skills during the first half of the medical school program, and 11>attemptstr>integrate the teaching of the basic science disciplines with orientation to the clinician's role. A Second concern has been that the student's introduction to his role as a clinician focus on patients and their problems, rather than on abstract principles of medical science, as has been characteristic of the traditional curriculum. One 1 of the earliest responses to this concern was the organi- zation of clinical clerkships around the principle of "comprehensive medical care," with a new emphasis on the psychological and sociological dimensions of medical problems as manifested in the individual patient (Hammond and Kern, 1959; Reader and G055, 1967). More recently, this concern has led to efforts to devise new instructional methods: (a) to develop the student's interpersonal skills, needed for effective and humane interaction with the patient, and (b) to develop the student's information-processing skills, needed for effective medical diagnosis and decision— making. Methods to achieve the first goal have included the creation of courses on "doctor-patient relations" and training in interpersonal techniques. With respect to the second goal, efforts have been directed toward the development of what may be termed "problem-solving" approaches to the teaching of clinical skills. These efforts have included the introduction, early in the program, of coursework dealing with "focal problems" (Ways et al., 1973), the abandonment of the traditional "case presentation" method of clinical teaching in favor of an approach emphasizing the generation and testing of diagnostic hypotheses (Engel, 1971), and the instruction of students in the use of "problem-oriented" methods of medical record keeping (Weed, 1969; Ways et al., 1972). "Problem-solving" approaches in medical education, like their counterparts in the curriculum reforms of the 19503 and 603 in primary and secondary education, exhibit considerable diversity of objectives and methods. There are two premises, however, which are common to all such approaches. The first premise is that the practice of clinical medicine is in essence a problem-solving activity (Miller, 1962; Elstein, et al., 1972; Barrows and Bennett, 1972; Ways, et al., 1973). Thus, develOpment of the information-processing skills necessary for medical problem solving is considered to be a primary curriculum objective. The second premise is that the teaching of these skills can best be achieved by means of a problem—solving mode of instruction: (a) by providing the student with opportuni— ties early in his program to practice medical problem solving under conditions which closely approximate the conditions of clinical practice (Ways, et al., 1973), and (b) by designing clerkship experiences which will encourage the student to approach the clinical workup as a problem—solving task (Miller, 1962; Engel, 1971). Both of these premises rest, implicitly, on psychological and educational princi— ples of long standing and respected parentage: most notably the conception of human cognitive activity and of the edu— cational process advocated by Dewey (1963, 1938) and Bruner (1960, 1966). However, their application in the context Of medical education still represents a major challenge at the present date. \f. \A Overview of the Study The physician's activity in a clinical setting is exceedingly complex. It involves both cognitive information- processing skills and affective interpersonal skills, and, in terms of outcomes, requires both diagnostic judgments and management decisions. The present study, like the larger research project1 of which it is a part, has limited the scope of its investigation to: (a) the physician's cognitive skills, and (b) the way in which he uses these skills to arrive at a diagnosis. Moreover, the present study focuses on only a very small, but highly important, portion of the diagnostic process: namely, the physician's information- processing activities during the first five minutes of his encounter with a patient. In recent years there have been a number of studies (Elstein, et al., 1972; Barrows and Bennett, 1972) designed to investigate in depth the cognitive activities of the experienced physician in carrying out a clinical workup. The outstanding feature of these studies, as compared to earlier research efforts (e.g., Kleinmuntz, 1968; Rimoldi, 1963), is their use of simulated patients in order to study 1This study is one of several conducted by the .Medical Inquiry Project, directed by Arthur Elstein and Lee Shulman, Office of Medical Education, Research and .Development, Michigan State University. Reports on other Studies conducted by the project are found in Elstein, et a1. (1972), Gordon (1973) and Sprafka (1973). A comprehenSive fiinal report on the activities of the project is forthcoming Ln January 1974. L" the physician's activity in a naturalistic setting, closely resembling the conditions of actual clinical practice, and yet achieve an adequate degree of experimental control. Probably the most salient feature of medical problem solving to emerge from these investigations is the critical role of hypotheses, generated by the physician during the earliest minutes of his encounter with a patient, and subsequently tested by the collection of data during the remainder of the workup. Other investigators (Wortman, 1972; Schwartz and Simon, 1970) have also identified hypothesis generation as a fundamental step in the diagnostic process. The notion that medical problem solving begins with the generation of diagnostic hypotheses is hardly novel in one sense: theories L“ of problem solving, both classical (Dewey, 1938) and current (Shulman et al., 1968; Newell and Simon, 1972) have posited that the generation of some form of conceptual framework (whether in the form of hypotheses, problem formulations or a problem space) is a major early step in attempting to solve complex problems. However, the notion is quite novel within the context of clinical medicine, where the diagnostic process has long been considered and taught, in the classical empiricist~¢ tradition, as an essentially inductive procedure in which thoroughness and objectivity of data collection should precede consideration of diagnostic possibilities (e.g., Harvey and Bordley, 1970). Thus, a major import of recent research is the finding that experienced physicians, despite their training in an inductive, "reserve judgment" approach to diagnosis, almost universally employ an approach charac- terized by early hypothesis generation and subsequent hypothesis testing. The purpose of the present study, broadly stated, is: (a) to develop, and (b) to test experimentally a procedure for training medical students in the generation of diagnostic problem formulations based on the data obtained during the earliest minutes of a clinical workup. In this study the term "problem formulation" (rather than "hypothe- sis") has been used to refer to the outcomes of the physi- cian's information-processing activity during the early part of the workup. This term has been chosen because it is one with which the particular medical student population partici- pating in the experiment was accustomed through their course- work on the use of problem-oriented methods of medical record keeping. Conceptually, however, it has the same meaning as the term "hypothesis," as employed in the research by Elstein, et al., Barrows and others. In focusing on the first five minutes of the clinical encounter, this study is concerned with two types of information-processing skill: (1) the detection and encoding of cues (i.e., elements of data having diagnostic relevance) presented by the patient during the early minutes of the workup, and (2) the use of these cues to generate an initial set of diagnostic problem formu- lations. Obviously, the process of testing these initial formulations by means of further data collection is of crucial importance in the resolution of diagnostic problems. The present study, hpwever, limits its investigation to the training of students in the process of generating these initial formulations. Before presenting an overview of the study, one more preliminary comment is in order. It should be recognized that the adoption of a "problem-solving" approach to training medical students does not necessarily imply training in the early generation of diagnostic problem formulations. Some current methods of clinical training do emphasize the gener- ation of hypotheses early in the workup (e.g., Engel, 1971). But others do not. For example, the steps in medical problem solving, as defined by Ways, et a1. (1973, p. 566) include: "(1) sensing that a problem exists; (2) collecting the data (from history, physical examination, and ancillary diagnostic methods) . . . ; (3) rationally defining and formulating the problem(s); (4) deciding upon a course of action. . . ." In this sequence, it will be noted, the act of generating problem formulations (step 3) is at the end of the workup (step 2), which is where the traditional "reserve judgment" View of the diagnostic process has always placed consideration of diagnostic possibilities. Thus, a major feature of the present experimental study is that it attempts to train medical students in the generation of diagnostic problem formulations at a very early point in the workup, a point at which only a very small portion of potentially relevant data is available. As far as this particular goal is con— cerned, there have been very few instructional precedents, and virtually no research precedents. Rationale The rationale for training medical students in the early generation of diagnostic problem formulations rests on two arguments: 1. A major conclusion to be drawn from much of the psychological literature on problem solving and inquiry (Dewey, 1938; Shulman et al., 1968; Newell and Simon, 1972) is that, when confronted with any complex problematic situation, the human information-processor inevitably generates some sort of conceptual framework as an early step in his search for a solution. The Elstein, et a1. (1972) and Barrows and Bennett (1972) investigations provide evidence that this conclusion applies to medical problem solving as well. Thus, training of medical students in the early generation of diagnostic problem formulations is consistent with what is known about the cognitive processes of human problem-solvers in general, and of experienced physicians in particular. 2. Since physicians, despite their training in the traditional "reserve judgment" approach to diagnosis, are found to generate problem formulations very early in the workup, it is to be anticipated that medical students, even without special training, would tend to approach medical problems in a similar manner, and that this tendency would be reinforced by increasing clinical experience. How- ever, it is believed that training, which explicitly focuses on this process, will improve the medical student's ability to generate appropriate early problem formulations, and, thereby, aid him in making the transition from classroom to clinical practice. The Training Model The training model developed and tested in the present study includes two major components: (1) having the student practice the task of generating initial problem formulations under conditions which simulate the early part of the clinical encounter, and (2) providing the student with feedback based on the performance of this task by experienced physicians. The first component in the model, the use of simu- lation exercises, is based on the educational principle that problem—solving skills can best be taught by providing the student with opportunities to encounter and attempt to solve a range of problems which closely approximate, in breadth and complexity, the problems which he will encounter in the real world (Dewey, 1963; Bruner, 1966; Gagné, 1971). In the present case the student's encounter with a series of patients, having various medical complaints and diverse demographic characteristics, is simulated by means of color 10 films which present a "physician's eye view" of the first five minutes of a clinical workup. The second component in the training model, the provision of feedback on the performance of the task by experienced physicians, is designed to enable the student to evaluate his own performance on a given exercise, and, across the full set of exercises, to increase his skill in attaining problem formulation outcomes similar to those of the experienced physician. Two types of feedback are employed in this eXperiment: (1) feedback on the outcomes of physi- cians‘ problem formulation activity during the earliest part of the workup; and (2) feedback on the processes by which physicians arrive at these outcomes. Both types of feedback are based on data obtained from a sample of experienced physicians who viewed the training films. The first type of feedback presents, in written form, the problem formu- lations (and cues associated with each) generated by the physicians, and is designed to indicate both the commonali- ties and the range of diversity thatvwaxefound in their out- comes. The second type of feedback includes a special version of each of the training films in which "think aloud" recordings are interposed at various points in the dialogue. The purpose of these films is to provide the student with a simulated portrayal of the problem formulation processes typically going on inside the physician's head as he observes and interviews the patient. In addition, written materials 11 are used to summarize the similarities and differences among physicians with respect to processes of generating a set of initial problem formulations. The feedback employed in this study has several rather special features which distinguish it from most traditional types of feedback. First, the notion of uti- lilizing feedback to provide the learner with "process models" is obviously quite foreign to the behaviorist tradition of feedback. Second, the type of outcome feed- back provided in this study is closer to what has been termed "cognitive" feedback (Hammond and Summers, 1972) than to the classical types of outcome feedback used in learning experi— ments or programmed instruction. A major feature of the feedback is that it does not provide the student with a single "correct" model of either outcomes or processes. 'Rather it indicates both the convergent and the divergent aspects of the performance of experienced physicians: i.e., the commonalities characteristic of nearly all physicians, as well as the ways in which they differ, with respect to the generation of a set of initial problem formulations. Thus, in utilizing the feedback to evaluate his own per— formance the student must engage in a series of relatively complex cognitive activities: he must examine, synthesize and draw inferences from a sample of the performances of experienced practitioners in his field. 12 Design of the Study The study involved both a developmental phase and an experimental phase. The developmental phase included: (1) production of a set of films of the first five minutes in eight doctor-patient encounters, and (2) collection of data on problem formulation outcomes and processes from a sample of eight experienced physicians. These data were then uti- lized as the basis for development of the training and evaluation materials employed in the experimental phase of the study. The second phase of the study consisted of a training experiment involving 48 second-year medical students, ran- domly assigned to three conditions: Treatment 1: Training with Outcome Feedback; Posttest Treatment II: Training with Outcome and Process Feedback; Posttest Control: Posttest only. Both treatment conditions involved application of the "simulation exercises, plus feedback" training model. The conditions differed, however, with respect to the type of feedback provided. Under one condition (Treatment 1) the subject was provided with outcome feedback only, while under the other condition (Treatment II) the subject received both 13 outcome and process feedback.1 The third condition consisted of a posttest-only control group. In designing the experiment it was recognized there are a number of researchable questions of potential interest with respect to the proposed training model, not all of which could be dealt with within the context of the present study. The decision as to which questions would be asked, and which would be deferred for subsequent investigation, was largely dictated by the author's conception of the research strategy that is most fruitful for the educational psychologist to follow in develOping a new method of train- ing. This strategy suggests the following research priori- ties: first, to determine the effectiveness of the best ,training "package" one can devise, as compared to the results already being attained by means of existing pro- cedures; second, to investigate those manipulations of the package that are likely to have the greatest educational relevance; third, providing that some variation of the total package produces the desired results, to determine the effects of the separate components in the package. A major reason for the use of such a strategy is that the components of a training model may, in combination, produce the desired effect even though any single component would have an insignificant effect. A second reason for the use 1It should be noted that a "process feedback only" condition is not possible: feedback pertaining to the processes of problem formulation requires that the outcomes of these processes be mentioned. 14 of such a strategy is that the application of experimental findings to on—going programs would indeed proceed at a snail's pace if it were to be predicated on the construction of new training models out of components whose separate effects had each been experimentally demonstrated. In terms of the present study, this strategy has been trans- lated into an experiment that is designed: (1) to determine the effectiveness of the "simulation exercise, plus feed- back" training model as a whole, compared to existing condi- tions (i.e., a posttest-only control), and (2) to determine the relative effectiveness of the model when its feedback component is manipulated so as to provide, in the one case, outcome feedback only, and, in the second case, outcome and (process feedback. Other questions not dealt with in this study, but which should be investigated, include: (1) how cost-effective are films, as compared to either higher— fidelity (e.g., simulated patients) or lower-fidelity (e.g., slide-tape combinations) means of simulating the early part of the doctor—patient encounter? (2) how effective are the simulation exercises without feedback of any type? (3) what variations of the model are most effective for students at different stages in the medical school curriculum? figsearch Hypotheses The major purpose of the study was to test eXperi— mentally the following hypotheses: 15 Hypothesis 1: That a training model consisting of: (1) problem— solving exercises in which films are used to simulate the conditions of the early part of the clinical workup, and (2) feedback based on data from a sample of experienced physicians, will significantly improve second-year medical stu- dents' skill in the generation of an initial set of diagnostic problem formulations. Hypothesis 2: That the training model will be significantly more effective when it provides both outcome and process feedback, than when it provides outcome feedback only. A secondary purpose of the study was to specify, in greater detail than has been forthcoming from previous investigations, the nature of the problem formulation com- L4 ponent of medical problem solving. This purpose was accomplished by an analysis of the physician data, designed to address the following questions: I 1. How early in the clinical workup does the physician begin to generate problem formulations? 2. What is the structure of a set of initial problem formulations? 3. What cognitive processes are involved in the generation of initial problem formulations? CHAPTER II REVIEW OF THEORETICAL ISSUES AND RESEARCH LITERATURE The purpose of this chapter is to examine theoreti- cal issues and research literature of relevance to three topics: (1) medical problem solving, (2) initial problem formulations in medical problem solving, (3) the development of a model for training medical students in the generation of initial problem formulations. Medical Problem Solving In attempting to define the nature of medical problem solving a logical starting point is to examine the vieWpoint of the medical profession itself. A description of the approach to diagnosis--which is held to constitute good clinical practice and taught to medical students--can be derived from a review of several well—known, authoritative volumes in the field. In Harvey and Bordley's Differential Diagnosis (1970, p. 7) and in The Principles and Practice 9§_Medicine (18th edition), edited by Harvey, et al. (1972, p. 39), the process by which the physician arrives at a diagnosis is described in terms of the following sequence of steps: 16 17 Steps in Diagnosis 1. Collecting the Facts a. Clinical history. b. Physical examination. c. Ancillary examinations. d. Observation of the course of the illness. 2. Analyzing the Facts a. Critically evaluate the collected data. b. List reliable findings in order of apparent importance. c. Select one or preferably two or three central features. d. List diseases in which these central features are encountered. e. Reach final diagnosis by selecting from the listed diseases either: (1) the single disease which best explains all the facts, or, if this is not possible, (2) the several diseases each of which best explains some of the facts. f. Review all the evidence--both positive and negative--with the final diagnosis in mind. Harrison's Principles of Internal Medicine (6th edition, Wintrobe, et al., 1970) suggests a similar sequence, pro- ceeding from collection of clinical data to evaluation of data with respect to diagnostic possibilities. In each of these volumes it is noted that "working diagnoses" (or hypotheses) may emerge at various points in the workup, and that to some degree the physician engages in data analysis concurrently with data collection. However, neither of these activities is considered to be a central feature of the diagnostic process. In discussing the physician's data collection activity over the course of the workup major emphasis is placed on thoroughness and objectivity, par- ticularly with respect to history taking and physical examination (e.g., "the patient must be literally 18 scrutinized from top to bottom in an objective search for abnormalities,‘ Harrison's Principles, p. 5). In discussing the process of data analysis, it is noted that a preliminary evaluation of history and physical data enables the physician to make a judicious selection of ancillary (laboratory) examinations, but, in largest part, the evaluation of data with respect to diagnostic possibilities is to take place at the end of the workup, after the physician has completed his search for the facts of the case. In sum, the physician is "to begin, as in all scientific research, by marshalling all the facts, then proceeding with an unprejudiced analysis of the facts, and ending with the logical conclusion" (Harvey and Bordley, 1970, p. 3). This view of the diagnostic process is based on a conception of scientific inquiry that: (l) is primarily inductive in nature, i.e., conclusions emerge out of an objective analysis of "all the facts," and (2) makes the classical empiricist assumption that autonomous facts (observables which exist independently of the observer and are objectively verifiable) constitute the ultimate arbiter- of scientific claims of knowledge (theories, hypothesis or conclusions). This conception of scientific inquiry has been challenged by an impressive number of scholars of the philosophy of science. Kessel (1969), in his review of these scholars' viewpoints, argues that it is necessary to reject the empiricist notion of the autonomy and l9 objectivity of facts. He asserts that "the scientist's premises and presuppositions can and do play a significant role at all levels of his endeavor" (p. 1004). Moreover, he suggests, the safeguard of disciplined scientific inquiry does not lie in the scientist's attempting to banish these premises and presuppositions (as, it may be noted, the physician is admonished to dol), rather it lies: (1) in the scientist's making explicit (to himself and others) the premises which he is entertaining, and (2) in his "actively seeking to invent alternatives" to his current premises in order to insure that the success of his endeavor does not rest on the spurious ground that potentially refuting facts were never sought, or were misinterpreted (p. 1002). This latter recommendation calls to mind Chamberlin's classic article (1890; reprinted in Science, 1965) in which he argued that only by employing a method of "Multiple Working Hypothe- ses" can the scientist avoid the pitfall of biased data collection and interpretation. A major challenge to the classical empiricist con- ception may be found in John Dewey's Logic: The Theory of Inquiry (1938). In this work Dewey combines both philosophi- cal and psychological considerations in his development of a ¥ lIn Harrison's Principles (1970, p. 6) it is stated that "the physician does not start with an open mind. . . , but with one prejudiced from knowledge of recent cases;" consequently, "he must struggle constantly to avoid the bias" that would interfere with the objective conduct of the clinical inquiry. 20 model of inquiry which coincides quite closely with the viewpoints summarized in Kessel (1969). First of all, Dewey asserts that the problem-solver's initial conceptuali- zation of the problem, as well as the "ideas" (or hypothe- ses) that emerge from this conceptualization, play a crucial, "operational" role in the selection and interpretation of facts, and in the organization of facts into a coherent whole. Secondly, Dewey argues in favor of a dialectical view of inquiry in which ideas are as much arbiters of facts as facts are of ideas. The orders of fact, which present themselves in consequence of the experimental observations the ideas call out and direct, are trial facts. They are pro- visional. They are 'facts' if they are observed by sound organs and techniques. But they are not on that account the facts 9f the case. They are tested or 'proved' with respect to their evidential function just as much as ideas (hypotheses) are tested with reference to their power to exercise the function of resolution (1938, p. 114). A further challenge to the classical empiricist conception of the inquiry process is provided by contemporary research within the framework of "cognitive" or "information- processing" theories of psychology. There is a very sizable body of research findings, comprehensively reviewed in Neisser (1967), which suggest that a person's expectations (presuppositions, ideas or hypotheses) have a significant impact on even the simpler forms of cognitive activity, e.g., visual and auditory perception, visual and auditory memory. Research into higher-order cognitive processes, 21 such as problem solving (Newell and Simon, 1972), indicates that the internal representation of the task (WhiCh the person generates)initiates, organizes and directs his subse- quent information-processing activities. Although certain types of problem solving have been quite intensively investigated during the past several decades, the activities of the physician in attempting to solve diagnostic problems have been studied for a relatively brief number of years (Rimoldi, 1963; Kleinmuntz, 1968; Elstein, et al., 1972; Barrows and Bennett, 1972; Wortman, 1972; Schwartz and Simon, 1970). The theoretical orientation of most of this research derives from the cognitive or information-processing conceptions of human psychology which emerged in the late 1950s, as embodied in the work of Bruner, Goodnow and Austin (1956), Miller, Galanter and Pribram (1960), and Newell, Shaw and Simon (1958). Although investigations of medical problem solving have employed diverse types of diagnostic tasks, e.g., card sorting (Rimoldi, 1963), a variant of the game Twenty-Questions (Kleinmuntz, 1968), simulated clinical encounters involving actors trained to play the role of patients, i.e., "simulated patients" (Elstein, et al., 1972; Barrows and Bennett, 1972), most have utilized these tasks in the context of the method- ological approach developed by Newell, Shaw and Simon (1958). A protocol of the physician's behavior--e.g., the questions he asks, maneuvers he performs--in conducting the diagnostic 22 task is an important source of data for describing some parameters of his activity, but description of the cognitive processes underlying his behavior relies primarily on intro- spective data obtained by having the subject "think aloud" as he attempts to solve the problem, and, additionally, in the case of the Elstein, et a1. (1972) investigation, by a "stimulated recall" technique. Research on medical problem solving has, thus far, resulted: (1) in the identification of several of the major features of the diagnostic process, and (2) in the specification--at a fairly general, descriptive leve1--of the sequence of events involved in this process. Present research efforts are still quite far, however, from achieving one of the major goals of the information- processing theorist: namely, the specification of cogni— tive mechanisms in terms of a computer program which simulates the physician's problem—solving activity.1 The findings which have been fairly well established by this research may be summarized as follows. The major character- istics of the physician's information-processing activity in conducting a workup are: (l) the early generation of multiple diagnostic hypotheses, and (2) the testing (revision and refinement) of these hypotheses by means of subsequent data collection. There may, depending on the difficulty of 1An initial attempt at computer simulation of diagnostic problem solving has been undertaken by Wortman (1972). 23 the case, be several iterations of the hypothesis generation/ hypothesis testing sequence. There are individual differ- ences in hgw early a physician begins to generate hypotheses, but, nearly all physicians do so at a very early point (at least within the first five minutes of the workup).1 Although the physician sometimes uses standard formats of data col- lection (e.g., asks routine history questions, performs the physical examination in a systematic head-to-toe manner), his mode of cognitive processing (i.e., of organizing, sorting, interpreting and synthesizing these data) is struc- tured by the hypotheses he is entertaining. Newell and Simon (1972) have proposed that the essence of an information-processing theory of human problem solving can be summarized in four propositions: 1. A few, and only a few, gross characteristics of the human IPS (information-processing system) are invariant over task and problem solver. 2. These characteristics are sufficient to determine that a task environment is represented (in the IPS) as a problem space, and that problem solving takes place in a problem space. 3. The structure of the task environment determines the possible structures of the problem space. 4. The structure of the problem space determines the possible programs that can be used for problem solving. (p. 788) 1Analysis (by the author) of a portion of the data from Elstein, et a1. (1972) investigation indicated that the Percentage of subjects who began generating hypothesis within the first five minutes of the workup was 95.2%, 81.8% and 100% for three simulated clinical encounters. The number of questions asked by the physician prior to generating his first hypothesis was, on the average, 13.9, 20.8 and 7.4 for the three simulations. 24 The remainder of this section will be devoted to consider- ation of the following question:. What are the features of the task environment in clinical medicine that are responsible for the hypothesis-guided nature of medical problem solving? Although theories of problem solvingr-both classical (Dewey, 1938) and current (Shulman, et al., 1968; Newell and Simon, l972)-—have posited that the generation of some sort of internal representation of the task is a major early step in attempting to solve complex problems, this representation does not take the form of multiple hypotheses regarding potential end states in the types oprroblem solving that have been most intensively investigated in the recent psycho- logical literature (e.g., cryptarithmetic problems, logic problems, chess playing). On the other hand, the generation of multiple hypotheses is characteristic of most forms of scientific inquiry, including medical inquiry. Borrowing a distinction made by Bartlett (1958) between reasoning in "open" versus "closed" systems, it may be useful to consider various types of problem solving as lying at different points along a continuum, with science (including clinical medicine) lying toward the "open system" pole of the continuum, and logic and mathematics toward the "closed system" pole. (Chess playing would probably be classified at some inter— mediate point). Given this conceptual framework, we will now consider two characteristics of the task environment 0f open systems, in particular clinical medicine, which 25 may account for the hypothesis—guided nature of problem solving in such systems. A first characteristic of the task environment in open systems is the indeterminacy of the end state. In closed system problems, such as cryptarithmetic and logic problems, the end state to be reached is fully specified in advance (e.g., show the DONALD + GERALD = ROBERT; prove that L1 is equivalent to L2). In an open system, such as clinical medicine, not only is the end state (i.e., the diagnosis) unspecified, its general configuration may take many forms (e.g., the patient may have a single disease, several separate diseases, several related diseases). The generation of hypotheses enables the physician to define a set of potential end states toward which his problem-solving activity may be directed, and thus tentatively "closes" the system in which he is operating. A second distinction between problem solving in open versus closed systems pertains to the nature of solution criteria. For problems in closed systems, such as cryptari- thmetic and logic, there are well-defined a priori criteria—- known to the problem solver-~which he may apply in order to determine that an appropriate solution has been reached. In an open system such as clinical medicine there are no such criteria. The criteria that do exist are a posteriori (i.e., the patient responds to the treatment, or his condition worsens), and, in general, are ambiguous. Occasionally there 26 are pathonogmonic findings that conclusively substantiate the correctness of a diagnosis, but most often the rela- tionship between the physician's state of knowledge (the data he has obtained) and the solution to the problem (the patient's disease) is a probablistic one. However, the use of multiple working hypotheses provides a pragmatic cri- terion for judging that an appropriate solution has been reached: i.e., the physician concludes that "Having generated and tested hypotheses X, Y and Z, the data collected tend to rule out hypotheses X and Y, but tend to support hypothesis Z. Therefore, I will consider Z as the most tenable diagnosis on which to base treatment." Initial Problem Formulations in Medical ProbIem Solving This section will examine in somewhat greater detail the diagnostic problem formulations (or hypotheses) which the physician generates in the early minutes of the clinical workup.l First, the findings of recent research regarding initial problem formulations will be reviewed. Second, both the functional advantages and potential risks involved in the early generation of problem formulations will be con- sidered. 1For reasons discussed in Chapter I (p. 6): the term "problem formulation" (rather than "hypothesis") is generally employed in this research to refer to the diagnostic labels which the physician generates on the basis of cues obtained during the workup. 27 A number of recent investigations have found that physicians begin to generate problem formulations at a very early point in the workup. Data from the Elstein, et a1. (1972) investigation indicate that this process nearly always occurs within the first five minutes of the physician's encounter with the patient (see footnote 1, p. 23). Barrows and Bennett (1972, p. 275) have observed that hypotheses literally "pop" into the head of the clinician "almost before the interview begins." Schwartz and Simon (1970) have also observed that from the moment of patient contact and presentation of chief complaint, the physician begins to generate hypotheses. Although the earliness with which the physician generates problem formulations appears to be fairly well established, current research findings are as yet rather sketchy as to the characteristics of these formulations, and as to the types of cognitive mechanisms involved in generating them. One characteristic that has been investi- gated is the number of problem formulations a physician generates. As yet unpublished data from the Elstein, et a1. (1972) investigation indicate that, for three simulated workups, the number of hypotheses generated was, on the average, 6.7, 7.0 and 4.2. Barrows and Bennett (1972) report that the neurologists in their study always generated at least three hypotheses, and sometimes as many as five hypotheses. Wortman (1972) found that students solving 28 diagnostic type problems (involving concrete objects having various properties), as well as a neurologist whose diag- nostic behavior was studied, tended to select cues that were associated with three 93 seven categories (objects/diseases). In sum, the research to date tends to suggest that the number of hypotheses a physician generates ranges from three to seven, which, it may be noted, coincides with Mandler's (1967) proposition that human information-processers organize and store information in terms of 5 i 2 categories. A second characteristic of problem formulations V” which has received some attention in the research literature is level of specificity. Kleinmuntz (1968), in a study of neurologists which involved a medical variant of "Twenty- Questions," found that his subjects tended to follow a general-to-specific strategy of hypothesis generation: beginning with very general formulations moving toward in- creasingly specific formulations. In a more recent paper (Wortman and Kleinmuntz, undated) it is suggested that although the physician follows a general—toéspecific search strategy he tends to begin at the most specific level that is possible--given the adequacy of available cues--in the hierarchy of diagnostic categories stored in long—term memory. Barrows and Bennett (1972), in a study of neu- rologists carrying out simulated workups, reached a con- clusion similar to Kleinmuntz's: namely, that the physi- cian's initial hypotheses are quite general, but are 29 progressively shaped into more specific entities by means of a "coning down" strategy. Elstein, et a1. (1972), on the other hand, found in a preliminary analysis of the problem- solving behavior of internists (working up simulated patients) that initial problem formulations were frequently quite specific. Subsequent analysis of the Elstein, et a1. data, in which the author assisted, has indicated that a physician's early problem formulations are often hetero- geneous with respect to level of specificity: he may simul- taneously entertain a general hypothesis (e.g., "psychogenic paralysis"; "viral infection") and a highly specific hypothe- sis that can be either a competitor to, or a subset of, the general hypothesis (e.g., "multiple sclerosis"; "infectious mononucleosis"). In sum, it is probable that the physician's initial problem formulations cannot be characterized as a uniform list of either very general, or very specific, diagnostic considerations. Although the observations of most investigators indicate that the generation of problem formulations on the basis of cues obtained during the earliest minutes of the workup occurs in a very rapid and virtually automatic manner, very little is known as yet regarding the types of cognitive mechanisms that are involved in this process. Different researchers have made various suggestions, includ- ing associative retrieval mechanisms (Elstein, et al., 1972), pattern recognition mechanisms (Lusted and Stahl, 1963), strategy-based search mechanisms (Kleinmuntz, 1968; Kleinmuntz 30 and Wortman, undated; Barrows and Bennett, 1972). To date, however, there has not been enough research focused on the physician's information-processing activities during the earliest part of the workup in order to specify in detail which of these mechanisms (or what combination of mechanisms) is involved in the generation of initial problem formulations. The remaining portion of this section will consider the functional advantages and potential risks that are likely to be involved in the early generation of diagnostic problem formulations. It is probable that the set of initial problem formulations which the physician generates early in the workup serves a dual cognitive function: (1) it provides an organizational framework for the storage of data obtained during the workup; and, (2) it provides a conceptual frame- work that guides the physician's data processing activity during the workup. The "storage function" of problem formu- lations relates to the role of organization in human memory. In medical problem solving, as in nearly any relatively complex task, the amount of information which must be stored greatly exceeds the capacity of short-term memory, i.e., 7 i 2 symbols (Miller, 1956). Moreover, information in short-term memory rapidly decays, or is deleted by the incoming flow of new information, unless it is processed (rehearsed, recoded) and transfered to long-term memory, which has working storage capacity that is potentially limitless (Broadbent, 1958). A sizable body of research 31 literature, notably Miller, (1956), Mandler, (1967), Tulving and Donaldson, (1972), indicates that the organization of elementary units of information into "chunks," or "catego— ries," increases the holding capacity of short-term memory, and facilitates the processes of transfer to, and retrieval from, long-term memory. In medical problem solving, it is probable that the problem formulations which the physician generates constitute the organizational framework which permits storage of the very large amount of information that is obtained over the course of the workup. Although the storage function of diagnostic problem formulations has not yet been investigated in depth, the finding that physi- cians failed to recall data that were not associated with one of the hypotheses they were entertaining (reported by Kleinmuntz, 1968, and Barrows and.Bennett, 1972) would lend some support to this assertion. The second cognitive function of initial problem formulations--which may be termed the “guidance function"-- relates to the concept of the "problem space" proposed by Newell and Simon (1972). These theorists assert that the task environment is represented internally as a problem space, and that the structure of this problem space deter- mines the programs (i.e., information—processing activities) to be used in the search for a solution. Any problem space, as defined by Newell and Simon, includes of two types of components: (1) "elements," which are symbolic structures 32 representing states of knowledge about the task, and (2) "operators," which are processes (procedures, methods) for producing new states of knowledge from existing ones. In clinical medicine, as in other domains of problem solving, the potential size of the problem space.is enormous: there are a vast number of elements (states of knowledge about the patient) that could be obtained, and an exceedingly large number of potential operators (interview questions, physical examination maneuvers, laboratory tests) for obtaining them. The early generation of diagnostic problem formulations would appear to be a major strategy that is used by the physician to determine the regions of the poten- tial problem space which are most likely to yield a solution. In sum, a set of problem formulations can be considered to constitute the framework of the functional problem space in which the physician conducts his search for a diagnosis. In addition to the cognitive functions served by early problem formulation, there are two practical con- siderations which would tend to foster this activity. One is the time constraints under which the practicing physician has to work. Selective data collection, guided by a set of problem formulations, may be the only feasible way to complete a workup within the amount of time typically allotted for each patient. Secondly, there are occasions when the physical or psychological well-being of the patient requires that a management decision be taken 33 right away, before completion of the data collection in- volved in a thorough workup. To make such decisions the physician would have to be operating in an "early problem formulation," rather than a "reserve judgment," mode of reasoning. Although the early generation of problem formulations presents a number of advantages, both cognitive and practical, which justify its use, it also entails potential risks. The primary risk is that of premature closure: i.e., the possi- bility that the physician may fail to seek data that could disconfirm his initial problem formulations, or fail to revise these formulations in the face of contradictory evidence, and thus arrive at an incorrect diagnosis. The literature on problem solving indicates that the initial conceptualization of the problem may create a mental set (Einstellung) that prevents the subject from shifting to more appropriate formulations as his search for a solution progresses (Luchins and Luchins, 1950). In addition, there is evidence that the human problem solver shows a marked reluctance to eliminate or revise an initial hypothesis as long as there is some confirmatory evidence in its facor (Wason, 1968). This may manifest itself as a failure to seek evidence that could disconfirm an hypothesis, or as a failure to interpret properly negative evidence that is obtained. The phenomena of set and psychological bias toward confirmatory evidence pose particular risks for 34 medical problem solving. The wide range of possible data collection options in a medical workup, as well as the probabilistic character of clinical findings (leaving room for considerable latitude for interpretation on the part of the physician), make it possible that even an experienced clinician may become inadvertently wedded to an incorrect initial formulation due to biased data collection and/or biased data interpretation procedures. There appear to be two means which the physician may use to avoid these problem-solving pitfalls.. One pertains to the process of generating problem formulations. Premature closure becomes much less likely if the physician generates multiple problem formulations, all of which are supported in some degree by the initially available data. In this case it becomes necessary to seek out and critically interpret additional data that will permit determination of a differ- ential diagnosis. Moreover, if care is taken to generate problem formulations at a level of specificity that is appro- priate to the data at hand, this would tend to facilitate subsequent revision of initial formulations.(i.e., broadening or narrowing of the problem space) in the light of new data. A second means of overcoming the risks of early problem formulation pertains to the process of data acquisition. As Elstein, et a1. (1972) have prOposed, a major justification for the use of routine procedures of history taking and physical examination is to reduce the likelihood 35 of biased data collection. Such procedures would help to insure that an adequate test of initial formulations will be made, and that there will be sufficient opportunity for tacts to emerge which could lead to revision of initial formulations and/or generation of additional formulations. A second possible rationale for routine procedures is that their application would require a minimal investment of cognitive effort on the part of the experienced physician, and would thus enable him to focus a greater part of his attention (and working memory space) on the processing (interpretation, organization, synthesis) of data, rather than on the activity of collecting data. Development of the Training Model This section will examine considerations related to the development of a model for training medical students in the generation of initial problem formulations. The Simulation Component of the Model One component of the training model consists of having the student practice the problem-solving skill to be attained--i.e., the generation of initial problem formula- tions--under conditions which simulate the early part of the clinical encounter. As Shulman (1970) has pointed out, the choice of a mode of instruction is not necessarily dictated by the nature of the outcome behaviors to be acquired. In principle, for example, it would be possible to teach skills 36 in clinical problem solving by an expository (or didactic) approach, or to teach basic concepts in medical science by a problem-solving (or discovery) approach. For some edu- cational outcomes, e.g., the learning of concepts, principles or rules, there is considerable debate among educational psychologists as to the most appropriate mode of instruction, with Bruner (1966) advocating use of a discovery approach for the teaching of these skills, and Gagné (1970) arguing in favor of carefully programmed expository methods. An expository approach has been used with some success in training physician's assistants to employ clinical algorithms (step-by-step instructions) for dealing with.a very limited set of medical problems, i.e., eleven acute illnesses (Sox, et al., 1973). However, this type of (algorithmic) rule learning is obviously not appropriate for training future physicians who will be called upon to solve a wide range of often complex medical problems, a task which would require, as it does in other domains (e.g., cryptarithmetic, logic, chess), the application of higher-order problem-solving skills such as cognitive strategies or heuristics. With respect to these higher-order skills, there is considerably more consensus among educational psychologists regarding instructional method. For these skills we find that Gagné (1971) would advocate--in practice, although not necessarily in principle--an approach very similar to Bruner's. 37 Although the possibilities of controlling the development of cognitive strategies seem definitely promising, the means may not be fully available . . . practically, then if a teacher wants a student to become a good thinker . . . he must provide many opportunities, throughout the course of his instruc- tion, for him to encounter, formulate, and solve problems of many varieties in his chosen field. Such encounters may be expected to lead to the progressive refinement of the cognitive strategies of thinking (p. 522). The adoption of a problem-solving mode of instruction in this study rests on more than a.Gagnéan brand of pragma- tism, however. The goal of the present training.mode1 is not to teach the medical student the various concepts and princi- ples--of physiology, anatomy, pathology--that are employed in generating problem formulations. Rather it is to increase his capacity to integrate concepts and.principles previously learned and to apply these integrated skills to problems of the type encountered in clinical practice.. Although much of the research on "discovery versus expository" modes of instruction is equivocal (Shulman and Keislar, 1966), there is at least some research evidence (Egan and Greeno, 1973; Craig, 1969) to suggest that a discovery method may be best suited to achieving cognitive integration (of component skills into problem-solving skills) and lateral transfer (generalization of these new skills to problems not previ- ously encountered). Thus, there is reason to believe that a problem-solving mode of instruction would be the method of choice in the present case, even if expository means were available. 38 In adopting a problem-solving mode of instruction, one seeks to simulate for the learner problem situations which,in breadth and complexity, closely approximate the real-life situations to which the training.is-to be gener- alized. Twelker (1971) has suggested that simulations may be classified into two categories: (1) "interpersonal- ascendent" (e.g., role playing, simulation games), and (2) "media-ascendent" (e.g., simulator equipment, motion pictures, computer simulations). The present training model involves simulation of the second type: namely, the use of motion picture films to simulate the conditions of the early part of the clinical workup. Each film presents a "physi- cian's eye view" of an encounter with a patient. The full set of films presents eight patients having various medical complaints (within the domain of internal medicine) and diverse demographic characteristics. Films of this type have been used previously as stimulus materials for assessing medical students'clinical skills, but the response mode was a multiple-choice test (Hammond and Kern, 1959). In the present case, the response mode also involves simulation (i.e., the student responds by generating a set of problem formulations as if he were the physician interviewing the patient). Research by Schalock (reviewed in Twelker, 1971) indicates that predictive validity (from a test situation to a complex real-life situation) is greater when simulation is involved in both the stimulus and the response side and, 39 it is reasonable to assume, that_generalization, or, lateral transfer, of skills would also be enhanced by such condi- tions. The essence of simulation is a simplification of reality which maintains the aspects of the real-life setting that are necessary to the learning of.a task, but omits the aspects that are in some degree unnecessary: in Twelker's (1971, p. 133) words: "Simulation = (real-life) - (task- irrelevant elements)." The use of films to simulate the medical student's encounter with patients relieves the student from having to carry out several activities in which the physician must normally engage: (l) the formulation and asking of questions; and (2) the establishment of rapport and handling of interpersonal relations.. Viewing the film does, however, provide the student with the opportunity to carry out the two types of information-processing activity that are of concern in the present study: namely, the detection and encoding of cues on the basis of naturalistic observation (of the patient's physical appearance, manner and speech), and the use of these cues to generate diagnostic problem formulations. In a real-life Clinical encounter there is undoubtedly considerable interplay between the formulation and asking of questions and the formulation of diagnostic hypotheses, particularly as the workup pro- gresses and the physician looks for data to test his hypotheses. However, during the first five minutes of 40 the workup, the physician generally follows a fairly stand- ard questioning format, i.e., elicitation of,a description of the patient's current complaint(s); thus, the omission of the task of question-asking should constitute merely a simplification and not a distortion.of.the conditions under which initial problem formulations are generated. The Feedback Component of the Model Medical problem solving contrasts with many other types of problem solving in that there are no a priori or logical criteria that may be applied in evaluating clinical outcomes. The criteria that must be used, in evaluating final outcomes (i.e., diagnoses) or initial outcomes (i.e., problem formulations), are empirical and a posteriori. For final outcomes there are sometimes specific objective criteria that may be applied, e.g., a laboratory finding that unambiguously substantiates the correctness of a given diagnosis. For initial outcomes, however, appropriateness can only be evaluated in terms of the judgments of experienced practitioners in the field. Thus, the feedback component of the training model is based on data collected from a sample of experienced physicians who carried out the task of gener- ating problem formulations with respect to each of the filmed cases. Data on the physicians' problem formulation outcomes were used to construct "outcome feedback" for each training' exercise, and data on the physicians' problem formulation processes to construct "process feedback" for each exercise. 41 Hammond and his collegues have conducted a number of experiments (Hammond and Summers, 1972; Hammond, Summers and Deane, 1972) in which it was found that the classical type of outcome feedback (i.e., informing the subject of the correct response) was an impediment to improvement of performance on complex tasks (multiple-cue probability learning, clinical judgment) in which the objective is not so much new learning as the application of skills already acquired, or, in Hammond's words, the acquisition of "cogni- tive control." On the other hand, various forms of "cogni- tive feedback" were effective in enabling the subject to develop cognitive control. Cognitive feedback is defined as material which "will enable the subjects to perceive not only that their judgment was in error, but why_it was in error" (Hammond and Summers, 1972, p. 64). The outcome feedback utilized in this study includes not only the problem formulations generated by the physi— cian sample, but also a list of the cues that they considered to be relevant to each formulation. The first objective of this feedback is to enable the subject to evaluate the appropriateness of his outcomes (i.e., the formulations he generated as compared to those of the physicians). The second objective of this feedback (and the reason that a cue list is provided for each problem formulation) is to permit the subject to discover some of the reasons his out- comes deviate from those of the physicians- .By examining 42 the feedback material the subject could discover, for example, that he failed to detect certain cues, that he encoded some cues incorrectly, that he failed to "cluster" related cues, that he failed to consider that cues may be relevant to multiple formulations. Thus, the outcome feed- back used in this study is much closer to cognitive feed- back, as defined by Hammond, than to the classical type of outcome feedback used in learning experiments and programmed instruction. The process feedback utilized in this study is intended to further assist the subject in determining why his outcomes deviate from those of the physicians, and thus is also a form of cognitive feedback.- These materials attempt to portray the types of information-processing activity that go on inside the physician's head during the early part of the clinical workup, and, thereby, to enable the subject to discover inadequacies in his own information-processing activities. It is not likely that the medical student will learn to "think like" an experienced physician as a result of receiving process feedback. But, it is possible that he will be able to use information on how physicians think to improve his skill in attaining out- comes similar to those of the physician. To summarize: The literature reviewed in this chapter suggests that probably the most salient characteristic of medical problem solving is the early generation, and subse- quent testing, of diagnostic problem formulations. Although 43 there is as yet little research evidence regarding the characteristics of the initial problem formulations a phy- sician generates, or the cognitive mechanisms involved in their generation, an attempt was made to define, in general terms, (a) two possible cognitive.functions that initial problem formulations may have in the conduct of a clinical workup, and (b) some of the potential risks that could be entailed by the early generation of problem formulations. Finally, consideration was given to issues related to the development of the proposed model for training medical students in the generation of initial.problem formulations, including: (1) the choice of a problem-solving (rather than an expository) mode of instruction, (2) the use of films as’a means of simulating the conditions of the early part of the clinical encounter, and (3) the use data from experi- enced physicians to provide the student with cognitive feed- back of an outcome and a process nature. CHAPTER III METHOD This chapter describes the methodology of the study. It includes three major sections: (1) production of the films, (2) collection of the physician data, and (3) design of the training experiment conducted with second-year medical students. Production of the Films Eight l6-millimeter films in color and with second were produced. Each film presents the first 4-6 minutes in a physician's encounter with a new patient.1 The setting of the interview is a doctor's office; thus, the patient is ambulatory, and his problem is of a non-emergency nature. Since the purpose of the films is to provide the viewer with a realistic simulation of participation in a clinical encounter, they were produced so as to present a "physician's eye view" of the encounter. Throughout the film the camera remains on the patient; the physician's voice is heard but he is never seen. After an initial 30- second segment in which the patient walks in and sits down, 1The average length of the films is 5 minutes, 18 seconds; the films range in length from 4 minutes, 20 seconds to 6 minutes, 10 seconds. 44 45 he is shown seated throughout the rest of the film, with close-up shots of his face occurring at several points. In sum, by focusing on the patient, the films were designed to facilitate the viewer's task of adopting the role of physician in a simulated clinical encounter. The cases for the eight films were selected to represent a cross-section of problems in internal medicine, as well as a variety of patient demographic characteristics (age, sex, occupation). Four of the films are based on cases that were used in the Elstein, et a1. (1972) investi- gation of physician reasoning. The other four cases were developed specifically for this study. Table 1 lists the eight films, titled according to the demographic characteristics of the patient, and numbered in the order they were presented during the training experi- ment. The table also indicates the presenting complaint(s) of the patient in each film. For each film, a case outline was prepared consisting of the following information: (1) information to appear on a written sheet, including the patient's major demographic characteristics (age, sex, occupation) and his temperature; this sheet was presented to subjects viewing the films as having been "filled out by the nurse" just prior to the interview; (2) information to be included in the doctor- patient dialogue, including a list of the patient's com— plaints, a brief description of the salient attributes of 46 TABLE 1.--The Eight Films. Number Title Presenting Complaint 1 A 21-year-old college senior Fatigue and weakness 2 A 43-year-old landlady of a Substernal chest boarding house pain A 30-year-old taxi driver Urinary distress 4 A 40-year-old carpenter Left chest pain A 19-year-old college Headache and sophomore sleepiness A 29-year—old lawyer Low back pain A 57-year-old executive Cough and fever A 19-year-old student Abdominal pain and nurse vomiting each complaint (e.g., onset, duration, location, severity, etc.), and other data of revelance; (3) an indication of sig- nificant nonverbal aspects of the case, e.g., the patient's physical appearance, psychological state, etc. An example of a case outline appears in Appendix A. Production of each film involved the following steps: 1. A case outline was prepared. 2. An experienced amateur actor was selected to play the role of the patient. 3. After the actor had familiarized himself with the case outline, a warm-up session was held in which the actor was coached with respect to his role, i.e., both the verbal presentation of his complaints, and the nonverbal aspects of his role (e.g., gait, posture, gestures, facial expressions, etc.). 47 4. A trial-run of the interview was videotaped, with immediate replay permitting discussion and critique. 5. The interview was filmed. Four physicians assisted the author with preparation of the case outlines and coaching of the patient-actors. Two of them played the role of the physician in the films (four films each). Given the constraint that each of the major topics in the case outline be covered, the physician- actor was free to conduct the interview in accordance with his usual practice. He was, however, instructed to utilize a relatively standard, unobtrusive questioning technique, and, in particular, to avoid any questions that would obvi- ously imply a particular problem formulation. During the warm-up session, a general sequence of events was worked out, but, in order to preserve naturalness of dialogue, any tendency to establish a fixed script was avoided. Each film begins by showing the patient walking into the doctor's office and sitting down to await the arrival of the physician. During this initial segment (30 seconds or less), relevant nonverbal attributes of the patient are presented, e.g., the patient coughs; he is holding his abdomen; he slumps in the chair, etc. When the physician enters, it is assumed that he has with him a sheet filled out by the nurse indicating the patient's name, his major demographic characteristics, and his temperature. The interview begins with the patient 48 presenting the complaint (or complaints) that have brought him to see the doctor. Questioning by the physician elicits information about the complaint(s), e.g., onset, duration, severity, location, amelioration, etc. As the interview progresses, additional complaints and their attributes, as well as certain other items of relevance, are either pre- sented by the patient or elicited by physician questions. The degree to which information is presented (by the patient) or elicited (by physician questions), as well as the manner in which the patient presents information or responds to questions, vary in each film depending on the type of patient. For example, in Film 1 the patient is a 21-year-old college student who is very articulate in his presentation of information and who responds precisely and in detail to the physician's questions, while in Film 3 the patient is a 30-year-old taxi driver who has a good deal of difficulty in describing his complaints and who responds imprecisely and with minimal detail to the physician's questions. To summarize: each film presents both verbal and nonverbal information. Information presented in the verbal mode, i.e., the dialogue between the doctor and the patient, consists primarily of a brief review of each of the patient's current complaints (history of present illness), but includes a few items of personal, past medical or family medical history that have particular relevance to the present illness. 49 The types of nonverbal information presented include: (1) the physical appearance of the patient (e.g., posture, build, dress); (2) the psychological state of the patient, (e.g., gestures, manner of speech, facial expression); and (3) non- verbal cues of particular relevance to the patient's current medical problem (e.g., cough, photophobia, clutching of abdomen). The objectives of the training experiment influenced the design of the film in several ways. First, each film is intended to provide a brief introduction to a_case, on the basis of which the subject is to generate a set of initial problem formulations which he would wish to investigate more thoroughly during the remainder of the workup. The 4-6 minute interview is structured so as to incorporate a limited amount of data on each of the patient's current complaints, but is not intended as a complete history of present illness. This fact is reinforced by the physician's closing statement in the films e.g., "Now, let us go back and review several of your problems." Second, given the fact that the training experiment deals with cognitive information-processing skills, the films are not designed to focus on the affective, interpersonal aspects of the doctor-patient encounter. In producing each film, the physician attempted to use an interview style that repre- sents good medical practice with respect to the establishment of rapport with a patient. The films do not, however, attempt to provide models of interpersonal interaction. 50 Collection of Physician Data The purpose of this phase of the study was to obtain data on physician performance to be used in designing the training experiment involving medical students. The eight films described in the preceding section were shown to a sample of experienced physicians. For each film, two types of data were collected: (1) data on the outcome of the physician's information processing (principally, the set of problem formulations he generated and the cues associated with each), and (2) data on the processes by which the physician generated his set of problem formulations. Sample In utilizing data from a sample of practitioners in a field as the basis for developing materials to train and evaluate students in that field, it would be desirable to obtain a sample of practitioners of proven expertise. In the present study, the ideal physician sample would be a group of clinicians in the field of internal medicine who are known to have outstanding diagnostic skills. Unfortu— nately, there appears to be no currently available means for identifying expert practitioners in clinical medicine. One method that has been attempted—-the use of peer nomi- nations to select "criterial" diagnosticians (Elstein, et al., l972)--did not prove to be successful (i.e., the criterial and noncriterial samples were not found to differ on a wide range of problem-solving measures). Thus, in the present 51 study representativeness rather than criterial expertise was the basis for sample selection. In selecting the sample the objective was to obtain a group of subjects whose academic backgrounds and clinical experience would be representative of the population of physicians who would generally deal with the type of medical cases presented in the films (i.e., office visits pertaining to problems in internal medicine). It was therefore decided to select a sample of eight physicians that included four specialists in internal medicine (with M.D. degrees) and four family medicine physicians (two with M.D. degrees and two with D.O. degrees). The eight subjects were selected from among the physicians associated with the Michigan State University Colleges of Human and Osteopathic Medicine. Table 2 summarizes the characteristics of the physician sample, including the type of degrees they hold, their areas of specialization, and the number of years of experience they have had as practicing clinicians, and as medical educators. Given the lengthy data collection procedure for each film, and the limited amount of time certain subjects could make available, it was not possible to obtain data on the performance of all eight subjects for each of the films. For each film, data were obtained from a minimum of three (out of the four) internists and three (out of the four) 52 TABLE 2.--Characteristics of the Physician Sample. Average No. Years Experience As Practicing As Medical n Degree Specialization Clinician Educator 4 M.D. Internal Medicine 11.0 . 11.7 2 M.D. Family Practice 8.0 2.5 2 D.O. Family Practice 8.5 1.5 family medicine physicians. For one film (number 6), data were obtained from a total of seven subjects; for the other seven films, data were collected from a total of six sub- jects. In order to construct the feedback materials and the dependent variable scoring keys, it was necessary that the physician sample (per film) be of sufficient size (1) to permit identification of commonalities in problem formu— lation outcomes and processes across the range of academic and clinical backgrounds represented in the sample, and (2) to provide an indication of the range of diversity that would be characteristic of experienced practitioners having such backgrounds. ”It is believed that the present sample was of sufficient size to provide adequate data on both of these points. The composite set of problem formulations generated for each film always included: (1) a substantial subset of formulations (approximately 30-50% of the com— posite) which were common to all (or all but one) of the 53 physicians, and thus could provide a basis for determining the outcomes which the student should necessarily attain, and (2) a second superordinate subset of formulations (approximately 80-90% of the composite) which were generated by at least two physicians, and thus could provide a basis for defining the range of outcomes that it would be accepta- ble for the student to attain. Materials Two types of materials were used in collecting the physician data: (1) response sheets, and (2) a Process Checklist. The response sheets were used by the subject to record the problem formulations he had generated while viewing a film. The sheets had a very simple format: a line across the top of the sheet for a problem formulation title, and space underneath for listing the cues of relevance to that title. In the Elstein, et a1. (1972) investigation of physician reasoning, it was found that subjects often had some degree of difficulty in providing an introspective description of the mental processes by which they generated problem formulations. It was therefore decided that, after the subject viewed each film, in addition to tape recording his introspective reconstruction of his thinking process while viewing the film, he would be given a checklist to 54 fill out. The use of a checklist for the assessment of problem-solving processes was suggested by a study of Marshall (1971). The Process Checklist devised for this study con-' sisted of a series of 25 statements that pertain to four aspects of the act of generating problem formulations: 1. modes of mental representation, 2. strategies of problem formulation, including a. initial routines, b. general strategies, 3. associative processes of problem formulation, 4. cue utilization. The classification of checklist items according to the above categories, and a brief description of each item appear in Chapter IV (Table 9). A copy of the Process Checklist is contained in Appendix B. In administering the checklist, the subject was instructed to check those items which "characterize your thinking while viewing this film." The checklist items were derived from data obtained from the physicians who participated in the Elstein, et a1. (1972) simulations. The "think aloud" and recall protocols of these physicians were reviewed; statements pertaining to problem formulation processes were picked out, and, with some modification of wording, included in the checklist. In addition, some items were devised to describe processes which were not explicitly Stated by the physicians, but which could be inferred from a review of their protocols. 55 Procedure The physician data were collected in individual sessions lasting about three hours during which three or four films were viewed. The session began with the adminis- tration of a set of general instructions. For each film, the following steps were involved in the data collection. Collection of the Outcome Data: 1. The subject was first shown the initial segment of the film in which the patient walks into the doctor's office and sits down to await the arrival of the physician. The film was stopped at this point and the subject was asked to comment on his impression of the patient and on any ideas that came to mind as to what problems the patient might have. The subject's comments were tape recorded. 2. The subject was given the written "nurse's sheet“ pertaining to the patient. His comments regarding impressions of the patient or ideas about the patient's problems were tape recorded. 3. The subject was then shown the rest of the film. At the end of the film, he was asked to fill out a response sheet for each problem formulation that he had generated while viewing the film. 4. The subject was asked to provide his tentative assessment of the case. He was asked to indicate: (1) how well substantiated he considered each of his problem formulations to be on the basis of the data obtained; III ‘I‘ll‘ll I! II‘ I l 56 (2) whether he anticipated that the patient has a single illness or multiple disorders; (3) whether he considered there to be any functional relationships among his problem formulations (e.g., some formulation could be considered to be secondary to, superimposed on, or contributing to, etc., some other formulations). All comments were tape recorded. Collection of the Process Data: 1. The subject was asked to attempt "to reconstruct your thinking while viewing this film," including such things as the point in the film when each problem formu- lation came to mind, the cues that were significant in generating each problem formulation, and any revision of initial formulations as the interview progressed. These comments were tape recorded. The physician was offered the opportunity to view the film again as he reconstructed his thinking, but only occasionally did a subject elect to do so. 2. The Process Checklist was administered. The experimenter used the items checked by the subject as a basis for asking additional questions pertaining to processes of problem formulation. The subject's responses were tape recorded. Although the checklist was administered after each film, it is believed that the length of the list (25 items) and the number of activities intervening between each ad- ministration of the list were sufficient to minimize any 57 effect that eXposure to the checklist might have had on subsequent problem formulation activity. Analysis The primary purpose of the analysis of the physician data was to obtain a basis for designing two components of the training experiment: (1) the outcome and process feedback materials for each of the six training films; and (2) the dependent variable scoring keys used to evaluate student performance on the posttest. The development of the feedback materials and scoring keys is described in the next section of this chapter. Analysis of the physician data was also conducted for a secondary purpose: namely, further specification of the nature of the problem formulation component of medical problem solving, beyond that which has been forthcoming from previous investigations. This analysis was designed to address three questions: 1. How early in the clinical workup does the phy- sician begin to generate problem formulations? 2. What is the structure of a set of initial problem formulations? 3. What cognitive processes are involved in the generation initial problem formulations? A description of the methods of analysis, as well as a discussion of the results pertaining to each question, are presented in Chapter IV. 58 The Training Experiment with Second-Year Medical Students The training experiment employed a posttest-only 'control group design, with subjects assigned at random to one of three experimental conditions: 1. Treatment I: Training with Outcome Feedback; Posttest 2. Treatment II: Training with Outcome and Process Feedback; Posttest 3. Control: Posttest only Since both treatment conditions were based on the general training model described in Chapter I, they had a number of features in common. 1. Both conditions consisted of three training sessions (with two films presented at each session), and a posttest. session (at which two films were presented). The general format of each session was the same under both conditions. 2. Under both conditions, the subject carried out the same basic task with respect to each of the films: i.e., having viewed the film, he filled out a set of response sheets indicating the problem formulations he had generated, and he wrote a brief tentative assessment. 3. Under both conditions, the subject was provided with feedback materials based on the physician performance data. 59 The two training conditions differed, however, with respect to the type of feedback provided. Under one con- dition (Treatment I), the feedback materials provided the student with "outcome models," i.e., examples of the problem formulations and tentative assessments generated by the physicians for each of the training films. Under the other condition (Treatment II), the feedback materials provided the subject with "outcome models," as defined above, and "i.e., materials which portrayed the "process models,‘ processes by which the physicians arrived at their problem formulations. The outcome feedback, provided under both treatments, was presented in written booklet form. The process feedback provided under Treatment II included audio supplements to each of the training films, and written materials. The control condition involved two sessions: (1) an initial orientation session whose purpose will be described subsequently; and (2) a posttest, session equivalent to the posttest session under the two training conditions. Experimental Procedure Under both treatment conditions, training was con- ducted in three sessions, with two films presented at each session. The order in which the films were presented was the same under both conditions: namely, random order, with the restriction that films involving similar medical com- plaints and/or similar patient demographic characteristics 60 were not presented consecutively (see Table 1). Under both treatment conditions, the instructional sequence followed for each film remained the same across the three training sessions. The posttest session, involved the presentation of two films, with the same procedure followed under all three experimental conditions. All experimental manipulations were administered to the subjects by means of individual booklets in self- instructional format. At the beginning of the first training session, this booklet provided the subject with a set of orientation materials designed to acquaint him with the problem formulation task. At the beginning of sessions two, three and four, the subject was given review materials. The instructional sequence (or posttest task) for each of two films was then administered by means of the self-instructional booklet. Thus, the role of the experimenter was limited to a small number of preliminary verbal instruction. It was decided that a single session for the subjects under the control condition would be undesirable for two reasons. (1) Because a single control group session would require administration of the orientation materials prior to the presentation of the bmo posttest films, it would be longer than the treatment group posttest session, and, of course, not fully equivalent in content. (2) It is possible that on a subject's first exposure to the task his per- formance might be depressed, or affected in some other 61 unknown way, by the novelty of viewing a filmed interview and then having to fill out a set of relatively unfamiliar response sheets. It was believed that any "novelty effect" would be eliminated after a subject's first exposure to one of the films, and to the task of filling out the response sheets. In order to make the posttest session conditions as similar as possible across all three groups, and in order to control for the possibility of a "novelty effect," it was decided to conduct two sessions for control group subjects. During the first session, the subject was provided with orientation materials (similar to those administered at the first session under the treatment conditions), and he carried out a task designed to control for a possible novelty effect during the subsequent posttest session: namely, the sixth training film was presented, and the subject recorded his problem formulations and tentative assessment. (Feedback was, of course, not provided.) The second control group session involved the same procedure as the treatment group posttest; sessions: review of the orientation materials, followed by presentation of the two posttest films. The format of the experimental procedure is outlined in Table 3. The heavy academic work-load of the second-year medical student population placed several constraints on the scheduling of the experimental sessions. Because of the students' very full class schedule during the day, it 62 TABLE 3.--The Experimental Procedure. Experimental Conditions Week Treatments I and II Control 1 Training session--l Orientation Film 1 Film 2 2 Training session--2 Review Film 3 Film 4 3 Training session--3 Orientation session Review Orientation Film 5 Film 6 Film 6 4 Posttest session Posttest session Review Review Film 7 Film 7 Film 8 Film 8 was necessary to conduct the sessions in the evenings. More- over, because of the limited number of time slots available for scheduling sessions (i.e., four weekday evenings), it was necessary to conduct group sessions involving all 16 of the subjects assigned to an experimental condition. Although individual administration of the training and post test was not feasible due to practical constraints, it was not con- sidered to be necessary (in order to treat the subject as the unit of analysis) since all experimental manipulations were carried out by means of a self-instructional booklet, and there was no interaction among subjects during the sessions. 63 The three eXperimental conditions were randomly assigned to three evening time slots. The treatment group sessions were held on four consecutive weeks. The control group sessions were held on two consecutive weeks (during the third and fourth weeks of the treatment group sessions). Although this procedure had the undesirable feature of confounding each experimental condition with one weekly time slot, it was not considered that this factor would pose a serious threat to the internal validity of the experiment. Under the treatment conditions, the first session lasted approximately two and one-half hours, and the other sessions approximately two hours each. Under the control condition, the first session lasted one and one-half hours, and the second two hours. A 10-minute break was taken about half-way through each session. An assistant aided the experimenter with the administration of each session. Due to absences at the scheduled group sessions, it was necessary to conduct make-up sessions on an indi- vidual or small-group basis. A total of 10 subjects (4 from Treatment I, 3 from Treatment II, and 3 from the control condition) participated in one make-up session; one Treatment I subject participated in two such sessions. Each make-up session was held within two days of the scheduled group session it was replacing. 64 The Problem Formulation Task For each of the filmed training and posttest cases, the subject was confronted with the same basic task. Before viewing the film, he was given the following instructions: While viewing the film, you should generate a set of initial problem formulations which you would want to investigate more thoroughly if you were to continue the workup beyond the first 4-6 minutes presented in the film. After the subject read the "nurse's sheet" and viewed the film, he recorded the problem formulations he had generated on response sheets. The response sheets were set up so that a group of hierarchically related problem formulations would be recorded on a single response sheet, with the most general formulation in the hierarchy listed on the front of the response sheet, and the more specific formulations listed on the back. If a subject generated a formulation that was not hierarchically related to another formulation, it was listed on the front of the response sheet. An example of a response sheet appears on page 65.1 1In the training materials, the term "problem formulation" was used to designate the response recorded on the front of the response sheet, and the term "more specific diagnostic possibility" to designate responses (if any) recorded on the back of the sheet. This distinc- tion was made in order that the terminology used in training would be consistent with that to which the students had become accustomed in learning to fill out a Problem- oriented Record. However, unless specifically indicated, a single term--"problem formulation"--has been employed throughout this report to refer to any diagnostic hypothesis, whether listed on the front or back of a response sheet. 65 Film Problem formulation title: CUE LIST '-—-‘. ._- . u..- More specific diagnostic possibilities (if any) are to be listed on the back i l . of the sheet. . '_“"_" - -._ _.-———————. 6(5 More specific diagnostic possibilities which you have considered for this problem formulation (if any) Title Cues ofjparticular relevance 67 After the subject had filled out the problem formu- lation response sheets, he wrote a brief paragraph giving his tentative assessment of the case. He was instructed that his tentative assessment should indicate: The --how well substantiated you consider each of your problem formulations to be on the basis of the data obtained thus far; --whether you anticipate that the patient has a single illness that will account for his various problems, or that he has multiple disorders; --whether you consider there to be any relationships among your problem formulations. For example, you may consider one problem formulation to be secondary to, superimposed on, or contributing to, etc. some other formulation. Instructional Sequence Under both treatment conditions, the same instruc- tional sequence was followed for each of the six filmed training cases. This sequence involved five steps which are summarized below. STEP 1: The subject read the "nurse's sheet" for the patient in the film. STEP 2: The subjects viewed the film of the 4-6 minute interview with the patient. STEP 3: The subject recorded the problem formu- lations he had generated, and wrote a brief tentative assessment. STEP 4: The subject was provided with feedback on the performance of the experienced phy- sicians. a. "Feedback Sheet 1" was presented. 68 b. Treatment I: The film of the interview was presented a second time. Treatment II: The process feedback version of the film was presented. c. "Feedback Sheet 2" was presented. STEP 5: The subject filled out a self-evaluation checklist. The first three steps in the sequence constituted the basic experimental task. Step 4, which embodied the experimental manipulation of feedback, was the only step in the instructional sequence which differed for the two treatment conditions. The feedback was presented to the subjects in three parts. The first part (Feedback Sheet 1, entitled "Major Problem Formulations") was designed to provide the subject with feedback on the problem formulation outcomes that were common to the responses of all, or nearly all, of the phy- sicians. This sheet enabled the subject to determine whether he had generated those formulations which the physician data indicated were of major importance for the case under con- sideration. The second part of the feedback consisted of a film presentation: in the case of Treatment I, a re-presentation of the standard film of the interview; in the case of 69 Treatment II, a presentation of the process feedback version of the film. Under both treatment conditions, a second exposure to the interview provided the subject with implicit feedback on the adequacy of his detection and recall of cues. The conditions differed, however, with respect to the pro- vision of process feedback. The film presented under the Treatment II condition provided explicit feedback, via the physician's "think aloud" comments, on the processes by which the physician sample generated each of the problem formulations listed on Feedback Sheet 1. Under the Treat- ment I condition, on the other hand, the second presentation of the standard film provided the student with the oppor- tunity to attempt to reconstruct, on his own, the processes by which the problem formulations on Feedback Sheet 1 were generated. The third part of the feedback (Feedback Sheet 2) contained one section that was the same for subjects under both conditions. This section (entitled "Additional Problem Formulations") was designed to provide the subject with feedback on the gagge_of diversity in the physicians' problem formulation outcomes. The second section of Feedback Sheet 2 (entitled “Summary") differed for the two treatment groups. Under Treatment I, it summarized the comments included in the physicians' tentative assessments. Under Treatment II, it consisted of a reconstruction of the physicians' reasoning 70 about the case, including a description of the processes by which the problem formulations listed on both feedback sheets were generated, as well as a summary of the comments in the physicians' tentative assessments. Feedback Sheet 2 pointed out both the commonalities and the range of diversity that were characteristic of the physicians' problem formu- lation processes, and their tentative assessments. Table 4 outlines the properties of the feedback presented under the two treatment conditions. TABLE 4.--Properties of the Feedback Presented under the two Treatment Conditions. Feedback Materials Treatment I Treatment II A. Feedback Sheet 1 PF Outcomes (C) PF Outcomes (C) B. Film Presentation Standard Film Supplemented Film: PF Processes (C) C. Feedback Sheet 2: Section 1 PF Outcomes (D) PF Outcomes (D) Section 2 TA Outcomes (C,D) PF Processes (D) and TA Outcomes (C,D) NOTE: PF = problem formulation; TA = tentative assessment; C = feedback indicating the commonalities found in the performance of all or nearly all physicians; D = feed- back indicating the range of diversity found in the per- formance of the physicians. The fifth and final step in the instructional sequence consisted of filling out a self-evaluation check- list. This checklist was designed to serve two functions: 71 (l) to insure that the subject carried out the process of comparing his own performance to that of the experienced physicians, and (2) to provide the subject with a sense of closure at the completion of the instructional sequence for a case. The first part of the checklist listed the titles of the problem formulations generated by the phy- sicians. The second part of the checklist listed the statements regarding functional relationships between problem formulations that were found in the physicians' tentative assessments. The subject was instructed to check each item in the list that corresponded to one of his own responses. Outcome Feedback These feedback materials were designed to provide the subject with models of appropriate outcomes for each of the filmed training cases: namely, the problem formu- lations and tentative assessments generated by the experi- enced physicians. Feedback Sheet 1 and the first section of Feedback Sheet 2 were based on a tabulation of the physician problem formulation data. Feedback Sheet 1 presented those problem formulations generated by at least five of the physicians who viewed the film. The first section of Feedback Sheet 2 presented the problem formulations generated by two to four of the physicians. Responses generated by only one physician were not included in the feedback. 72 Each problem formulation presented in the feedback sheet included two components: (1) a problem formulation title, and (2) a list of cues of relevance to that title. Problem formulations were organized on the feedback sheets so as to indicate hierarchical relationships (if any) among subsets of formulations. Cues of relevance to all formu- lations in a hierarchy were listed under the problem formu- lation title at the head of the hierarchy; only those cues of particular relevance to each more specific formulation in the hierarchy were listed under the subordinate title(s) in the hierarchy (see Figure l). The second section of Feedback Sheet 2 was based on a review of the transcriptions of the physicians' tape- recorded tentative assessment comments. It indicated both the commonalities and the range diversity in the physicians' assessments with respect to the three topics listed on p. 67. Process Feedback These feedback materials were designed to provide the subject with models of the processes by which the phy- sicians arrived at their problem formulation outcomes. For each case, a process feedback version of the training film was produced. Each of these films included three or four "think aloud" segments interposed at appro- 'priate points in the standard film of the interview. For each such segment the dialogue between the doctor and 73 GI DISORDER: ULCERATIVE COLITIS, or REGIONAL ENTERITIS/ILEITIS GI disorder pain in right lower quadrant of abdomen, for 4 months occurs in evening, lasting several hours not relieved by aspirin or Darvon not related to foods diarrhea: increase in number of stools, from 1 to 4-5/day, over 3 month period mucous in stools blood in stools pieces of food in stools weight loss, 25 lbs. in 1-2 months good appetite, eating more than usual extreme fatigue and weakness, for 2 months no vomiting Ulcerative colitis, or regional enteritis/ileitis diarrhea blood and mucous in stools weight loss of 25 lbs., with good appetite age 21 college senior: under academic stress concerned about keeping up with studies Figure l. A Sample of the Outcome Feedback for Film 1. patient stopped, an image of the patient was frozen on the Screen, and the physician was heard "thinking aloud" about such matters as: --the problem formulations he had generated up to that point; --the cues which had led to the generation of these formulations; --alternative interpretations being considered for certain ambiguous cues; 74 --any strategies that were guiding his thinking; --his impressions of the patient; --his revisions of previous formulations in the light of new data. In sum, the "think aloud" segments attempted to provide the viewer with simulated access to the processes going on in the physician's head as he conducted the interview. The process feedback films were produced as follows. For each film, the physicians' retrospective recall proto- cols and their responses to the Process Checklist were sum- marized in outline form. On the basis of these data, three or four appropriate stopping points in the film dialogue were selected, and a script was prepared for the "think aloud" segment to take place at each point. Since the process feedback films were intended to portray the cognitive activi- ties characteristic of experienced physicians in general, the scripts for the "think aloud" segments included only those processes typically involved in the generation of the "major" problem formulations listed on Feedback Sheet 1. The recording of the "think aloud" segments was carried out under conditions similar to those of the filming of the interviews. (1) In order to preserve naturalness of speech in the "think aloud" segments, the physician was provided with a script outlining the topics to be covered in each segment, but was otherwise free to improvise the comments he would make if he were in fact "thinking aloud." 75 (2) After the physician had reviewed the script, a trial run was carried out. (3) The final tape recording of the "think aloud" segments was undertaken. In order to facilitate the physician's task in simulating spontaneous "thinking aloud," he was shown the film of the interview up to each stopping point just prior to carrying out each commentary. Finally, a film laboratory undertook the production of the freeze frames of the patient and the insertion of the "think aloud" auditory tracts into the original films of the doctor-patient interview. Further process feedback was provided to the subject in written form by means of the "Summary" on Feedback Sheet 2. This summary attempted to indicate both the commonali- ties and the range of diversity of the physicians' problem formulation processes. It included: (1) a recapitulation of the main points included in the "think aloud" segments of the process feedback film (relevant to the "major“ problem formulations on Feedback Sheet 1), and (2) a presentation of the processes underlying the generation of the "additional" problem formulations (found on Feedback Sheet 2). The "Summary" also included discussion of the physicians' tentative assessments. Thus, the final portion of the feedback materials provided to subjects under the Treat- ment II condition attempted to integrate all feedback information of relevance to the case. 76 Materials All instructions and written training materials were provided to the subject by means of a booklet in self- instructional format. The first section of the booklet (entitled "Introduction“) provided the subject with the orientation materials. For the treatment groups, these materials included: (1) a statement of the purpose of the instructional package; (2) a description of the role of initial problem formulations in medical problem solving; (3) a definition of the components of an initial problem formulation; (4) a definition of a tentative assessment; (5) a description of the materials to be used; (6) a sum- mary of the five steps in the instructional sequence to be followed for each filmed training case; (7) a set of guide- lines to follow in filling out the problem formulation response sheets; (8) examples of two problem formulation sheets and a tentative assessment sheet filled out for a sample case. The control group received the same orientation materials, except that item 6 was omitted and the other items were modified as needed to omit reference to the feedback materials. The section of the booklet for each of the six training films had an identical format: instructions were presented for each of the five steps in the instructional sequence, and the written training materials (nurse's sheets, and feedback sheets) were presented at the appro- priate points in the sequence. 77 The sections of the booklet for the two posttest films (and for the control group orientation film) presented instructions equivalent to those for steps one, two and three in the instructional sequence. At the beginning of each new session, the subject received guidelines for review of the materials presented at the preceding session(s). At the beginning of session four, the experimental subjects also received a summary feedback sheet entitled "Common errors observed in your responses to the previous cases." This sheet was prepared by the eXperimenter after having reviewed all subjects' responses to the six training films. The sheet discussed the following types of errors: 1. errors in the generation of problem formulation titles due to failures to organize information in an appropriate manner (items 1 and 2). 2. errors in the listing of cues under problem formulation titles due to insufficient con- sideration of cue relevance (items 3, 4 and 5). 3. errors in writing a tentative assessment due to insufficient consideration of relationships among problem formulations (item 6). Along with a description of each type of error, a guideline for avoiding the error was presented. In addition to the Instructional Booklet, the sub- ject received a Response Booklet. This booklet contained 78 the response sheets to be used in recording the problem forjulations and tentative assessment generated for each case. It also included the self-evaluation checklists for each of the six training cases. Examples of the experimental materials are contained in Appendix C. Posttest Tasks The Basic Posttest Task.--At the posttest session the subject carried out the basic problem formulation task with respect to each of two films. Selection of the films to be used for the posttest was based on two considerations. First, in order to obtain as broad a sample as possible of the content domain (i.e., internal medicine), films that presented dissimilar cases, with respect to type of medical complaints and patient demographic characteristics, were selected. Second, in order to avoid a possible ceiling effect, the films were selected from among those which were found, upon examination of the physician data, to provide the basis for the generation of a relatively large number of problem formulations. The Additional Posttest Tasks. After the subject had completed the basic jposttest: tasks, he was administered two additional tasks pertaining to the second posttest film. The purpose of these tasks was to determine the extent to which perceptual and memory factors (i.e., factors in the processes of detecting, encoding and retrieving cues 79 rather than in the process of generating problem formulations PE£.§S) may have affected the subject's performance on the basic posttest task. Adequate detection, encoding and retrieval of cues is obviously a necessary prerequisite for generating problem formulations. A high level of performance on the basic posttest tasks would imply that these pre- requisites were met. However, a low level of performance would be open to three interpretations: (1) failure in the process of generating problem formulations; (2) failure in the perceptual and memory processes that are prerequisites for generating problem formulations; (3) failure in both domains. In order to more precisely assess between-group differences on the basic posttest. task, two additional tasks were devised. Both tasks pertained to the second posttest film. Although it would have been of interest to administer these tasks for the first posttest; film as well, this possibility was rejected because of the risk that interpolation of the additional tasks might influence subjects' performance on the second basic posttest task. The first additional task (Recognition of Cues) required the subject to indicate on a checklist those cues he recalled being presented in film 8. The checklist con- tained 64 randomly ordered items, 32 of which were cues presented in the film, and 32 of which were distractors. The cues included in the checklist were items which had been listed by at least one member of the physician sample 80 as relevant to some problem formulation he generated while viewing the film. The distractors were devised by the experimenter in collaboration with a physician consultant, and were designed to represent three plausible but incorrect types of data. 1. consistent distractors: 16 items which were not presented in the film but which would be consistent with the cues that were presented. 2. contradictory distractors: 8 items which contra- dicted a cue that was presented in the film. 3. inconsistent distractors: 8 items which were not presented in the film and which would be inconsistent with the cues that were presented. After completion of the above task, the subject carried out a second additional task (Additions to Response Sheets). For this task he was provided with a list of the 32 cues that had been presented in film 8. After reading the list, he was instructed to make any additions he believed apprOpriate to his problem formulation response sheets, including: (1) addition of cues to the problem formulations he had previously recorded; and (2) addition of new problem formulations which he thought of after reading the cue list.l k lThese additions were made in ink so as to be readily distinguished from the subject's initial responses in pencil. 81 In sum, the first additional task was designed to determine whether failure to detect, encode and recall cues placed constraints on the subject's performance of the basic posttest task, while the second additional task was designed to ascertain whether the removal of potential perceptual-memory constraints would permit the subject to improve his problem formulation performance. Copies of the materials for the two additional posttest tasks are found in Appendix D. Several comments are in order regarding the interpre- tation of performance on the additional tasks. Subjects were allowed to take notes while viewing the films, and nearly all did so. Thus, it is primarily failures in the detection and/or encoding of cues, and only secondarily failures in the retrieval process, that are at issue here. Secondly, performance on the second additional task does not, in itself, provide a pure measure of what the subject's performance would have been in the absence of perceptual- memory constraints. A subject may be able to generate additional problem formulations on this task not because he failed to detect and/or encode relevant cues in viewing the film, but simply because the additional task provides a second exposure to the cues. Thus, in interpreting performance on the second additional task, recourse must be made to other sources of data (e.g., items checked on the recognition task) in order to infer that it was failure 82 to detect and encode cues that inhibited a subject's initial problem formulation performance. The Questionnaire.--(Treatment groups only) After the subject had carried out the two additional tasks de- scribed above, he was given a questionnaire to fill out. The questionnaire contained four sections. Section one included statements to which the subject responsed on a five-point scale (strongly agree, agree, no opinion, dis- agree, strongly disagree). These statements were designed to elicit the subject's opinion of, or attitude toward, various aspects of the training procedure: including the instructional booklet, the films of the interviews, and the feedback materials. The questionnaire also sought the subject's opinion concerning the appropriateness of the materials for.second-year students, the degree to which the materials had been effective in improving his problem formulation skills, and several possible ways of integrating the materials into the current curriculum. The second section consisted of a checklist designed to determine the degree to which the subject pursued an interest in the training cases outside of the experimental sessions, either through discussion with students and/or faculty, or by looking up reference materials. The third section of the questionnaire was an open-ended request for comments and suggestions. The fourth section requested for subject to report any clinical experience he had had, prior to 83 participating in the eXperiment, that involved contact with patients. He was asked to indicate both the type of experience (e.g., as an intern, physician's assistant, medic, nurse) and the extent of the experience (hours/week, and number of weeks). These latter data were collected for the purpose of sample description, and to ascertain the degree to which prior eXperience may have affected the experimental outcomes. A copy of the questionnaire is contained in Appendix E. Subjects A sample of 48 students in the third term of their second year of medical school participated in the experiment. The decision to sample second-year students was based on consideration of three criteria. First, it was believed that training in the generation of problem formulations utilizing filmed case presentations would be most appro- priate prior to the student's participation in clinical clerkships (i.e., before summer term of his second year). Second, in order for the training to be effective it was necessary that the student have acquired sufficient medical science background to be able to deal with the range of cases in internal medicine presented by the films. Third, it was necessary that the student had not already attained a level of skill in the experimental task that would 84 preclude improvement in this skill via participation in the training experiment. Since the experiment was to be conducted spring quarter, there were two potential populations which met the first.sampling criterion: students enrolled in the third term of either their first or second year in the College of Human Medicine, Michigan State University. It was believed that the first-year student would not have mastered suf- ficient medical content to meet the second sampling cri- terion. Second-year students, on the other hand, clearly met this criterion. During the two and one-half terms of their Focal Problems course they had dealt with written case materials of relevance to each of the eight experi- mental films.l Thus, a primary concern became to determine whether the second-year student population would meet the third sampling criterion. The results of the pilot testing indicated that there did not appear to be a substantial risk of a ceiling effect due to the second-year students' prior problem-solving experience (see Table 6). Thus, second-year medical students were chosen as the target population for the experiment. 1The following "focal problems" involved medical content of relevance to the experimental films: polyuria (films 2,3,8), headache (film 5), painful joints (film 6), diarrhea (film 1), shortness of breath (film 6), fever (films 3,5,7), anemia (films 1,4), chest pain (films 2,4), hematuria (film 3), cough (film 7), abdominal pain (films 1,3,8). 85 The list of students enrolled in the third term of their second year in the College of Human Medicine included 81 persons. Four students who had begun their primary clerkship that term were eliminated from the list prior to sampling; thus, the target population included 77 persons.1 Of these, five persons participated in the pilot testing, resulting in a population of 72 students from which the experimental sample was selected. Students were randomly selected from the list, and randomly assigned to one of the three experimental conditions. Each person selected was contacted by telephone to determine whether he could participate in the experiment on the dates scheduled for the experimental condition to which he had been assigned. In four instances, a subject was willing to participate but not available on the dates in question. He was there- fore randomly assigned to one of the two other conditions. In order to obtain a sample of 48 participants, a total of 68 students were sampled. The students sampled who were unable to participate included 17 refusals, and 3 students who could not be contacted by telephone. Thus, a partici- pation rate of 70.6% was obtained. On contacting each person he was told that he would be paid approximately $5.00/hour 1It was found in reviewing the subjects' responses to section four of the questionnaire that one subject who had begun his primary clerkship was inadvertantly included in the sample (Treatment I condition). Since this subject's scores on the dependent variables did not substantially differ from those of the rest of the sample, he was not excluded from the analysis. 86 for participation in the experiment. It is believed that without payment the refusal rate would have been con- siderably higher, and attrition would have occurred across the training sessions. Ten of the students who refused to participate in the experiment gave the reason of being "too busy" with their studies. The remaining seven students gave a variety of reasons, including prior committments or lack of interest. In order to determine whether those who refused to par- ticipate (or who could not be contacted) differed systemati— cally from the sample of participants, it was possible to examine the students' final exam scores for the Focal Problems course in which they were enrolled the term preceding the experiment. These scores were selected for examination since they would provide the best available measure of the sstudents' pre-experimental level of achievement in areas ‘that could be expected to correlate substantially with the eexperimental posttest variables. Inspection of these data Jrevealed that the four students with exam scores that were (:onsiderably lower than all other students (i.e., scores jrn the 40's, whereas the range of all other students' Ehcores was 61-87) were among the refusals. Thus, it would appear that the weakest students in the population may liarve systematically eliminated themselves from participation \ 1The Focal Problems exam is described in more detail on p. 92. 87 in the experiment. However, with these four students excluded, the mean exam scores for the participant and refusal/no contact groups were virtually identical (72.06 and 72.67, respectively). A 90% confidence interval calcu- lated on the difference between these means indicated: (1) that zero was contained well within the bounds of the interval, and (2) that the range of values spanned by the interval was very narrow (see Table 28, Appendix H). This finding lends support to the argument that, with the exception of the students in the bottom 6% of the class, the results of the experiment may be generalized to the target population of second-year medical students. The target population of real interest, however, extends beyond the 77 second-year medical students enrolled at Michigan State. One would wish, utilizing the arguments of Cornfield and Tukey (1956), to generalize the results of the experiment to a hypothetical target population that includes all second-year medical students that are similar to those actually sampled. In so doing, it is of course necessary to take into consideration_the characteristics of the sample at both the individual and the institutional level. Table 5 presents selected characteristics of the individuals included in the sample. Since amount of prior clinical experience (e.g., as an extern, physician's assistant, nurse, medic) may affect the degree to which students are able to profit from the simulated clinical 88 exercises, this variable is reported, along with sex, age and Medical College Admission Test (MCAT) scores. At the institutional level it is necessary to con- sider the ways in which the first two years of the cur- riculum at Michigan State may differ from that found else- where. In addition to courses in the basic sciences which are probably quite similar to those at other institutions, the Michigan State students take, during their first two years, several courses which focus on the development of‘ clinical skills: 1. The Doctor-Patient Relationship course (1 quarter, 3 hours per week), which emphasizes the develop- ment of interpersonal skills; 2. The Clinical Sciences course (3 quarters, l afternoon per week), which involves supervised contact with patients, including history taking and physical examination. 3. The Focal Problems course (3 quarters, 2-3 hours per week), which focuses on a problem-oriented approach to clinical medicine, including the development of problem- solving skills and use of the Problem—oriented Record. Thus, all participants in the experiment had a substantial degree of prior coursework having a clinical orientation. In their Focal Problems course, in particular, they had become familiar with the concepts of utilizing cues elicited during a clinical workup in order to generate a list of problem 89 TABLE 5.--Selected Characteristics of the Student Sample.a Sex Male: n = 42 Female: n = 6 Age Mean: 25.5 Standard deviation: 2.5 MCAT Scores Mean: Standard deviation: Verbal Ability 523.1 96.5 Quantitative Ability 552.6 80.8 General Information 545.5 72.2 Science 537.9 84.3 Prior Clinical None: n = 8 Experience 1-12 wks.: n = 15 (40 hour weeks) 13-52 wks.: n = 3 52 or more wks.: n = 6 aMCAT scores were available for only 42 subjects. Data on prior clinical experience were obtained from the questionnaire administered to the treatment groups (n - 32). formulations as entries in a Problem-oriented Record. How- ever, their experience in this course differed from that provided by the training experiment in several ways: (1) the course exercises were based on written case summaries; (2) the course exercises did not focus on the generation of initial problem formulations based on cues obtained during the earliest part of the workup: (3) feedback con- sisted of the course instructor's evaluation of their performance. Pilot Testing The experimental procedure and materials were pilot tested with five subjects randomly selected from the 90 population of second-year medical students who were to participate in the experiment. An initial version of the Treatment I materials for two films were administered to two subjects. Subsequent to this initial pilot test, all eXperimental materials were prepared, and a second pilot test was conducted with four subjects (including one person who participated in the initial pilot test, plus three additional subjects). Two subjects were administered the orientation materials and the materials for two films under the Treatment I condition, and two subjects were adminis- tered the posttest materials. The pilot testing served two purposes. First, on the basis of the subjects' comments and criticisms, certain revisions were made in the materials. Secondly, on the basis of the subjects' performance, it was ascertained that the training would be appropriate for a second-year medical student population. The subjects' Problem Formu- lation scores (reported in Table 6) indicated that they easily met the second sampling criterion (i.e., prior mastery of prerequisite medical content), and that on the posttest films in particular they adequately met the third sampling criterion (i.e., a level of skill in carrying out the experimental task that was low enough to permit detection of a training effect). 91 TABLE 6.--Resu1ts of the Pilot Test. Maximum a Possible Subject Film PF Score Score 1 2 38 60 4 3O 60 2 2 34 60 4 36 60 3 7 14 54 8 12 78 4 7 22 54 8 31 78 aThe PF (problem formulation) score is defined on pp. 96-98. The Covariate In a number of studies of medical problem solving (e.g., Elstein, et al., 1973; Gordon, 1973), a high degree of variability on the dependent measures has been found. Thus, it was considered quite important that measures on an appropriate covariable be obtained in order to increase the precision of the statistical analysis. Probably the best measure would have been a pretest in which the subject carried out the same task as on the posttest. However, this possibility was rejected for the following reasons. First, it was believed that in order to obtain reliable pretest measures on the dependent variables at least two filmed cases would have to be employed. This would have reduced the number of films 92 available for training from six to four. It was believed insufficient training would be even more likely than lack of precision to result in a nonsignificant treatment effect. Moreover, a nonsignificant outcome due to failure to carry out an adequate test of the treatment would be a more serious experimental failure than the occurrence of a Type II error due to lack of precision. It was therefore decided to attempt to obtain covariate measures from some other source than a pretest. The source decided upon was the final exam in the Focal Problems course in which all second-year medical students had been enrolled the term prior to the adminis- tration of the experiment. This measure was selected for several reasons. First, since it was a lOO-item long exam of multiple-choice and true-false questions, it could be expected to provide relatively reliable and objective data. Second, more than any other available measure, it could be expected to correlate with the dependent variables in the present study. Although the majority of items in the test were designed to measure the student's knowledge of medical science content that had been covered in the course, a substantial number of items attempted to assess his ability to use data regarding a patient to make a differential diagnosis, or to select diagnostic or theraputic options. thus, it was anticipated that use of this exam as a (xyvariate would be effective in reducing within-group 93 variability primarily on the dimension "pre-experimental knowledge of medical science content" and secondarily on the dimension "pre-experimental ability in solving medical problems." Dependent Variables A subject's performance on the basic posttest tasks was evaluated in terms of four dependent measures: 1. a problem formulation score (PF) 2. a cue utilization score (CUE) 3. a classification of cues with respect to problem formulations score (CUE-PF) 4. relationships among problem formulations (R—PF) For each variable, the adequacy of the subject's performance was measured by means of a scoring key derived from the physician performance data. The subject's score on each variable was calculated separately for each of the two posttest tasks, but, for purposes of statistical analysis, his scores were summed across tasks, thus yielding four dependent measures per subject. Each scoring key was designed to measure the degree to which the student's performance on a given variable approximated that of the experienced physicians. Each key contained a list of various potential responses, with points assigned to each. The number of points assigned to a response was weighted to reflect the relative frequency with which the response occurred in the experienced physician 94 data. Certain additions to the keys, as well as validation of certain components in the keys, were carried out by means of independent consultations with two additional physicians who were not part of the sample of eight. Three of the dependent variables (PF, CUE, CUE-PF) were based on the information the subject recorded on his problem formulation response sheets. Each of these variables pertained to one component of the interrelated cognitive outcomes which resulted from the subject's simulated encounter with a patient. As depicted in Figure 2: the CUE score pertained to the functional data base (i.e., set of cues) which the subject extracted from the film and utilized to generate problem formulations; the PF score pertained to the set of problem formulations he generated; the CUE-PF pertained to the way in which he classified the cues he obtained with respect to the problem formulations he generated. The fourth dependent variable (R-PF) was based on information the subject recorded in his tentative assessment, and pertained to functional relationships which he hypothe- sized to exist between problem formulations he had generated. The remainder of this section will be devoted to a discussion of the properties of each score and of the general principles underlying the construction of the scoring keys. For more detail the reader is referred to the c0pies of each key and of the scoring instructions, found in Appendix F. 95 CUE score PF score Problem Formulations Generated Functional Data Base (i.e., cues utilized) l CUE—PF score Classification of Cues with respect to Problem Formu- lations Figure 2. Relationships between Cognitive Outcomes and the Dependent Variable Scores CUE, PF and CUE-PF. CUE score.--This score was designed to measure the adequacy of the functional data base which the subject extracted from the film and utilized in the generation of problem formulations. It was based on the cues which the subject listed on his response sheets, irrespective of the problem formulation title(s) under which he listed them. The key consisted: (1) of a list of all cues that were utilized (i.e., listed under any problem formulation title) by the physicians, and (2) for each cue the number of points to be obtained if the subject utilized the cue. Points were allotted to cues as follows: 96 No. of physicians utilizing the cue Points n = 5-6 3 n = 2-4 2 n = l l A subject's CUE score consisted of the sum of points obtained for each cue he utilized. The CUE scoring key for film 7 included 20 items, and yielded a maximum possible score of 42. The CUE key for film 8 included 32 items and yielded a maximum possible score of 61. Thus, the range of scores on the CUE variable (summed over both posttest tasks) was 0-103. PF score.—-This score was designed to measure the appropriateness and thoroughness of the subject's set of problem formulations. The scoring key included a list of all problem formulation titles generated by the physician sample, and, in addition, a list of "other" titles (i.e., student responses not occurring in the physician data) that were judged to be acceptable by both of the physician consultants. Points were allotted to each title as follows: No. of physicians who listed the title Points n = 5-6 6 n = 2-4 3 n = l (or, judged acceptable 1 by the two physician consultants) 97 A subjects' PF score consisted of the sum of points obtained for each problem formulation title he recorded.1 The PF key for film 7 included 13 titles (plus a list of 12 "Other Acceptable Responses"), and yielded a maximum possible score of 54 points. The PF key for film 8 included 20 titles (plus a list of 18 "Other Acceptable Responses"), and yielded a maximum possible score of 78 points. Thus, the range of scores of the PF variable (summed over posttest tasks) was 0-132. The performance dimension of thoroughness was taken into consideration both in the construction of the key and in the definition of the PF score as cumulative summation. The scoring key was constructed to incorporate as wide a range of potential formulations as possible: all of the physicians' responses were included, and "other acceptable responses" (as judged by the physician consultants) were added so that restrictions due to sampling error in the original physician data would be minimized. By cummulatively summing points across the problem formulation titles a subject generated, the thoroughness of his performance (i.e., the number of problem formulations generated) would be reflected in the resultant score. 1The term "problem formulation title" refers to both the title listed on the front of each problem formu- lation response sheet and the titles of more specific diagnostic possibilities (if any) listed on the back of each response sheet. 98 A second performance dimension—-appropriatness-- was incorporated in the key in several ways. First, the points assigned to each title in the key were weighted to reflect their relative frequency of occurrence in the phy~ sician sample. Second, a set of codes was included in the key indicating the permissible ways in which hierarchically related titles could be recorded. If a subject recorded a title in a way that was not permissible (e.g., he recorded "pheumonia" as a diagnostic possibility on the back of a problem formulation sheet with the title "cancer"), he received no points for this title. Third, in order to control for inflation of the PF score due to a tendency on the part of a subject to "catalogue" every diagnostic possibility he could think of, two features were included in the key: (1) a title was not scored if there were no cues listed with it; (2) the number of points that could be obtained for "other acceptable responses" was restricted to a maximum of six (i.e., six one-point responses). CUE-PF score.--The CUE—PF score was designed to measure the subject's skill in classifying cues with respect to the problem formulation categories of major importance for the case. In constructing the key for the CUE~PF score, it was necessary to restrict the sc0pe of the key to those problem formulations for which there was sufficient physician data available in order to specify, in a reliable manner, the points to be allotted for the classification of a given 99 cue under a given problem formulation title. Thus, it was decided to include in the key, as scoring categories, those problem formulations (or groups of similar problem formu- lations) generated by all six physicians. In some instances, a scoring category consisted of a single problem formulation (e.g., in film 8, diabetes mellitus). In other instances, several formulations generated by the physicians were grouped to form a single category in the key. A category of this latter type was constructed on the basis of three criteria: (1) that the formulations pertain to the same diagnostic "subspace" (e.g., organ system and/or disease mechanism); (2) that the cues listed by the physicians under each formulation be highly similar; (3) that all of the phy- sicians generated at least one formulation belonging to the category. For example, in the film 7 key, several problem formulations (e.g., chronic obstructive lung disease, chronic bronchitis) were grouped to form the scoring category "chronic respiratory problem." The CUE-PF scoring key consisted of a grid, with cues as one dimension and problem formulation categories as the other dimension. The entry in each cell of the grid was the number of positive or negative points which the sub- ject would obtain for listing the pth cue under a problem formulation in the qth category. The rationale for the negative points was that if only positive points were awarded a subject could easily attain a perfect score simply 100 by listing every cue under each of his problem formulation titles. Determination of the sign of the points assigned to each cue X category cell was based on two sources of data: (1) the responses of the physician sample, and (2) independent ratings of the cues (by categories) carried out by the two physician consultants. The latter data were collected in order to reduce the effect of sampling error (in the original physician sample) on the classification of cues as relevant or irrelevant to a problem formulation categories. The primary concern was that negative points be assigned to a cell only if the cue was clearly irrelevant to a problem formulation category, and not simply because the members of the physicians sample had omitted some potentially relevant cue(s) in recording their problem formulations. The following criteria were used to deter- mine the entries in each cue X category cell of the scoring grid: Cell entgy: Criteria: + (CUE points) cue listed (or rated) as relevant to titles in the category by at least two (sample and/or consulting) physicians — (CUE points) cue not listed (or rated) as relevant to any titles in the category by any of the (sample or consulting) physicians 0 points cues listed (or rated) as relevant to titles in the category by only one (sample or consulting) physician 101 NOTE: CUE points = the number of points allotted to the cue in the CUE scoring key. To summarize: the scoring key was designed: (l) to reward the subject for listing relevant cues under a problem formu- lation title; (2) to penalize him for listing clearly irrele- vant cues; and (3) to neither reward nor penalize him for listing cues whose relevance was indeterminant. In addition, rules were incorporated into the key to penalize the sub- ject for various errors (e.g., listing "weight gain" if the patient said he has lost weight; listing a relevant cue but 'failing to indicate that it was a relevant disconfirmatggy cue, as was required by the instructions). The subject's CUE-PF score consisted of the sum of points he obtained for each cue be listed under any title included in any of the problem formulation scoring cate- gories. The scoring key for film 7 consisted of a 20 cues X 4 categories grid. Summation of points across the grid yielded a score range of -22 to +68. The scoring key for film 8 consisted of a 33 cues X 9 categories grid, and yielded a score range of -122 to +140. Thus, the range of possible scores on the CUE-PF variable (summed over posttest taSks) was -144 to +208. It should be noted, however, that a score in the negative part of the range would be most unlikely to occur. The CUE-PF score differs from the PF and CUE scores in several ways. First, each of the latter two scores measures a single aspect of the subject's performance: his- 102 problem formulation titles (irrespective of the cues listed under themhvor his cue utilization (irrespective of the titles under which they are listed.) The CUE-PF score, on the other hand, measures the way in which the subject classified cues with respect to problem formulation titles. The rationale for this score is the hypothesis that the Capacity to categorize cues under multiple problem formu- lation titles is not a simple linear function of the skills measured by the PF and CUE scores. Second, while the PF and CUE scores are designed to measure both the thoroughness and appropriateness of the subject's performance in each area, the CUE-PF score focuses primarily on the dimension of appropriateness. Unlike the other keys, which each include a fairly exhaustive list of potential responses, the CUE-PF key permits scoring only of those responses that fall within a set of problem formulation categories which were found to characterize the performance of all of the experienced physicians. Thus, if a subject generated a problem formulation that was not included in the scoring categories, the cues associated with this title were not scored. R—PF score.--This score was designed to measure the degree to which the subject had included, in his tentative assessment, appropriate statements of probable functional relationships between problem formulations. The key for this score included a list of the statements made by the 103 physician sample, as well as a list of "Other Acceptable Responses" devised in collaboration with the two physician consultants. Points were allotted to statements as follows: No. of physicians who made the statement Points n = 5-6 6 n = 2-4 I 3 n = 1 (or judged acceptable 1 by the two physician consultants) The subject's R-PF score consisted of the sum of points obtained for each statement of functional relation- ships he made, with the number of points obtainable for "Other Acceptable Responses" restricted to a maximum of three (i.e., three one-point statements). The key for film 7 included two statements (plus a list of 9 "Other Acceptable Responses"); the key for film 8 included three statements (plus a list of 4 "Other Acceptable Responses"). Each key yielded a maximum possible score of 12 points; thus, the range of the R—PF variable was 0-24. Reliability and Validity Two aspects of reliability were of concern in the present study: (1) the stability of the posttest scores obtained by independent scorings of the subjects' responses (i.e., inter-scorer reliability); and (2) the generali- zability of'Ume posttest scores beyond the sample of two 104 cases included in the posttest tx>the domain of potential cases represented by office practice in internal medicine. Estimation of inter-scorer reliability was based on the sets of scores obtained by having the experimenter and a second professional in medical education independently score the responses of a sample of six subjects. Three subjects were randomly selected from each experimental condition. The response booklets of three subjects (one from each condition) were used to train the second scorer in the use of the scoring keys and instructions. The responses of the remaining six subjects were then scored by the experimenter and by the second scorer. The second scorer was not aware of the experimental condition to which each subject belonged. Interscorer reliability on the variables CUE, PF, CUE-PF and R-PF was computed by means of the Ebel (1951) intraclass correlation formula. A high degree of restriction in range on the R-PF variable resulted in what was judged to be a spuriously low estimate ofinter-scorer agreement by means of the intraclass cor- relation formula. Thus, for this variable an index pro- posed by Holsti (1969) for use in content analysis was also calculated. The reliability estimates obtained on each variable are presented in Chapter V. Let us now consider the question of generalizability. On the whole, the issues of reliability and validity have received insufficient attention from researchers in the 105 field of problem solving. In most studies some consider- ation is given to devising a problem-solving task which meets certain criteria of "face validity," i.e., a task which presents the subject with a nontrivial problematic situation, and which requires some degree of productive thinking in order to arrive at a solution. But to a large extent the more substantive psychometric issues pertaining to reliability and validity have been ignored: e.g., how reliable are problem-solving scores, given the very small number of tasks typically employed; how valid are inferences about general parameters of human problem solving, given the restricted range of tasks typically employed? An answer to these questions is not easily forthcoming. Problem- solving research poses some particularly difficult psy- chometric dilemmas. To begin with, it is difficult to define the unit of behavior that is equivalent to the classical test item. It is generally impossible to decompose a problem solution into a set of independently scorable events that would be comparable across subjects. The entire problem may be considered as one "item," but this poses other diffi- culties. In order to meet even the most informal criteria of face validity the investigator must devise tasks of considerable complexity. This, in turn, severely limits the number of tasks that can be included in any single evaluation instrument. 106 At the theoretical level, probably the best approach to determining the psychometric properties of a problem- solving test is offered by Cronbach's theory of generali- zability (Cronbach, et al., 1963). Under Cronbach's theory, the classical considerations of reliability and validity coalesce into the consideration of generalizability. The question of central interest in generalizability analysis is to determine the dimensions of the domain to which one can generalize based on a set of empirical observations. The posttest. devised for this experiment was similar in a number of respects to the instruments that have been used in most studies of problem solving, i.e., the subject was given a reasonably complex, problematic task to carry out, but the number of "items" in the postr test was very small. Nevertheless, an attempt was made to give all consideration possible, within the limits of the design of the study, to the issue of generalizability. First of all, in selecting'tlma posttest. films consideration was given to choosing two cases that would constitute a reasonably representative sample of the types of problems encountered in the domain of internal medicine. Second, in order to estimate the degree to which it would be possible to generalize from the subjects' scores on the two tasks to the domain of interest, the Cronbach (1951) coefficient alpha formula was used, but modified so as to eliminate between-group differences induced by the experimental lull ‘Il‘lrllllll'. Al, . 107 manipulation. The formula used, as well as the resulting estimates for each dependent variable, are presented in, Chapter V. Additional Dependent Measures In addition to the four major dependent variables defined above, a variety of other measures were calculated. These measures included: 1. The PF scores of the experimental subjects on the six training tasks; 2. The number of items of each type checked on the "Recognition of Cues? task, for subjects under all three conditions; 3. PF, CUE and CUE-PF scores based on the subject's total responses after carrying out the "Additions to Response Sheets" task, for subjects under all three con- ditions; 4. The experimental subjects' responses to the ques- tionnaire, by item, and summarized in terms of three scores: evaluation of the films (EV FILM), evaluation of the feed- back materials (EV FB), and evaluation of the general effectiveness of the training program (EV GEN). 5. Several measures (to be defined) pertaining to the structure of the subject's set of problem formulations on each posttest task. The purpose of these measures was to aid interpre- tation of the experimental outcomes of primary interest: Ill ‘II'IIIUIIII ‘1'] 108 namely, the hypotheses regarding between-group differences as assessed by the PF, CUE, CUE-PF, and R—PF scores on the basic posttest tasks. Thus, the above measures should be regarded merely as supplementary sources of data, of interest primarily as they contribute to an understanding of the experimental outcomes on the four major dependent variables. Hypotheses The two experimental hypOtheses, presented in general terms in Chapter I, will now be operationally defined as follows: Hypothesis 1:. The average performance of second-year medical students who have received problem formulation training (Treatment I and Treatment II) will be superior to that of students who have not received training (control group), as measured by four dependent variables: (1) CUE score, (2) PF score, (3) CUE—PF score, and (4) R—PF score. Hypothesis 2: The average performance of second-year medical students who have received problem formulation training involving outcome and process feedback (Treatment II) will be superior to that of students who have received problem formulation training involving only outcome feedback (Treatment I), as measured by the four dependent variables: (1) CUE score, (2) PF score, (3) CUE—PF score, and (4) R—PF score. Analysis The two experimental hypothesis were tested by means of multivariate and univariate analyses of covariance. In addition, a number of supplemental analyses were conducted. 109 in order to address questions that have been raised at various points in these chapters, or that were suggested as a result of the outcomes of the hypotheses tests. Chapter V reports the method and results of each analysis undertaken. CHAPTER IV ANALYSIS OF THE PHYSICIAN DATA This chapter presents an analysis and discussion of the data that were collected from the sample of eight experienced physicians. The chapter consists of three sections, each dealing with one.of the research questions listed in Chapter I: I 1. How early in the clinical workup does the phy- sician begin to generate problem formulations? 2. What is the structure of a set of initial problem formulations? ' 3., What cognitive processes are involved in the generation of initial problem formulations? The primary reason for collecting the physician data was to obtain a basis for the development of the materials to be used in the training experiment. However, an analysis of these data is of interest in itself in so far as it may contribute to the elaboration of a theory of medical problem solving. Given the small size of the .physician sample, the findings reported in this chapter must be regarded as highly tentative.‘ But, because the procedure uSed in this study permits a/more in-depth appraisal of initial problem formulation outcomes and / 110 ii I Ell-[ll (ill. » 0 111 processes than has been forthcoming from previous investi- gations, the findings may be of value in suggesting hypothe- ses and questions to be explored in future research. For each of the eight filmed cases, data were obtained from six (or in one case seven) of the eight phy- sicians. Thus, each analysis reported in this chapter is based on a total of 49 responses. Because of the limited size of the sample, only descriptive statistical analyses were conducted. Generation of the First Problem Formulation It will be recalled from Chapter III (p. 55) that the physician was asked to report what thoughts had come to mind: (a) after the initial 30-second view of the patient, and (b) after having read the nurse's sheet. When problem formulations were not reported at either of these points, the physician was asked, at the end of the film, to attempt to recall the point in the interview at which he first generated a problem formulation. On the basis of a frequency distribution of these data, it was ascertained that: 1. In 17 instances out of 49 (34.7%) the physician generated his first problem formulation after the initial 30-second view of the patient. The formulations generated at this point were of two types: (1) a formulation of "psychological problem" based on the patient's general 112 appearance and manner, and (2) formulations pertaining to organic disorders, e.g., "respiratory problem," based on specific nonverbal cues, e.g., "the patient's cough." 2. In 4 instances out of 49 (8.2%) the physician generated his first problem formulation after reading the nurse's sheet. In all four instances a formulation of "infection" was generated on the basis of the cue of elevated temperature. 3. In 26 instances out of 49 (53.1%) the physician generated his first problem formulation on the basis of the patient's presenting complaint. These formulations varied across cases depending on the nature of the com- plaint, but there was a high degree of consistency within each case with respect to the type of formulation generated. 4. In 2 instances out of 49 (4.1%) the physician generated his first problem formulation after the patient's had presented several complaints. Both of these instances occurred on film 1 in which the patient's presenting com— plaint (fatigue and weakness) was highly general. It was not until the patient mentioned that he also had abdominal pain that, in these two instances, a problem formulation was generated. Two factors must be borne in mind in attempting to generalize on the basis of these data regarding the earliness with which problem formulations are generated in actual clinical practice. First, the demand characteristics of 113 the experimental task, as well as the fact that the subject did not have to devote part of his attention to the task of data elicitation (i.e., he obtained data by viewing a film rather than by actively engaging in an interview), may have led the physicians to generate problem formulations some- what earlier than they would do in actual practice. Second, except in the instances where problem formulations were reported after the initial view of the patient or after the presentation of the nurse's sheet, some degree of retrospective distortion may have affected the physician's report of the point at which he first generated a problem formulation. The findings of this study probably over- estimate the earliness with which problem formulation typically occurs in clinical practice. Nevertheless, the fact that in 47 instances out of 49 (95.9%) a first problem formulation was generated no later than one minute into the interview provides evidence that physicians are able to generate problem formulations very early, on the basis of very minimal data, and in actual practice most probably do generate problem formulations relatively early, at least within the first five minutes of the workup. The Structure of a Set of Initial Problem Formulations In Chapter II (p. 32) it was proposed that a set of initial problem formulations defines the dimensions of the functional problem space within which the physician's 114 search for a diagnosis is conducted. The purpose of this section is to describe the way in which a set of initial problem formulations is structured. Two topics will be dealt with: (l) the features characteristic of a set of problem formulations; (2) the size and organization of a set of problem formulations. Structural Features An examination of the physician data indicated that what results from the physician's information-processing activity during the early part of the workup is not a unidimensional list of problem formulations. Rather it is a ggructured set of formulations which may be described in terms of four features: (1) hierarchical organization; (2) competing formulations; (3) multiple subspaces, and (4) functional relationships. 1. Hierarchical organization. A set of problem formulations may include formulations that are organized into a general-to-specific hierarchy that pertains to a single diagnostic category (e.g., an organ system, a disease mechanism). For example, a physician may generate a problem formulation, such as "GI disorder" and, as subcategories under this formulation, one or several more specific formu- lations, such as "inflammatory bowel disease" and/or "intestinal malignancy." 2. Competing formulations. A set of problem formulations may include formulations that provide 115 alternative explanations for some group of symptoms. For example, a physician may generate "inflammatory bowel disease" and "intestinal malignancy" as competing problem formulations. 3. Multiple subspaces. A set of problem formulations may include subsets of formulations that pertain to different types of diagnostic categories, e.g., different organ systems and/or different diseaSe mechanisms. Each such category may be considered to designate a "subspace" within the functional problem space in which the physician is operating. For example, a physician may generate a set of formulations that consists of four subspaces: (1) "GI disorder," (2) "diabetes mellitus," (3) "anemia," and (4) "cardiovascular problem." 4. Functional relationships. A set of initial problem formulations may include functional relationships which the physician hypothesizes to exist between certain problem formulations. For example, a physician may consider "anemia" to be secondary to "GI disorder." In order to illustrate the way in which the four features characterize the structure of a set of initial problem formulations, a structural diagram of the composite set of problem formulations generated by the physician sample for one film has been prepared (Figure 3). Although the diagram pictures a more extensive set of problem formu- lations than would normally be generated by any single individual, it may serve as a useful illustration for the 116 .H sHam Mom camfimm Guacammrm one an coumnmsmo meHDMH55HOh EOHDOHA mo umm mnemomsou may mo smummaa Housuosuum .m enumem .moxon mceQMOm nosed cosmmp an woumoeoce "mmermcoflumaou Hmcoeuocsm .o .Honuo Amy .annoum Hmasomm>ofloumo Aev .mesmsm Amy .msufladmfi mmucnmep Amy .HocHOmec HO Adv .umoommmnsm .o .moxon acacHOM mocfla HmucoNflHon we topmoecse “mcoepmassnom oceummsou .Q .mmxon mcHGAOM mocHH Hmcwuhc> an poumoflpcfl ”coeuomwcmmuo Hmoenoumuoem .m "mom mHeHmmezm _ . 1i mHquoo , qmoncmm _ _m>Heammoqo wozmonqaz , . mmammHo , onmoomowmm gazHemmezH amzom smoeazzaqmzH _, _ __ zmqmomm ¢H2m2< mmMBO DHHAAHZ mmamomHQ H0 MdADUm4>OHQm¢U mmfimm¢HQ ‘ 117 following commentary on the four features. The number of subspaces indicates the scope (or range) of diagnostic categories included in the problem space. It is the superordinate horizontal dimension of a set of formulations. Subspaces may becompetitors (e.g., "GI disorder" versus "diabetes mellitus"); they may be compatible but unrelated (e.g., "diabetes" and "anemia"); they may be functionally related (e.g., "anemia" secondary to "GI disorder"). Some subspaces may consist of a hierarchy of formulations (e.g., "GI disorder" hierarchy) while others may consist of a single formulation that is highly general (e.g., "cardio- vascular problem") or very specific (e.g., "diabetes mellitus"). Hierarchical organization of formulations indicates the degree to which the problem Space is elabo- rated on a vertical dimension. Competing formulations may exist within subspaces (e.g., "inflammatory bowel disease" versus "intestinal malignancy"), or between subspaces (e.g., "anemia" versus "cardiovascular problem"). Functional relationships may be hypothesized at the level of subspaces (e.g., "anemia" secondary to "GI disorder"), but can also be hypothesized at the lower levels of subspace hierarchies (e.g., a physician could hypothesize "blood loss anemia" secondary to "ulcerative colitis"). The set of problem formulations generated by each physician for each film was coded in terms of the features it exhibited (see Appendix G). The results of this analysis 118 are summarized in Table 7, by film (percent of subjects' whose sets of formulations exhibited each feature) and by subject (percent of films for which a subject's set of formulations exhibited each feature). TABLE 7.--Features Characteristic of Individual Sets of Problem Formulations, by Film and by Subject. Feature Hierarchical Competing Multiple Functional Organization Formulations Subspaces Relationships Film % of subjects whose sets of formulations exhibited each feature l 67 100 83 67 2 100 100 100 100 3 100 100 33 0 4 67 100 100 100 5 83 100 67 33 6 57 100 100 0 7 33 100 100 100 8 83 100 100 50 Subject % of films for which a subject's set of formulations exhibited each feature A 38 100 75 38 B 57 100 100 71 C 75 100 100 100 D 75 100 75 50 E 86 100 71 43 F 100 100 71 43 G 75 100 100 63 H 100 100 100 50 The following discussion of each feature will consider: (1) the consistency of its occurrence across films and 119 across subjects, and (2) the types of factors which may influence its occurrence. Of the four features, competingpformulations is the only one which is consistently characteristic of all phy- sicians' sets of formulations for all eight films. Thus, the present data suggest that competing formulations is an essential feature of any experienced physician's set of initial problem formulations. It is not surprising that this feature stands out as the most salient characteristic of a set of initial problem formulations. As noted in Chapter II, the entertaining of multiple competing hypothe- ses is a primary means by which the scientific thinker seeks to avoid the pitfall of becoming prematurely wedded to a favored, but possibly incorrect, hypothesis. The feature hierarchical organization is present a high proportion of the time for most films and for most subjects. But, it also Shows a good deal of variability across subjects and across films. Thus, in contrast to competing formulations, the occurrence of this feature appears to be influenced by both task environment and indi- vidual difference variables. Comments in several of the physicians' recall protocols indicated that, having generated a hierarchy of problem formulations, they would evaluate data subsequently collected with respect to formulations at each level in the hierarchy. It may be hypothesized that a hierarchy of 120 formulations serves a dual purpose. On the one hand, the early generation of specific formulations would help to guarantee that those cues of particular relevance to the establishment of a differential diagnosis are elicited and interpreted. On the other hand, by continuing to entertain a more general problem formulation category (that subsumes the specific formulations), the physician is more likely to avoid the "blind alley" pitfall that could result if data collection and interpretation were narrowly focused on specific diagnostic hypotheses and these hypotheses were disconfirmed. A further rationale for the feature of hierarchical organization is suggested by the research literature on the role of organization in memory (e.g., Mandler, 1967; Collins and Quillian, 1969). A hierarchy of problem formulations permits more parsimonious storage of cues. Consider this example. ' Example: -A physician obtains 8 cues of relevance to X0 (where 'X0 is a relatively general problem formulation, such as "GI disorder"). Of the 8 cues, three cues (#2, #3, #7) are of particular relevance to X1, two cues (#2, #6) are of particular relevance to X2, and two cues (#2, #8) are of particular relevance to X3 (where x1, X2, and X3 are more specific problem formulations, such as "ulcerative colitis," "intestinal malignancy," and "psychogenic diarrhea.") -If the physician generates a two—level hierarchy of problem formulations, the cues may be stored as follows: 121 X lI2I3I4I5I 6,7,8 -If on the other hand, if the physician generates a single-level list of specific formulations, the cues must be stored as follows: X 1I2I3I4I SI 1I2I3I 4’5! ' 1I2I3I4I5I 6, 7, 8 6, 7, 8 6,7,8 As the above example illustrates, hierarchical organization increases the number of categories to be stored (from 3 to 4), but greatly reduces the amount of information to be stored within categories (from a total of 24 units to a total of 15 units). The feature multiple subspaceg is present a high proportion of the time. With respect to subjects, this feature shows a relatively restricted range of variability (i.e., it is present for at least 70% of the films viewed by each subject). Thus, individual difference variables appear to have less of an effect on the occurrence of this feature than on the occurrence of hierarchical organization. With respect to films, the range of variability is fairly broad, but the distribution is highly skewed (i.e., for five films all subjects generated formulations in more than one subspace). This would appear to suggest that 122 for many medical cases (e.g., films 2,4,6,7,8) task environ- ment variables are of primary importance (and thus elicit consistent occurrence of this feature across all subjects) while for other medical cases (e.g., films 1,3,5) task environment variables are less powerful and individual difference variables may play more of a role. There are two task-related factors which probably contribute to the high frequency with which this feature typically occurs. One is the fact that a patient may often have more than one medical disorder. Thus, resolution of such cases would require that multiple disease mechanisms and/or disorders in multiple organ systems be considered. A second factor is the ambiguity of the cues obtained during the early part of the workup: many cues are inherently nonspecific (e.g., weakness and fatigue), and even relatively specific cues (e.g., substernal pain) may be compatible with multiple disease mechanisms and/or multiple organ system involvements. The feature functional relationships is more likely to be absent from a set of initial problem formulations than any of the other three features. There is a con— siderable degree of variability across subjects with respect to this feature, with some subjects showing a much greater tendency to hypothesize functional relationships than others. With respect to films, the data appear to suggest that for many cases (e.g., films 2,3,4,6,7) task environment variables are powerful enough to elicit consistent outcomes across all 123 physicians (i.e., either all subjects hypothesized func- tional relationships or none did), while for other cases (e.g., films 1,5,8) the occurrence of this feature is likely to vary and may be largely a function of individual dif- ference factors. One task environment variable which appears to influence the occurrence of this feature is the age of the patient: several physicians noted when the patient is older than forty (e.g., films 2,4,7) he is more likely to have multiple disorders that are func- tionally related. Size and Organization The size of a set of initial problem formulations may be measured in terms of two variables: (1) the number of problem formulations it contains, and (2) the number of subspaces it contains (see Appendix G). Table 8 presents the mean and range on these variables, by film (across subjects) and by subject (across films). For films, the average number of problem formulations ranged from 3.5 to 8.8, and the average number of subspaces from 1.3 to 5.0. For subjects, the average number of problem formulations ranged from 3.6 to 7.8, and the average number of subspaces from 2.6 to 4.3. The two measures of the size of a set of problem .formulations are highly correlated: a productsmoment coefficient of .70 with film as the unit, and a coefficient 124 TABLE 8.--Number of Problem Formulations, and Number of SubSpaces: Average and Range by Film, and by Subject. Number of Number of Problem Formulations Subspaces Average Range Average Range Film 1 4.8 3-7 3.2 1-5 2 8.2 5-14 5.0 3-6 3 3.5 3-6 1.3 1-3 4 5.0 3-7 3.7 3-6 5 4.3 3-7 2.0 1-3 6 6.9 4—12 3.4 2-3 7 4.7 3-7 3.8 3-5 8 8.8 4-11 4.3 3-5 Subject A 3.6 3-5 2.6 1-4 B 6.0 4-11 4.3 2-6 C 6.5 3-9 3.8 3-5 D 5.3 3-7 2.8 1-4 E 5.3 3-7 2.6 ‘1-5 F 5.7 3-11 2.9 1-5 G 7.8 3-14 4.3 2-6 H 7.0 3-10 3.5 1—6 of .78 with subject as the unit. These correlations indicate that the two measures have a very sizable proportion of variance in common. Thus, the question arises as to the rationale for considering that the two measures pertain to distinct psychological entities. The rationale for this distinction derives from an evaluation of the data in Table 8 in terms of the research literature on the role of organi— zation in memory. Research by Mandler (1967) has indicated that a sub- ject typically organizes and stores items in terms of (5i2) 125 categories. Moreover, there is some evidence (Wortman, 1972; Wortman and Kleinmuntz, undated) that Mandler's (5&2) parameter applies to the information-processing behavior of physicians. An examination of Table 8 reveals that the number of initial problem formulations a physician generates may in some instances considerably exceed the storage capacity of working memory (e.g., films 2,6,8: subjects B,F,G,H). However, the number of subspaces generated for a given case, or by a given subject, never exceeds six. Thus, it would appear that however many problem formulations a physician generates, the maximum number of subspaces into which these formulations are grouped is consistent with the upper bound of the parameter that has been found to govern the storage of information in working memory. This finding would seem to attest to the psychological reality of the subspace as the superordinate unit in a set of problem formulations. How many subspaces a physician generates (within the limit imposed by memory capacity) is probably a function of both individual difference variables (e.g., his know- ledge of relevant medical content) and task environment variables (e.g., the complexity of the case). It was possible to identify one task environment variable which appears to have influenced the performance of this phy- sician sample: namely, the number of different organ systems to which the patient's complaints pertained. When 126 this variable is correlated with the minimum number of subspaces generated for each film, a product-moment coef- ficient of .72 is obtained. Thus, it would appear that while memory capacity imposes a limit on the maximum number of subspaces a physician generates, the number of organ systems to which the patient's complaints pertain is one task environment variable that governs the minimum number of subspaces generated. Let us now consider the way in which problem formu- lations are organized into subspaces. Examination of the physician data reveals that problem formulations are never evenly distributed across subspaces. In the typical case, e.g., a problem space containing two to five subspaces, there are usually several subspaces containing only one problem formulation, and several other subspaces containing from two to at most four problem formulations at the same level of specificity. The subSpaces that are hierarchically elaborated typically include only two (or at most three) levels of specificity. Thus, the number of units included in a subspace never exceeds, but in many instances does fall considerably below, the (5i2) parameter proposed by Mandler. There are several factors which may account for this finding. (1) In some instances, the subspace category may be at a level of specificity which does not admit further hierarchical elaboration of diagnostic relevance (e.g., the subspace "diabetes mellitus" in Figure 3). 127 (2) In other instances, it would be possible to generate a hierarchy of subordinate formulations, but the current data base is so limited with respect to that subspace that further hierarchical elaboration would be fruitless (e.g., the subspace "cardiovascular problem" in Figure 3). ggnclusions On the basis of the preceding analyses, several tentative conclusions may be proposed regarding the struc- ture of a set of initial problem formulations: l. The subspace is the superordinate unit in a set of problem formulations. Typically, there are two to five such units. 2. In the typical case, some subspaces contain 2-4 hierarchically organized formulations, while other subspaces, contain only a single formulation. 3. In virtually every case, there are competing formulations at the level of subspace categories and/or at the level of specific formulations within subspaces. 4. In some cases, there may also be functional relationships linking subspaces and/or specific problem formulations. Processes Involved in Generating Initial Problem Formulations As described in Chapter III (p. 56), two types of data relevant to problem formulation processes were col- lected: (l) retrospective recall data, (2) process checklist 128 data. This section will present findings that were derived from an analysis of each type of data. For each film the subjects' recall protocols were collated by means of a process summary sheet. This sheet summarized, in outline form, the time sequence of mental events that were reported by the physicians, including a notation of the number of subjects reporting each event. A review of the process summary sheets for all eight films yielded several observations regarding the processes underlying the generation of initial problem formulations. The discussion of these observations will be organized so as to indicate how problem formulation processes are related to each of the structural features described in the first section of this chapter. generation of a hierarchy of problem formulations.-- Kleinmuntz (1968; Wortman and Kleinmuntz, undated) has proposed that the diagnostic process is characterized by hierarchical search which proceeds from general problem formulation categories to increasingly specific diagnostic formulations. The data from the present study indicate that a physician's initial problem formulations cannot be characterized as either highly general or highly specific. In fact, a set of initial problem formulations typically includes hierarchies of formulations at various levels of specificity. Moreover, the data from the physicians' recall protocols indicate that the elaboration of a problem 129 formulation hierarchy may proceed in three ways: (1) from general to specific; (2) from specific to general; (3) gener- ation of general and specific formulations virtually simul— taneously. Each of these processes may be illustrated by examples from the recall protocols. Example of process 1: general to specific In Viewing film 1, nearly all physicians generated the general formulation of 'GI disorder' on the basis of the patient's complaint of abdominal pain. Subse- quently, when the patient mentioned having diarrhea (with blood and mucous in his stools), they generated the more specific formulations of 'ulcerative colitis' and/or 'regional enteritis.‘ Example of process 2: specific to general In viewing film 6, nearly all physicians generated the specific formulations of 'rheumatoid arthritis' and/or 'ankylosing spondylitis' relatively early on the basis of certain cues (i.e., stiff back in a.m., but loosens up during the day; back pain plus dyspnea). Subsequently, when it was learned that the patient also has knee and ankle inflammation, they generated the more general formulation of 'polyarthritis.‘ Example of process 3: general and specific simultaneously In viewing film 5, nearly all physicians generated the general formulation of 'acute infectious illness' very early on the basis of the cues fever, headache and sleepiness of three days duration. They noted that almost simultaneously they thought of several specific types of infectious illnesses (e.g., 'viral flu,‘ ’ 'infectious mononucleosis,‘ 'infectious hepatitis') that are highly prevalent in a college student population. To summarize: it is necessary to distinguish between the processes of generating a problem formulation hierarchy, and the product of these processes. While the product may be represented as a general-to—specific hierarchy of formu- lations, the process of generating the hierarchy may take one of three forms. For a given case, however, there was 130 a substantial degree of consistency (across subjects) in the process reported. Thus, it would appear that task variables, rather than individual difference variables, were the major determinants of which process was employed. Generation of competing formulations.--The recall protocols provided evidence of two types of processes underlying the generation of competing formulations: (1) generation of competitors at a single point in time on the basis of the same set of cues; (2) generation of competitors over several points in time on the basis of different cues. Examples from the recall protocols will serve to illustrate each of these processes. Example of process 1: In viewing film 1, nearly all physicians generated a list of competing formulations (e.g., 'ulcerative colitis,‘ 'regional enteritis,‘ 'intestinal malignancy') at almost the same point in time, on the basis of the patient's report of diarrhea with blood and mucous. Example of process 2: In viewing film 8, all physicians generated the formulation of 'GI disorder' early in the interview on the basis of the cues abdominal pain and vomiting. Much later, they generated the formulation of 'diabetes,‘ as a competitor to ‘GI disorder,‘ on the basis of the patient's report of increased appetite, thirst and urination. It is probable that the associative mechanisms underlying the above processes are quite different. In the case of the first process, we may hypothesize two types of under- lying associative mechanisms: (1) association from cue(s) to a list of competing formulations; (2) association from cue(s) to one formulation, and from this formulation to 131 another competing formulation, and so on. In the case of the second process, we may hypothesize an associative mechanism of the following sort: association from one set of cue(s) to a formulation; association from another set of cue(s) to another formulation; associative link-up of the two formulations as competitors. As was the case for hierarchical problem formulations, diverse associative processes may result in the same product, i.e., a set Of competing formulations to be stored in memory. Generation of multiple subspaces.--Evidence from the recall protocols indicated that there are two types of processes underlying the generation of multiple subspaces: (1) generation of multiple subspaces at a single point in time on the basis of the same set of cues, and (2) generation of multiple subspaces at several points in time on the basis of different cues. Examination of the recall data revealed that there are several task variables which appear to govern the generation of multiple subspaces. The first process listed above generally occurred under two circumstances: (a) when the patient's complaint was of a general or multisystem nature, and (b) when the location of a specific complaint indicated that several organ systems could be involved. The second process listed above generally occurred when complaints, reported at different points on the inter- view, pertained to different organ systems or implied dif- ferent disease mechanisms. These generalizations may be illustrated by the following examples. 132 Examples of process la: In film 1 the patient's presenting complaint of fatigue and weakness is highly general and could be compatible with diverse organ system involvements and/ or disease processes. Thus, some physicians generated formulations belonging to two distinct subspaces: (1) 'anemia,‘ and (2) 'cardiovascular problem.‘ Examples of process lb: In film 2 the patient indicates a substernal location of her chest pain. On the basis of this cue, all physicians generated formulations belonging to two subspaces: (1) 'cardiac problem,‘ and (2) 'upper GI problem.‘ Example of process 2: In film 7 the patient's presenting complaints of two days duration led all physicians to generate formulations belonging to the subspace 'acute respiratory infection.’ Subsequently, when the patient reported symptoms of several years duration, they generated formulations in two additional sub- spaces: (1) 'chronic respiratory problem,‘ and (2) 'cancer.‘ When multiple subspaces were generated by processes la and lb they were usually competitors, whereas multiple subspaces generated by process 2 most often pertained to disorders which were not mutually exclusive (i.e., some sort of functional relationship between subspaces, was hypothesized, or the subspaces pertained to concommitant but unrelated disorders). Generation of functional relationships.--The data from the recall protocols indicate that, as each new problem formulation is generated, the physician generally considers how it might be functionally related to the formulations he has previously generated. Thus, functional relationships between problem formulations are usually not hypothesized until the physician has generated at least two noncompeting 133 formulations. However, this is not always the case. In viewing film 2, for example, several physicians noted that because of their initial impression of the patient (i.e., a middle-aged, obese, anxious appearing woman), they expected from the outset that she might have multiple, interrelated problems of both an organic and psychological nature. Having presented and discussed the findings from the analysis of the recall protocol data, let us now con- sider the findings from the analysis of the process check- list data. As indicated in Chapter III (p. 54), the items in the process checklist pertained to the following aspects of the act of generating initial problem formulations: 1. modes of mental representation; 2. strategies of problem formulation, including, a. initial routines, b. general strategies; 3. associative processes of problem formulation; 4. cue utilization. The classification of items according to the above topics is presented in Table 9 (p. 135). The analysis of the checklist data was designed to determine, for each item: (a) its overall importance as a characteristic of the act of generating initial problem formulations, (b) its stability with respect to subjects (across tasks), (c) its stability with respect to tasks (across subjects). 134 The first step in the analysis was to construct a subject X task (i.e., film) data matrix for each item. In the 49 matrix cells for which data were available, a l was entered to indicate that the nth subject checked the item on the tth task. The overall importance of an item as a character- istic of the act of generating problem formulations was measured by the relative frequency with which it was checked: i.e., (the number of cells in the item matrix with an entry of l)/49. The results of these calculations are presented in Table 9. The subject stability and task stability of each item was measured in terms of the following criteria: 1. subject stability: an item was considered to be a stable characteristic of a subject‘s performance if the item was checked for all but one of the films he viewed; 2. task stability: an item was considered to be a stable characteristic of performance on a given task if it was checked by all but one of the subjects who had viewed the film. A l was entered in the margin(s) of the item matrix for each subject, or task, which met the stability criteria defined above. In order to determine the proportion of cell entries which could be accounted for by using the stability criteria defined above, the following formula was employed: 135 TABLE 9.--The Relative Frequency with which Each Process Checklist Item was Checked. Category Item No. and Description Rel. Freq. I. Modes of Mental 1. mental list .84 Representation 7. mental image--general .49 21. mental image--anatomical .45 23. mental image--previous patient .29 II. Strategies of Problem Formu- lation A. Initial 16. organic vs. psychogenic .22 Routines lO. assume organic .63 18. acute vs. chronic .47 ll. localize organ system .59 B. General 12. incidence .73 Strategies 19. incidence, plus complaints .71 2. seriousness .45 4. pathophysiological processes .35 17. convergence .24 9. divergence--(l) .35 24. divergence--(2) .69 3. quick "rule cuts" .43 III. Associative 8. association--salient cue .84 Processes of 15. association--combination Problem Formu- of cues .82 lation IV. Cue 25. focus on verbal cues .6l Utilization 20. focus on nonverbal cues .39 6. impression of patient .71 13. presenting complaint--more weight .43 14. selective focus on cues .24 22. interrelate cues progressively .88 5. store cues, interrelate later .65 Nt - Ne Nt where Nt = the total number 0f cells in the item matrix, i.e., 49 Ne = the number of cells in the item matrix whose entries deviated from those that would be predicted on the basis of the entries in either matrix margin (i.e., a cell entry of 1, but no entry in either the subject or task margin; or, conversely, no cell entry, but a l in either the subject or task margin). The coefficients for each item calculated according to this formula ranged from 65.3% to 89.8%, with an average of 81.8% across all 25 items. Thus, in general, the criteria adopted for measuring subject and task stability accounted for a very large percentage of the observed responses. In order to summarize the data on item stability with respect to subject and tasks, each item was classified along two crossed dimensions: (1) degree of subject stability: i.e., the number of subjects for whom the item was a stable characteristic of performance across tasks, and (2) degree of task stability: i.e., the number of tasks for which the item was a stable characteristic of performance across subjects. Table 10 presents the results of this classification. The results of the analysis of the checklist data, presented in Tables 9 and 10, will now be discussed with respect to each of the topics listed on page 133. 137 TABLE lO.--Classification of Checklist Items on Two Dimensions: (a) Degree of Subject Stability, and (b) Degree of Task Stability. Degree of Task Stabilitya 8 7 6 5 4 3 2 1 0 8 8 7 1 15 Degree 6 22 of Subject b Stability 5 6 4 10 12,19 5 24 3 11,25 2 2,13I3,7 9,18 l 21 14,20 23 ‘ ' 4,16 0 17 aNumber of tasks on which the item was a stable characteristic of performance across subjects. bNumber of subjects for whom the item.was a stable characteristic of performance across tasks. NOTE: Entries in the cells are the item numbers (see Table 9). 138 Modes of mental representation.—-This topic was concerned with two modes of mental representation: (1) the verbal, and (2) the figural. Four checklist items (numbers 1, 7, 21 and 23) were of relevance to this topic. The data in Tables 8 and 9 suggest the following conclusions regarding the relative importance of verbal versus figural modes of mental representation in generating problem formulations. The generation of mental lists of problem formu- lations (item 1) occurred a very high proportion of the time (.84), and was found to be a stable characteristic of nearly every subject's performance on nearly every task. The generation of mental images occurred considerably less frequently (.49, .45 and .29, for items 7, 21 and 23, respectively). Moreover, the occurrence of mental images showed a very low degree of stability with respect to subjects or tasks: there were only two subjects who con— sistently reported mental images, and only one task for which mental images were consistently reported. Examination of the relative frequency scores for the three mental image items, revealed that when images are generated, they generally pertain to the anatomical location of the patient's problem (item 21), and, less frequently, may consist of an evocation of a previous patient (item 23). In sum, we may conclude that the generation of problem formulations is typically carried out in a verbal mode of mental repre‘ sentation, but that for a few individuals, or occasionally 139 for all individuals on certain tasks, mental imagery ac— companies the predominantly verbal train of thought. Strategies of problem formulation: initial routines.--This topic was concerned with the occurrence of what may be termed "initial routine strategies" for the generation of problem formulations. The items of relevance to this topic were designed to determine whether one of the physician's first steps in the problem formu- lation process was: (1) to consider the patient's com- plaint(s) in terms of an organic versus psychogenic distinction (items 10 and 16); (2) to consider the patient's complaint(s) in terms of an acute versus chronic distinction (item 18); (3) to localize the patient's complaint(s) in terms of an organ system (item 11). All three of these strategies pertain to highly general principles of problem formulations. Thus, this topic is also concerned with whether the physician begins the process of generating problem formulations at a high level of generality. The data in Table 9 and 10 suggest the following conclusions regarding initial routine strategies. Examination of the relative frequency scores reveals that the physician does not generally begin by trying to make an organic versus psychogenic destination (item 16, .22); most often he tends to assume that the patient's problem is organic (item 10, .63). Item 16 was not a stable characteristic of any subject's performance of performance 140 on any task. On the other hand, item 10 was consistently checked by four subjects (across tasks), and on five tasks (across subjects). The relative frequency scores of item 18 (.47) and 11 (.59) indicate that these two routines occurred with more frequency than the first routine (item 16), but, like the first routine, showed little or no subject or task stability. In sum, we may conclude that initial routines involving highly general distinctions are not typically the first step(s) in the process of generating problem formulations. Only a few individuals consistently follow such routines, and there are few tasks which con- sistently prompt their use. General strategies of problem formulation.--In contrast to the previous topic which was concerned with the physician's initial strategies, this topic is con- cerned with the strategies that the physician may use throughout the entire 4-6 minute encounter. Four items included under this topic sought to determine whether the physician follows a strategy based upon: (1) consideration of disease incidence (items 12 and 19); (2) consideration of disease seriousness, i.e., its life-threatening impli- cations (item 2); (3) consideration of pathophysiological mechanisms (item 4). Three of the items sought to determine whether the physician follows a convergent strategy, i.e., attempts to come up with one problem formulation that will account for all the data (item 17), and/or either of two 141 divergent strategies: (1) a "brainstorming" strategy of attempting to think of as many formulations as possible that fit the cues (item 9), or (2) a more modest divergent strategy of attempting as each formulation is generated to think of other possible formulations (item 24). There was also one item (number 3) designed to determine whether the physician had already, within the 4-6 minute interview, performed some quick "rule outs" of certain problem formu- lations. The data relevant to these items suggest the following conclusions. Consideration of disease incidence is relatively more important than consideration of disease seriousness in determining the type of problem formulations a physician generates (relative frequency scores of .71 and .73 for the incidence items versus .45 for the seriousness item). Consideration of incidence was consistently checked by four physicians, and on four tasks, while consideration of seriousness was consistently checked by two physicians, and on only one task. Thus, we may conclude that although the physician may not give consideration to either of these factors, he is relatively more likely to direct his search toward diseases of high incidence (which are usually not very serious) than toward diseases of great seriousness (which usually have low incidence). The item pertaining to consideration of pathophysi- ological processes (17) was checked a relatively small 142 proportion of the time (.35). Since knowledge of patho- physiological processes is considered to be one of the foundations of clinical medicine, this result is somewhat surprising. It may be that in the experienced physician the utilization of such knowledge is so well established (routinized) that he is no longer consciously aware of its use in generating problem formulations. On the other hand, it is also possible that the generation of problem formu- lations is essentially a cue-to-disease associative mechanism which does not require consideration of the pathophysiology underlying diseases processes. This latter hypothesis receives some support from data to be discussed under the topic "associative processes of problem formulation." Examination of the data for the items on convergent versus divergent strategies of problem formulation reveals that the convergent item was checked quite infrequently (item 17, .24), while one divergent item was checked relatively frequently (item 24, .69) and the other was not (item 9, .35). The convergent item was not consistently checked by any subjects, or on any task. The data for the divergent items indicate that item 24 was consistently checked by four subjects, and on four tasks, while item 9 was checked consistently by only two subjects, and on none of the tasks. It is not surprising that experienced phy- sicians rarely follow a convergent strategy of problem formulation during the initial 4-6 minutes of the workup. 143 To do so would entail the risk of premature closure: i.e., acceptance of a formulation which may be intellectually appealing (because it can account so parsimoniously for the available data), but possible incorrect. Divergent strategies of problem formulation would of course help to counteract any tendency toward premature closure. Of the two divergent strategies that were considered, one (the strategy of attempting each time a formulation is generated to think of other formulations) was consistently emplOyed by a sizable number of physicians, and on a sizable number of tasks, while the other (a "brainstorming" strategy of attempting to think of as many causes of the patient's symptoms as possible) was not. The less frequent use of the second divergent strategy may be due to the risks which brainstorming could entail: e.g., information over- load taxing the capacity of working memory; inefficient data collection to test numerous potential but impulsible hypotheses. The item on quick “rule outs" was checked with a relative frequency of .43. However, it was checked con- Sistently by only two subjects, and was not consistently checked for any tasks. Thus, we may conclude that while a few physicians consistently rule out some problem formu- lations generated during the first 4-6 minutes of a workup, most physicians retain all formulations they generate as components of their initial problem space. 144 Associative'processes of problem formulation.--The items under the two previous topics were designed to deter- mine whether the physician attempts to follow various strategies of problem formulation. The items under this topic were designed to determine whether the act of generating problem formulations entails associative processes, i.e., rapid cue-to-problem formulation retrieval, essentially outside the realm of conscious search. There were two items of relevance to this topic. They sought to determine whether problem formulations were immediately brought to mind: (1) by some "particularly salient cue" (item 8), and/or (2) by a combination of cues (item 15). Both of these items were checked a very high proportion of the time (.84, and .82, respectively). One of the items (8) was checked consistently by all eight subjects, and on seven of the tasks, thus showing a higher degree of subject and task stability than any other item on the check- list. The other item (15) was checked consistently by seven subjects, and on five tasks; thus, it was among the top four items with respect to degree of subject and task stability. When the data for these items are compared with the data on problem formulation strategies, we are led to conclude that the generation of problem formulations is largely an associative process. Search strategies are employed, more or less frequently depending on the individual 145 and the case, but appear-to be adjuncts to the primary process of associative retrieval. Moreover,-the finding that generation of problem formulations is more-consistently based on single salient cues, than on combinations of cues, would appear to indicate that the physician's long-term storage of potential problem formulations categories (i.e., disease processes) may be indexed in terms of a very small number of pathognomonic cues for each category, rather than in terms of a complex system of multiple-entry, cross- referenced cues. In sum, the checklist data tend to support the notion that the generation of diagnostic problem formu— lations consists primarily of rapid cue-to-problem category associative retrieval. As Barrows and Bennett (1972) noted in their study of neurologists, hypotheses seem to literally "pop" into the head of the clinician. Cue utilization.--The items under this topic were designed to measure several aspects of the physician's behavior with respect to detecting, interpreting, and utilizing cues: (1) whether he focuses on verbal cues (item 25) or nonverbal cues (item 20); (2) whether he gives more weight to the patient's presenting complaint then to subsequent cues (item 13); (3) what role his initial impres- sion of the patient plays in interpretation of the reliability of cues (item 6); (4) whether he focuses his attention on certain cues and pays less attention to others (item 14); (5) whether he attempts to look for relationships among 146 cues progressively as-each cue is presented (item 20), or stores cues and attempts to interrelate them after obtaining data on each major complaint (item 5). The data for these items suggest the following conclusions regarding cue utilization. First, although in many instances the physician does not selectively focus on either verbal or nonverbal cues, there is a greater tendency to focus on verbal than on nonverbal cues (relatively frequency scores of .61 and .39, respectively). Focusing on nonverbal cues (item 20) was a stable characteristic of only one subject (and no tasks), while focusing on verbal cues (item 25) was a stable characteristic of three subjects (and two tasks). However, the data for item 6 indicated that physicians generally do make use of nonverbal data to form an early impression of the patient (his personality, intelligence, background, etc.), which is used as basis for judging the accuracy and objectivity of the symptoms the patient reports. This item had a relative frequency score of .71, and was consistently checked by five subjects, and on three tasks. Thus, it appears that some physicians make greater 'use of their initial impression of the patient than others; and that some patients provide more of a basis for doing so (e.g., have more salient nonverbal characteristics) than others. 147 The items dealing with giving more weight to the patient's presenting complaint than to other cues (item 13), and with giving more attention to certain cues than to others (item 14) had moderate to low relative frequency scores (.43 and .24, respectively) and were consistently checked by only one or two subjects, and on only one (or none) of the tasks. Several physicians' comments in their recall protocols suggest a reason why the physician does not typically give more weight to the patient's presenting complaint: they noted that with some patients there may be a "hidden agenda" of medical problems which must be uncovered by careful questioning, and which may prove to be more important than the presenting complaint. The two items dealing with the manner in which the physician attempts to interrelate cues were both checked relatively often (item 22, .88; item 5, .65). In construct- ing the checklist, these two items were considered to describe two contrasting strategies for dealing with cues. However, the data reveal that four subjects consistently checked both of them. Several subjects'comments in discussing the checklist items with the eXperimenter provided an _ indication as to why this occurred. They noted that although they do not consciously adhere to either of the strategies described by the two items, both items do refer to aspects of their processing of cues and were therefore checked. They noted that it is not a matter, as the items imply, of 148 "trying" (or "not trying") to relate each new cue to previous cues; rather, they suggested, relationships among cues simply "come to mind," usually in a progressive manner as each new cue is obtained, but sometimes at a later point in time. Conclusions Several tentative conclusions may be drawn from the analysis of the recall protocol and checklist data regarding the processes involved in the act of generating initial problem formulations. l. The mode of mental representation involved in the generation of problem formulations is, for all phy- sicians, predominantly verbal. For most individuals mental imagery occasionally occurs, and for a few individuals such imagery consistently occurs. But, for all individuals, mental imagery appears to be an adjunct to the primary verbal mode of representation. 2. Although physicians show some tendency to focus on verbal cues, they do make use of nonverbal cues. One major use of such cues, consistently reported by over half of the physicians, is to form a general impression of the patient that will aid him in judging the accuracy and - objectivity of the cues reported verbally by the patient. 3. The generation of problem formulations appears to be primarily a process of direct associative retrieval, rather than one of strategy-guided search. The checklist 149 data indicated that only two strategies were consistently employed by at least half of the physicians: (l) focusing on diseases of high incidence for the patient's demographic group, and (2) attempting to think of alternatives (or competitors) to each formulation generated. On the basis of the recall protocol data, it was possible to identify various processes underlying the generation of each of the four structural features of a set of initial formulations. In general these processes appeared to be governed by the effect of various task variables on associative retrieval, rather than by consistent use of strategies on the part of the subject.- In particular, it may be noted, the data failed to support Kleinmuntz's (1968) notion that the physician follows a "general-to-specific" strategy, at ' least so far as the generation of initial problem formu- lations is concerned. To summarize: both the recall protocol and checklist data suggest that the physician's information-processing activity during the early part of the workup consists primarily of associative retrieval of problem formulation labels on the basis of cues, and that this process is mediated to only a limited extent by . search strategies. CHAPTER V RESULTS OF THE TRAINING EXPERIMENT This chapter presents the results of the training experiment conducted with second-year medical students. It includes two major sections: (1) results of the analyses conducted to test the experimental hypothesis; and (2) results of several supplemental analyses conducted to aid in interpreting the outcomes of the hypothesis tests. Tests of Experimental Hypotheses Reliability Estimates Two types of reliability estimates were calculated on the basis of the posttest data: (1) estimates of inter- scorer reliability in employing the scoring keys for the CUE, PF, CUE-PF and R—PF variables; and (2) coefficients of generalizability for the CUE, PF, CUE—PF and R-PF variables. Inter-scorer reliability.—-A random sample of six subjects' posttest responses were scored independently by the experimenter and a second person. Since the experi- menter would ultimately score all subjects' posttest responses, this procedure was not undertaken in order to estimate the effect of inter-scorer variability (as a source 150 151 of error) on the final set of scores. Rather it was carried out as a means of determining the objectivity of the scoring keys that had been devised for each variable. A basic' judgmental operation was required in utilizing each key: namely, a judgment as to whether a given component of the subject's response was equivalent to some item in the scoring key. Once such a judgment had been made, the other operations in utilizing the key were largely mechanical. Thus, it was the consistency of these judgments that was of primary concern in estimating inter-scorer reliability. 1. The intraclass correlation coefficient (Ebel, 1951) was used to estimate the inter-scorer reliability of six subjects' scores on the variables CUE, PF, CUE-PF and R—PF. These coefficients are presented in Table 11. As shown in the table, the coefficients for the variables PF, CUE and CUE—PF are nearly 1.0. The coefficient for the R-PF TABLE ll.--Inter-Scorer Reliability Coefficients on the Variables CUE, PF, CUE-PF and R—PF. Intraclass Index of Variable Correlation Agreement CUE .97 -— PF .99 -- CUE-PF .97 -- R-PF .09 .83 152 Variable, however, was quite low. Inspection of the data revealed that this was due to a high degree of restriction in range on this variable, rather than a substantial degree of divergence between scorers. It was therefore decided to employ a formula proposed by Holsti (1969, p. 140) for estimating percent of agreement in conducting a content analysis. Since the determination of a subject's R—PF score was based on an analysis of the text of his tentative assessment, a content analysis index was deemed to be appropriate for estimating inter-scorer agreement on this variable. As indicated in Table 11, the index of inter- scorer agreement on the R-PF variable was very high. Thus, it may be concluded that on all four of the dependent variables scorer would not constitute a source of unrelia- bility in the data. Generalizability coefficients.--In order to deter- mine the degree to which it would be possible to generalize from subjects' performance on the two posttest tasks to a hypothetical pOpulation of randomly equivalent tasks in the domain of internal medicine, generalizability coeffi- cients were calculated for each dependent variable on the basis of all 48 subjects' posttest; scores. These coeffi- cients were calculated by means of Hoyt's (1941) analysis of variance technique (which is equivalent to Cronbach's (1951) coefficient alpha). However, because the coefficients were being calculated on experimental posttest. data, the 153 formula was modified so as to exclude between-group vari- ation. This was necessary in order that estimation of the generalizability of the instrument would not be contaminated by treatment effect. The formula used was as follows: M MS SS:G " TS:G MSS:G where MSS.G = mean square for subjects within groups MSTS=G a mean square for subjects X task inter- action Within groups The obtained coefficients are presented in Table 12. This table also presents the within-group correlation coefficients between scores on the two posttest tasks. As shown in the table, the generalizability coefficients ranged from .48 to .73. Given the fact that the posttest included only two tasks, and that the tasks were selected. so as to represent very different cases (with respect to type of medical complaints and patient demographic charac- teristics), the magnitude of the generalizability coeffi- cients is quite substantial. In fact, as compared to coefficients obtained by other investigators in the field, the coefficients are very high. Lewy and McGuire (1966), for example, obtained coefficient alphas of .35, .27, .21 and .10 for proficiency scores on tests including two tasks (i.e., Patient Management Problems) in each of four domains. 154 TABLE 12.--Generalizability Coefficients, and Within-Group Correlation Coefficients (Between Tasks) on the Dependent Variables CUE, PF, CUE—PF and R-PF. Generalizability Correlation Variable Coefficient Between Tasks CUE .73 .67 PF .66 .50 CUE-PF .55 .41 R—PF .48 .31 ‘In sum, given the very high degree of inter-scorer reliability, and the substantial magnitude of the generali- zability coefficients, it was concluded that the posttest instrument had sufficiently strong psychometric properties to permit detection of treatment effects by inferential hypothesis tests. Results of Hypothesis Tests The experimental hypotheses were tested in the following manner. First, a multivariate analysis of covariance (MANCOVA) was conducted in order to determine whether there was a significant main effect for treatment as measured by the set of four dependent variables. Second, stepdown F ratios obtained from the MANCOVA, as well as univariate F ratios obtained from an analysis of covariance (ANCOVA) on each dependent variable, were tested in order to identify the dependent variable(s) on which a significant 155 treatment effect occurred. Third, for each variable having a significant univariate F ratio, the Scheffé Egg; hgg confidence interval procedure was used in order to test for significant differences between each pair of experi- mental conditions. Although the experimental hypotheses were stated in directional form, an omnibus nondirectional test of the null hypothesis for treatment was conducted in order that significant differences in either the predicted direction or the opposite direction would be detected through EQgEIQQQ comparisons. Both the MANCOVA and the ANCOVAS were conducted using the Finn (1970) program on the Michigan State CDC 6500 Computer. The Scheffé pest hgg' comparisons were computed by the experimenter using the formulas presented in Glendening (1973). For one subject a score on the covariate (Focal Problems Exam) was not available. Thus, prior to the analysis, this subject's score was estimated by means of the following regression equation: 9 = bO + blxl + b2x2 + b3x3 + b4X4 Where 9 = estimated Focal Problems Exam score X1 = PF score X2 = CUE score X3 = CUE-PF score X4 = R-PF score 156 The estimated exam score for this subject was then used in the analysis. The observed means and standard deviations on the four dependent variables and on the covariate for each experimental condition are presented in Table 13. The means on the dependent variables adjusted for their rela- tionship to the covariate are presented in Table 14. Examination of the means in Table 14 reveals that on the variables CUE, PF and CUE-PF the Treatment I means are consistently higher than the Treatment II means, and the latter are consistently higher than the control means. On the variable R-PF, the means for Treatment II and control are very close, while the mean for Treatment I is higher than the other two. Examination of the standard deviations in Table 13 indicates a higher degree of variability under the control condition than under the treatment conditions, especially on the variable CUE-PF. The analysis of covariance fixed effects model is based on the following assumptions: (1) independence of observations on the dependent variable; (2) normality of the conditional population distributions; (3) equality of the conditional population variances; (4) equality of the population regression slopes. Empirical studies (reviewed in Glass, et al., 1972) have demonstrated that fixed effects ANCOVA F tests are robust with respect to violation of assumptions (2) and (3), providing that sample size is 157 TABLE l3.--Means and Standard Deviations on the Dependent Variables and Covariate,by Experimental Condition. Experimental Condition Variable Treatment I Treatment II Control CUE 79.88 76.44 72.31 ( 7.91) ( 9.38) (11.45) PF 60.19 50.56 39.38 (12.59) ( 8.07) (13.77) CUE-PF 84.50 75.25 57.13 (13.84) (18.11) (24.24) R-PF 9.44 6.31 7.00 ( 3.63) ( 3.38) ( 4.65) Exam 73.63 71.88 70.25 ( 7.82) ( 6.14) ( 7.32) Experimental Condition. TABLE 14.--Adjusted Means on the Dependent Variables, by Experimental Condition Variable Treatment I Treatment II Control CUE 79.52 76.45 72.66 PF 59.46 50.58 40.09 CUE-PF 83.49 75.27 58.12 R—PF 9.10 6.32 7.33 158 moderately large (10 or more subjects per cell), and there is an equal number of observations per cell. Since both of these conditions were met in the present study, possible violation of assumptions (2) and (3) does not pose a threat to interpretation of the F statistics. Glass, et al., point out that little is presently known about the effect of violation of assumption (4) when the covariate is a random variable. However, they suggest that unless the departure from homogeneity of regression slopes is extreme, the effect on the F statistic is probably minimal. In order to determine whether this assumption was met, a test for homogeneity of regression was conducted for each dependent variable. The results of these tests, reported in Appendix I, indicated that for the variables PF, CUE-PF, R-PF, there was no evidence of departure from homogeneity of regression. Although the test for the variable CUE did indicate a lack of homogeneity, it was believed, upon further analysis, that this result could be attributed to a spurious negative regression weight for one group (Treatment II), and thus constituted a Type I error (see Appendix I for a more detailed discussion of the data relevant to this point). With respect to the first, and most crucial, of the ANCOVA model assumptions, it is believed that the experimental procedure included sufficient precautions to guarantee that this assumption was met. Although practical con- straints required that subjects be assembled in groups for 159 training and testing, all instructions and materials were administered by means of individual booklets in self- instructional format. Thus, it is believed that subject may be legitimately considered as the unit of analysis. The multivariate analysis of covariance included one fixed independent factor (experimental condition) having three levels, with 16 subjects nested within each level, one covariate (Focal Problems Final Exam), and four dependent variables (CUE, PF, CUE-PF, R-PF). The ordering of the dependent variables for the conditional stepdown F tests was based on the following considerations. (1) Since performance on CUE (i.e., the detection and utilization of cues) is a prerequisite for the generation of problem formulation titles (PF performance), the variable CUE was ordered first and the variable PF second. Thus, for CUE the stepdown F test was the same as a univariate F test, while for PF the stepdown F provided a test of treat- ment effect on the generation of problem formulation titles with between-group differences on CUE partialled out. (2) Since the classification of cues with respect to problem formulations titles (CUE-PF performance) is a function of both cues obtained and problem formulation titles generated, CUE—PF was ordered third. Thus, the stepdown F ratio for CUE—PF provided a test of treatment effect on this variable with between-group differences on both CUE and PF partialled out. (3) Since statements of 160 functional relationships between problem formulations is dependent on the subject's prior processing of cues and generation of problem formulations (i.e., performance on CUE, PF and CUE-PF), this variable was ordered last. Thus, the fourth stepdown F ratio tests for treatment effect on R-PF with between-group differences on the other three variables partialled out. The results of the MANCOVA are presented in Table 15. As shown in the table, the multivariate F test of equality of the vectors of adjusted group means was signifi- cant at p < .0052. Thus, for the set of four dependent TABLE 15.--Multivariate Analysis of Covariance, on CUE, PF, CUE-PF, R-PF. F tests df F p Multivariate F test 8, 82 3.0149 .0052 Stepdown F tests on CUE 2, 44 1.9376 .1562 on PF 2, 44 8.2725 .0010 on CUE-PF 2, 44 .4261 .6559 on R-PF 2, 44 1.8334 .1728 variables taken together, there was a significant main effect for treatment. The stepdown F tests yielded the following results: (1) no significant treatment effect on the variable CUE; (2) a significant treatment effect (p < .001) on the 161 'variable PF, conditioned on the variable CUE: (3) no sig- nificant treatment effect on the variable CUE—PF, conditioned on the variables CUE and PF; (4) no significant treatment effect on the variable R—PF, conditioned on the variables CUE , PF , CUE-PF . In order to determine whether the nonsignificant stepdown F ratios occurred: (1) due to nonsignificant differences between group means, or (2) due to the fact that significant differences existed but had been partialled out in the calculation of the conditional stepdown F ratios, univariate F ratios were calculated on each dependent variable. The univariate ANCOVAS involved the same design model as the MANCOVA: namely, one fixed independent factor (experimental condition with subjects nested within levels of condition) and one covariate (Focal Problems Final Exam). The results of the ANCOVA on each dependent variable are presented in Table 16. As shown in the table, there was a significant treatment effect on the variables PF and CUE-PF (p < .0002, and p < .0020, respectively), but no significant treatment effect on the variables CUE and R-PF. The results of the stepdown and univariate F tests indicate the following conclusions regarding each dependent variable. 1. The differences among adjusted group means on the variable CUE are nonsignificant, as tested by a univariate F ratio. 162 TABLE 16.--Univariate Analyses of Covariance on CUE, PF, CUE-PF, R-PF. Dependent Sources of Variable Variation df MS F p CUE Group 2 181.57 1.9376 .1562 Subjects: Group 44 93.71 Total 46 PF Group 2 1446.76 11.0134 .0002 Subjects: Group 44 131.36 Total 46 CUE-PF Group 2 2583.62 7.1966 .0020 Subjects: Group 44 359.00 Total 46 R-PF Group 2 31.05 2.2698 .1154 Subjects: Group 44 13.68 Total 46 2. The differences among adjusted group means on the variable PF are significant, as tested by a univariate F ratio or by a stepdown F ratio with PF conditioned on CUE. Thus, a significant treatment effect on PF is found not only when this variable is tested singly (by a univariate F ratio), but also when between-group variance on CUE is partialled out (by a stepdown F ratio). 3. The differences among adjusted group means on the variable CUE-PF are significant, as tested by a univariate F ratio. However, when CUE—PF is conditioned on CUE and PF (via a stepdown F ratio), differences among groups are not significant. Since significant between- group differences were found on PF but not on CUE, and 163 since the within-group correlation of PF and CUE-PF was .75, the nonsignificant stepdown F for CUE-PF can be attributed to the partialling out of between-group dif- ferences on PF. 4. The differences among adjusted group means on R-PF were nonsignificant, whether measured by a univariate .F ratio, or a stepdown F ratio. The average within-group correlations between the covariate and the dependent variables were .15 for CUE, .26 for PF, .22 for CUE—PF, and .36 for R-PF. Of the four coefficients, only the one for R-PF was found to be sig- nificantly different from zero (F = 6.6840, df = l, 44, p < .01). Given the nonsignificance of the three coef— ficients, and the low magnitude of the R-PF coefficient, we may conclude that the covariate was not effective in increasing the precision of the F tests. For the variables PF and CUE-PF significant univariate F tests were found in spite of the ineffectiveness of the covariate. For the variables CUE and R-PF, it is possible that significant F tests would have been obtained if a more powerful covariate had been available. Having found a significant univariate treatment effect on the variables PF and CUE-PF, the Scheffé Egg; hgg confidence interval procedure was used in order to test for significant differences on these variables between each pair of experimental conditions. Although the more 164 powerful Tukey pggt hog procedure would generally be used when simple pair-wise comparisons are desired and there is the same number of subjects per level of the independent variable, this procedure is not considered applicable following analysis of covariance because the estimates of the regression intercepts do not in general meet the requirement of equal variances and covariances (Scheffé, 1959, cited in Glendening, 1973). The results of the Scheffé pg§5.hgg procedure are presented in Table 17. Similar results were found for both dependent variables: (1) the difference between the two treatment groups is nonsignificant; (2) the difference between Treatment I and the control group is significant at the .001 or .005 level; (3) the difference between Treatment II and the control group is significant at the .05 level. The results on the preceding analysis will now be discussed with respect to each of the experimental hypothe- ses. Hypothesis 1: The average performance of second-year medical students who have received problem formulation training (Treatment I and Treatment II) will be superior to that of students who have not received training (control group), as measured by four dependent variables: (1) CUE score, (2) PF score, (3) CUE-PF score, and (4) R-PF score. 165 TABLE l7.--Scheffé post hoc Comparisons on PF and CUE-PF. Confidence Variable Comparison Interval (l-¢)% PF T1-T2 8.8767 i10.3227 95% Tl-C 19.3699* 116.6644 99.9% T2-C 10.4932* £10.3187 95% CUE-PF Tl-T2 8.2107 i17.0648 95% Tl-C 25.3706* i23.6577 99.5% T2-C l7.1599* i17.0583 95% * Significant group differences at the .05 level or better. NOTE: Comparisons are made on adjusted group means. T1 = Treatment I; T2 = Treatment II; C = Control. The results of the analysis supported this hypothesis with respect to the variables PF and CUE-PF, but not with respect to the other two variables. On the variable CUE there was no significant dif- ference between the treatment group means and the control group mean. If the means on CUE are expressed as a per- centage of the maximum possible score on this variable, it is found that the average performance under all three conditions was high (77.2% for Treatment I, 74.2% for 166 Treatment II, 70.5% for the control group). Thus, we may conclude that the subjects had already attained, prior to the experiment, a high level of ability with respect to detecting cues, and utilizing them to generate at least one problem formulation. On the variable PF the treatment group means each differed significantly from the control group mean. Since the treatment and control subjects did not differ on the variable CUE, the significant differences on PF cannot be attributed to a failure on the part of the control subjects to acquire sufficient cues to generate problem formulations. Thus, we may conclude that the effect of the training was to improve the subject's skill in making use of the cues he obtained in order to generate a thorough and appropriate set of initial problem formulations. On the variable CUE-PF the treatment group means each differed significantly from the control group mean. Thus, the training was also effective in improving subjects' performance on the task of classifying cues with respect to the problem formulation categories of major importance for the case. However, the results of the stepdown F test on CUE—PF indicates that between-group differences on this variable can be attributed to the between-group differences that occurred on PF. Thus, we may conclude that although training significantly improved the subject's performance 167 on CUE-PF, this effect was a function of improvement in the thoroughness and appropriateness of the problem formulations he generated. On the variable R—PF there was no significant differ- ence between treatment group means and the control group mean. Thus, we may conclude that the training had no effect on the subject's ability to hypothesize possible functional relationships among the problem formulations he generated. Hypothesis 2: The average performance of second-year medical students who have received problem formulation training involving outcome and process feedback (Treatment II) will be superior to that of students who have received problem formulation training involving only outcome feedback ( Treat- ment 1), as measured by the four dependent variables: (1) CUE score, (2) PF score, (3) CUE- PF score, and (4) R-PF score. There were no significant differences between the two treatment groups on any of the variables. Thus, the second experimental hypothesis was not supported. Moreover, the direction of observed differences indicated a trend in the opposite direction than that hypothesized: namely, the means for the "outcome feedback only" condition were consistently higher than the means of the "outcome plus process feedback" condition. Except for CUE, the Treatment I group means did not closely approach the maximum possible score on each variable (the PF mean was 50.8% of the maximum, the CUE-PF mean 40.1% of the maximum, and the R-PF mean 168 37.9% of the maximum). Thus, the lack of significant dif- ferences between treatment groups cannot be attributed to a ceiling effect. As indicated in Table 14, differences between the treatment conditions were quite sizable on the variables PF and CUE-PF. It is possible that these dif- ferences would have proved to be significant if the covariate had been more powerful. Thus, we may conclude that while neither treatment was found to be more effective than the other, the results suggest a possible superiority of Treatment I over Treatment II. Further interpretation of the results of the hypothesis tests will be undertaken in the second section of this chapter. Relationships Among Dependent Variables As described in Chapter III (p. 94), each dependent variable scoring key was designed to measure a distinct component of the subject's performance on the posttest task. task. It was assumed that the measures would show moderate positive intercorrelations, but that no correlation would be so high as to indicate that performance on one variable can be fully predicted by performance on any other(s). In particular, it was argued that the ability to classify cues with respect to problem formulation categories (CUE-PF performance) would not be a simple linear function of performance on the two single-dimension variables (CUE 169 and PF). The keys were constructed so that even though two subjects had identical CUE and PF scores they could differ in performance on CUE-PF (e.g., one could show greater flexibility in associating appropriate cues with the multiple problem formulations to which these cues were potentially relevant). The results of the stepdown F tests indicated that, at least so far as between-group differences are concerned, performance on CUE-PF could be predicted by performance on PF. In order to determine if this were also true at the within-group level, a within-group multiple linear regression of CUE-PF on PF and and CUE was carried out. The intercorrelations among the dependent variables, as well as the results of the multiple regression analysis, are presented on Table 18. All correlations, except for that between PF and CUE-PF, were as anticipated: positive and moderate. The results of the regression analysis indicated that a very sizable portion of the variance on CUE-PF could be accounted for by PF and CUE (R2 = .62). The estimates of variance accounted for by step-wise addition of PF and then CUE to the equation indicated that PF alone accounted for 56% of the variance, while the addition of' CUE accounted for only an additional 6% of the variance. Since the multiple correlation coefficient (R = .79) is probably as high as could be attained given the reliability of the measures, we are led to conclude that within—group 170 TABLE 18.--Relationships Among Dependent Variables. Average Within-Group Correlations Among Dependent Variables CUE PF CUE-PF R-PF CUE 1.00 PF .34 1.00 CUE-PF .48 .75 1.00 .R-PF .27 .41 .39 1.00 Multiple Regression of CUE-PF on CUE and PF (Within-group) Regression Partial Weight Correlation PF 1.09 .67 CUE .50 .25 Multiple R Multiple R2 .79 .62 % of Variance Accounted for by Step-wise Addition PF 56% CUE 6% Total 62% performance on CUE-PF is a linear function of performance on PF and CUE. In sum, it appears that once a subject has obtained the cues presented during the workup, and generated a set of problem formulation titles, the task of determining 171 which cues are of relevance to each title does not pose any further difficulty. Supplemental Analyses This section will present a number of supplemental analyses that were conducted to aid in interpretation of the experimental outcomes described in the preceding section. Results of the Additional Posttest Tasks As explained in Chapter III (p. 143» two additional tasks were administered at the posttest session in order to determine whether failure in the processes of cue detection, encoding and retrieval may have inhibited per- formance on the basic posttest. task as measured by the four dependent variables. For reasons discussed previously, the additional tasks pertained only to film 8. It will be recalled from Chapter III that the CUE score was a weighted sum of points obtained for each cue a subject listed under at least one problem formulation title. Thus, high performance on CUE in itself provides evidence that failure in cue acquisition was not responsible for low performance on the PF variable. The additional posttest tasks were administered in order to aid in inter- preting the experimental outcomes in the event that there were significant between-group differences on CUE, and that these differences were predictive of differences on PF.‘ Since performance on CUE was uniformly high across 172 all three experimental conditions, the data from the addi- tional posttest. tasks merely corroborates the conclusions reached in section one of this chapter. The subject's performance on the Recognition of Cues task was summarized in terms of the number of each type of item he checked: (1) number of cues (out of 32); (2) number of consistent distractors (out of 16); (3) number of contradictory distractors (out of 8); (4) number of inconsistent distractors (out of 8). In some instances a subject failed to check a cue on the recognition task even though he had listed it under one of his problem formulations in carrying out the basic posttest task. Since the primary purpose of this analysis was to deter- mine the number of cues the subject had obtained from the film and could have potentially used in generating problem formulations, an additional variable was also calculated: number of cues obtained (i.e., number of cues checked on recognition task + number of cues used on basic posttest task but not checked on recognition task). Group results on each of these measures are presented in Table 19. As shown in this table an average of 30 (out of 32) cues was obtained under each experimental condition. Moreover, no subject obtained less than 27 cues. Clearly, detection, encoding and retrieval of cues presented no obstacle to carrying out the basic posttest, task, and in no way contributed to between-group differences on PF. A 173 TABLE l9.--Resu1ts of the Recognition of Cues Task, by Experimental Condition. Variable Treatment I Treatment II Control No. cues obtained 30.1 30.1 29.4 (28-31) (28—31) (27-31) No. cues checked 29.5 29.9 28.5 (27-31) (28-31) (25-31) No. distractors checked consistent 1.1 2.0 1.3 (1-3) (0-6) (0-3) contradictory 0.1 0.2 0.3 (0-1) (0-1) (0-2) inconsistent 0 0 0 tabulation of the cues that were not obtained (i.e., a total of 97 omissions across all 48 subjects' responses), revealed that in 80 instances (82.5%) the cue had weight of one in the CUE scoring key. Twelve of the 97 omissions (12.3%) involved cues with a weight of two; and only 5 omissions (5.2%) involved cues with a weight of three. Thus, only y§£y_rarely did a subject fail to obtain cues that were of major importance for the generation of appropriate problem formulations. Examination of the cues that had been listed by the subject on the basic posttest task, but not checked on the Recognition of Cues task (i.e., cues obtained - cues checked), indicated that in 23 instances (out of a total 174 of 25 such instances across all 48 subjects), the cue for- .gotten had a weight of one and/or was very closely related to other cues that were checked. Of the 20 subjects who forgot cues, only 4 forgot more than one cue. Thus, not only was the forgetting of cues between the two tasks very minimal, when it did occur it usually involved single items that were of minor importance and/or highly redundant with other cues that were recalled. Examination of the distractors checked on the recognition task indicated that errors of commission were as infrequent as errors of omission. The extremely low frequency with which the contradictory distractors were checked indicated that if a subject detected a cue while viewing the film, he nearly always encoded it correctly. The results for the other two types of distractors indicated that if a subject "recalled" pieces of data that were not in fact presented in the film, these were always data which were consistent with the cues that were presented. Thus, to a very limited degree subjects showed a tendency to "supplement" the nominal stimulus by generating, as "cues," items that presumably are closely associated with the actual cues in long-term memory. In the second additional posttest, task, the subject was provided with a list of the cues presented in the film and asked to make any additions he wished to his original response sheets. The purpose of this task was to determine 175 if performance on the basic posttest task would have been higher if the process of generating problem formulations had not been dependent on the subject's detection, encoding and retrieval of cues. For each subject a new set of PF, CUE and CUE-PF scores were calculated on the basis of his initial responses plpg his additions to his response sheets. As discussed in Chapter III, additions to response sheets could occur for two reasons: (1) because the list of cues provided the subject with data which he had failed to obtain while viewing the film, and (2) because the list provided the subject with a second exposure to cues he had originally obtained, but failed to utilize in generating problem formulations. Since the present analysis was concerned only with the first factor listed above, the following criteria were used in determining which additions a subject made would be included in the calculation of his new depend- ent variable scores. (1) Problem formulation titles were counted as additions providing that at least one of the cues listed under this title had not been previously obtained (i.e., had not been included in the subject's basic posttest. responses, and/or had not been checked on the recognition checklist). (2) Cues were counted as additions only if not previously obtained (as indicated by the subjects basic posttest. and/or checklist responses). The results of the additions task are presented in Table 20. There was very little change in the group means 176 TABLE 20.--Resu1ts of the Additions to Response Sheets Task, by EXperimental Condition. Treatment I Treatment II Control Group means CUE 45.88 43.88 40.25 CUE (A) 47.32 44.69 42.00 PF 34.44 29.88 23.19 PF (A) 34.94 30.82 23.82 CUE-PF 50.44 42.44 33.19 CUE-PF (A) 51.25 43.88 35.07 Mean increment for subjects whose scores changed CUE 3.29 2.17 2.80 (n) (7) (6) (10) PF 2.67 7.50 5.00 (n) (3) (2) (2) CUE-PF 2.60 4.60 6.00 (n) (5) (5) (5) NOTE: Group means are reported on each variable for the subjects' original scores on the basic posttest task, and for the subjects' composite scores (including additions), designated by (A) following the variable label. on FF, due to the fact that only two or three subjects in each group generated any additional problem formulations. A sizable number of subjects in each group improved their CUE and CUE—PF scores, but not by a very large amount on the average. On all variables the increments in the group mean was fairly constant across experimental conditions. Thus, the between-group experimental outcomes found on the 177 basic posttest. task were not altered by providing the sub— ject with the cues he initially failed to obtain. This finding corroborates the conclusion drawn from the hypothe- sis tests reported earlier: namely, treatment-control differences in the generation of problem formulations can— not be attributed to differences in cue acquisition. Treatment-Control Differences in Problem Formulations Several supplemental analyses were undertaken in order to determine more precisely the nature of the sig- nificant treatment-control differences that were found on the variable PF. The first analysis was concerned with the structural properties of the "problem spaces" generated by subjects under each experimental condition. It will be recalled from Chapter IV, that four features were found to be characteristic of the sets of problem formulations generated by the experienced physicians: (l) hierarchial organization; (2) competing formulations, (3) multiple subspaces, and (4) functional relationships. In addition, it was found that the size of the physician's set of formulations could be measured in terms of (1) the number of problem formulations it contained, and (2) the number of subspaces it contained. Analysis of the student data in terms of these six variables yielded the results presented in Table 21. 178 TABLE 21.--Ana1ysis of the Structure of the Students' Sets of Problem Formulations, by Experimental Condition. Variable Treatment I Treatment II Control Structural Featuresa Film 7 Hierarchical organization 12 15 9 Competing formulations 16 16 11 Multiple subspaces l6 16 15 Functional relationships 15 13 11 Film 8 Hierarchical organization 16 14 9 Competing formulations 15 12 8 Multiple subspaces 16 16 16 Functional relationships 14 8 10 Problem Space Sizeb Film 7 Number of problem 6.9 5.5 4.2 formulations (4-11) (3-7) (1-8) Number of subspaces 4.0 3.7 3.1 (3—6) (3-5) (1—5) Film 8 Number of problem 8.4 6.8 5.1 formulations (5-13) (3-10) (3-9) Number of subspaces 4.4 4.2 3.3 (3—5) (3-5) (2-5) aTable entries are the number of subjects whose set of problem formulations exhibited each feature. b variable. Table entries are the mean and range on each 179 An examination of the data on structural features indicated that the control group tended to differ from the treatment groups with respect to two features: (1) hierarchi- cal organization, and (2) competing formulations. The fact that fewer control subjects generated hierarchically organized sets of problem formulations can in part be attributed to the fact that they generated fewer problem formulations, and therefore had less need to use hierarchical organization as a means of increasing working memory storage capacity. The relative infrequency of competing formulations among control subjects is a more critical manner. As indicated in Chapter IV, competing formulations was the one feature that was found to characterize all physicians' performance on every film. Thus, a major difference between the control subjects' responses and those of the trained subjects (especially under Treatment I) was that many of the former (approximately one-half to one-third, depending on the case) failed to generate formulations having the feature that was uniformly characteristic of the experienced physicians, while nearly all of the latter did so. This finding would suggest that one effect of the training procedure, at least under "outcome feedback only" condition, was to improve the subject's skill in generating competing formulations. 180 V In order to determine whether the observed treatment- control differences on the measures of problem space size were statistically significant, a multivariate analysis of covariance was conducted. For the purpose of this analysis, the subject's scores on the two variables were summed across the two tasks. The results of the analysis, reported in Table 30, Appendix H, revealed: (1) a significant multi- variate main effect (F = 5.0314, p < .0011), (2) a signifi- cant main effect on number of subspaces (univariate F = 6.2399, p < .0042), (3) a significant main effect on number of problem formulations conditioned on number of subspaces (stepdown F = 4.0060 p < .0254). Scheffé_pp§3.ppp compari- sons, were then carried out on the adjusted group means for each pair of experimental conditions. The results of these comparisons, reported in Table 31, Appendix H, were as follows: (1) the Treatment I mean was significantly higher than the control mean on both number of problem formulations and number of subspaces (p < .001 and p < .05,~respective1y), (2) the difference between the Treatment II and the control group was not significant on either variable; (3) the dif- ference between the treatment groups was not significant on either variable. On the basis of this analysis we may conclude that the treatment-control difference on the variable PF can be attributed in part to the fact that the trained subjects (at least under Treatment 1) generated a larger number of 181 both problem formulations and subspaces than the control subjects. The larger number of subspaces generated by the Treatment I subjects indicates that one effect of the "outcome feedback" training was to increase the scope (or horizontal dimension) of the student's problem spaces. Moreover, the significant stepdown F ratio for number of problem formulations conditioned on number of subspaces, coupled with the results of the pp§p_hpp comparisons (indicating that the Treatment I-control difference in number of problem formulations cannot be accounted for by the between-group difference in number of subspaces) suggests that the "outcome feedback" training also led to an increase of problem space size on the vertical dimension of hierarchical elaboration within subspaces. It is also of interest to consider the data on subspaces with respect to the parameter of memory organi- zation that has been proposed by Mandler (1967). The range of subspaces (per film) generated under both treatment conditions coincides very closely with Mandler's proposition that human information-processors typically organize and store items in terms of (5 i 2) categories. Under the control condition, the number of subspaces generated never exceeded the upper limit of Mandler's parameter, but, in the case of five subjects on each task, did fall below the lower limit. 182 The data on number-of subspaces provide a quanti- tative measure of the sc0pe of subjects' problem spaces, but do not indicate whether the (5 i 2) subspaces typically generated were the most appropriate ones for the case. In order to address this question, a second type of analysis was undertaken. In this analysis, the subjects' problem formulation responses were tabulated in terms of eight major problem formulation categories (plus two specific formulations) that were found to characterize the performance of all of the experienced physicians on the twolpostteat films.l Each of the eight categories (listed on Table 22) represents a different problem subspace. The table indicates the number of subjects, under each condition, who generated at least one problem formulation within each of the eight categories. It also indicates the number of subjects who generated two specific formulations of parti- cular importance for film 8, and the number of subjects generating at least one formulation within various combi* nations of categories. For film 7, the table indicates that all subjects generated problem formulations pertaining to the patient's recent (acute) symptoms of respiratory infection. In this case, however, the problematic aspect is to recognize that 1These categories, it may be noted, and were also the categories used in the CUE—PF scoring key. 183 TABLE 22.--Number of Subjects Generating at Least one Problem Formulation in Various Categories, by Experimental Condition. Problem Formulation Categories of Major Importance Treatment I Treatment II Control Film 7 1. Acute respiratory infection 16 16 16 2. Chronic respiratory 5 problem 15 14 _12 3. Cancer 16 14 8 Categories 1 plus 2 or 3 16 16 14 All three categories 15 12 6 Film 8 1. Pregnancy l6 16 16 la.Ectopic pregnancy 8 8 5 2. Psychological problem l4 l6 l6 3. Gastrointestinal problem l3 l3 8 3a. Appendicitis 6 7 4 4. Diabetes mellitus 15 13 9 5. Genito-urinary infection 12 ' 7 3 At least four categories 15 12 7 All five categories 7 5 3 the patient's acute problem is superimposed on a more serious and long-term problem (chronic respiratory disease, or cancer). All but two control subjects generated formulations in one of these latter categories thereby indicating their recog- nition of the dual level nature of the case. However, only six control subjects as compared to 15 subjects under Treat- ment I and 12 under Treatment II, generated formulations in 184 both of these categories. Thus, the major difference between the treatment and control group was the latter's failure to generate multiple competing formulations pertaining to the patient's underlying problem. For film 8, there was a considerably larger number of major problem formulation categories to be considered. Since the patient is a single student nurse who thinks she may be pregnant, and has broken up with her boyfriend, it is not surprising that nearly all subjects generated the formulations of pregnancy and psychological problem. The main difference between the treatment and control groups was their skill in generating multiple formulations to account for the patient's other complaints (severe abdominal pain and vomiting for one day's duration; increased appetite, thirst and urination for several weeks; a recurrent vaginal itch). Although all of these symptoms might be attributed to pregnancy and/or a psycho- logical problem, they strongly suggested (to all experienced physicians) that three other categories should be con- sidered: namely, gastrointestinal problem, diabetes mellitus, and genito-urinary tract infection. Only seven control subjects generated problem formulations pertaining to at least four out of the five categories, whereas 15 of the Treatment I subjects and 12 of the Treatment II subjects did so. The patient in film 8 could easily have multiple, concurrent problems (e.g., pregnancy, psychological problems, 185 a G.U. infection and diabetes), but it is also necessary to consider competing explanations for some of her symptoms (e.g., diabetes versus G.I. disorder versus ectopic preg- nancy). Thus, the film 8 data suggest that the control subjects' performance was inferior to that of the trained subjects (at least under Treatment I) both with respect to the scope of apprOpriate subspaces considered, and with respect to the number of competing formulations generated. It may be noted that one inadequacy of the students' performance under all three conditions was the relatively small number of subjects (4-8) who generated the two specific formulations of ectopic pregnancy and appendicitis. Nearly all physicians indicated that these formulations should be considered early in the workup of the film 8 case due to their immediate life-threatening implications. To summarize: the results of the supplemental analyses corroborate the conclusions drawn from the hypothe- sis tests on the variable PF, namely, that the trained subjects (especially under Treatment I) generated more thorough and appropriate sets of problem formulations. Treatment I group performance was found to be superior not only on a quantitative dimension (i.e., number of problem formulations and subspaces generated), but also on a qualitative dimension (i.e., number of subspaces of major importance in which at least one formulation was generated). 186 Comparison of the Treatment Conditions The hypothesis tests reported in the first section of this chapter indicated that there were no significant differences between the two treatment groups. However, it was noted that on every variable examined, in both the major and supplemental analyses, the direction of the difference between the groups was in favor of Treatment 1. It is possible that if a more powerful covariate had been employed the observed differences on many of these variables would have been statistically significant. Alternatively, if one were to accept a higher probability of a Type I error (namely, a = .10), most of the differences between treatment groups would be found significant. Thus, there was some evidence to suggest that, if either training procedure is to be preferred, it is the "outcome feedback only" procedure rather than the "outcome plus process feedback" procedure. This final section of Chapter V will first present the results of some supplemental analyses regarding the treatment groups, and second address the question as to why the process feedback was so ineffective. PF scoring keys were prepared for each of the six training films, and all subjects' responses scored by the experimenter. Table 23 presents the treatment group means and standard deviations on PF for each of the films. As indicated in the table, the group means are almost identical on the first film, but begin to diverge on the second film, 187 TABLE 23.--Means and Standard Deviations of Treatment Group PF Scores on the Training Films (1-6). Treatment Training Film I II 1 20.25 20.44 (4.24) (5.29) 2 36.44 33.31 (7.16) (7.13) 3 18.06 17.19 (3.36) (3.25) 4 33.56 29.50 (5.81) (8.34) 5 21.50 19.56 (6.20) (5.66) 6a 30.25 25.19 (6.29) (4.26) aThe treatment group means on Film 6 are significantly different at p < .01, F = 7.10732, with l and 30 degrees of freedom. and on the last film are significantly different at the .01 level: Although there is no clear trend across the training tasks, it is possible that with a longer period of training the cummulative effects of the treatments would have led to a significant contrast between the groups on the posttest, and, thus, have provided clear evidence of the superiority of Treatment I on the variable PF. A second supplemental analysis was based on the responses to the questionnaire, administered at the end of 188 the posttest session in order to determine the students' Opinions of the training procedures and materials. The subject's response to each item in the questionnaire (section 1) was scored on a five-point scale: +2 = strongly agree; +1 = agree; 0 = no opinion; -1 = disagree; -2 = strongly disagree. Table 24 reports the group means and standard deviations for each item. In addition, each sub- ject's score on three summary variables was calculated. These variables were as follows: 1. EV FILM: the subject's evaluation of the six filmed interviews (i.e., the mean of his responses to items 3-9, with the sign reversed for item 5). 2. EV FB: the subject's evaluation of the feed- back materials (i.e., the mean of his responses to items 10, ll, 12 and 13, with the sign reversed for item 11). 3. EV GEN: the subject's evaluation of the overall effectiveness of the training materials and procedures (i.e., the mean of his responses to items 12, 17 and 20). The group means and standard deviations on these variables are reported in Table 25. In order to test the significance of the differences in the group means, a one- way fixed effects analysis of variance was performed on each variable. The results of these analyses are found in Table 26. 189 TABLE 24.--Mean and Standard Deviation of Treatment Group Responses to Questionnaire Items (Section 1).3 my understanding of the case. Treatment Item I II 1. The instructions were generally clear 1.438 1.125 and easy to follow. ( .512) ( .957) 2. The instructional sessions were too - .375 - .188 long. (1.088) (1.109) 3. The actors who played the role of the 1.438 1.438 patients in the films were very ( .629) ( .512) canvincing. 4. The films provided a realistic 1.438 1.188 simulation of the early part of ( .814) ( .544) the clinical workup. 5. The dialogue in the films was some- -1.125 -l.063 times difficult to follow. ( .342) ( .680) 6. The physicians in the films did a .563 .125 good job of interviewing the patients. ( .727) (1.088) 7. I enjoyed watching the films. 1.438 1.250 ‘ ( .629) ( .577) 8. As I watched the films, I was able to .875 .438 put myself into the role of the doctor.( .885) (1.094) 9. The films presented a good selection 1.250 .688 of medical cases. ( .775) ( .704) 10. The feedback materials were well 1.375 .875 organized and easy to follow. ( .500) ( .806) 11. The feedback materials were sometimes .313 .563 overly redundant. ( .873) ( .892) 12. The opportunity to compare my problem 1.000 .938 formulations to those of experienced ( .365) (1.063) physicians helped to improve my skill in generating initial problem formu lations. 13. I found the feedback materials 1.438 1.000 interesting. ( .512) ( .632) 14. Treatment I: The second viewing of - .500 the film helped me to consolidate (1.155) 190 TABLE 24.--Continued. Item Treatment II: The second version of the films, which portrays the phy- sician "thinking aloud," provided me with an understanding of the process by which experienced physicians generate initial problem formulations. 15. Treatment I: The second viewing of the film was not worthwhile. Treatment II: The "think aloud" segments in the second version of the films tended to disrupt my own thinking process. 16. The self-evaluation checklists helped me to evaluation my own performance as compared to that of the experi- enced physician. 17. My ability to generate a set of initial problem formulations has improved as a result of utilizing this instructional package. 18. This instructional package is not appropriate for second-year medical students. 18a. It would be more appropriate for first—year students. 18b. It would be more appropriate for third-year students. 19. For some of the cases, I didn't have sufficient medical knowledge to be able to generate appropriate problem formulation. 20. If a library of films like these, with accompanying feedback materials, was available to medical students, I would make use of it. 21. It would be more interesting to use the films and feedback materials in a group discussion setting (e.g., focal problems class) than in an individual self—instructional format. Treatment I II .625 (1.025) .625 (1.147) .750 ( .931) 1.000 .750 (1.095) (1.125) 1.188 .563 ( .655) ( .814) -1.438 -1.500 ( .512) ( .516) -1.000 - .813 ( .731) ( .911) - .438 .875 ( .814) ( .885) .125 .125 (1.147) (1.025) 1.125 1.000 ( .619) ( .632) .375 .625 (1.088) (1.147) aSs responded to each item on a five-point scale (+2= strongly agree; +l=agree; strongly disagree). 0=no Opinion; -1=disagree; -2: 191 TABLE 25.--Means and Standard Deviations of Treatment Group Scores on Questionnaire (Section 1). Treatment Scorea I II EV FILM 1.1607 .884 ( .397) ( .428) EV FB I 1.031 .563 ( .352) ( .452) EV GEN 1.104 .833 ( .359) ( .632) aRange of scale is from +2 (highly positive) to -2 (highly negative). TABLE 26.--Analyses of Variance on the Questionnaire Scores: EV FILM, EV FB, EV GEN. Score Sources of Variation df MS F p EV FILM Between groups 1 .6129 3.6011 .067 Within groups 30 .1702 31 EV FB Between groups 1 1.7578 10.7143 .003 Within groups- 3g .1641 31 EV GEN Between groups 1 .5868 2.2179 .147 Within groups 30 .2646 31 192 A review of the data in these tables suggests the following conclusions regarding the students' opinions of the training materials and procedure. First of all, it may be noted that, with one excep- tion, both groups evaluated the films, feedback materials and training procedure as a whole in a positive manner (as indicated by mean scores close to 1.0 on each variable in Table 25). The one exception was the Treatment II mean on EV FB which was half-way between the positive and neutral points on the scale. Thus, we may conclude that the students in both groups reported a generally positive opinion of the materials and procedure. There were, however, some differences between the groups with respect to their opinions. As indicated in Table 26, the Treatment I mean on EV FB was significantly higher than the Treatment II mean (p < .003). Thus, on the factor which differentiated the two groups--type of feedback-- theTreatment I group's opinion was more favorable than that of the Treatment II group. Although the groups did not dif- fer significantly on the other two variables (EV FILM, EV GEN), it should be noted that Treatment I had slightly higher means, and slightly lower standard deviations, on these variables than Treatment II, indicating that at the level of the individual subject there were more persons re- porting relatively unfavorable opinions under Treatment II. This observation is also borne out by an examination of the means and standard deviations for single items (see Table 24). 193 The responses to the final six items in section 1 of the questionnaire provide data on two further points. First, although some subjects in both groups felt that they didn't have sufficient medical knowledge to generate appro- priate problem formulations for some of the cases (item 19), nearly all subjects in both groups indicated that they found the training package to be appropriate for second- year students (item 18). Secondly, the students in both groups indicated generally favorable attitudes toward eventual application of the training materials in a self- instructional and/or group discussion setting (items 20 and 21). To summarize: Analysis of the responses to section 1 of the questionnaire yielded results that closely paral- leled those obtained from the analysis of the posttest data. Differences between treatment groups were largely nonsigni- ficant, but there was some evidence to suggest that the "out- come feedback only" training elicited more favorable student Opinions. We will now address the question of why providing the subject with process feedback, in addition to outcome feedback, did not lead to superior performance on the part of the Treatment II subjects, as was originally anticipated. First of all, it is necessary to consider the possibility that "outcome plus process feedback" lg superior to "outcome feedback only," but that its effectiveness was not detected 194 due to some failure in the experimental procedure. Two potential sources of internal invalidity will be considered. 1. Failure of random assignment to yield equivalent groups. Although one can never be sure, in any single experiment, that the treatment groups are actually (as opposed to randomly) equivalent, there is little basis for hypothesizing this factor as an explanation of the ineffectiveness of process feedback. The groups had highly similar scores on the covariate. .Moreover, on a second variable of potential relevance to the dependent measures: namely, amount of clinical experience prior to participating in the experiment, any difference that did exist was in favor of the Treatment II group (see Table 27). 2. Extra-session or intra-session "history" as a confounding variable. "History" is the term applied by Campbell and Stanley (1969) to events which occur con- currently with, and are confounded with, the administration of treatment. In the present study, there are two ways in which the experimental design could have failed to control for the confounding effects of history. The first is extra-session history: principally, the possibility that subjects in one treatment group pursued an interest in the training cases outside of the experimental sessions to a greater degree than subjects in the other group. The second section of the questionnaire was designed to ascertain if this occurred. The subject was asked to indicate for each case: (a) whether he discussed it with other students 195 TABLE 27.--Responses to Sections Two and Four of the Questionnaire. Treatment Variable I . II Pursuit of interest in training cases outside the sessions No. of cases: (a) discussed with students Mean 3.8 3.8 Range (0-6) (0-6) (b) discussed with faculty Mean 0.4 0.1 Range (0-3) . (0-2) (G) looked up references Mean 1.4 1.4 Range (0-4) (0-4) (6) any of (a), (b), (C) Mean 4.1 4.4 Range (0—6) (0-6) Clinical experience prior to participation in experiment (40-hour weeks) 0-12 weeks‘ n = 13 n = 10 13-52 weeks n = 2 n = l 53 or more weeks n = l n = 5 Median 5.0 6.5 Range (0-182) (0-136) 196 in his group; (b) whether he discussed it with faculty members; (c) whether he looked up reference materials of relevance to the case. The students responses to these queries are reported in Table 27. As indicated in the table, there was no evidence of between-group differences with respect to extra-session pursuit of interest in the training cases. It is always possible, of course, that the groups' extra-session history systematically differed in some other way, but this is implausible given the homogeneity of the students' curricular activities. A second possibility is that of between-group differences in intra-session history, due to the fact that training was administered in group sessions. This weakness in the experimental design was recognized at the outset, but could not be avoided because of practical constraints. It is believed that the use of individual self-instructional booklets to administer the training provided sufficient control against this source of internal invalidity. No events occurred during the sessions to suggest that there systematic between—group differences in intra-session history. Nevertheless, this possibility cannot be ruled out of consideration. 1It may be noted that the data in Table 27 also bear on the issue, previously raised, of the students' attitude toward the training materials, with the level of the group means tending to indicate a high level of interest in the training cases on the part of students in both groups. 197 We will now consider an alternative interpretation for the outcome of the experiment: namely, that the process feedback was in fact ineffective. Assuming that failures in experimental method did not occur, the results indicate that providing the subject with process feedback, in addition to outcome feedback, clearly did not have a positive effect on the development of his problem formulation skills, and may even have had a negative effect, as compared to outcome feedback only. The experimenter's interpretation of this phenomenon rests on two hypotheses. The first is that, having been given the outcome feedback, the Treatment I subjects had no difficulty in inferring what the phy- sicians' reasoning process must have been in order to generate the formulations listed on the outcome feedback sheets. Thus, they were able to provide themselves with self-generated process feedback, and thereby received, in essence, the same "treatment" as the other group. This factor could explain equivalence of posttest performance by the two groups. However, a second hypothesis is needed in order to account for the evidence suggesting a possible superiority of outcome feedback alone. It would appear that the Treatment II condition may have provided the subject with too much feedback, and thereby led to a dimin- ishing of interest in the task. Two observations support this hypothesis. First, the experimenter noted that during the presentation of the process feedback films some of the 198 Treatment II subjects did not appear to be actively attending to the film. Second, the questionnaire item on which the largest difference was found between the groups was number 11 (see Table 24), on which the majority of Treatment II subjects agreed that "the feedback materials were sometimes overly redundant" while the majority of Treatment I subjects disagreed. In response to another item (14), the majority of Treatment II subjects indicated that the supplemented films did convey "an understanding of the process by which experienced physicians generate initial problem formu- lations." However, the real issue seems to be whether the films were necessary to this effect, or whether gplff generated process feedback, as apparently occurred under the Treatment I condition, is not more effective from both a cognitive and a motivational standpoint. A third interpretation to be considered is that process feedback could pptentially be effective but was not in this experiment due to inadequacies in the data that were obtained from the eXperienced physicians. In this study, as in other recent investigations of medical problem solving, introspection was the technique used to obtain data on cognitive processes. As reported in Chapter IV, the strategies for generating problem formulations that were found to be employed by a sizable number of physicians (and thus were included, along with other process-related commentary, in the "think aloud" segments of the films) 199 were neither large in number nor very complex, a factor which may help to account for the Treatment I subjects' apparent ability to generate their own process feedback. It is always possible, however, that the physician does employ a variety of complex strategies to generate initial problem formulations, but that these strategies have become so routinized as to be inaccessible by means of introspective self-report. If this were the case, and if it were possible to identify these strategies (by using, for example, an approach of the Bruner, Goodnow and Austin (1956) typel), then it might be found that the provision of process feed- back wppld significantly increase the effectiveness of the training model. lThe Bruner, Goodnow and Austin (1956) approach attempts "to externalize for observation as many of the decisions as could possibly be brought into the Open in the hope that regularities in these decisions might provide the basis for making inferences about the processes involved . . . (p. 54) (my italics). CHAPTER'VI CONCLUSIONS AND IMPLICATIONS This chapter will summarize the major conclusions of this research, and will indicate some implications of these conclusions for future research and instructional development. Problem Formulation Outcomes and ProceSses in the ExperIenced Physician Conclusions drawn from the analysis of the physician data are necessarily tentative due to the small size of the sample. Nevertheless, since this analysis examined the physician's initial problem formulations in greater detail than has been done in previous research, it may provide some valuable indications as to directions future research‘in medical problem solving could take. Analysis of the physician outcome data revealed that what results from the physician's information-processing activity during the early part of the workup is not a uni- dimensional list of problem formulations, but a structured set of problem formulations which may be described in terms of four features: (1) hierarchical organization, (2) com- peting formulations, (3) multiple subspaces, and (4) functional 200 201 relationships. Of these features, only the second was found to be always present. Thus, it would appear that irrespective of the properties of the case he encounters,l 1 I the experienced physician always seeks competing explana- 3 tions for the data obtained during the early part of the ‘ workup. Which and how many of the other three features are present, on the other hand, is probably a function: (a) of the properties of the case, and/or (b) of the charac- teristics of the physician. One goal of future research should be to determine the way in which the structure of a medical case affects the structure of the physician's initial set of initial problem formulations, or, in Newell and Simon's (1972) terms, what properties of the task environment determine the structure of the problem space. To answer this question it will not be sufficient to design cases, such as those used in this study and in other in- vestigations, which are simply a representative sample of the cases encountered by the physician in a given domain. Rather it will be necessary to carefully construct cases! along a series of structural dimensions, holding some 1 dimensions constant while varying others. A second, and complementary, goal of future research should be to deter- mine what type of individual difference variables, if any, affect the physician's problem formulation outcomes. There is a wide range of variables that could potentially be 202 investigated: e.g., amount and type of clinical experience, area of specialization, cognitive style variables, personality: traits. With respect to the processes involved in generating a set of initial problem formulations, the findings of this study suggest: (a) that the major mechanism involved is associative retrieval, and (b) that the mode of representa- tion is primarily verbal. Although the early generation of problem formulations is itself a strategy for dealing with the diagnostic task as a whole, the use of strategies to generate these formulations was not found to be a major characteristic of the physician's information-processing activity during the early part of the workup. Only two strategies were consistently employed by at least half of the physician sample and on at least half of the cases: (1) focusing on diseases of high incidence for the patient's demographic group, and (2) attempting to think of competitors to each formulation generated. Two implications for future research may be drawn from these findings. If the generation of initial problem formulations is largely a process of direct associative retrieval, rather than strategy-guided search, then a major focus of future research should be the investigation of: (a) the organization, or structure, of the physician's store of diagnostic categories in long- term memory, and (b) the properties of the retrieval system 203 (i.e., the indexing of diagnostic.categories in terms of cues) which permits access to this store. On the other hand, it is possible that strategies do play a major role in the generation of problem formulations, but that reliance on introspective data, as was the case in this and other recent studies, is not the means for identifying them. Thus, a second goal of future research should be to attempt to devise tasks which require the subject to externalize the steps in his thinking, and which, because of the proper- ties built into the task by the experimenter, permit infer- ences about the use of strategies from observations of behavior, i.e., a Bruner, Goodnow and Austin (1956), rather than a Newell, Shaw and Simon (1958), approach to the study of cognitive processes. It will, however, be considerably more difficult to devise an appropriate task for the study of medical problem solving than it was for the Bruner, et al. investigation of concept attainment. The outcomes of these lines of research will have important implications for the training of medical students. If the generation of problem formulations lg found to be primarily a process of associative retrieval, this would imply that a major determinant of the student's acquisition of this skill is the way in which his store of medical knowledge is structured. Thus, future instructional research would need to focus not only on devising specific training in the generation of problem formulations (as was done in 204 this study), but also on determining how the-medical school basic science curriculum should be organized and taught so as to impart to the student a store of medical knowledge that is structured similarly to that of the experienced physician. Training of Medical Students in the Generation of Initial Problem Formulations Conclusions The results of the training experiment support two major conclusions: 1. That a training model consisting of--(l) problem— solving exercises in which films are used to simulate the conditions of the early part of the clinical workup; (2) feedback based on data from a sample of experienced phy- sicians--is an effective means of improving the second-year medical student's skill in generating a set of initial problem formulations. 2. That the training model is just as effective, if not more effective, for second-year students when it provides outcome feedback only, rather than outcome and process feedback. The analysis of student posttest performance in terms of four variables (CUE, PF, CUE—PF, and R—PF scores) suggests the following conclusions regarding the nature of the training effect: 205 1. The ability.to detect, encode and make at least limited use of the cues presented during-the early part of the workup is already well established-in the second-year medical student, and thus shows no improvement as a result of the training. 2. The major effect of the training, therefore, lies in the improvement of the student's ability to make use of cues, once obtained, in order to generate a thorough and appropriate set of initial problem formulations. 3. The ability of the secondryear medical student to classify cues obtained with respect to problem formula- tions generated (in a flexible and appropriate manner) does not in itself improve as a result of training. But, his performance on this variable does increase because of the improvement in the scope and appropriateness of his problem formulations. 4. There is no change in the student's ability to identify possible functional relationships among the problem formulations he generates. Implications for Future Researph Given the effectiveness of the "simulation exercises, plus feedback" model as a means of training medical students in the generation of initial problem formulations, one line of future research would be to consider ways of applying the model with respect to other problem-solving skills involved 206 in the remainder of a clinical-workup:r‘namely;~the testing of initial problem formulations by further-data collection, the revision of initial formulations and.the_generation of new formulations in light of additional data obtained as the workup progresses, and, ultimately, the making cf diagnostic decisions at the close of the workup.. Since an essential feature of the physician's activity, after the earliest part of the workup, is the seleCtion of clinical procedures that will provide data to test his problem formulations, it would be necessary to use some other medium than films to simulate the conditions of the remainder of the workup. Booklets with "rub out" answer sheets, of the type employed in the Patient Management Problems developed by McGuire (McGuire and Solomon, 1971), provide an effective (yet relatively low cost) means of simulating sequential decision-making regarding the selection of cliniCal procedures. Modifi- cation of the McGuire format to provide feedback, based on physician performance data, at various key decision points in the workup (e.g., between history and physical, between physical and lab) would be one means to test the effective- ness of the training model with respect to skills other than the generation of initial problem formulations. For example, if one were interested in training which focused on the testing of problem formulations, the training exercise could begin by providing the student with a set of initial formulations.to be tested. After he has 207 completed a data selection sequence (e.g., the history) and recorded his interpretation of the data with respect to the formulations he was given, he could be given feed- back: (a) on the data that experienced physicians selected, and (b) the way in which experienced physicians interpreted the data. If, on the other hand, one were interested in providing the student with training in both the generation and testing of problem formulations, it would be possible to construct a more comprehensive simulation exercise, using films with feedback for the early part of the workup, and McGuire type materials with feedback for.the remainder of the workup. A second line of research would be to determine if there are not less expensive means of simulating the early part of the workup than motion picture films of the type used in this experiment. It seems probable, in retrospect, that a set of color slides of the patient, plus a tape recording of the doctor-patient dialogue would be just as effective as a motion picture film, and much less costly. A slide—tape combination would maintain the realism of the audio aspect of the simulated encounter with the patient, but would, of course, reduce the realism of the visual aspect. The question, therefore, is how much does a moving picture of the patient, as compared to a series of still images, contribute to the development of the student's skill in generating problem formulations. Certain types 208 of visual cues--such as the patient's gait as he enters, the way he sits down, his changes of positions in the chair, the movements of his head and arms, his changes of expres- sion--could not be adequately conveyed by slides. However, examination of both the physicians' and the students' responses reveals that no cues of this type were used to generate problem formulations. The visual cues that the physicians and students did use to generate problem formu- lations were either essentially static in nature (e.g., the patient's physical build, or his dress) and thus could be easily conveyed by means of slides, or were movements and gestures whose cue properties could be fairly effectively captured in slides (e.g., the patient points to the location of his pain, clutches his abdomen or splints his chest wall with his arm; the patient rests his head on his hand, squints his eyes or exhibits an expression of pain). Thus, for the particular cases used in this study, as well as for a wide range of other possible cases, a moving picture may not play an essential role in conveying visual cues to the learner. Slides would simplify the learner's task by providing still images of the visual cues of relevance to the generation of problem formulations, whereas a film requires the student to detect such cues from the on-going.flow of images of the patient. However, the results of the experiment indicate that, at least in the case of second—year medical students, the ability to detect relevant cues on the basis of 209 naturalistic observation of the patient in motion.was'already well established; thus, the use of motion picture films was not needed to develop the students' skill in this domain. For first-year medical students, however, a motion picture film might provide needed practice in.cue detection, and, therefore, substantially contribute to the training's effec- tiveness. ‘ A third line of future research that may be proposed pertains to the feedback component of the training model. First, it would be of interest to determine the degree to which feedback contributes to the effectiveness of the model by comparing students'performance under two experimental conditions: (1) simulation exercises with feedback; (2) simulation exercises without feedback. .A second question of interest is whether there may be an ordinal interaction between the type of feedback provided (“outcome" versus "outcome and process") and the level of medical knowledge and skill of the student. The results of this experiment indicate that, for second-year medical students, provision of process feedback, in addition to outcome feedback, clearly had no positive effect on the development of problem formu- lation skills, and may even have had a negative effect, as compared to outcome feedback only. In Chapter V the fol- lowing explanation was advanced to explain these results. First, it was argued that, having been given outcome feed- back, the student had no difficulty in reconstructing what 210 the physicians' reasoning processes must have been, and thus was able to provide himself with self-generated process feedback. Second, it was proposed that, by providing the student with process feedback which he could-generate for himself, the feedback became overly redundant; thus, the student's level of cognitive involvement in the task, as well as his motivation to carry out the task, diminished. It is possible, however, that process feedback wppld be effective with students at an earlier point in the medical school curriculum. Unlike the second-year student, the first-year student may not have acquired a sufficient level of medical knowledge and skill to be able to reconstruct for himself the processes by which the experienced physician ar- rives at a given set of problem formulation outcomes. Thus, process feedback materials could be effective in enabling him to understand and assimilate the outcome feedback materials. Replication of the eXperiment with student entry level as a factor in the design (e.g., third term first-year students and third term second—year students) would be useful to determine whether different types of feedback are appropriate for students having different levels of medical knowledge and skill. The expense of pro- ducing one of the training exercises increases considerable by the addition of process feedback: (a) because of the sizable technical cost involved in adding the "think aloud" segments to the standard training film, and (b) because the 211 collection of the process data from a sample of physicians requires lengthy individual sessions, whereas outcome data only can be easily obtained in group sessions. Thus, unless it can be demonstrated that the addition of process feed- back makes the training exercises considerably more effective for some medical-student populations, further development of such materials would not be warranted. Whatever direction is taken by future research on. problem-solving instruction (in medicine, or in other domains), two general methodological recommendations may be offered. It is believed that the positive outcome of this experiment (at least with respect to the treatment versus control hypothesis) was due in part to the relatively sizable period of training which the students received. A series of one to two-hour training sessions over a period of several weeks (rather than the all too frequent single session, 60-minute treatment) would seem to be required in order for a training model (in the area of problem solving) to be properly tested. Secondly, it is believed that the use of multiple dependent measures, each focused on one component of the overall complex of skills that could potentially be affected by the training, (as was the case in this study) is likely to be necessary for training experiments to yield meaningful results. Since some of the skills that are involved in a problem-solving task may improve as a result of training, while others do not, a 212 battery of quite specific dependent measures would seem to be required in order to determine.not only whether the training had an effect but also the precise nature of its effect. Instructional Applications In conclusion, we would like to suggest that, even without further research and development, the materials produced and tested in this study may have a number of useful applications within the current medical school curriculum. 1. Self-instruction. Each of the simulation exercises could be easily packaged as a self-contained unit (i.e., film cassette,plus instructional booklet, plus re- sponse booklet), and the set of such units made available to students for use on an individual basis. The students' responses to the questionnaire indicated that if a "library" of such units were available, most students would make use of it. 2. Group instruction. The materials could also be used in group settings, such as a "focal problems" class. In such a setting it would probably be quite effeCtive to have a group discussion (in which the students compare and criticize the various outcomes they arrived at) prior to presentation of feedback on the physician outcomes. In responding to the questionnaire, many students indicated 213 a preference for using the materials in a group discussion setting, rather than on an individual self-instructional basis. 3. Evaluation. The films produced in this study could be used in designing more effective evaluation instruments for clinically-oriented coursework or clerk- ships. Although the films provide a simulation of only the early part of the workup, they could be combined with booklets of additional clinical findings (i.e., further history, plus physical and lab data) in order to evaluate a wide range of clinical competencies. The results of this study indicate that either version of the training materials could be an effective instructional tool for improving second-year medical students' skill in generating initial problem formulations. Although neither version of the materials was found to be conclusively more effective than the other, there was some evidence to suggest that, at least for second—year students, the outcome feed- back version (with certain modifications recommended in Appendix J) is likely to be most appropriate for instruc- tional use. It is hoped, moreover, that medical educators will find ways of adapting, expanding and improving upon the materials that were developed in this study. REFERENCES 214 I' REFERENCES \ Barrows, H. S. & Bennett, K. The diagnostic (problem solving) skill of the neurologist. Archives of NeurolOgY, 1972, pg, 273-277. Bartlett, F. C. Thinking. New York: Basic Books, 1958. Broadbent, D. E. Perception and communication. New York: Pergamon Press, 1958. Bruner, J. S. The process of education. New York: Vintage Books, 1960. Bruner, J. S. Toward a theory of instruction. Cambridge, Mass.: Belknap Press, 1966. Bruner, J. S., Goodnow, J. J., & Austin, G. A. A study of thinking. New York: Wiley, 1956. Campbell, D. T. & Stanley, J. C. Experimental and Quasi- experimental designs for research. Chicago: Rand McNally, 1969. Chamberlin, T. C. The method of multiple working hypotheses (1890). Reprinted in Science, 1965, 148, 754-759. COllins, A. M. & Quillian, M. R. Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Behavior, 1969, 8, 240-247. Cornfield, J. & Tukey, J. W. Average values of mean squares in factorials. Annals of Mathematical Statistics, Craig, R. M. Recent research on discovery. Educational Leadership, 1969, 26, 501-508. Cronbach, L. J. Coefficient alpha and the internal struc- ture of tests. Psychometrika, 1951, l6, 297-334. Cronbach, L. J., Rajaratnam, N., & Gleser, G. C. Theory of generalizability: A liberalization of reliability theory. British Journal of Statistical Psychology, 1963, l6, 137-163. 215 216 Dewey, J. Experience and education. New York: Collier Books, 1963. (Originally published 1938.) Dewey, J. Logic: The theory of inquigy. New York: Holt, Ebel, R. L. Estimation of the reliability of ratings. Psyphometrika, 1951, 19! 407-424. Egan, D. E. & Greeno, J. G. Acquiring cognitive structure by discovery and rule learning. Journal of Edu- cational Psychology, 1973, 64, 85-97. Elstein, A. 8., Kagan, N., Shulman, L. 8., Jason, H., & Loupe, M. J. Methods and theory in the study of medical inquiry. Journal of Medical Education, 1972, 41, 85-92. Engel, G. L. The deficiency of the case presentation method of clinical teaching. New England Journal of Medicine, 1971, 284, 20-24. Finn, J. D. Univariate and multivariate analysis of variance and covariance: A Fortran IV program. Office of Research Consultation, College of Edu- cation, Michigan State University, Occasional Paper No. 9, 1970. Gagné, R. M. The conditions of learning. (2nd edition). New York: Holt, Rinehart and Winston, 1970. Gagné, R. M. Instruction based on research in learning. Engineering Education, 1971, El, 519-523. Glass, G. V., Peckham, P. D., & Sanders, J. R. Conse- quences of failure to meet assumptions underlying the fixed effects analysis of variance and covari- ance. Review of Educational Research, 1972, 42, 237-288. Glendening, L. POSTHOC: A Fortran IV program for generating confidence intervals using either Tukey or Scheffé multiple comparison procedures. Office of Research Consultation, College of Education, Michigan State University, Occasional Paper No. 20, 1973. Gordon, M. J. Heuristic training for diagnostic problem solving among advanced medical students. Unpublished Ph.D. dissertation, Michigan State University, 1973. 217 Hammond, K. R. & Kern, F. J. Teaching comprehensive medical care. Cambridge, Mass.: Harvard University Press, 1959. Hammond, K. R. & Summers, D. A. Cognitive control. Psychological Review, 1972, 19, 58-67. Hammond, K. R., Summers, D. A., & Deane, D. H. Negative effects of outcome-feedback in multiple-cue proba- bility learning. Institute of Behavioral Science, University of Colorado, Program of Research on Human Judgment and Social Interaction, Report No. 148, 1972. ' Harvey, A. M. & Bordley, J. Differential diagnosis. (2nd edition). Philadelphia: .Saunders, 1970. Harvey, A. M., Johns, R. J., Owens, A. H., & Ross, R. S., (Eds.). The principles and practice of medicine. (18th edition). New York: Appleton-Century-Crofts, 1972. Holsti, O. R. Content analysis for the social sciences and humanities. Reading, Mass.: Addition-Wesley, 1969. Hoyt, C. J. Test reliability estimated by analysis of variance. Psychometrika, 1941, 4, 153-160. Kessel, F. S. The philosophy of science as proclaimed and science as practiced: "Identity" or "dualism"? American Psychologist, 1969, 24, 999-1005. Kleinmuntz, B. The processing of clinical information by man and machine. In B. Kleinmuntz (Ed.), Formal representation of human judgment. New York: Wiley, 1968. Lewy, A. & McGuire, C. A study of alternative approaches in estimating the reliability of unconventional tests. Paper presented at the Annual Meeting of the American Educational Research Association, 1966. Luchins, A. S. & Luchins, E. H. New experimental attempts at preventing mechanization in problem solving. Journal of Genetic Psychology, 1950, 42, 279-297. Lusted, L. B. & Stahl, W. R. Conceptual models of diagnosis. In J. A. Jacquez (Ed.), The diagnostic process. Proceedings of a conference sponsored by the Biomedical Data Processing Training Program, University of Michigan, 1963. 218 Mandler, G. Organization and memory.. In K- W: Spence and J. T. Spence (Eds.), The psychology of learnipg» and motivation. New York: Academic Press, 1967. Marshall, R. L. Self-generated complexity, cognitive abilities and strategies as determiners of clue utilization in problem-solving. Unpublished Ph.D. dissertation, University of California at Berkeley, 1971. ' McGuire, C. H. & Solomon, L. M. Clinical simulations: Selected problems in patient management and illus- trations. New York: Appleton-Century—Crofts, 1971. Miller, G. A. The magical number seven, plus or minus two: Some limits on our capacity for processing infor- matibn. Psychological Review, 1956, 4;, 81-97. Miller, G. A., Galanter, E., & Pribram, K. H. Plans and the structure of behavior. New York: Holt, Rinehart and Winston, 1960. Miller, G. E. (Ed.). Teaching and learning in medical schools. Cambridge, Mass.: Harvard University Press, 1962. Neisser, U. Cognitive psychology. New York: Appleton- Century-Crofts, 1967. Newell, A., Shaw, J.-C., & Simon, H. A. Elements of a theory of human problem solving. Psychological Review, 1958, 65, 151-166. Newell, A. & Simon, H. S. Human problem solving. ‘Englewood Cliffs, N.J.: Prentice-Hall, 1972. Reader, G. G. & Goss, M. E. Comprehensive medical care and teaching. Ithaca, N.Y.: ‘Cornell University Press, 1967. Rimoldi, H. J. A. Evaluation and training of clinical diagnoStic skills. Psychometric Laboratory, Loyola ' ‘University, Publication No. 41, 1963. Scheffé,H. The analysis of variance. New York: Wiley, 1959. ~ Schwartz, S. H. & Simon, R. I. Information processing and decision making in medical diagnosis. Paper presented at a symposium on Health Sciences and the Systems Approach, Wayne State University, 1970. 219 Shulman, L. S. Psychology and mathematics education. In E. G. Begle (Ed.), Mathematics Education. Chicago: University of Chicago Press, 1970. Shulman, L. S. & Keisler, E. R. (Eds.). Learning b discovery: A critical appraisal. Chicago: Rand McNally, 1966. Shulman, L. S., Loupe, M. J., & Piper, R. M. Studies of the inqpiry process. Educational Publication Services, College of Education, Michigan State University, 1968. Sox, H. C., Sox, C. H., & Tompkins, R. K. The training of physician's assistants. New England Journal of M8d1Cine, 1973’ 288I 818-8240 Sprafka, S. A. The effect of hypothesis generation and verbalization on certain aspects of medical problem solving. Unpublished Ph.D. dissertation, Michigan State University, 1973. Tulving, E. & Donaldson, W. (Eds.). Organization of memory. New York: Academic Press, 1972. Twelker, P. A. Simulation and media. In P. J. Tansey (Ed.), Educational aspects of simulation. New York: McGraw Hill, 1971. Wason, P. C. On the failure to eliminate hypothesis: A second look. In P. C. Wason & P. N. Johnson-Laird (Eds.), Thinking and reasoning. Baltimore: Penguin Books,'l968. Ways, P. 0., Baker, T., Finhilstein, P., Fiel, N. J., & Jones, J. W. The problem-oriented record: A self- instructional unit. College of Human Medicine, Michigan State University, 1972. Ways, P. 0., Loftus, G., & Jones, J. M. Focal problem teaching in medical education. Journal of Medical Education, 1973, 48, 565-571. Weed, L. L. Medical records, medical education, and patient care. Cleveland, Ohio: Press of Case Western Reserve University, 1969. Winer, B. J. Spatistical principles in experimental design. New York: McGraw Hill, 1962. 220 Wintrobe, M. M., et al. (Eds.). Harrison's Principles of Internal Medicine. (6th edition). New York: McGraw Hill,II970. Wortman, P. M. Medical diagnosis: An information processing approach. Computers and Biomedical Research, 1972, p, 315-328. Wortman, P. M. & Kleinmuntz, B. The role of memory in information-processing models of problem solving. Unpublished paper, Department of Psychology, Duke University, undated. APPENDICES 221 APPENDIX A 222 I? [m 2. CASE OUTLINE FOR FILM 1: "A 21-year-old College Senior" Written sheet: sex: Male age: 21 occupation: college student (senior) temperature: 99° Verbal dialogue: Complaints weakness and exhaustion abdominal pain Attributes —-feeling increasingly weak and exhausted for sometime (about 2 months) --so bad during last couple weeks that he can hardly walk across campus --lives in 3rd floor apartment, and by the time he climbs to the top of the stairs he is completely exhausted and has to flop down on the couch —-after exertion his legs feel wobbly and weak -—no shortness of breath --began about 4-5 months ago -—below beltline, on right side only (show) ——heavy, sharp pains, like a big heavy rock --pain comes on in the evening, about 8-9 o'clock and usually lasts until about midnight when he falls asleep; when he wakes up the pain is gone 223 224 --he hasn't noticed anything (e.g., certain foods or activities) that make the pain better or worse -—he sometimes takes aspirin but this does not seem to reduce the pain --in the beginning, the pain was irregular (it would come for a few nights and then go away for a week); but during the last couple months it has gotten worse (more frequent and more intense) 3. weight loss --he's lost about 25 pounds over the past 3-4 months --general loss of weight --he's been eating normally, in fact, he's been eating more than usual (regular meals, plus snacks between meals) 4. diarrhea -—began about 3 months ago (about a month after he first noticed the pain) -—at first, 2 stools a day, more recently 3-4 stools a day (in the past, he normally had 1 stool a day) —-stools are soft, loose, but not watery —-has noticed quite a lot of mucous in stools, and some- times some blood, and pieces of food Other 5. No vomiting. 6. Nothing special has happened during the past few months (no school or social problems). He's always worked very 225 hard, plans his study habits and is very organized about his work. He gets good grades, but has to work hard for them. The exhaustion and pain are getting in his way, making it difficult for him to work and he's worried about being able to finish the term and graduate in June. 9. Nonverbal cues: thin, appears tired intelligent, well-dressed answers questions carefully and conscientiously appears tense, but not really nervous NOTE: The patient intially present the complaint of exhaus- tion. When asked if he has any other problems, he mentions the pain. Further questions by the doctor elicit items 3-6. APPENDIX B 226 PROCESS CHECKLIST Subject Film Check as many items as apply. Do not check items that you consider to be praise-worthy, or wEIEh describe your clinical approach in general. Check only those items which characterize your thinking while viewing this film. 1. As the patient described his complaints, this infor— mation elicited a sort of "mental list" of possible problem formulations. In attempting to come up with problem formulations, I made an effort to think of the most serious (life- threatening) types of diseases that the patient might have. On the basis of the information presented in the film, I made some quick "rule-outs" of several problem formulations. In attempting to come up with problem formulations, I focused on pathophysiological processes (i.e., what aspect of physiology is disturbed, and what could cause this disturbance?) I waited until I had some data on each of the patient's major complaints before attempting to look for interrelationships among the symptoms. I tried early on to form a general impression of the patient (his personality, intelligence, back- ground, etc.) that would help me to judge whether he was giving me accurate, objective history infor— mation. In attempting to arrive at problem formulations, one or more sorts of "mental images" came to mind. One particularly salient piece of data immediately brought to mind one or more problem formulations. Given the patient's major complaints, I tried to think of as many different organic causes as possible. I assumed that the patient's problem was organic, unless confronted with evidence to the contrary. 227 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 228 One of the first things I tried to do was to localize the patient's problem in terms of some organ system. In attempting to come up with problem formulations, I tried to think of those illnesses that have a high statistical incidence for persons of the patient's sex, age group, occupational group, etc. In attempting to come up with problem formulations, I gave a great deal of weight to the first complaint mentioned by the patient. In attempting to arrive at problem formulations, I focused on a couple pieces of data that appeared to be most critical, and paid less attention to the other pieces of data. It was the combination of the patient's major complaints that led me to think of one or more problem formulations. One of the first things I tried to do was to deter- mine whether the patient's problem was organic or psychogenic. I tried to come up with at least one problem formulation that would account for all of the data presented. One of the first things I tried to do was to deter- mine whether the patient's problem was acute or chronic. In attempting to come up with problem formulations, I focused primarily on those illnesses that are most common, given the patient's chief complaints and demographic characteristics. In thinking of problem formulations, I paid a great deal of attention to nonverbal characteristics of the patient, e.g. his build, posture, facial expression, gestures, emotional state, etc. In attempting to arrive at problem formulations, it helped me to try to visualize (i.e., to form some sort of "mental image" of) the anatomical location of the problem. As each piece of data was presented I tried to think of how it might be related to the other pieces of data. 23. 24. 25. 229 As I observed the patient and listened to his complaints, I recalled another patient (or patients) that I had seen before. As soon as a problem formulation came to mind, I made an effort to think of other problem formulations that need to be considered. In attempting to come up with problem formulations, I paid attention primarily to what the patient was saying. APPENDIX C 230 TRAINING MATERIALS This Appendix contains the following materials: 1. The "Introduction" section of the Instructional Booklet (Treatment I). 2. The "Film 1" section of the Instructional Booklet (Treatment I). 3. The "Self-Evaluation Checklist" for Film 1 (both treatment groups). 231 232 INTRODUCTION This instructional package focuses on one aSpect of the physician's activity in conducting a clinical workup: namely, the generation of a set of initial problem_fbrmuZati0rs during the first minutes of an -encounter with a patient. The materials have been designed to provide you with the opportunity to practice generating initial problem formulations for a variety of medical cases. For each case, an instructional sequence consisting of three basic components will be followed: (1) you will view a film of the first 4-6 minutes in a doctor-patient encounter; (2) having viewed the film, you will record the problem formulations you have generated and write a tentative assessment; (3) you will be provided with “feedback materials" which describe the problem formulations and tentative assessments generated by a group of experienced physicians who have viewed each film. 233 WHAT IS "AN INITIAL PROBLEM FORMULATION"? During the first minutes of a clinical workup, as the patient's presenting complaint(s) are elicited, the experienced physician usually generates a set of initialggroblem fbrmulations. These initial formulations serve two functions: (1) (2) They are a set of mental categories under which the physician organizes and classifies the data obtained during the first minutes of the workup. They are a set of tentative hypotheses which he will subsequently test by collecting further data during the remainder of the workup. The agecificity of an initial problem formulation will depend on the adequacy of the data obtained during the first minutes of a workup. -—An --It --It --It --It initial problem formulation may be highly general, e.g., ”organic disorder," "psychological problem,” may refer to an organ system, e.g., "renal disorder," may refer to a disease mechanism, e.g., "infection," may refer to both an organ system and a disease mechanism, e.g., "renal infection,” may be a highly specific diagnostic label, e.g., "glomerulonephritis." The number of initial problem formulations a physician generates will depend on the number of distinct complaints the patient presents and the degree to which these complaints are consistent with a single or multiple formulations. .234 In actual practice, the physician does not record his initial problem formulations. Rather he mentally stores them, evaluates them with respect to the data he collects, reformulates them or generates new formulations as needed, and, at the end of the workup, records those problem formulation he has retained as entries on a Problem-Oriented Record. In using this instructional package, however, your task will end with the recording of the initial problem formulations you have generated after viewing a 4-6 minute film of an encounter with a patient. 235 COMPONENTS OF AN INITIAL PROBLEM FORMULATION. Each of your initial problem formulations should include two components, and, in some cases, may include a third component: 1. a problem formulation title, 2. a list of cues, 3. (optional) a list indicating more specific diagnostic possibilities under consideration, and the cues of particular relevance to each. A problem formulation title is a label, having potential diagnostic and/OI" management implications, under which you are able to group elements data (or cues) presented in the filmed interview. It should be stated at a level of specificity that is appropriate to the available data. A cue list should include all elements of data that are relevant to the problem formulation title under which they are listed. The list may include both information reported verbally by the patient and nonverbal cues which you observe. For some problem formulations, there may be particular cues which suggest one or several more specific diagnostic possibilities. For example, among the cues which have led you to generate the problem formulation of "renal disorder" there may be several cues which point to "glomerulonephritis" as one diagnostic possibility to be considered. In this case, you would record "glomerulonephritis" (plus the cue(s) of particular relevance to it) as a diagnostic possibility under the problem formulation "renal disorder." 236 WRITING A TENTATIVE ASSESSMENT After you have recorded your initial problem formulations for a case, you will be asked to write a brief paragraph giving your tentative assessment of these formulations. Your tentative assessment for each case will discuss the set of initial problem formulations you have generated after-a 4-6 minute encounter with a patient. It should indicate: -—how well substantiated you consider each of your problem formulations to be on the basis of the data obtained thus far; --whether you anticipate that the patient has a single illness that will account for his various problems, or that he has multiple disorders; --whether you consider there to be any relationships among your problem formulations. For example, you may consider one problem formulation to be secondary to, superimposed on, or contributing to, etc. some other formulation. 237 -6- You will note that in using this instructional package the format to be followed, in recording problem formulations and tentative assessments, differs somewhat from the usual Problem-Oriented Record format. In the usual Problem-Oriented Record: an assessment is written for each problem formulation, and discussion of various diagnostic possibilities is included in each assess- ment. In this instructional package: you are asked to list specific diagnostic possibilities (if any) under each problem formulation, and to write a single tentative assessment summarizing all of the problem formulations you have generated for a case. A_reminder.... As you view the films and attempt to generate a set of initial problem formulations, keep in mind that these initial formulations are tentative hypotheses which you would want to investigate more thoroughly if you were to continue the workup beyond the first 4-6 minutes presented in the film. 238 THE INSTRUCTIONAL MATERIALS meme. Color films will be used to simulate your encounter with eight patients. Each of these films presents a "physician's eye view” of the first 4-6 minutes in an office visit with a new patient. Throughout the interview the film focuses on the patient; the physician's VOICE is heard but he is never seen. While viewing the film, you should attempt to put yourself in the role of the physician. Pretend that the patient is sitting in front of you and talking to you. The Response Booklet The Response Booklet is divided into eight sections. There is one section for each of the eight films you will view. Each section con— tains the following materials: (l) a set of response sheets on which you will record the problem formulations you have generated; (2) a sheet on which you will write your tentative assessment; and (3) a self- evaluation checklist to be filled out at the end of the instructional sequence. The Feedback Materials The feedback materials summarize the initial problem formulations and tentative assessments generated by a group of eight experienced phy- sicians who have viewed the films. The purpose of these materials is to provide you with a means of comparing your own performance on each case to that of experienced physicians. 239 THE INSTRUCTIONAL SEQUENCE For each of the filmed interviews, the same instructional sequence*will be followed. The steps in the sequence are summarized below. This summary is intended to provide you with an overview of the instructional sequence. Complete instructions for each step will be repeated, as appropriate, throughout the INSTRUCTIONAL BOOKLET. STEP I. You will read the "nurse's sheet" fer the patient in the film. This sheet indicates the patient's name, sex, age, occupation, and temperature (taken orally by the nurse). STEP 2. You will view the film of the 4-6 minute interview of the patient. As you view the film, you should generate a set of initial problem formulations. STEP 3. YOu will record the problem formulations you have generated, and write a brief tentative assessment. STEP 4. You will be provided with feedback materials describing the performance of the group of'experienced physicians. a. YOu will be provided with "Feedback Sheet 1." This sheet presents the major problem formu- lations generated by the physicians; i.e., those formulations generated by all or nearly all physicians who viewed the film. 240 b. YOu will view the film a second time. This viewing of the film will provide you with the Opportunity to repeat your encounter with the patient. As you view the film, attempt to recon- struct in your own mind the reasoning process which led the physicians to generate the problem formu- lation(s) listed on Feedback Sheet l. C. YOu will be provided with "Feedback Sheet 2." This sheet has two sections. The first section presents additional problem formulations generated by some of the physicians who viewed the film. The second section (entitled "Summary") describes the physicians' tentative assessments. STEP 5. You will fill out a selfeevaluation checklist designed to aid you in comparing your performance to that of the experienced physicians. 241 - 10 _ GUIDELINES FOR COMPLETION OF THE PROBLEM FORMULATION RESPONSE SHEETS In reading these guidelines, you should refer to the two sample problem formulation sheets on pages l3-l6. Both of these sheets apply to the same patient. I. At the top of each response sheet, list the title of one problem formulation you have generated. Underneath, in the space provided, list the cues (i.e., all pieces of relevant data) for this formulation. 2. In listing the cues, try to record, as closely as possible, the words used by the patient, or, in the case of nonverbal cues, your actual observation. If you make any interpretations or inferences based on a cue, put these in parentheses. For example: "swollen fingers (edema?)" 3. List both "positive" cues (i.e., cues that tend to confirm a problem formulation) and "negative" cues (i.e., cues that tend to disconfirm a problem formulation). If you consider a cue to be "negative" for a problem formulation, indicate this by writing "(neg.)" in front of the cue. (See the examples on page 15.) 4. A cue may be listed under more than one problem formulation. A cue which is listed as "positive" for one problem formulation may be listed as "negative" for some other problem formulation. 242 -1]- 5. For some of your problem formulations, there may be certain cues which suggest one or several more specific diagnostic possibili- ties. If this occurs, then on the back of the problem formulation response sheet, you are to list: (a) the title of each diagnostic possibility under consideration for that problem formulation; (b) next to each possibility, in the space provided, the cues that are of particular relevance to it. (See the example on pages l3-l4.) 6. Write legibly and avoid abbreviations. 7. If you want to take notes while viewing a film, you may do so. Use a sheet in the Response Booklet for note-taking, and write "NOTES" at the top of the sheet. The following pages present response sheets for a sample patient, including (a) two examples of initial problem formulations, and (b) an example of a tentative assessment based on these formulation. Take several minutes to look over these sheets and to review the preceding “Guidelines." You can also look back over any of the materials presented up to this point in this booklet. 243 / 244 245 Film fl Me’s Problem formulation title: rend disorLe r CUE LIST smilen 4.3.3", 4— ankles (edeML?> , I wee/c loss oi Appe‘l’H-e -o- hang“ . $4401; r I —" I vomii-inq I I]; dag; TI darker “H.,”. (he’nv‘urlé ?) T. 244g: weakness + 0E4+|1ll€ f ngcelp heaeLath. r (Ascgrjtou‘coa .’) dlzéoness i (LGIC IL .1 i3 More specific diagnostic possibilities (if any) are to be listed on the back of the sheet. 246 More specific diagnostic possibilities which you have considered for this problem formulation (if any) Title .3100" re.“ nephrii'l‘s Cues of particular relevance swell“, fingers 4 «files (edema?) d4: ktr “L3H. é heuggau; ?) Sore “use? 2 weeks 690 -- L S+r¢ p . ck‘ec‘hoon ? ) 3h 247 Film flange/e.- Problem formulation title: Congestive fien- f’ 'Fu'Ia rc :1 CUE LIST 34.43”“: out!“ (Galen-tel) ; I'wegfi [0552f cipc'h'fc. 4' "40434,. 3‘44175 weakness #- 140L119“; F [Aweck headachg _37 ([15 pectta [tea ? ) gleam“; U, (heg.r 4Q 55+"); of cardtg digests: (Mg. 4 e 22. q O J “at" ’ ..... P , ‘.'I'A I§~A np—I- r--.: ‘ .O- A do-e soCC‘-.- e.Ug.cs. C pCs-qu!l..G> 1..“ _-. ' ..Q. .L . . lv any. are to be sisted on .he oacx Cf the sheet. . . —- .- -—,~ .—~-— 2118 More Specific diagnosticgpossibilities which you have considered for this problem formulation (if any) Title Cues ofgparticular relevance 249,4. 1 Film Exwpk' TENTATIVE ASSESSMENT The pchhd's swap-hm: path? Sironqu “to some. 4mm of renal dliordtr. Thgrjaef 3419:5144 A. ion fired»? a geek: £30 (Poss: M3,! 9 gimp. inf¢¢hon)4 AM the G235 1mm edema _Jld hung. tgcfa. ' [14.4 Mg, l consgdgr 'i‘g; Pgs ”baht? of qlomgrulo memo-Hug- i‘ arise. L_£1LQ‘¢J4‘ 4»l1isz___£L:L___sa211:Li::: :igurt «airti <:21tff:1u11lic ‘ieuzr”i' fisili‘0::LI____J:IJn___¢s_m__. r on a a with it; 933136 hc'dvry 42f Cardiac. Acsegjc # agreesf'wc m+ failure. as a lat-Query Mum «3102(4 be We. “filikelv . 30d; [1’ is jam-{ibis firm ’14:, con :ILes'hoc. hanger? hikers. seugarzj 8L2; st {:3er disorder' [fie q/omera lohylirrlhso _53:55n the (can? can? 01" JAtrsyMthI fit: “3!:th The} she I; ygesaaq and cLQQeir's fi¢;¢r(fl/ I“... , w: Gustaf/i "la-97%;? Jibdjt 73’ «Ci 5 f 41.3., _‘£_Ei’1:;:’;:4 {Unfit}? .' ,4 (the! disorder' jucL sswwpgigmeruia nephritis r either are?“ or _111.;;-_”E:e_;.‘3_'__-__.w h i c s 7"; u e '5 elem 4' $31 ( I a r e. 4‘ at ...._...—-.—-—s- .— , -7 _. . , ., .~ -.._,,-j .3 .. “a. ,: “75.0 I (L “7.1-; ." .‘-‘u.'-u. sun, nL'L VP ~.»L_‘. 0‘ f“ .. 250 251 FILM l -2]- FILM 1 STEP l. Here is the nurse's sheet for the patient in film I. Name: 50’? Ran/zinger Sex: {:1 Age: 2! Occupation: (cf/61¢ student (senior) CI 0 Temperature: 1? F STEP 2. The film of the 4-6 minute interview with the patient will now be presented. Nhile viewing the film, you should generate a set of initial problem formulations which you would want to investigate more thoroughly if you were to continue the workup beyond the first 4-6 minutes presented in the film. (PRESENTATION OF THE FILM) 252 -22- FILM l STEP 3. Turn to the section of the Response Booklet for film l. Record the problem formulation you have generated. Fill out one response sheet for each problem formulation. You may refer back to the "Guidelines” for completion of these sheets, on pages lO-ll, if you wish. (RECORD PROBLEM FORMULATIONS) After you have recorded your problem formulations, write a brief paragraph giving your tentative assessment of the case. Your assessment should indicate: --how well substantiated you consider each of your problem formulations to be on the basis of the data obtained thus far; --whether you anticipate that the patient has a single illness that will account for his various problems, or that he has multiple disorders; --whether you consider there to be any relationships among your problem formulations. For example, you may consider one problem formulation to be second- ary to, superimposed on, or contributing to, etc. some other formulation. (NRITE TENTATIVE ASSESSMENT) After you have written your tentative assessment, go on to the next page. 253 STEP 4. -23- FILM I You will now be provided with feedback on the performance of the group of experienced physicians who viewed this film. Turn to the next page and read Feedback Sheet l. Check your response sheets to see if they include the major problem formulation(s), listed on Feedback Sheet 1, which were generated by all or nearly all of the physicians. 254 -24- Film l Feedback Sheet l Major Problem Formulations All physicians who viewed this film generated the following problem fornurlation: GASTROINTESTINAL DISORDER. Under this formulation, all physicians listed the possibility of inflammatory bowel disease, such as: ULCERATIVE COLITIS, 0r REGIONAL ENTERITIS/ILEITIS. The following table presents the cues listed as relevant to GI DISORDER in general, and to ULCERATIVE COLITIS or REGIONAL ENTERITIS/ILEITIS in particular. GI DISORDER: ULCERATIVE COLITIS, or REGIONAL ENTERITIS/ILEITIS GI disorder pain in right lower quadrant of abdomen, for 4 months occurs in evening, lasting several hours not relieved by aspirin or Darvon not related to foods diarrhea: increase in number of stools, from l to 4-5/day, over 3 month period ; mucous in stools . blood in stools pieces of food in stools g weight loss, 25 lbs. in l-2 months good appetite, eating more than usual extreme fatigue and weakness, for 2 months i no vomiting : Ulcerative colitis, or regional enteritis/ileitis i diarrhea blood and mucous in stools weight loss of 25 lbs., with good appetite . ; age 2l ! college senior: under academic stress concerned about keeping up with studies 1 255 STEP 4. -25- FILM 1 Feedback (continued) The film will now be presented for a second time. As you view the film, attempt to reconstruct in your own mind the reasoning process which led the physicians to generate the problem formulation(s) listed on Feedback Sheet l. (PRESENTATION OF THE FILM) Now turn to the next page and read Feedback Sheet 2. Compare your problem formulations to those listed on the feedback sheets. Compare your tentative assessment to the physicians' assessments described in the "Summary" on Feed- back Sheet 2. 256 -26- Film l Feedback Sheet 2 Additional Problem Formulations Under the formulation of GI DISORDER, most physicians listed, in addition to inflammatory processes (ulcerative colitis, or regional enteritis/ileitis), two other possibilities: INTESTINAL MALIGNANCY, PSYCHOGENIC PROBLEM. The cues listed as particularly relevant to each of these possibilities are presented below. GI DISORDER: INTESTINAL MALIGNANCY; PSYCHOGENIC PROBLEM GI disorder (See cues listed on Feedback Sheet l.) Intestinal malignancy blood in stools weight loss of 25 lbs. extreme fatigue (neg.) age 2l Psychogenicpproblem initial impression: patient appeared tense and anxious college senior: under academic stress concerned about keeping up with studies (neg.) blood in stools (neg.) weight loss of 2 lbs. with good appetite ‘ ( neg.) patient appeare to be frank and objective in reporting symptoms neg.) reported no major difficulties with school or personal life 257 -27- Film 1 Feedback Sheet 2 (Continued) In addition to the formulation of GI DISORDER, some physicians generated the following problem formulations: ANEMIA, DUE TO GI BLOOD LOSS CARDIOVASCULAR PROBLEM, DIABETES MELLITUS. ANEMIA, DUE TO GI BLOOD LOSS extreme fatigue, for 2 months especially on exertion weakness in leg muscles blood in stools onset of diarrhea (with blood?) preceded onset of fatigue by l-2 months CARDIOVASCULAR PROBLEM extreme fatigue, for 2 months especially on exertion weakness in leg muscles (neg.) no shortness of breath DIABETES MELLITUS extreme fatigue and weakness, for 2 months weight loss, 25 lbs. in l-2 months good appetite, eating more than usual 258 -28- Film 1 Feedback Sheet 2 (Continued) Summary (of the physicians' tentative assessments) All physicians stated that they anticipate a single illness: most probably some type of gastrointestinal disorder. All indicated that, given the patient's symptoms and his age, inflammatory bowel disease, such as ulcera- tive colitis or regional enteritis/ileitis, is the most likely type of GI disorder for him to have. An alternative hypothesis, intestinal malignancy, was considered by most physicians, but was judged to be less likely than the inflammatory processes under consideration. Most physicians also gave consideration to the possibility of a GI disorder of psychogenic origin. However, they tended to conclude that the patient probably has an organic GI disorder, such as ulcerative colitis, which may be aggravated by psychological factors (e.g., academic stress), rather than a problem that is primarily psychogenic in nature. Due to the patient's extreme fatigue and weakness, some physicians generated the formulation of anemia, due to blood loss from the GI tract, and several generated the formulation of cardiovascular problem. ‘They tended to conclude that a cardiovascular problem (unrelated to his GI difficulties) is quite unlikely, whereas anemia (secondary to intestinal inflammation or malignancy) is very possible. Several physicians indicated that diabetes mellitus, is another possibility to explore, although on the basis of the available data it does not appear a: be very likely. 259 " -29.. 260 -30- FILM I STEP 5. Now turn to the self-evaluation checklist at the end of the Response Booklet section for film l. (FILL OUT SELF-EVALUATION CHECKLIST) 261 262 SELF-EVALUATION CHECKLIST INSTRUCTIONS The checklist is designed to aid you in evaluating your own problem formu- lation performance as compared to that of the experienced physicians. Part A of the checklist presents the titles of all problem formulations (and diagnostic possibilities) generated by the physicians. Part B pre- sents the statements regarding possible relationships between problem formulations that appeared in the physicians' tentative assessments. Place a check next to each item in the checklist which corresponds to one of your own responses. In order to check an item, there does not have to be an exact correspondance in wording; a general equivalence in meaning is sufficient. (FILL OUT THE CHECKLIST) You may consider that the degree of correspondance between your own performance and that of the experienced If you have checked... physicians is... All items marked (*), and some (or all) of the other items; high All items marked (*), and none of the other items; OR moderate Some items marked (*), and some of the other items; None of the items marked (*), and some (or none) of the other items. low Note: It should be borne in mind that the checklist is not necessarily exhaustive. If a larger group of physicians had viewed the film, it is possible that the checklist would have included some additional items. If your own set of problem formulations and/or tentative assessment included items that do not appear in the checklist, you cannot evaluate them by means of the checklist, but this does not necessarily mean that they are inapprOpriate. . 263 SELF-EVALUATION CHECKLIST Before filling out the checklist, read the instructions on the opposite page. CHECKLIST FOR FILM I Part A: Problem formulations, and diagnostic possi- bilities l. GI disorder* a. ulcerative colitis, and/or regional enteritis* b. intestinal malignancy c. psychogenic problem 2. anemia due to GI blood loss 3. cardiovascular problem 4. diabetes mellitus Part B: Relationships between problem formulations l. anemia secondary to ulcerative colitis/regional enteritis, or intestinal malignancy Note: An (*) indicates that the item was included in the responses of all or nearly all of the physicians. APPENDIX D 264 li|.l III!!! ii. RECOGNITION OF CUES Film 8 Place a check in front of each cue that you remember seeing or hearing in the filmed interview. IIIIIH 10. ll. 12. FJH n4» 0 15. 16. I'-‘ \l O 18. I—l K0 0 20. NNN LUMP c o o 24. 25. N ON 0 27. 28. 29. 30. 31. 32. 33. 34. HHHIH 000») mm c o 37. 38. 39. .b O 0 has been drinking a lot recently her voice quavers has recently noticed some blurring of vision abdominal pain is localized in right lower quadrant is leaning forward, clutching abdomen has vaginal discharge, along with itching female abdominal pain no fever obese vaginal itch has re-occurred several times during past year feels nausea after eating fatty foods has been "tired & edgy" recently vaginal itch re-occurred 2 days ago boyfriend used contraception has had intermittant headaches recently sudden onset of abdominal pain this a.m. urine has had darker color recently breasts sore and swollen, for 1 week stopped taking birth control pills 1 year ago, due to vaginal itch pain radiates from abdomen to back burning on urination, 1 week has been feeling "funny" recently no recent change in bowel movements has noticed shortness of breath recently abdominal pain is diffuse projectile vomiting this a.m. pain relieved by lying down has been eating a lot of candy and cookies recently has felt "very depressed" recently last menses 8 weeks ago no bowel movement this a.m. menstrual cycle is usually irregular has felt dizzy recently student nurse anxious manner & tone of voice nausea for several days abdominal pain is intermittant and crampy has noticed no recent weight change vomiting began after breakfast this a.m. 265 TIIIIII'I'III} o 41. 42. bub .e-w O ab U1 0 46. Ill! lllll U1 \D o UIIb OKO o 47. h m o 51. UTUIU'I lbw“) a o 55. 56. 57. 58. ox O O 61. 62. 63. 64. 266 reports tingling sensation in her arms increased appetite recently skin rash on hands for several days has broken off with boyfriend 1 month ago has noticed recent increase in frequency of urination age 19 sexually active steady increase of abdominal pain increased thirst recently abdominal pain relieved by vomiting was using contraceptive foam after stopped taking birth control pills has been eating irregular meals recently has pain on intercourse has not been able to concentrate recently had cheese sandwich & milk at midnight, 6 hours before onset of abdominal pain & vomiting rests her head on her hand vomits thin yellow liquid complains of frequent belching her condition is stable at present admits anxiety about pregnancy sudden chill last night usually has cramps with menses abdominal pain is sharp and steady vomiting, 4-5 times this a.m. 267 ADDITIONS TO PROBLEM FORMULATION SHEETS Film 8 The attached sheet lists the cues from film 8 which the physicians used to generate problem formulations and diagnostic possibilities. You are to use this sheet to do the following: 1. In reading the attached list, you may notice some cues which you did 22E record, but which you now consider to be relevant to one or more of your problem formulations (or diagnostic possibilities). IF THIS IS THE CASE, add these cues to your response sheets. Record the cue number(s) in the space under the relevant problem formulation (or in the space next to the relevant diagnostic possibility). Having read the attached sheet, you may have thought of some additional problem formulations. IF THIS IS THE CASE, record the title of each additional formulation on a response sheet. Underneath, in the space provided, list the number(s) of all relevant cues. Having read the attached sheet, you may have thought of some additional diagnostic possibilities, related to the problem formulations you recorded after viewing the film, or related to the additional problem formulations (if any) you recorded in step 2 above. IF THIS IS THE CASE, record the title of each additional diagnostic possibility on the back of the appropriate problem formulation sheet. Next to it, in the space provided, list the number(s) of all cues of particular relevance. Use the pen that has been provided for you to record any of the above additions in your response booklet. When you are finished please raise your hand. 13. 17. l8. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 268 FILM 8 List of cues which the physicians used to_generate problem formulations age 19 female student nurse no fever obese leaning forward, clutching abdomen anxious manner & tone of voice her condition is stable at present abdominal pain sudden onset of pain this a.m. pain is diffuse, not'localized steady increase of pain pain is sharp & steady, not crampy vomiting, 4-5 times this a.m. vomits thin yellow liquid no bowel movement this a.m. no recent change in bowel movements had cheese sandwich and milk at midnight, 6 hours before onset of abdominal pain & vomiting has been feeling "funny" recently has been "tired & edgy" recently admits anxiety about pregnancy last menses 8 weeks ago sexually active has broken off with boyfriend 1 month ago he was using contraception; she was not stopped taking birth control pills 1 year ago, due to vaginal itch vaginal itch re—occurred 2 days ago increased thirst recently increased intake of fluids recently increased appetite recently no recent weight change increased frequency of urination recently APPENDIX E 269 QUESTIONNAIRE The questionnaire was the same for both treatment. groups except for items 14 and 15. The copy of the ques- tionnaire contained in this appendix includes the Treatment II version of these items. The Treatment I version of these items was as follows: 14. The second Viewing of the film helped me to consolidate my understanding of the case. 15. The second viewing of the film was not worth- while. 270 ,1_ Participant QUESTIONNAIRE Part l Please read carefully each of the statements below. For each statement, indicate your Opinion by circling ppg of the five response Options: SA = strongly agree A = agree N0 = no opinion D = disagree SD = strongly disagree Statements ReSponse Options l. The instructions were generally clear and easy to follow ............. SA A NO D SD 2. The instructional sessions were too long . . SA A ND D SD 3. The actors who played the role of the patients in the films were very convincing ................. SA A NO D SD 4. The films provided a realistic simulation of the early part of the clinical workup . . SA A NO D SD 5. The dialogue in the films was sometimes difficult to follow ............. SA A NO D SD 6. The physicians in the films did a good job Of interviewing the patients ...... SA A NO D SD 7. I enjoyed watching the films ........ SA A NO D SD 8. As I watched the films, I was able to put myself into the role of the doctor ..... SA A NO D SD 9. The films presented a good selection of medical cases .............. SA A NO D SD l0. The feedback materials were well ' organized and easy to follow ........ SA A NO D SD II. The feedback materials were sometimes overly redundant ..... . ......... SA A ND D SD l2. The opportunity to compare my problem formulations to those of experienced physicians helped to improve my skill in generating initial problem formulations. . . SA A NO D SD 271 -2- QUESTIONNAIRE (cont.) Statements I3. I4. I5. I6. I7. I8. I9. 20. 2I. I found the feedback materials interesting ................. The second version of the films, which portrays the physician "thinking aloud", provided me with an understanding of the pppcess by which experienced physicians generate initial problem formulations. The "think aloud“ segments in the second version of the films tended to disrupt my own thinking process ........... The self-evaluation checklists helped me to evaluate my own performance as compared to that of the experienced physicians. . . . My ability to generate a set of initial problem formulations has improved as a result of utilizing this instructional package ................... This instructional package is not appro- priate for second-year medical students. . . a. It would be more appropriate for first-year students. . .. ........ b. It would be more appropriate for third-year students ........... For some of the cases, I didn't have sufficient medical knowledge to be able to generate appropriate problem formulations ................ If a library of films like these, with accompanying feedback materials, was available to medical students, I would make use of it ............... It would be more interesting to use the films and feedback materials in a group discussion setting (e.g., focal problems class) than in an individual self- , instructional format ............ 272 ReSponse Options SA SA SA SA SA SA SA SA SA SA SA A NO NO NO NO NO NO NO NO NO NO NO D SO SO -SD SO SO SO SO SO SO SO SO -3- QUESTIONNAIRE (cont.) Part II After participating in one of the previous sessions, you may have pursued your interest in one or more of the cases outside of the instructional sessions. For example, you may have discussed the cases with other students or a faculty member, or you may have looked up materials pertaining to the cases in a medical reference book. Please indicate below the ways (if_§py) in which you pursued your interest in the cases outside of the instructional sessions. Check all items that are applicable. __ Case I (a 2l-year-old college senior who complains Of fatigue, abdominal pain and diarrhea). __a. I discussed the case with other student(s). __b. I discussed the case with faculty member(s). __p. I looked up relevant reference materials. __d. other (specify) ___Case 2 (a 43-year-old landlady of a boarding house who -complains of chest pain). __a. I discussed the case with other student(s). __b. I discussed the case with faculty member(s). __p. I looked up relevant reference materials. __d. other (specify) ___Case 3 (a 30-year-old taxi driver who complains of urinary distress). __a. I discussed the case with other student(s). __b. I discussed the case with faculty member(s). __p. I looked up relevant reference materials. __d. other (specify) ___Case 4 (a 40-year-Old carpenter who complains of chest pain incurred while wrestling). __p. I discussed the case with other student(s). __b. I discussed the case with faculty member(s). __p. I looked up relevant reference materials. __d. other (specify) __ Case 5 (a l9-year—old college sophomore who complains of headache and sleepiness). __a. I discussed the case with other student(s). __b. I discussed the case with faculty member(s). __p. I looked up relevant reference materials. __d. other (specify) __ Case 6 (a 29-year-old lawyer who complains of low back pain). _“a. I discussed the case with other student(s). __p. I discussed the case with faculty member(s). __p. I looked up relevant reference materials. __d. other (specify) 273 -4- QUESTIONNAIRE (cont.) Part III Please indicate below any comments regarding this instructional package, or any suggestions for the use of these materials with medical students. Part IV We are interested in knowing how much contact with actual patients you have had prior to participating in this experiment. Please list below any type of experience you have had that involved contact with patients (e.g., experience as a physicianS assistant, as a nurse, as a paramedical assistant). For each type of experience, please indicate the extent of the experience (e.g., 20 hours/week for 6 weeks; 40 hours/week for l year). Type Of Experience Extent of Experience 274 APPENDIX F 275 PF AND R-PF SCORES: INSTRUCTIONS A. PF score 1. 2. 3. This score is based on the titles of the S's problem formulations (PF) and diagnostic possibilities (DP). Each PF or DP title is scored as follows: a. If the S's title is equivalent to one of the titles on the scoring sheet: (1) under the column "Type of Response," circle (2) (3) (4) (5) the code(s) corresponding to the way(s) in which the 8 recorded the title. under the column "Pts.", circle the number of points for the title. if the S has recorded a title in a way for which there is no code under the column "Type of Response," write in the type of response the S gave. if the 8 fails to list any cues for a title, do not score this title. if the 8 lists more than one DP in a single response space, score only one of the DPS listed: the one with the highest number of points. if, on reading the 8's tentative assessment, he mentions a title that was not listed on a response sheet, this title may be scored providing that the S mentions at least Ew9_ cues that led him to consider it. Under the column "Type of Response," write "in TA," and circle the points for the title. If the S's title is not equivalent to one of the titles in the scoring sheet: Check the list of "Other Acceptable Responses." If the title appears in this list, write in the title, and circle one point. Sum the number of points circled. 276 277 PF and R—PF Scores: Instructions Continued B. R—PF score 1. This score is based on the S's statements of functional relationships between PFs or DPS, as recorded in his tentative assessment (TA). Each such statement is entered on the scoring sheet as follows: a. If a statement is equivalent to an item on the scoring sheet, circle the number of points next to the item. b. If a statement is not equivalent to an item on the scoring sheet: check the list of "Other Acceptable Responses." If the statement appears on this list, write in the statement and circle one point. Keep in mind that statements of relationships between cues (rather than PFs or DPs) are not to be scored. If the S lists functional relationships between PFs orlflkson his PF response sheets, but fails to mention them again in his TA, such relationships should be entered on the scoring sheet as indicated in rule B2. The scorer should indicate for such relation- ships "not in TA." Sum the number of points circled. 278 CUE AND CUE-PF SCORES: INSTRUCTIONS CUE score 1. 4. The CUE score is based on the cues the S recorded, irrespective of the PF (or DP) under which he listed them. The entries for this score are made under the column labeled "CUE." For each cue the S recorded, circle the number of points on the scoring sheet corresponding to the cue. Sum the number of points circled in the CUE column. CUE-PF score l. 5. 6. The CUE-PF score is based on the cues the S recorded under PE (or DP) titles included in each category across the top of the scoring grid. The entries for this score are made in the cells of the cue (rows) x category (columns) scoring grid. The PF (or DP) titles that are included in each category are specified at the end of the scoring grid. For each PE (or DP) title the S recorded: a. determine if this title is included in one of the scoring categories; b. if so, circle the number of points (in the appropriate category x one cell) for each one the 8 recorded under this title. c. if there is an (*) next to a cue, this indicates that the S must mark the cue as "negative" for PFs in that column. If the S did not do so, change the sign of the cue from + to -, and circle the points. After completion of step 4, sum the points circled (across columns) in each row and enter this sum in the column "Tot." Sum the points recorded in the "Tot." column. 279 CUE and CUE-PF Scores: Instructions Continued C. Both scores 1. In order for a cue to be scored, it is not necessary that the S mention all the details included in the cue description on the scoring sheet. Example: Film 7 If the S lists Trfever 101-102," or "fever 2 days," or "fever," he gets credit for cue l. However, it is necessary that the S list sufficient detail for the scorer to be able to determine unequivocally the cue to which the 8's response refers. Example: Film 7 Ithhe S listETwcough," it is impossible to deter- mine whether he is referring to cue 2, 3, 4 or 5. Thus, no points are scored. If the S recorded a cue that is clearl incorrect (e.g., film 7: "no fever," "weight ain"), write the letters "inc"_in the cell for tge cue, and change the sign in front of the points from (+) to (-), or if the sign is already (-), leave it as (-). i||. II lllll. I II 10. ll. 12. l3. l4. 15. 280 Subject PF score R—PF score PF Scoring Key--Film 7 Titles cough and/or fever acute respiratory infection respiratory problem chronic respiratory problem cancer (lung/ bronchogenic) infection acute bronchitis pneumonia bronchitis C.O.L.D./emphysema chronic bronchitis TB SVC syndrome other PF of 3) a. (maximum b. C. other DP (maximum of 3) TY PF PF PF PF PF PF PF PF PF PF PF PF PF e of Response DP under 1, 3, 6 DP under 1 DP under 1, 3 DP under 1, 3, 4 DP under 1, DP under 1, 2,_3, 6, 9 DP under 1, 2, 3, 6 DP under 1, 2, 3, 4, 6 DP under 1, 3, 4 DP under 1, 3, 4, 9 DP under 1, 2, 3, 4, 6 DP under 5 DP under Pts. 281 R—PF Scoring Keyr-Film 7 Statements of relationships acute respiratory infection superimposed on chronic respiratory problem or cancer SVC syndrome 2e to cancer other (maximum of 3) Pts. .282 Subject CUE Score CUE-PF Score CUE and CUE-PF Scoring Keys: Film 7 CUE CUE-PF ARI CRP I CA RP TOT l. fever 101-102 (2 days) +3 0 O O 2. chronic a.m. cough (2-3 yrs.) 0 +3 +3 +3 3. cough worse (2 days), acute cough 3 +3 0 +3 +3 4. woke up coughing (2 nights ago); got back to sleep propped on 2 pillows, orthcpnea +2 0 0 +2 5. wheezy cough observed +1 +1 +1 +1 6. sputum with a.m. cough (2—3 yrs.), productive a.m. cough 3 -3 +3 +3 +3 7. change sputum color (yellow to green, 2 days) 3 +3 0 0 +3 8. blood flecks in sputum (2 days), hemoptysis 2 +2 0 +2 +2 9. weakness in leg muscles climbing stairs (3 wks.) 2 0 +2 +2 +2 lO. weight loss (5-10 lbs., in 1 yr.) 3 -3 +3 +3 0 ll. smoker (20+ pack yrs.) 0 +3 +3 +3 12. wife noticed face is "ruddier and chubbier" recently 2 -2 +2 +2 +2 13. trouble buttoning collar recently, neck enlargement 3 -3 0 +3 +3 14. age 57 2 -2 +2 +2 0 15. male 2 -2 +2 +2 0 16. executive l -l 0 O 0 17. no chills 1 +1* -1 -l 0 l8. no temp. spikes 1 +1* -1 -l 0 19. no S.O.B. on exertion l 0 +1* -1 +1 20. physical and lab/X-ray (OK, 9 mo. ago) 1 —1 +1* 0 0 ARI = acute respiratory infection (pneumonia, acute bronchitis, viral/ strep/bacterial/pneumococcal/infection, pneumonitis, ) CRP = chronic respiratory problem (C.O.L.D./C.O.P.D., chronic bronchitis, broncniectasis, ) CA = cancer (hrochogenic/lung), including S.V.C. syndrome 2e to cancer, and metastases to thvroid RE 2 respiratory problem (Code only those cues not coded under .‘iflI, CI‘P, C: CPI) koooqmmbwm O 1.: O O 11. 12. l3. 14. 15. l6. 17. 18. 19. 20. 21. 283 PF Scoring Key——Film 8 Titles abdominal pain and/or vomiting/abdominal problem GI problem appendicitis gall bladder gastroenteritis ulcer pancreatitis polydypsia and/or polyuria diabetes mellitus psychological amenorrhea/dysmenorrhea pregnancy ectopic pregnancy GU infection/problem vaginal itch vaginitis/monilial cystitis/urinary tract infection VOD. P.I.D. obesity other PF a. b. C. (maximum of 3) Type of Response PF PF PF PF PF PF PF PF PF PF PF PF PF PF PF DP DP DP DP DP DP DP DP DP DP DP DP DP DP DP ll, l4, 16, under under under under under under under under 12 under under under under under 15 under 17 under Subject PF score R—PF score 1 NNNNN m 11 11, ll, 12 12 14, 15 8. 15, 14, 15 Pts. mowwwwmm wommmom 284 PF Scoring Key-~Film 8 Continued Titles T e of Response Pts. 22. other DP (maximum of 3) a. DP under 1 b. l c. l R-PF Scoring Key--Film 8 Statements of relationships Pts. 1. psychological problem aggravating organic problem(s)/organic problem(s) with "psychologi- cal overlay“ 3 2. diabetes predisposes to GU infection/cystitis/ vaginitis 3 3. pregnancy: increases likelihood of GU infection/ vaginitis 3 4. other (maximum of 3) a. l b. l c. 1 285 OHOOM hmlflbu 300m ”DU subarea a EHHm "who: mnHuoom emumao new moo HI HI H+. H. H. H- H meHsHeo> a cHaa Ho uomco wuowmn .mur o -.E.m NH um xHHE com rOHzocmm wmooco .wH NI NI eN+ N+ NI NI NI N umaomon Naamsm: “mucoEo>OE Hmzon :H omcmro vacuum 0: .FH HI HI eH+ H+ HI HI HI H .E.m mflru ucOEo>OE Hoson 0: .mH HI HI H+ H+ H+ HI HI o H eHsoHH soHHOH :Heo msHso> .mH MI NI m+ m+ m+ o o m+ m =mmoconm mchuoE= ..E.m mHeo mosHo mns .mcHoHEo> .vH o o aN+ N+ NI N+ NI N Nassau no: . .NUMODm w mumcm .ouo>vm :Hom .ma o HI H+ HI H+ HI H chm mo OmmouocH Noses. .NH O O fim+ M+ O ¥M+ MI M _-\HO>O Haw: .ooNHHmooH Doc .omsquo :Hmm .HH o NI N+ o N+ N: N .E.m msru :Hmm wo bonco cocoom .OH m+ m+ m+ o m+ o m :Hmm HmcHEoobm .m HI HI .H+ o HI .H+ H- H Hammond um oHcmuw mH :oHuflocoO Ho: .m NI NI c N+ o N+ N OOHO> we TECH w HOCCGE mJOHXCfl .5 o o N+ c N+ o N coeoooa mcflcouoau .oum3h0w mCHcmoH .o NI N+ N+ o N+ NI mi N amass .m aH+ HI uH+ o HI «H+ o H m.mmne .Ho>oH cc .q o HI o H+ o o H amuse ocoezou .H N+ o o o o N+ N OHBEHN .N o o o e o H+ H OH one .H .uOH. .MCM .Z.D .m.0 .m.0 .mmd .Ewo .Hfivhmm .wom .HOZ J30 .o.o .H.o .moua mmlmao moo HI o HI HI HI HI H Home .HN He x0 N@HX\ooH\Hmonmcm .Nm m+ m+ MI o o m+ m OHuomHom .wHucooou COHumcHuo No Mocoooouw oomOOHOCH .Hm HI H+ o 0 HI HI H omcmco bzmHox ucooou oc .om NI N+ o N+ NI 0 N cbmoczmHOL .NHucooou muHuommo oomoOHUCH .ON Ml m+ o m+ ml 0 m onLHUAHO; .wqbcooou umHch cowsouocH .wm m+ m+ MI 0 o m+ m 0mm mMoU N :uuunooOIou :oDH chHmo> .mu N+ o NI 0 NI 0 N soul chHr.> OH USU ”not .Hn HV MHMHx HQHquU :uHH; ooxxfibm .Jw o NI NI 0 o N+ N :oHu.rHs3Hbero o>HuouHHocH “oz; 3H3 - JIonccn 3;-£ L: ._I H - .- rbrk .. J.,—ruck V i u i 286 NI NI NI N+ NI 0 N Omo .Jsae one JCOHHm>OQ cyan “we frown .ru N+ NI NI 0 Q u+ N or new rHHthvu .Hw o o MI m+ o m+ m stHuxifi,s uafifi_H .Qfio .mxk m 23::35 unfiH .nN ml ml ml m+ 0 m4 m ..;.?_31 4,. - . I ,H. . r THUS . I _.. .uoe .ueH .z.o .m.o .a.o .ae< .coo .Hosme .sou .Hoz Hw.o ODOU 0H0”. .MHVIHHI mvhocwurfiflu m LHHQH Hag.:..ullH0oM“.HLI;<fio U:o ”BED .287 mHuHumNO mmoomHo NHoumEEoHNCH OH>HOQ mHuHCHmm> Homwu Ncmv .D.> cpooowcH .D.o "mCHBOHH0m or» No New moons ooumHH mono ouoom coHuOONCH NnmcHHoIouHcom n .wcH .D.o mouHHHoE monoooHo n .2.2 .mCEdHoo omonu cH pom =.coo= HO CESHOo ecu CH mooo room ouoom .mchHoo omocu CH mono ecu mo Nam mumHH m may NH "=.m.u: =..m.w= =-.Qmo= UoHoan mcESHoo ecu Home: meHHoucooubmmm OCHCOwHOQ UOOw HooUpo HHom mHuHuoouocom mHuHoHocomao "mEoHQOHQ H0 mcHBOHH0w or» No New HO\o:o ACOHmHoommcov Epooum Ho Moos: UODmHH mono ouoom ".coo HochmoucHouumsa u Ho conmOHmoo\Nooncm\HooHmoHozoxma n .Hrounm >ocmcmona oHQouoo n .uom OOHuHoommc: HoEHo: u .Hoz >occcmoul u .moue ooscHocoo m eHHc "mace mcHucon sguseo can moo APPENDIX G 288 ANALYSIS OF THE STRUCTURE OF A SET OF - PROBLEM FORMULATIONS ’ Example of two physicians' responses for Film 1 Subject: A No. No. Structural Problem Formulations PF S Features 1. GI disorder 1 l H: l(a,b,c) la. ulcerative colitis/ileitis l ' C: la-lb-lc 1-2-4 lb. GI malignancy 1 S: 1,2,3,4 1c. psychogenic 1 R: 3/1a or . 2. diabetes 1 1 lb 3. anemia (26 to 1a or lb) 1 l 4. renal problem 1 l 7 4 Subject: B No. No. Structural Problem Formulations PF S Features 1. intestinal inflammation 1 l H: 1(a,b) la. regional enteritis l C: la—lb lb. ulcerative colitis 1 S: 1,2,3 2. anemia (2e to l) l l R: 2/1 3. cardiovascular (along with l) l l 5 3) Note: No. PF No. S = number of problem formulations Features: number of subspaces hierarchical organization competing formulations subspaces functional relationships 2303031! A response was not counted as a problem formulation unless there were at least two cues associated Wlth it. 289 APPENDIX H 290 TABLE 28.--Comparison of the Participants with the Refusal/ No Contact Group on Focal Problems Exam Scores. Group n Mean St. Dev. Participant 47 72.06 7.115 Refusal/No Contact 15 72.67 7.556 90% Confidence Interval: Note: The comparison concerns the participants in the experiment versus the students who were sampled but refused to participate (or could not be contacted) except for the four students with exam scores con- siderably below all other students (i.e., 20% of refusal/no contact group, or 6% of target population). Scores were not available for one participant and for one refusal. 291 292 TABLE 29.--Adjusted Means on Number of Subspaces, and Number of Problem Formulations, by Experimental Condition. Variable Treatment I Treatment II Control No. Subspaces 8.3 7.9 6.5 No. Problem Formulations 15.0 12.3 9.5 TABLE 30.--Multivariate Analysis of Covariance on Number of Subspaces and Number of Problem Formulations. F Tests df F p Multivariate F test 4,86 5.0314 .0011 Stepdown F tests no number of problem formulations 2,44 6.2399 .0042 no number of subspaces 2,44 4.0060 .0254 293 TABLE 31.--Scheffé Post Hoc Comparisons on Number of Subspaces and Number of Problem Formulations. Confidence Variable Comparison Interval. (l-a)% No. Subspaces Tl-T2 ‘ 0.4 i 1.39 95% Tl-C 1.8* i 1.41 95% T2-C 1.4 i 1.39 95% No. Problem Formulations Tl—T2 2.8 t 3.12 95% Tl-C 5.5* i 5.03 99.9% T2-C 2.8 f 3.11 95% Note: T1 = Treatment I; T2 = Treatment II; C = Control. Significant group differences at the .05 level of better. APPENDIX I 294 HOMOGENEITY OF REGRESSION The sample estimates of the population regression slopes, reported in Table 32, were sufficiently dissimilar to indicate that a test of the ANCOVA model assumption of homogeneity of regression should be conducted. For each dependent variable, a procedure described in Winer (1962) was used to test the null hypothesis that the population regression coefficients for the three eXperimental conditions are equal (HO: 81:82:83). The test consists of:' (l) paré titioning the within-group variance into two components (81 and 82), and (2) calculation of an F ratio (F = SZ/Sl) which, if significant, leads to rejection of the null hy- pothesis. Since the desired outcome in conducting this test is to fail to reject the null hypothesis, the proba- bility of a Type I error (a) for the set of four tests was set at .20 (i.e., d=.05 for each test) in order to reduce the possibility of making a Type II error. The results of the tests of homogeneity of regression of each dependent variable (CUE, PF, CUE-PF, R—PF) on the covariate (Focal Problems Exam) are reported in Table 33. As indicated in this table, the F tests were nonsignificant for PF, CUE-PF and R-PF indicating that for these three variables there was no evidence of departure from homogeneity of regression. For the variable CUE, on the other hand, the F test was significant. However, upon closer examination of the data, 295 296 it was found that there was some evidence to suggest that this outcome constituted a Type I error (i.e., rejection of the null hypotheses when it is in fact true). Examination of the regression and correlation coef- ficients for the covariate and CUE (Table 32) indicated that the significant F test had occurred due to the disparity between the Treatment II and control coefficients (correla— tions of -.32 and +.53, respectively). Since a negative correlation was quite unexpected on a priori grounds (i.e., the covariate might fail to correlate at all with the depend- ent variables, but should not correlate negatively), it was decided to plot the bivariate distribution for the Treatment II group (see Figure 4). The plot of this distribution revealed that 15 of the 16 data points were scattered in a random appearing cluster about the bivariate mean, but that one data point fell, at a distance from this cluster, in the corner of lower-right quadrant (i.e., the subject with the highest score on the covariate had the lowest score on CUE). The plot of the data suggested that this one extremely deviant observation was respondible for the negative coefficient for Treatment II which, although not significantly different from zero in itself, was sufficiently disparate from the control group coefficient to produce the significant F ratio in the test of homogeneity of regression. When the regression coefficients for Treatment II were recomputed with this one subject dropped from the sample, 297 a coefficient of zero was found for.the regression of CUE on the covariate (see Table 34). With a zero coefficient for Treatment II, the disparity between the coefficients for the three groups is considerably reduced (i.e., .08, -.Ol, .53) and, it is reasonable to assume, (given the out- come of the test for PF based on a highly similar set of coefficients) would have led to a nonsignificant F test. It may also be noted that with this one subject eliminated, the disparity between the regression coefficients for the other three variables either remained the same (in the case of R-PF) or was reduced (in the cases of PF and CUE-PF). In sum, a careful examination of the data suggested that a highly deviant performance on the part of a single subject led to a spurious negative correlation between the covariate and CUE for Treatment II, and that this in turn was responsi- ble for a Type I error on the test of homogeneity of regres- sion of CUE. Thus, in the author's Opinion, there was not sufficient evidence of departure from homogeneity of regres- sion to invalidate the use of the ANCOVA model. It should be noted that although the analyses described in this Appendix lead to the conclusion that the population regression slopes do not differ, the possibility must be borne in mind that a Type II error (i.e., failure to reject a false null hypothesis) could have occurred. If this wepg the case, the major conclusion of Chapter V-- namely, that there was a significant treatment main effect 298 on the variable PF-ewould have to be qualified.. A zero (or nearly zero) regression slope for the two treatment condi- tions and a moderate positive slope for the.control condition (see Table 32) would suggest an ability by treatment inter- action, which, given the size of.the differences between the group means (reported in Table 13), would probably be ordinal in nature. If this type of interaction did exist, it would imply that although all second-year students may improve their skills in generating initial problem formulations as a result of the training, lower ability students would tend to improve to a greater degree than higher ability students. TABLE 32.--Regression and Correlation.Coefficients for the Covariate with each Dependent Variable, by Experimental Condition. Dep. Variable Treatment I Treatment II Control . CUE b = .090 b = -.485 b = .834* r = .089 r = -.317 r = .533* PF b = .135 b = -.014 b = 1.073* r = .084 r = -.011 r = .570* CUE-PF b = .296 b = -.455 b = 1.673* r = .168 r = -.154 r = .505* R-PF b = .244* b = -.O4l b = .320* r = .525* r = -.O75 r = .504* * Coefficient is significantly different from zero at p < .05. 299 TABLE 33. --Tests of Homogeneity of Regression for CUE, PF, CUE-PF, R-PF. Dependent Sources of Variable Variation df SS F CUE Subjects: Group 44 4123.24 3.6939* 31 42 3506.45 82 2 616.79 PF Subjects: Group 44 5779.84 2.1308 51 42 5247.40 82 2 532.44 CUE-PF Subjects: Group 44 15796.22 2.4314 81 42 14157.08 82 2 1639.14 R-PF Subjects: Group 44 601.92 1.7426 81 42 555.80 52 2 46.12 * Significant at p < .05. 300 CUE 100+ 0 90+ ' CID . 80" . . 1% 0 ° 8 70" 3 ‘ 60“ O 50... l l I L r r I l . 6O 7O 8O 9O Covariate Figure 4. Bivariate Distribution of Treatment II scores on the Covariate and the Dependent Variable CUE. TABLE 34.--Regression and Correlation Coefficients for Treatment II with one Deviant Subject Eliminated. Dependent Variable b r CUE -.018 -.013 PF .216 .150 CUE—PF 0371 0129 R-PF , -.O45 -.O72 APPENDIX J _ 301 MODIFICATIONS OF THE OUTCOME.FEEDBACK VERSION OF THE TRAINING MATERIALS Recommendations for Instructional Application with Second-Year Medical Students It is believed that the following modifications of the outcome feedback version.of the training materials would result in an improved instructional package for training second-year medical students in the generation of initial problem formulations. l. Modification of the instructidnal sequence with respect to the second viewing of the film. a. In the present version of the materials, the film of the doctor-patient interview is presented a second time after the student receives the first feedback sheet and before he receives the second feedback sheet. This inter- polation of the film between the two feedback sheets was intended to reduce the possibility of information overload and/or attentional decrement which, it was believed, might occur if all of the feedback material was presented at the same time. However, a number of students commented (in section 3 of the questionnaire) that it would be preferable to view the film after having received all the feedback material, particularly since it was the problem formulations on the second sheet which they most often failed to generate. In retrospect, the experimenter believes that precautions against information overload and/or attentional decrement 302 303 probably were not needed, and that presentation of the film after all of the feedback information would probably enhance the student's ability to assimilate this information. b. Treatment I group responses to items 14 and 15, section 1 of the questionnaire, indicated that many students did not find the second viewing of the film to be worthwhile. Thus, a second recommendation is that this viewing be made optional. In a self-instructional setting this would mean that the student could elect not to View the film again at all, or, for an exceptionally difficult case, could chose to view it again several times. 2. Modification of the response format. In the present version of the materials, a sizable amount of time was required for the student to record his responses (i.e., problem formulations, plus a cue list for each formulation). It also required a good deal of time for the experimenter (and would require for the instructor) to read and score the students' responses. The following modifications of the response format would considerably reduce both of these time requirements: a. The student lists all cues he obtained while viewing the film (i.e., data he used to generate one or more problem formulations). b. The student receives-a "Cue Feedback Sheet.“ This sheet presents anumbered list of all cues the physicians used to generate problem formulations. 304 c. The student records each problem formulation he generated, but rather than writing out a cue list for each formulation, he records the numbers of the cues on the Cue Feedback Sheet that are relevant to each of his formu- lations. This response format was originally considered in designing the training experiment but was rejected because it would permit the student to generate problem formulations (at least in part) on the basis of the Cue Feedback Sheet, rather than solely on the basis of his own naturalistic observation while Viewing the film. .However, since the results of the experiment indicated that, at least for second-year medical students, skill in one detection on the basis of naturalistic observation was already well estab- lished, it is believed that the proposed modification of the respOnse format wOuld not diminish the instructional value of the exercises. To summarize: a revised instructional sequence, incorporating the above modifications, would include the following steps: 1. The student reads the “nurse's sheet." 2. The film of the doctor-patient interview is presented. 3. The student records his cue list. 4. The student receives a “Cue Feedback Sheet" (described above). 305 The student records.his problem.formu1ations, and, for each formulation, lists the numbers of the cues (on the Cue Feedback Sheet) of relevance it. He then writes his tentative assessment. The student receives a "Problem Formulations Feedback Sheet" (i.e., a composite of the two feedback sheets provided in the present study). The film is presented a second time (optional). The student completes the Self-evaluation Check- list. "IIIIIIIIIIIIIIIIEIII