MSU RETURNING MATERIALS: Place in book drop to remove this checkout from w your record. FINES wil] be charged if book is returned after the date stamped below. ;, '. a 1..- ~_ " 1 ‘7'." ‘ WI” C“' an THE DESIGN, DEVELOPMENT, AND FIELDTEST OF AN EVALUATION FRAMEWORK FOR SHORT-TERM TRAINING PROGRAMS By Kent Jeffrey Sheets A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Counseling, Educational Psychology, and Special Education 1983 "*9 ' mx ABSTRACT THE DESIGN, DEVELOPMENT, AND FIELDTEST OF AN EVALUATION FRAMEWORK FOR SHORT-TERM TRAINING PROGRAMS BY Kent Jeffrey Sheets The study described in this dissertation originated from the need to identify an evaluation framework capable of assessing the impact of short-term training programs, specifically faculty development programs. An extensive review of the literature indicated that no appropriate evaluation approach existed. Although much of the literature on short- terml faculty development programs reported that faculty development activities were successful and effective, results were based largely on self-reported and satisfaction data. This evidence was considered suspect by many authors. Therefore, a majority of the short-term training programs in existence were not evaluated in terms of impact on participants. This study was conducted to determine if an evaluation framework suitable for evaluating the impact of short-term training programs on participants could be developed and how well this framework would function when applied to an existing short-term training program. An Optimal evaluation framework for short-term training programs was designed eclectically by selecting elements and concepts from models and methods identified in the literature. The framework was fieldtested by evaluating a faculty development program involving 14 family physicians. Numerous methods were used to collect reaction, cognitive, and behavioral data from multiple information sources. Kent Jeffrey Sheets A metaevaluation was designed and conducted to assess the effective- ness of the fieldtest evaluation. An evaluator self-report, interview of the program directors, and analysis of evaluation procedures were used to gather data about the practicality, utility, and adequacy of the field- test procedures and outcomes. The study's four conclusions are: 1. The Program had an impact on the participants and the framework documented the impact. 2. The most effective and efficient evaluation procedures were the End-of-Week Evaluations, final debriefing session, and videotape rating scale. 3. Discrepancies in evaluation results should be expected when qualitative and quantitative data are gathered from a variety of information sources using different evaluation procedures. 4. The evaluation framework is not useful for the purpose of pro- viding immediate formative evaluation information to decision makers. Recommendations for further research were presented and implications of the study for educational practice were discussed. In conclusion, a revised matrix of the evaluation framework was provided. The revised matrix reflected the results of the study. To my wife, Barbara, for her support, understanding, and love throughout the writing of the dissertation. ACKNOWLEDGEMENTS .A study of this nature is the product of the efforts and influence of individuals too numerous to recount. Family, friends, teachers, colleagues, and fellow students have all contributed significantly to this effort and I wish to acknowledge their support and contributions. I am indebted to my parents for instilling in me at an early age a respect for knowledge, a zest for learning, and an appreciation of the importance of a job well done. It was my great fortune to work with a committee composed of individuals who were my teachers, colleagues, and friends. Cass Gentry and Joe Levine were readily available for consultation and advice throughout the stages of my coursework and dissertation. Bill Anderson and Rebecca Henry supplied the idea and impetus for this study and provided tremendous support, encouragement, and guidance along the way. My chairman, Bruce Miles, has patiently nurtured me during my doctoral studies, keeping me on track, challenging me, and advising me. .As promised, I wish to thank Dan, Penny, Mike, and Eric for their time and effort in scoring the tests and videotape presentations. They contributed a substantial amount of their own valuable time to assist a fellow student and their efforts were truly appreciated. Cr di. ta: 0f Many thanks also to my friends and colleagues in the College of Osteopathic Medicine at Michigan State University and to my present colleagues in the Department of Family Practice at the University of Michigan. They were extremely supportive throughout the writing of the dissertation. .A special note of thanks is extended to my good friends from the pre-MSU era, Big Andy (How many pages?), and Larry and Sally (How are things in Ann Arbor, Ken?), among others too numerous to name. .A medal for service above and beyond the call of duty goes out to all those who typed various drafts of the dissertation, especially Steve and Marianne who labored long and hard at the CRT to type the final draft. Additional thanks go to Karen, Judy, Millie, and Blythe for their typing and editorial assistance along the way. A final special word of thanks to the three peOple who made my years at MSU especially memorable and enjoyable. I feel fortunate to include Fred Benjamin and Eric Gauger among my friends, they helped make the low points tolerable, provided many high points themselves, and generally helped me maintain my "mental health." Finally, thanks to the best friend from my stay at MSU, my wife, Barbara Beath. Barbara was also my editor, proof reader, and toughest critic. Most importantly, she was more than understanding when the dissertation had to come first. If and when she writes her own disser- tation, I hope I can be half as understanding and supportive as she was of me. iv CHAPTER ONE TWO THREE FOUR TABLE OF CONTENTS STATEMENT OF THE PROBLEM Introduction . . . . . . . . . . . . . . The Problem . . . . . . . . . . . . . Purpose of the Study . . . . Limitations of the Study . . . . . Research Questions . . . . . . . . Definition of Terms . . . . . . . Organization of the Dissertation . REVIEW OF RELATED LITERATURE . . . . . . . Introduction . . . . . . . . . . . . . . . Evaluation of Faculty Development Programs in Medical Education . . . . . . . . . . Evaluation of Faculty Development Programs in Higher Education . . . . . . . . . . Evaluation Models . . . . . . . . . . . . Evaluation Methodology . . . . . . . . . . Metaevaluation . . . . . . . . . . . . . . Summary and Implications for the Study . . PROCEDURES AND METHODS Introduction . . . . . . . . . . . . . . . An Evaluation Framework for Short-term Training Programs . . . . . . . . . . . Matrix of the Optimal Evaluation Framework Use of the Evaluation Framework . . . . . Program, Subjects, and Setting . . . . . . Fieldtest of the Evaluation Framework . . Instruments . . . . . . . . . . . . . Analysis Procedures . . . . . . . . . Metaevaluation of the Fieldtest . . . Summary . . . . . . . . . . . . . . . RESULTS Introduction . . . . . . . . . . . . . . . Results of the Fieldtest of the Evaluation Framework . . . . . . . . . . . . . . . Summary of Results of Fieldtest . . . . . . . Results of the Metaevaluation of the Fieldtest Summary of Results of Metaevaluation . . . . . Summary of the Chapter . . . . . . . . . . . . "U m on 0 HI— NOONNo—o—u— Ho— woo CHAPTER FIVE SUMMARY AND CONCLUSIONS Introduction . . . . . The Problem . . . . . The Literature . . . . Procedures and Methods Results . . . . . . . Discussion . . . . . . Conclusions . . . . . . . Recommendations for Further Research . Implications for Educational Practice Summary . . . . . . . . . . . . . . . . LIST OF REFERENCES LIST OF GENERAL REFERENCES APPENDICES Appendix A Background Information on the Family Medicine Faculty Development Program End-of-Week Evaluation Forms Cognitive Pretest Cognitive Test Rating Scale Videotape Rating Scale Interview Protocols Final Debriefing Questionnaire Metaevaluation Procedure: Program Director Interview Evaluation Report: Introduction Fieldtest Data: End-of-Week Evaluations Fieldtest Data: Fellow Interviews Fieldtest Data: Final Debriefing Fieldtest Data: Program Director Interview Fieldtest Data: Supervisor Interviews Metaevaluation Data: Program Director Interview Page 119 119 119 120 121 121 122 137 138 140 144 145 151 155 157 166 175 176 178 193 195 200 203 221 229 241 248 253 Table buts):— LIST OF TABLES Evaluation Model Comparisons Components of the Evaluation Framework Matrix of the Optimal Evaluation Framework Matrix of the Evaluation Framework as Applied to the Program Evaluation Factors Cognitive Test Results Cognitive Test Subscale Results ANOVA: Pretest Vs. Delayed Posttest ANOVA: Posttest Vs. Delayed Posttest Additional Study and Handout Use Mean Self-Ratings of Expertise Knowledge or Skills Used Since September Knowledge or Skills to be Used in the Next Six Months Mean Self-Ratings of Performance Mean Scores on Videotape Rating Scale Composite Scores and Ranks on Tests and Videotapes Difficulty Levels of Test Items Discrimination Levels of Test Items Test Difficulty and Discrimination Indices Responses to Research Question #2 Responses to Research Question #3 Responses to Research Question #4 Responses to Research Question #5 Additional Metaevaluation Questions and Responses Individual Ratings of Evaluation Procedures Mean Overall Ratings of Evaluation Procedures Summary of Responses to Research Questions Rankings for Three Selected Participants Strengths and Weaknesses of the Evaluation Framework Revised Matrix of the Evaluation Framework Page 27 42 47 52 63 76 77 78 78 81 82 85 86 87 88 96 100 101 101 106 107 109 111 112 114 115 117 130 135 143 of fr St th p0: Se PM (2123 be We int: CHAPTER ONE STATEMENT OF THE PROBLEM INTRODUCTION This dissertation reports the procedures, results, and conclusions of a study concerned with the identification of a validated evaluation framework which can be applied to short-term training programs. The study focuses on a potential solution to the growing problem related to the need to evaluate the impact of faculty development programs in post-secondary education. A short-term training program is a program of from one hour to several weeks in length delivered to 50 participants or less. The program is designed to teach certain skills, techniques or content or to change specific attitudes or behavior. A short-term training program may be an independent program or it may be a component of a larger or longer program. Examples of short-term programs include workshops, seminars, intensive courses, orientation sessions, and conferences. THE PROBLEM Short-term training programs are conducted regularly throughout the United States and the rest of the world in a variety of institutions and organizations, including schools, corporations, hospitals, businesses, churches, and the military. Support for this statement is provided by 1E fa 10 PI: We of, the existence of the large number of advertisements and notices for work- shops, seminars, symposia, and other short-term training programs found in professional journals and periodicals. In post-secondary education, short-term training programs are :frequently' used in faculty' development programs directed toward the improvement of instruction and teaching. Gaff (1975) defined faculty development as "enhancing the talents, expanding the interests, improving the competence, and otherwise facilitating the professional and personal growth of faculty members, particularly in their roles as instructors" (p. 14). Other authors use the terms instructional improvement or teach- ing improvement to describe activities that fit Gaff's definition of faculty development. Large amounts of time, effort, and resources have been and continue to be expended on the design and implementation of short-term training programs in a variety of settings and content areas. However, little is known about the impact of these programs because rarely are these programs systematically evaluated. Forman (1980) attempted to explain why there is little or no history of systematic evaluation of training in business or industry. There appear to be three reasons which partly explain the low status of evaluation in training. The first is that, un- like education, a great deal of training occurs in the private, as opposed to the public sector. Since government and public foundations are not supporting these training programs, they cannot mandate evaluation.... Second, there is a general feel- ing (on the part of some people in business and industry) that educational methods often are not well suited to the real, everyday, outcome-oriented world of business. These people tend to distrust educational methods and techniques borrowed without adaption and revision; they want training evaluation to develop a character of its own. no be: PM C031 The third reason for the low use of evaluation in training is that the field of training is in a state of tremendous growth and development. Training is now a several billion dollar a year industry in the United States and growing at an incredible rate. It is interesting to note that when education was in a similar state of growth, evaluation was not very significant either. (p. 48) Pratt (1979) reported findings similar to those reported by Forman and also commented on the lack of impact evaluations. The ultimate test of the quality of training is the impact the trained person has on some unknown future situation. This fact has been almost universally ignored in the evaluation of training.... Rather, evaluation of training has predominately focused on variables which deal with the actual process of training, including the instructor's style and technique, effectiveness of resources, and the student-instructor inter- actions Additionally, evaluation typically focuses on the student's performance in relation to instructional objectives which, it is presumed, relate to knowledge and skill which will be useful at some future point in time. Often left unaddressed is the impact of training on practice and ultimately on the ”system" in which the learner operates. (p. 350) Forman and Pratt suggested that the status of training evaluation in business and industry needs to be improved substantially. They also noted that the focus of training evaluation efforts should shift from a heavy reliance on the assessment of participant satisfaction with the process of training to an assessment of trainee performance following completion of the training. Training is conducted to improve performance, and perfor- mance should be measured on the job, not just after the completion of a training program in the classroom setting. If the classroom benefits of training are not retained and trans- ferred to the job, then training has failed to reach its full potential. Of all the range of evaluation activities, this stage is most important for the documentation of the effects and impact of training, and it is the stage which most clearly distinguishes educational from training evaluation, (Forman, 1980, p. 51) Forman suggested that the systematic evaluation of training that is absent in business and industry is present in education. However, according to a number of authors (Centra, 1976; Gaff, 1979; Hoyt & Howard, 1977; Levinson-Rose & Menges, 1981; Littlefield, Hendricson, Kleffner, 8 Burns, 1979; and Menges & Levinson-Rose, 1980) the literature of post-secondary education suffers from a shortage of reports of systematic evaluation of faculty' development programs, including the short-term training programs often conducted within these faculty development programs. As Davis (1979) stated, ”The major objective of all successful faculty development programs is to change the overt behavior of instructors in the classroom" (p. 125). However, the evi- dence supporting the successful change of overt teaching behavior of faculty development participants is for the most part based on satis- faction measures. One of the glaring problems in the evaluation of faculty development research is the tendency of authors to try and change teaching behavior but only evaluate the participants' reports of satisfaction with the course or their views of its relevance or usefulness. Almost invariably, courses are rated highly.... This does not tell the reader anything about what the participants have learned from the course. Even self reports of what the faculty believe they may have gained may be deceptive. (Stephens, 1981, p. 10) Donnelly, Ware, Wolkon, and Naftulin (1972) suggested: Although there is great value to the satisfaction-type questionnaire, other kinds of data that permit the measurement of cognitive gain, attitudinal change, and ultimately behavioral change are crucial in evaluating any attempt at education. (p. 184) Caldwell similarly reported in 1981: Not only is there a dearth of preservice training programs for teachers of adults, but also most existing training pro- grams lack an evaluation component. The absence of evaluation procedures constitutes a serious deficiency in training program models. Far too often evaluation of training programs merely consists of questionnaires that elicit the responses of program participants. These questionnaires, or happiness indicators, measure the receptivity or responsiveness of the participants, but fail to measure the mastery of subject matter acquired by the participants or their attainment of program goals. (pp- 9-10) In light of these reports, serious attempts should be undertaken to evaluate the impact of short-term training used within faculty develop- ment programs. These efforts should be designed to provide more rigorous data than the mere tabulation of participant opinions. Forman suggested the following guidelines for future evaluations of training: In training, evaluations will be more focused and less extensive. Training evaluations will have to be clearly linked to improving the program, documenting its effects, increasing its usefulness, or having some other demonstrable impact. Evaluation, in short, will have to be held more accountable for itself. Second, there will be changes in the data-gathering techniques used for the evaluation of training. In training, the emphasis must shift from survey techniques (questionnaires and interviews) and written tests to those that measure job performance, such as checklists, performance tests, observation scales, and role-play activities. Data must be gathered on 'what people can actually do, not just what they say they can do. (p. 50) Similar guidelines could be formulated regarding evaluation of short-term training programs ‘whether conducted in schools, churches, business and industry, or elsewhere. The problem is that no appropriate evaluation framework designed specifically for short-term training programs appears to exist. Baron and Baron (1980) suggested a possible solution to the problem when they proposed that specific evaluation designs should be developed for different types of programs. We propose that evaluators abandon the aspiration to a single all-purpose research design. Instead, we suggest the development of several prototypes or ideal evaluation designs which fit different types of evaluation settings to varying degrees. As presently conceived, these prototypes could be generated both according to different conceptual orienta- tions... and to the availability of time, money, and other resources. (p. 96) Steele (1973) discussed the value of identifying an appropriate evaluation model and following an eclectic approach to its operationali- zation. In most instances you will select certain parts of a pattern for systematic evaluation. There's a growing push toward selective evaluation. For example, R.E. Brack of the University of Saskatchewan suggests that you take an eclectic approach--first identify the questions about the program that need to be answered and then select the parts of a particular model that can help deliver these answers without trying to systematically operationalize the complete model. In this situation, however, an understanding of the total pattern helps you keep the component that's receiving major attention within a total perspective of programming relationships. (p. 54) Patton (1980) presented another viewpoint on this issue when he discussed comments made by a group of noted evaluators including Worthen, Stake, Stufflebeam, and Popham. The basic theme running through the comments of these evaluators was that their work is seldom guided by and directly built on specific evaluation models. Rather, each evaluation problem is approached as a problem to be solved--and the resulting design reflects their thinking about the problem as opposed to an attempt to carefully follow a prescriptive model. In effect, these experienced evaluators are describing how the practice of evaluation research requires more flexibility than is likely to be provided by any single model. (p. 58) The differing viewpoints represented in the previous three quota- tions are indicative of the controversial nature of the issues related to the design and use of evaluation models. Cronbach, Ambron, Dornbusch, Hess, Hornik, Phillips, Walker, and Weiner (1980) contributed to the controversy and listed the following among their "Ninety-Five Theses”: 55. Much that is written on evaluation recommends some one "scientifically rigorous“ plan. Evaluations should, however, take many forms, and less rigorous approaches have value in many circumstances. (p. 7) In view of the controversy noted above, the need to evaluate the impact of short-term training programs, and the absence of any evaluation models designed specifically for short-term training, it is suggested that an evaluation framework for short-term training programs be designed. This evaluation framework should provide a mechanism that allows its users to conduct comprehensive evaluations of the outcomes of short-term training. At the same time, the evaluation framework should be flexible enough to be adapted to specific settings and allow its users to be eclectic in their operationalization of the framework. PURPOSE OF THE STUDY The study reported in this dissertation was conducted to determine whether an evaluation framework for short-term training programs could be developed and successfully implemented. As indicated earlier, short-term training is a popular training format. A great deal of time and re- sources have been and continue to be expended on short-term training programs with little or no assessment of their impact except for measures of participant satisfaction. It is becoming increasingly clear that individuals responsible for planning and implementing short-term training programs must also provide evidence that their programs are producing the desired impact on the ultimate target of the programs. It is assumed that an evaluation framework for short-term training programs would be of great interest to a number of these individuals. LIMITATIONS OF THE STUDY The evaluation framework resulting from this study was designed and developed based on a review of the literature of evaluation models and methodology. The framework was then applied to an existing short-term training program, a faculty development program for family practice physicians. The evaluation conducted on the faculty development program served as the fieldtest of the evaluation framework. The evaluator shared the results of the fieldtest with the two program directors of the faculty development program by means of a written evaluation report. There are several limitations to this study. The evaluation frame- work was fieldtested with one particular short-term training program. The purpose of the evaluation was to determine if this program had an impact on its participants. The framework was fieldtested on only one group of participants in one specific type of short-term training. There was no control group against which this treatment group was compared. Thus, the concepts of internal and external validity were of great impor- tance when considering the limitations of this study. Campbell and Stanley (1966) made a distinction between internal and external validity by defining internal validity as ”the basic minimum without which any experiment is uninterpretable" (p. 5). In contrast, external validity was concerned with the question, ”To what populations, settings, treatment variables, and measurement variables can this effect be generalized?" (p. 5). The internal validity of the fieldtest of the evaluation framework was addressed in this study by attempting to control the classes of variables that Campbell and Stanley identified as poten- tial threats to internal validity. All possible precautions were taken throughout the process of developing and administering the evaluation instruments and while scoring and analyzing the data to minimize the effects of these variables. The external validity of the fieldtest results, or the validity of the inferences that could be made beyond the fieldtest, was partially established by the fact that the type of train- ing evaluated in the fieldtest was a commonly used approach to faculty development for physicians. While the fieldtest results may have limited generalizeability to other populations and programs, there was sufficient external validity to make inferences related to other faculty development programs for physicians. The inferences concerning the effectiveness of the evaluation frame- work as a mechanism for measuring the impact of short-term training programs were more limited. The evaluation framework was not tested on other types of short-term programs or with programs with different content. Thus, inferences could be made only to the evaluation of similar programs with similar populations. Based on this study it is difficult to claim that the specific short-term training program evaluated in the fieldtest would be similarly effective with a sample composed of non-physicians. Likewise it is difficult to propose that the evaluation framework would be similarly effective with a program with different content, length, or teaching strategies. However, stronger conclusions and recommendations can be made concerning whether or not the short-term program had an impact in this particular situation and whether or not the evaluation framework was effective when applied to this particular short-term training program. RESEARCH QUESTIONS The following research. questions ‘were formulated, to direct the study: 1. What specific problems were encountered in the field- test of the evaluation framework? 2. Was the evaluation framework practical in its use of resources? 3. Was the evaluation framework useful in providing in- formation to the decision makers? 10 4. Were the methods and instruments used during the fieldtest of the evaluation framework technically adequate? 5. Were the methods and instruments used during the fieldtest of the evaluation framework conducted in an ethical manner? .A metaevaluation, an evaluation of an evaluation, was designed and conducted by the evaluator to answer the research questions and assess the quality of the evaluation conducted during the fieldtest. In this manner the effectiveness of the evaluation framework was assessed as well. The evaluator, with the assistance of the program directors and established evaluation standards and criteria, evaluated the process, procedures, and results of the fieldtest. DEFINITION OF TERMS Behavioral data: information related to the performance of short-term training program participants in a simulated or on-the-job setting. Cognitive data: information related to the knowledge and skills of short- term training program participants. Evaluation: the determination of the impact of a program upon the program participants with the purpose of providing information to decision makers for planning, implementing, rejecting, and/or improving the program. Evaluation framework: a set of conceptual components and guidelines to be utilized in the design, development, and implementation of evaluations. 11 Faculty development: enhancing the talents, expanding the interests, improving the competence, and otherwise facilitating the profession- al and personal growth of faculty members, particularly in their roles as instructors. (Gaff, 1975, p. 14) Fieldtest: a step in the systematic development of a process or product in which the process or product is used in a setting that approximates the ultimate setting in which the process or product is to be used. Impact: the effect of program participation on a participant and/or the participant's organization in terms of changes in the participant's cognitive knowledge, behavior, per- formance, and/or attitude. Metaevaluation: the process of delineating, obtaining, and using descrip- tive and judgmental information about the practicality, ethics, and technical adequacy of an evaluation in order to guide the evaluation and publicly report its strengths and weaknesses. (Stufflebeam, 1981, p. 151) Reaction data: information related to the satisfaction of short-term training program participants with the content, instruc- tors, and activities of the program. Short-term training program: a training program lasting from one to several weeks, composed of 50 participants or less that is designed to teach certain skills, techniques, or content or to change specific attitudes or behavior; may be an independent program or a component of a larger program. Training: activities conducted with the purpose of helping partici- pants (trainees) learn specific skills, techniques, methods, or attitudes to help improve their performance, usually in a job-related setting. 12 ORGANIZATION OF THE DISSERTATION In Chapter One the problem was outlined and described. Research questions were presented and key terms were defined. The review of related research in Chapter Two examines the literature on evaluation of faculty development programs in medical education and higher education, evaluation models, evaluation methodo- logy, and metaevaluation. The material presented in this chapter serves as the source of information for the design phase of the study. In Chapter Three the evaluation framework is presented with an explanation and rationale for the methods used to conduct the fieldtest of the framework. Procedures for evaluating the fieldtest, the metaevaluation, are outlined. The results of the fieldtest are presented in Chapter Four. The metaevaluation results are also provided in this chapter. In Chapter Five the dissertation is summarized and the results of the fieldtest and metaevaluation are discussed and interpreted. Conclu- sions are drawn and recommendations for further research are suggested. CHAPTER TWO REVIEW OF THE RELATED LITERATURE INTRODUCTION This review examines research literature on the evaluation of faculty development activities in medical education and higher education, evaluation models, evaluation methodology, and metaevaluation. Informa- tion presented in this chapter was used to design the evaluation central to the study. As a result, the chapter includes a discussion of the strengths and weaknesses of existing evaluation models and methods and metaevaluation models as they pertain to the usefulness of these models and methods for the evaluation of short-term training programs. EVALUATION OF FACULTY DEVELOPMENT PROGRAMS IN MEDICAL EDUCATION Faculty development programs have been popular in medical education for a number of years. Stephens (1981) conducted a review of the litera- ture related to faculty development in medical education. Her review encompassed more than 40 articles and books dedicated to research on faculty development activities for medical teachers. An area of medical education that has used faculty development workshops in recent years is a new medical specialty area, family medicine or family practice. The establishment of family medicine as the newest medical specialty set the stage for the resurrection of the family doctor. The years from 1969 to present have witnessed an 13 14 explosion of interest in this distinctive form of medical prac- tice. (Canfield, 1976, p. 911) This "explosion of interest” translated into medical school graduates selecting family medicine residency training and a resultant search for family medicine faculty by medical school administrators. Since no reservoir of experienced family physicians has existed to meet the demand for faculty during the past ten years, most faculty members in family practice training pro- grams entered teaching after a period of 10 to 20 years in either group or solo practice. (Ramsey 8 Hitchcock, 1980, p. 421) Faced with the problem of hiring faculty with little or no teaching experience, departments of family medicine have been forced to rely on faculty development programs to train teaching faculty. The workshop has become a technique frequently used in these programs. The search for an effective means tomeet the faculty develOpment needs of family ‘medicine faculty revealed that workshops are a frequently used and effective method of pro- moting faculty development in general. (Bland, 1980, p. 8) While there was little doubt about the accuracy of Bland's statement that workshops were a frequently used method of faculty development in family medicine, her comment concerning the effectiveness of workshops required further examination. Much of the research cited by Bland in support of that statement was based on self-reported data and satisfac- tion measures, rather than on objective outcome or impact measures. There was little evidence of evaluation of actual changes in partici- pants' behavior due to short-term training in the literature of faculty development in medical education. For example, an article by Bland, Reineke, Welch, and Shahady (1979) presented results of a study of the effectiveness of the two-to-three day workshop format for faculty development in family medicine. 15 Clearly, the two-to-three day workshop format can result in enduring perceived change in faculty ‘members' teaching, research, and administrative abilities. Also, participants report they have changed and/or increased their faculty activities at home as a result of the workshop. (p. 458) Impact was not measured by observation of performance or by cognitive tests, but by self-report data collected from program participants who ”perceived" and ”reported” changes in ability and activities. As Stephens (1981) pointed out: Some type of systematic observation of teaching behavior is probably more useful than self report when assessing the impact of a workshop on teaching skills. This is not to suggest that ratings of faculty satisfac- tion with a workshop or course are not important. It is certainly crucial to please the consumers of a service. But this suggests measuring participants' satisfaction, using a rating scale of pre-post gains in teaching behavior as well as getting feedback on the structure of the workshop. (p. 10) Evaluation efforts should go beyond the collection of satisfaction rand selfereport data if the impact of programs on participants and the participants' organizations is to be determined. Stephens addressed this issue and commented on the scarcity of such efforts. Generalization of change to outside the workshop is an important concern in evaluation. It is also one that has been widely neglected. A few authors (e.g. Bland, 1979) asked the participants to report how their behavior has changed. This method has all the problems that any self-report measure does. Irby et al. (1976) used a self-report measure, but strengthened it considerably by also observing the lectures of the partici- pants at a later date. This is a practice that needs to be encouraged to establish the usefulness of faculty development. Once behaviors have generalized, they also need to be maintained. A change that lasts only for a few weeks or is exhibited only when a teacher is being observed is not a useful accomplishment. Follow-up contacts in faculty development research are as rare as attempts to assess generalization. (p. 11) Other research cited by Bland as proof of the effectiveness of faculty development workshops was examined. Three studies cited by Bland 16 (Adams, Ham, Mawardi, Scali, 8 Weisman, 1974; Koen, 1976; and Wergin, Mason, 8 Munson, 1976) relied heavily or solely on self-report data or were not concerned with workshops as an instructional format. Only one reference, Donnelly et al. (1972), reported results based on the use of tests to measure attitude change and cognitive learning. A subsequent study by Bland and Froberg (1982) also reported positive results of faculty deve10pment workshops, but these results were based primarily on participant self-ratings. The primary data gathering instruments were the partici- pant questionnaires (PQs), which asked for participants' self ratings of their abilities before and after the workshop or seminar. Because of their advantages in cost and efficiency, self-assessments are often seen by evaluators as the method of choice. Generally, self-assessments show moderate correlations with achievement or performance measures. It appears, however, that people may rate their own abilities somewhat higher than is warranted by their performance tests and also somewhat higher than they are rated by others, such as peers, superiors, or subordinates. (Bland 8 Froberg, 1982, p. 540) Further examination of the literature of faculty development activities in medical education yielded mixed results. Joorabchi and Chawhan (1975) reported that by ”using experiential learning methods in small groups with little or no didactic presentation, it was possible in a short time to change long-held educational views of diverse groups of medical educators” (p. 40). Pre- and post-tests of attitudes were used in this study to arrive at those results. A study by Warburton, Frenkel, and Snope (1979) used evaluation approaches including interviews, videotapes, and self-assessment measures. Some positive impact was shown in reducing anxiety and increasing comfort among faculty participants in activities related to teaching family medicine. A study by Walls (1979) used objective tests to measure impact of a faculty development program on family medicine 17 faculty. Positive results were reported, but no observation of behav- ioral change or other impact measures were examined. .A study reported in 1980 by Lawson and Harvill used self-reports of attitude change along with ratings of videotaped teaching performances to evaluate the effectiveness of a faculty‘ development program for residents. ”The results of the study described here indicate that short training programs can produce significant, observable improvement in physicians' teaching behavior” (p. 1003). No mention was made in the report of any attempts to measure cognitive change in the participants. A two-year-long faculty development program at the Michigan State University College of Osteopathic Medicine was evaluated with data gathered from program staff, faculty participants, and faculty non- participants. Although positive results were reported, the evaluation was based entirely on self-report data and there was no evidence pre- sented that any observation of faculty using the content of the workshops was conducted. No mention was made of the use of cognitive tests for evaluation purposes (Bell, Hunt, Parkhurst, 8 Tinning, 1979). Faculty development activities are well documented in the literature of medical education. Workshops were frequently used in these faculty development programs, especially those conducted with family medicine physicians. However, there was little or no evidence found in the literature that these faculty development activities were sufficiently evaluated in order to assess whether or not participants in these activities had actually changed their teaching behavior as a result of their participation. 18 EVALUATION OF FACULTY DEVELOPMENT PROGRAMS IN HIGHER EDUCATION The next section of the review of the literature focuses on the evaluation of faculty development programs in higher education, particu- larly those that may be classified as short-term training programs. Levinson and Menges (1979) reviewed the research literature on improving college teaching and reported less than encouraging results. "The literature on teaching improvement in higher education is larger than we had expected when we began this review. It is also of lower quality than we had hoped” (p. VIII-1). Levinson and Menges examined six major categories of methods of fostering faculty development, but had some pertinent comments to make about workshops and seminars. Perhaps the most frequent but least carefully evaluated instructional improvement activities are workshops and seminars.... A number of courses to train graduate teaching assistants have been systematically evaluated. Activities for experienced faculty, on the other hand, are typically evaluated rather informally by questionnaires distributed at the close of an event or soon thereafter. Participants are likely to be asked how they felt about the activity and what they learned from it. These comments, at least as described in reports and published articles, are usually positive, but permit no conclu- sions about impacts which persist beyond the event itself. (I). IV-l) In a subsequent work, Menges and Levinson-Rose (1980) stated again that "there have been virtually no adequate studies of the impact of workshops" (p. 2). In 1981, Levinson-Rose and Menges suggested the following guidelines for assessing impact. Because the most common data for evaluating workshops are participant satisfaction ratings (sometimes termed the ”happi- ness index”), we note problems of such estimates. When studies assess satisfaction and skill at preworkshop, end of workshop, and delayed posttest, the happiness index is known to be seriously misleading.... From such research we extrapolate several guidelines for workshop assessment, guidelines seldom followed in research we reviewed: 1) both immediate and delayed tests should be made... and 2) if participants' self-assess- ments are to be accurate, they should refer to specific behaviors. (PP. 409-410) l9 Littlefield et a1. (1979) supported the findings of Levinson and Menges and stated that ”systematic evaluations of faculty development programs are difficult to find” (p. 4). Littlefield et al. also cited the following quotation from Hoyt and Howard (1977) to support that statement. In summary, the literature is extremely sparse and the studies reported are uncommonly simplistic. Apparently, participants in faculty development programs have generally expressed satisfaction with them, a finding of doubtful meaning. There is some evidence that teaching methods may change in directions considered desirable by teaching authorities. No dependable evidence regarding impact on students was reported. (Hoyt 8 Howard, 1977, p. 2) Centra (1976) conducted a survey of colleges and universities in the United States to determine the status of faculty development practices in post-secondary education. A total of 756 institutions responded to Centra's survey. Of those, only 142 reported that they had evaluated their faculty development programs or activities, 332 had performed par- tial evaluations, while half of the programs had not been evaluated. A dozen or so respondents forwarded copies of their pro- gram evaluations. Judging from these, questionnaires or interviews with samples of faculty members were commonly used. Although such methods can prove helpful in tapping faculty reactions to particular services, or in ascertaining faculty awareness of a program, more sophisticated designs are probably needed to deal with such issues as accountability and the actual effects of various activities. (p. 42) Gaff (1979) pointed out the dearth of information on the impact of faculty deve10pment programs. "While the literature of faculty develop- ment is replete with descriptions and analyses of programmes, little evidence has been gathered about the impact of these programmes on participants or on their institutions" (p. 242). Gaff went on to state that emphasis has been on the establishment of faculty development pro- grams rather than on their evaluation. The evaluations that have been 20 conducted have been rather simplistic; participant reactions, annual reports, visits by outside evaluators, and case studies prepared by insiders or outsiders. These evaluations told more about the operation of the program that its outcome. Gaff and Morstain (1978) suggested the possible problems that could result from relying on such happiness measures rather than observing faculty development participants in action following interventions. ”It is one thing for faculty to give a generally positive assessment of their experiences, even indicating specific benefits of teaching improvement activities, but it is quite another for them to actually do something different in their teaching" (p. 78). The literature provided little empirical evidence that faculty deve10pment programs in higher education have made an impact on partici- pants. Most evaluation efforts appeared to stop when the activity was over and did not attempt to observe the participants' behavior in actual or practice application situations following the faculty development activities. EVALUATION MODELS An examination of the literature on evaluation and evaluation models indicated there were numerous definitions and models of evaluation in existence. Worthen and Sanders (1973) compared eight different models, each with a different definition, purpose, and key emphasis. Steele (1973) examined over 50 different evaluation models, approaches, and frameworks. Other authors (Borich 8 Jemelka, 1981; Britan, 1978; House, 1978, 1980; and Taylor, 1976) categorized existing models according to philosophy, purposes, assumptions, and other criteria. However, the 21 authors each had their own terminology and category scheme and rarely did they coincide or agree. Volumes have been written on the definition, purposes, and methods of evaluation, but there has been little consensus among the experts in the field concerning definitions or categories of evaluations. However, far from working against the prospective evaluator, this lack of consensus among the experts can be used to the practitioner's advantage. As mentioned in Chapter One, experienced evaluators reported they rarely followed a specific evaluation model. They were more likely to modify a model or models to suit a particular situation. In many situations, rather than extensively adapting a particular approach, you might be better off to construct your own, borrowing the parts of other approaches that are most useful and building patterns and processes that are appropriate to your needs. a Don't search for the one way to do evaluation. Do search for the range of approaches that will best address your varied needs in program evaluation. (Steele, 1973, p. 55) Patton (1980) went beyond Steele's suggestion of eclecticism to propose that it is the difference between the actual practice of evaluation and the ideal conceptualizations of evaluation that often leads to more meaningful and useful models. Patton also discussed some new options now available to evaluators. In essence, the options open to evaluators have expanded tremendously in recent years. There are more models to choose from for those who like to follow models; there are legitimate variations in, deviations from, and combinations of models; and there is the somewhat model-free approach of problem-solving evaluators who are active, reactive, and adaptive in the con- text of specific evaluation situations and information needs. Cutting across the evaluation model options are a full range of methods possibilities, the choice in any particular evaluation to be determined by the purpose of the evaluation, and the nature of the evaluation process. (pp. 58-59) 22 Based on the comments of Patton and Steele, it appeared the evaluator was free to examine a variety of evaluation models and then select the aspects of a model or models that best suited a particular situation. Following such a procedure, several models were examined to determine which had components suitable for the purpose of designing an evaluation framework to be used with short-term training programs. A number of evaluation frameworks, models, and approaches are briefly described, with emphasis on their strengths and weaknesses appropos to this study. The terms framework, model, and approach were used interchangeably in much of the literature reviewed and are used in the same manner throughout the dissertation. .Scriven, Stake, Stufflebeam, Tyler, Alkin, and Crotelueschen are authors of the models discussed in this chapter. Over 50 evaluation models were identified and examined, and the models of these six individuals were selected because of their relevance to the study and their prominence in the evaluation literature. Additionally, several categories of models are described. A table that summarizes the strengths and weaknesses of the models is provided later in this section. Scriven (1967) wrote philosophically about evaluation and compared concepts of evaluation such as, goals versus roles, formative versus summative, and comparative versus non-comparative. Two concepts applica- ble to the problem of assessing impact of short-term training were intrinsic and pay-off evaluation. Intrinsic evaluation involved an assessment of the instruments or materials used in the program, while pay-off evaluation examined the effects of the materials or instruments on program participants. Both kinds of evaluation were relevant to determining program impact. Aside from these two concepts, Scriven's 23 philosophical discussion of evaluation did not lend itself to the evaluation of short-term training programs. According to Worthen and Sanders, there were serious methodological problems in Scriven's approach to evaluation. There was no methodology provided for assessing the validity of evaluative judgments and the approach contained several over- lapping concepts. Except for the concepts of intrinsic and pay-off evaluation, this approach was not well suited to the purpose of this study. Stake (1967) presented a much more descriptive and prescriptive model of evaluation than did Scriven. Stake's model was devoted to describing and judging educational programs using a formal inquiry process. One of the components of Stake's model provided for the assess- ment of program outcomes using a systematic approach that allowed for the use of relative and absolute judgments by the evaluator. However, Stake also called for the use of explicit standards, which may not always exist when determining changes in performance, attitudes, or behavior. Worthen and Sanders suggested that Stake provided inadequate data collection methods in his model and that some of the distinctions made between different cells of the model matrix were not clear and sometimes over- lapped. Stufflebeam's model of evaluation (1968) was a comprehensive approach to an evaluation of the context, input, process, and product of educational programs. The components of the model related to process and product evaluation were particularly applicable to an examination of pro- gram impact since these components focused on program activities and outcomes. Stufflebeam's view of evaluation included the concept that evaluation provided information to decision makers. This concept was 24 pertinent to the problem of how to determine the impact of short-term training programs so that programs may be planned, implemented, rejected, and/or improved. However, while the process and product components of the model had some utility, the context and imput components were not of similar value since they were more concerned with the planning of evalua- tions, thus negating the use of the model in its entirety. Tyler's approach to evaluation (1942, 1949) was clearly based on behavioral objectives and the assessment of whether they were being achieved by the learners. While this was a good measure of program impact, additional information was required to indicate whether learners used the content of the program, changed their behavior, or were satis- fied with the program. Tyler's approach was central to the task of evaluating the impact of short-term training programs, but his approach was not comprehensive enough because it failed to consider other impor- tant factors related to the impact of short-term training. Alkin (1969, 1972, Alkin 8 Fitz-Gibbon, 1975) presented a holistic approach to evaluation that was decision- and system-oriented. Alkin suggested that the impact of the program on other systems be examined using documentation and outcome evaluation. This concept was similar to Stufflebeam's notions of process and product evaluations. The value of Alkin's model lay in its attention to other systems that interact with the program and its participants. However, Alkin did not clearly deline- ate methods to be used within this model and the systems approach may be very costly and complex to implement due to the time and resources it requires. Crotelueschen (1980) presented a comprehensive approach to program evaluation. He described a classification scheme intended to specify 25 evaluation questions and clarify relationships among those questions. Grotelueschen's approacht included consideration of three purposes of evaluation (to justify, to improve, and to plan), four evaluation elements (participants, instructors, topics, and contexts), and four pro- gram perspectives (goals, designs, implementation, and outcomes). Of particular value to the study were Grotelueschen's descriptions of the three purposes of evaluation, the elements, and the outcome perspective. Grotelueschen's whole model was more complex and comprehensive than required for the specific purpose of assessing program impact. However, the concepts of determining the purpose and elements of evaluation, the formulation of sample questions, and the focus on outcomes appeared to be of particular value to the study. The remaining models were grouped under two categories of models rather than attributing them to a particular author. The first category reviewed was the transactional approach (Taylor, 1976) or the illumina- tive (Parlett 8 Hamilton, 1976) or contextual approach (Britan, 1978), depending upon which author was doing the categorizing or describing. The models in this category were primarily characterized by an intensive study of the whole program. Evaluation methods used with these models included observation, interviews, analysis of program documents, and other qualitative methods. The use of qualitative methods within the models in this category was applicable to impact evaluation, but the extensive use of observations and analysis of documents focused on imple- mentation rather than impact and did not appear to be useful as a complete approach. The clinical approach to evaluation (Glaser 8 Backer, 1972) was similar to the transactional category of evaluation. Glaser and Backer 26 advocated a holistic systems approach which utilized subjective measure- ment, consultation, and feedback among its program evaluation methods. The subjective measurement methods were applicable to the study, but the consultation and feedback methods were implementation-focused and there was a notable absence of any mention of the use of objective measures within this approach. The strengths and weaknesses of the models described in this section are summarized in Table 1. While this review did not cover all the evaluation models identified during the search of the literature, it has mentioned those that were considered to be most applicable to the evalua- tion of short-term training programs. No single model was identified that was suited to the task of evaluating the impact of short-term training. However, concepts or elements that were relevant to this study were identified for possible inclusion in the evaluation framework to be designed. Although they did not describe models of their own, several authors' views of evaluation and evaluation models were of interest and value. Steele (1973) suggested that evaluation should be conducted to judge and form conclusions and should be used as a management tool. She also said that program evaluation should be considered a generic term and that evaluators should look. beyond objectives and results during ‘program evaluations. This view was consistent with one of Scriven's concepts, goal-free evaluation, which suggested programs be evaluated without the evaluator's knowledge of stated program goals and objectives. Steele also suggested that unintended outcomes and results be sought and analyzed in evaluating a program. 27 nauseous o>auoonno mo Jung couscoulsoaumuaoaoanaH usuaaw many couuoucoaofiaau no ouoa vomooom couumuaoaoumaw a :wwnov amuwoum no mao>wm=ouxo monsoon Hovoa ouuuao on: ou xoanaoo a o>um=onoumaoo ooa coaumuaoaoagaw mo umoo swam avenues vouooauaov mausoao oz OSUOH Q 3OHHQG OOH. moouumsfim>o waaaamaq mom unawamov mucoaoqaou usmsa .uxoucou ucoaoamaa haasm ou o>amaoaxm ovonuoa coauooaaoo dump as coaumauomov mumsuovmaH ovumvsmum ugoaamxo you voHHmo mumoucoo manoamauo>o muaoawvsfi o>uumsam>o mo madcaao> unqumommo you venues oz mmmzxuuouuamsv no on: nomoummw owumuao: maoaumosu a .musoaoao .momonuan coauoaau>o mo munouaoo uumnaw .oaouuao sags voauooaou voumoauOImaoummm a Icoamauon mo>uuuofino mo uaoaaamuuo no mason muconoaaou unavoum .mmoooum nomounnu voucouu0Iaoamwuon ucoaoaaoo cowumsao>o oaouuao squamous uwumaoumhm souumoao>o «Holman a oaoauuuaH whuzmmfim mzomHmm a m4mauom momhb<\4maoz 28 Patton (1978, 1980) called for qualitative, utilization-focused evaluation based on an eclectic approach to the process of evaluation. Patton also proposed the use of a holistic, naturalistic approach to evaluation in order to provide information to decision makers. This was similar to the purpose of the models developed by Alkin and Stufflebeam. Among all the models examined, none focused exclusively on impact evaluation, although several (Alkin, Crotelueschen, Stake, and Stuffle— beam) considered outcomes as a major component of their models. Bryk (1978) explained some of the problems inherent in an impact study: First, for the program to be effective all subjects do not have to move in a particular direction, on all dimensions, for each unit of time. Second, even if we could measure short-term changes with perfect validity, without an understanding from a clinical perspective of the individual program that generated ‘the numbers, we may not know what values to place on them. As a consequence, we may be unable to interpret the results of the impact study.... Clearly, then the questions we ask and the methods we employ must be carefully fitted to the nature of the program under study. (pp. 51-52) Corbett (1979) reported on the absence of literature on impact evaluation. ”Numerous evaluations of training design, methods, and techniques, as well as student learning in terms of educational objective, have been reported, but very few on impact" (p. 347). In calling for impact evaluations, Pratt (1979) stated that "in impact evaluation, we are examining not just the impact of training but the relative impact of competing and complementing forces that potentially influence the agency, system, or practice under consideration" (pp. 351-352). Hunt (1978) suggested an approach to determining who and what to evaluate when determining impact, but stepped short of suggesting methods to use. Grenough and Dixon (1982) proposed a ”systematic measurement process designed to demonstrate to management whether or not those 29 trained use their experience" (p. 40). They described this process of assessing the impact of training in terms of evaluating the "utilization of training.” Grenough and Dixon suggested that utilization of training may be measured either directly or retrospectively. In the retrospective mode, surveys, telephone interviews, and on-site interviews were used with existing descriptive and quantitative data. Although rather simplistic iJIIItS methodology and not yet fully developed, this approach seemed to have some potential value for this study. For a variety of reasons, no single established evaluation model was well suited to the task of assessing program impact. Some models were too complex or costly to use. Others were too narrow in focus. However, several of the models contained components or presented concepts useful for impact evaluation. The concepts taken from models that were most useful for this study included intrinsic and pay-off evaluation, a decision-orientation, a systems-orientation, a holistic viewpoint, and a focus on utilization. Intrinsic evaluation was a useful concept since it suggested the value of examining the materials used in a program as a means of determining what the outcomes should be. Pay-off evaluation was relevant because it was concerned with impact and outcomes of program materials and activities. The concept of decision-oriented evaluation was appropriate because the definition of evaluation for this study included as its purpose the provision of information to decision makers. A systems-orientation was essential because impact may often best be assessed by gathering information from individuals in systems other than the systems in which the trainee functioned. The importance of a 30 holistic viewpoint was based on the notion that all systems and compo- nents of a short-term training program should be considered as a whole in order not to isolate or neglect certain variables or factors that might have important significance when determining impact of the program. Finally, the focus on utilization was relevant for it suggested the need to gather documentation that the content of the program was being used in the work setting. A number of evaluation models and approaches were presented in this section of the literature review. The relevance of these models and approaches to the task of evaluating the impact of short-team training programs was discussed and strengths and weaknesses of each were identi- fied. Finally, those components and concepts most useful for this study were identified and discussed. These components and concepts are reflected in the design of the evaluation framework outlined in Chapter Three. EVALUATION METHODOLOGY A prevalent theme found in literature on evaluation methodology was that quantitative methods have dominated research and evaluation studies in: the past and may need to be supplemented on some occasions by quali- tative methods. Several authors (Cronbach et al., 1980; Filstead, 1979; Glaser 8 Backer, 1972; Patton, 1980; and Reichardt 8 Cook, 1979) suggested that studies be designed combining the two approaches rather than relying solely on one approach or the other. The obtrusive nature of quantitative research methods was a major reason that some authors have suggested there are situations when quali- tative methods may prove to be more effective in conducting program 31 evaluations. Glaser and Backer stated that "program evaluations do not always lend themselves to rigorously quantitative approaches” (p. 54). Patton (1980) supported Glaser and Backer and added that ”on many occasions--indeed for most evaluation problems--a variety of data collec- tion techniques and design approaches will be used” (p. 18). Cronbach et a1. listed the following among their "Ninety-Five Theses.” 54. It is better for an evaluative inquiry to launch a small fleet of studies than to put all its resources into a single approach. 56. Results of a program evaluation are so dependent on the setting that replication is only a figure of speech; the evaluator is essentially an historian. 59. The evaluator will be wise not to declare allegiance to either a quantitative-scientific-summative metho- dology or a qualitative-naturalistic-descriptive methodology. 60. External validity--that is, the validity of inferences that go beyond the data-is the crux; increasing internal validity by elegant design often reduces relevance. (p. 7) 95. Scientific quality is not the principal standard; an evaluation should aim to be comprehensible, correct, and complete, and credible to partisans on all sides. (p- 11) Reichardt and Cook discussed the potential benefits of using qualitative and quantitative methods together. They stated that ”two method-types can build upon each other to offer insights that neither one alone could provide" (p. 21). Filstead supported Reichardt and Cook and stated, ”Qualitative methods are apprOpriate in their own right as evaluation-assessment procedures of a program's impact. Program evalua- tion can be strengthened when both approaches are integrated into an evaluation design” (p. 45). 32 By no means were qualitative methods presented as the sole approach to evaluation research. Rather, as with the selection of appropriate components from various evaluation models, one was urged to consider qualitative methods as yet another means of conducting evaluation research. Quantitative approaches clearly may be warranted in some cases; however to maximize the utility of the data gathered to those who authorize its collection, and avoid damage to an on-going program, it may be useful to consider viable alterna- tives or supplements to standard quantitative or experimental methods. (Glaser 8 Backer, p. 54) Filstead added, "Qualitative methods provide a basis for understanding the substantive significance of the statistical associations that are found” (p. 45). Several other authors supported the use of multiple methods to evaluate the impact of programs. A carefully designed strategy using mixed, multiple measures seems desirable. Although no single measure may be individu- ally strong, several measures taken together can create a total picture that reliably captures the efficiency of an individual program. If a program is effective, then predictable patterns of outcome information ought to occur across multiple measures. (Bryk, 1978, p. 40) Posavac and Carey (1980) "recommended that evaluators use multiple variables from a single source because the evaluation of a single variable to be the criterion of success will probably corrupt it" (p. 54). Patton (1980) added that "multiple sources of information are sought and multiple resources are used because no single source of information can be trusted to provide a comprehensive perspective on the program” (p. 157). Cronbach et al. added, "Multiple indicators of outcomes reinforce one another logically as well as statistically. This is true for measures of adequacy of program implementation as well as for measures of changes in client behavior" (p. 8). 33 An example of suggested multiple criteria to be used to evaluate training programs was found in the literature on training and development in business and industry. Kirkpatrick's four criteria for the evaluation of the effectiveness of training programs were cited throughout the literature (Brethower' 8 iRummler, 1977; Goldstein, 1974; Kirkpatrick, 1967; Laird, 1978; Otto 8 Glaser, 1970; and Wexley 8 Latham, 1981). Wexley and Latham described the four criteria this way: 1. Reaction criteria measure how well the participants like the program including its content, the trainer, the methods used, and the surroundings in which the training took place. 2. Learning criteria assess the knowledge and skills that were absorbed by the trainee. 3. Behavioral criteria are concerned with the performance of the trainee in another environment, i.e., the on- the-job setting. 4. Result criteria assess the extent to which cost- related behavioral outcomes have been affected by the training. (PP. 78-79) Brethower and Rummler listed four potential levels of evaluation which were clearly based on Kirkpatrick's criteria. 1. Do trainees like the training? 2. Do trainees learn from the training? 3. Do trainees use what they learn? 4. Does the organization benefit from the newly learned performance? In summary, the literature on evaluation methodology suggested various options related to the selection of evaluation procedures. These Options included recommendations regarding types of methods to use, the value of multiple measures and sources of information, and criteria that could be used to evaluate various aspects of a short-term training program. 34 METAEVALUATION The literature on metaevaluation was rather limited since it was a relatively new concept. Scriven first introduced the term in 1969 and he and Stufflebeam have been among its leading proponents. Theoretically, meta-evaluation involves the methodological assessment of the role of evaluation; practically, it is con- cerned with the evaluation of specific performances. (Scriven, 1969, p. 36) Good evaluation requires that evaluation enterprises them- selves be evaluated. Evaluations should be checked for problems such as bias, technical error, administrative diffi- culties, excessive costs and misuse. (Stufflebeam, 1981, p. 147) Metaevaluation was essentially defined as the evaluation of evaluations, but the term has had different meanings for different authors. "There are as many potential conceptions of metaevaluation as there are of evaluation itself" (Stevenson, Longabaugh, 8 McNeill, 1979, p. 38). Some authors limited the focus of the concept of metaevaluation. Cook and Gruder (1978) used the term ”to refer only to the evaluation of empirical summative evaluations-~studies where the data are collected directly from program participants within a systematic design framework” (p. 6). Stufflebeam placed no such restrictions on the term in any of his writings (1974, 1978, 1981). He suggested that just as there were formative and summative evaluations, there should also be formative and summative metaevaluations. Stufflebeam placed no limitations on the type of evaluations that could be evaluated in a metaevaluation study. .A term often associated with and confused with metaevaluation was meta-analysis. Scriven (1980) defined meta-analysis as ”a particular approach to synthesizing studies on a common topic, involving the calculation of a special parameter for each" (p. 83). Numerous studies 35 on a common topic were analyzed together to look for trends or signifi- cance across studies. This kind of analysis was not a component of this study. For the purposes of this study the concept of metaevaluation is based on metaevaluation as perceived and defined by Stufflebeam (1981) as: The process of delineating, obtaining, and using descrip- tive and judgmental information about the practicality, ethics, and technical adequacy of an evaluation in order to guide the evaluation and publicly to report its strengths and weaknesses. (p. 151) The metaevaluation procedures described in Chapter Three and the results presented in Chapter Four are derived using Stufflebeam's definition for guidance. Since metaevaluation was a relatively new concept, literature on the topic was scarce. In 1974, Stufflebeam reported that: The state of the art of meta-evaluation is limited in scope. Discussions of the logical structure of meta-evaluation have been cryptic and have appeared in only a few fugitive papers.... The writings on meta-evaluation have lacked detail concerning the mechanics of meta-evaluation.... Finally, there are virtually no published designs for conducting meta- evaluation work. Overall, the state of the art of meta-evalua- tion is primitive, and there is a need for both conceptual and technical deve10pment of the area. (p. 4) Seven years later Smith made a remarkably similar statement. There has been relatively little work done to date in the area of meta-evaluation...with most efforts having been focused on the development of formal evaluation standards. The prac- tice of meta-evaluation holds great potential, however, for illuminating the nature of evaluation practice, highlighting the difficulties of performing evaluations, and fostering a concern for excellence in evaluation service. (Smith, 1981, p. 263) Smith also reported that: Evaluators have consequently had little practice in conducting meta-evaluations and the literature on the subject is sparse.... The number of actual meta-evaluations is still very small and I know of no comparative studies of meta- evaluation procedures. (p. 266) 36 Stevenson et al. reported an "absence of empirical literature on metaevaluation in the human services” (p. 45). These authors also reported that ”the literature on metaevaluation...has focused largely on the methodological soundness of an evaluation as the criterion for its ‘worth” (p. 44). Stevenson et al. noted that in many cases evaluators were interested in evaluating not only the means or methods of an evaluation, but also in examining its ends or outcomes or impacts on the organization or the rest of society. However, examples of these kinds of metaevaluations were not found in the literature. While little work has been done with metaevaluations, authors have suggested guidelines and models for metaevaluations. These authors included Stufflebeam, Cook and Gruder, and Millman (1981). Cook and Gruder presented seven models of metaevaluation based on time of the metaevaluation, status of the data, and the number of data sets involved. These models were best suited for use with large-scale evaluations such as city-wide, state-wide, or nation-wide evaluations of curricula, instructional innovations, or other large-scale programs. Millman presented alternative methods for metaevaluation such as criticism techniques often used in the arts and music. Millman also provided a checklist which could be used to evaluate evaluation programs and/or products. This checklist was based on a similar checklist, the Key Evaluation Checklist (KEC), which was outlined by Scriven in 1980. Heading #18 on Scriven's KEC, Metaevaluation, suggested that the other 17 items on the checklist could be applied to the evaluation while planning, implementing, and evaluating an evaluation. MHllman's checklist asked similar types of questions concerning preconditions, effects, and utility 37 of the program or product and of the evaluation that was conducted of the program or product. Since 1974, Stufflebeam's concept of metaevaluation has become more refined and further developed. As mentioned earlier, Stufflebeam suggested there could be both formative metaevaluations to guide the evaluation and summative metaevaluations to publicly report the strengths and weaknesses of evaluations. Stufflebeam (1981) also stressed that "metaevaluations must be a communication as well as a technical, data- gathering process" (p. 151). He considered metaevaluation to be both a process and a product. Stufflebeam also outlined four categories of evaluation standards that should be used to plan, conduct, and evaluate evaluations. These categories were: 1) utility standards 2) feasibility standards 3) propriety standards 4) accuracy standards The Joint Committee on Standards for Educational Evaluation built upon Stufflebeam's four categories and published Standards for Evaluations of Educational Programs, Projects, and Materials in 1981. This work detailed 30 standards within the four categories and proposed that the standards be used in planning, conducting, and evaluating evaluations. Many of these standards were similar to items on the checklists devised by Millman and Scriven. Included with each standard were an overview, guidelines, pitfalls, caveats, an illustrative case, and an analysis of the case. Baron and Baron (1980) discussed the history of ethics, standards, and guidelines for evaluations and expressed some strong opinions. 38 Whereas we feel that basic ethical principles for evaluation should be universal and absolute, we believe that methodological standards should be particular and relative, for when we get to issues of methodology, we are dealing with decisions constrained both by situational realities about what is possible and by the state of the art in regard to new research design, theory and statistical approaches. (p. 89) .As reported earlier, there were few published accounts of metaevaluation studies. Giesen (1979) reported in her master's thesis the results of the evaluation of a particular evaluation model. ~She established six criteria based on Stufflebeam (1974). and evaluated an evaluation model based on those criteria. Her results showed that with some minor additions the model could be extremely useful and effective. Kennedy (1982) applied the evaluation standards developed by the Joint Committee to a three-year faculty development, curriculum revision project. She reported how the standards were used in four phases of the evaluation project, designing the evaluation, collecting the information, analyzing the information, and reporting the evaluation. She identified those standards which were extremely useful as well as those which seemed to be of little or no value for the individual project phases. In summary, the literature on metaevaluation was limited, both in reports on how to conduct a metaevaluation study and in reports on out- comes or results of such studies. Different authors have developed checklists and standards which can be used to plan, conduct, and evaluate evaluations. As these checklists and standards are used, more reports should be generated and added to the literature. This study incorporated a metaevaluation design based on some of the literature just described. This metaevaluation is described in Chapter Three and the results are presented in Chapter Four. 39 SUMMARY AND IMPLICATIONS FOR THE STUDY The literature on evaluation of faculty development activities in medical education and higher education, evaluation models, evaluation methodology, and metaevaluation has been reviewed and discussed. This review revealed no single evaluation model properly suited to evaluate the impact of short-term training programs, thereby partially explaining the lack of such evaluation reports for faculty development activities in ‘both medical education and higher education. Based on the review of the literature on evaluation models and methodology, some evaluation proce- dures and methods suitable for the evaluation of the impact of short-term training programs were identified. These methods and procedures were utilized to design an evaluation framework that is presented in Chapter Three. In addition to the evaluation framework, a plan for the meta- evaluation of the fieldtest of the evaluation framework is also presented in Chapter Three. CHAPTER THREE PROCEDURES AND METHODS INTRODUCTION In this chapter, the procedures used in the design, development, and fieldtest of the evaluation framework for short-term training programs .are presented. Based on the literature review in Chapter Two, an evalu- ation framework, which is referred to as an ”optimal" framework, was designed. The optimal evaluation framework is described with its compo- nents and options as it is intended to be used. The program evaluated during the fieldtest of the evaluation framework is described, including .a matrix of the evaluation framework as derived from the optimal frame- work and applied in this particular situation. The matrix also includes the evaluation questions asked during the fieldtest. The instruments and analysis procedures used during the fieldtest are also presented in this chapter. The remainder of the chapter is devoted to a description of the metaevaluation of the fieldtest. The research questions originally stated in Chapter One are presented again and the procedures used to answer the research questions are outlined. 41 AN EVALUATION FRAMEWORK FOR SHORT-TERM TRAINING PROGRAMS The evaluation framework described in this chapter was designed and developed to provide a mechanism for evaluating the impact of short-term training programs on program participants. As defined in Chapter One, the evaluation framework is a set of conceptual components and guidelines to be utilized in the design, development, and implementation of evalu- ations. The major purpose of evaluations conducted using the evaluation framework for short-term training programs is to provide the information to decision makers for planning, implementing, rejecting, and/or improving short-term training programs. The evaluation framework was designed and developed based on infor- mation gathered during the review of the literature. Factors considered during the review and subsequent design included the importance of assessing the program impact, the focus on providing information to decision makers, and the need to provide users of the framework as much flexibility as possible. One assumption considered during the review and design was the probability that users of the evaluation framework would not have access to participants in their short-term training programs prior to the beginning of the program. Thus, data could be gathered only during and/or after the program. The five major components of the evaluation framework are presented in Table 2. These components were taken from the literature reviewed in Chapter Two and are discussed in greater detail in subsequent sections of this chapter. exis Kirk coll {Ho c011 Char. 42 TABLE 2 COMPONENTS OF THE EVALUATION FRAMEWORK COMPONENT SOURCE RATIONALE 1. Type of data Kirkpatrick (1967) Different types of data are gathered Brethower 8 Rummler (1977) Wexley 8 Latham (1981) required to conduct compre- hensive evaluations. 2. Who or what is Hunt (1978) The object of evaluation assessed efforts must be identified to facilitate the process. 3. Source of data Patton (1980) Multiple sources of data help Cronbach et al. provide more comprehensive, (1980) reliable evaluation informa- tion. 4. Method of Bryk (1978) Different methods of collecting gathering data Posavac 8 Carey evaluation data should be used (1980) depending on the situation. 5. Evaluation Crotelueschen Sample evaluation questions questions (1980) facilitate the formulation of questions for specific settings 8 programs. 1. Type of data gathered existing evaluation approaches, models, and frameworks. This component of the framework sets it apart from most of the Based closely on Kirkpatrick's four criteria for evaluation, the three types of data to be collected when using the framework are: 1) Reaction (satisfaction) data 2) Cognitive (learning) data 3) Behavioral (performance) data Most of the research previously cited addressed only one, or at most two of the three types. Reaction data were the type most frequently Self-reports of cognitive or behavioral collected from participants. change were also used for evaluation purposes, but there was little 43 evidence of the use of objective measures of cognitive or behavioral change. It is important to recognize that favorable reaction to a program does not assure learning. All of us have attended meetings in which the conference leader or speaker used enthu- siasm, showmanship, visual aids, and illustrations to make his presentation well accepted by the group. A careful analysis of the subject content would reveal that he said practically nothing of value--but he did it very well. (Kirkpatrick, 1967, p. 96) Participant self-reports are the most common method of measuring change in management training programs. Unfortu- nately, most program evaluators and researchers believe the self-report to be among the least accurate and least consistent forms of measuring participant change. (Mezoff, 1981, p. 10) 2. Who or what is assessed The purpose of the evaluation framework is to assess the impact of a short-term training program on its participants. This component is concerned with identifying who or what is assessed in order to determine program impact. The information gathered with the framework is ultimately concerned with assessment of the content, activities, and resources of the program for the purpose of making decisions and is only secondarily concerned with aptitude and achievement of the participants. However, it is often necessary to assess the participants in order to obtain accurate program evaluation data. In the framework, the participants are the object of cognitive and behavioral data collection and the program is the object of specific reaction data collection. 3. Source of data There are numerous potential sources of data which provide informa- tion about the participants and the program. Ideally, all those individuals in a position to comment on changes in participants resulting from the short-term training program should be considered as data 44 sources. Minimally, the participants, program faculty, and supervisors of the participants should serve as sources of data about the partici- pants and the program. Other possible sources of data include the participants' subordinates, peers, students, clients, family members, or others with whom the participants interact while applying the skills and techniques learned during the program. If the program content includes activities related to the creation of certain products or materials, it would also be possible to use examples of products or materials developed by the participants as sources of data. Cronbach et al. (1980) and Patton (1980) were among the authors who suggested the use of mmltiple sources of information when conducting evaluation studies. 4. Method of gatheringidata The data-gathering methods used within, the framework may be quantitative, qualitative, or both. The methods need not be the same for each type or source of data. Various methods may be used to collect data to answer a single evaluation question or one method may be used to answer nmme than one evaluation question. The work of several authors (Baron 8 Baron, 1980; Bryk, 1978; Cronbach et al., 1980; Patton, 1980; and Posavac 8 Carey, 1980) provided the impetus for including this component in the framework. Evaluation methods that may be used to evaluate short-term training programs include, but are not limited to, the following: 1) Interviews - telephone - personal - group 2) Questionnaires - semantic differential questions - open-ended questions - checklists 45 3) Tests - multiple-choice, true-false questions - essay, short answer questions - oral - simulations 4) Direct observation (live, videotapes, films, audiotapes) - checklists - rating scales - narrative accounts diary 5) Participant and staff self-reports - checklists - rating scales - narrative accounts - diary 5. Evaluation questions General evaluation questions are presented within this component to help guide the use of the framework. The inclusion of this component was prompted by Grotelueschen's (1980) use of similar questions in his evaluation model. The questions presented in the matrix of the optimal framework in Table 3 are examples of the types of questions that might be of value to users of the framework. The specific questions used in the fieldtest are presented later in this chapter. MATRIX OF THE OPTIMAL EVALUATION FRAMEWORK The five components of the evaluation framework have been arranged in a matrix in an attempt to graphically represent the framework. Within the structure of the matrix, different individuals, activities, and elements were placed in appropriate cells. The following abbreviations are used for elements identified in the matrix. 46 P - Participant F - Faculty (program) S - Supervisor of participant S-R - Self-report STP - Short term training program 0 - Others related to the participant (subordinates, peers, clients, students, family members) Q - Questionnaire I - Interview VT - Videotape DO - Direct observation The elements displayed in the matrix in Table 3 depict an optimal configuration.of the evaluation framework. USE OF THE EVALUATION FRAMEWORK Suggested guidelines for the use of the evaluation framework for short-term training programs are outlined in this section. Options and requirements to consider when operationalizing the framework are discussed. One of the requirements of the evaluation framework is that all three types of data--reaction, cognitive, and behavioral--be collected. To assess program impact on participants, it is important to assess cognitive and behavioral change in addition to participant satisfaction. Kirkpatrick (1967) is a major proponent of using multiple evaluation criteria or levels to assess the outcomes of training. Another requirement is that the program and participants are assessed to gather reaction data while participants alone are the object of cognitive and behavioral data-gathering activities. The ultimate objective of the evaluation is to evalute a program's impact, but to conduct that evaluation it is necessary to evaluate participants as well as the program. 47 ~:0fiumuaamwuo agony aw soauocsu no oaou Hausa a“ owsmno has o>uoouog m.m may van «mam sea «0 uaousoo onu hangs on humans» “sons o>aauuoa m.m men use so: sounuou any ca manna ou uoonxo honu age not: prm ecu so voaanmm waa>m£ phonon m.m onu owe was: ~wauuuom commasaam no Home a nu mym use no ucouaoo onu human m.m osu tau Haoa so: swam any we acouaou may we coauaouou vow wcqcumoa :30 agony o>qoouon m.m was owe Hams 30m wawmuou was comma m.m onu vav mam onu mo uuousoo onu we some 30: pm onu mums mam mew new: coauuaumm 30m pm onu one: mam onu no“: voamnwumm 30: ~a.m was mum: mam may ease vaauafiuam so: mzoaammna zo~a<=n<>m .OQ mum.o.H mHmmH o.H mum.c.H o.H UszmmHm A<=mm GEE—Ed u>HaHzooo Aonaoaoouoq o uo\vam m o£u cap to: ~m.m onu :« omcmno has 0>Hoouoa o uo\vco m onu van «mam one we usouaoo o:u names on huuaanm m.m on» o>aoouon m may cap 30: monammao ona<=A<>m MIm.O.H oo.eum.o.H auaamoo oz» mo cowuaouou unmoumaawqo o onus» on: snowman» onu no use onu um owvoaaonx o>auaawoo mo Ho>oa .musmmwo«uumo one aw mucosa unmoamucwum m ouo5u no: waoqmmom may mo wnaasuwon onu um soummom nonaoumom may no owvoaaosx o>uuacwou uo Ho>oH .ouconuowuuon osu on: was: woucomaowuumm onu mo uuooa>uooso osu mums aouwoum any sou: voammaumo So: .ooaomoo nonaouoom on» no moauw>uuom was .muammuowuumn .ucoucoo may :ufis muouoouav aouwoua men one: voammquon so: .0 .m .e momma .m soa>uouaH .N Bofi>uouau waoumauawe Hanan «scammom umzu mo mowuw>wuom was .muouosuumaq sow>uouaH .uaoucoo on» no“? cowmmoo nonaouaom maouumsao>o one :u ouamnwoauuma osu who: vofimmuumm tom .~ xoo31molv=m mzoaammeo zo~a<=q<>m uoaom huasomm maoaaom m Auouau paoamoom may no uaoucoo osu hangs ou muwauao .muommfioauumn OSu o>aoouom muooa>uonam osu ago now .- soa>uouaH penance was axon onu ma on: on uuoaxo muamawoauumn on» cam moouaanuou no oaawxo mo momma yon: .od «mouooom nonaouaom onu mo dowuoanaoo onu wawaoaaom on: muamaaowuuma onu vac moacwcnoou no maawxm no woman ass: .m 3oa>uouaH waofiomom onu nouns was ouowon noon soummom nonaouoom onu mo moons usoumou onu mo sumo ca ooauuonxo :30 Hausa o>aoouon muconaowuuma onu vac Bo: .w wumouumoo commaou osu mo coaumuuoaafiavm onu can nowmmom was no use one costume mucoafioauumg on» he ooxouuovca no: scammom onu mo unoucoo on» no human aoaoauwvvo some pom .u aofi>uou=H mzoaammsa zo~a<=g<>m uom9m msoaamm macaaom ¢H¢mmn noonom mnoun>non=o onn on: wnowmmom may on one moownsuwumau oaon nnosn ca moauoqsn no oHon n30 naosu :« owamno has o>uounon muconaoaunoo onu van wflOnmmom men on one wawnomou .muamaaounnoa man :a swoono has o>noonoa mnomu>nonam man can pmaawxo cownmunomona .munmanuwnnmn onu o>woonom mnonoonwv amnwonn may one Rom woaownmucomonm econommav oau an uaanxo cowuonsooonm on “Hamonunuoam woumaon monsoon as» no unoucoo man ounaaua muaoawouunoo man vac Haws so: soaouumucomonn no mcownoasawm oonnn no monnom may no noomon a nu nonmanounoo wouooaxo nwoau oumn mnooonoaunon onu paw 3o: .mn .n~ .on .m~ .mn mzonnmmaa 535545 3on>noncH 3ofi>nouou 3oa>nonca 30H>noncu wsnuon enouoovu> 3ou>noncH noosm maoaaom mnomw>noa=m mnanoom maoufiom msoaaom auooomnom msownoosv ucovOum wswnozoso was msuxm< oaanxo coauouuooonm wasnnouma Hosoq>oavso wcnosvonm maanxm nouoao:0hno mswsomoa soauo>auoa use wsacnooa no moannosnnm coauoosvo Hosannao an xomnvoom o>auosnuo=oo nonma>noosm Houwcaao mo OHOM osunanoon waanomou Hmoqanao nsoanoao>ov noon» mo ousoaoam UHmOH 82 TABLE 11 * MEAN SELF-RATINGS OF EXPERTISE TOPIC SEPTEMBER MARCH Elements of group development 1.88 3.28 Clinical teaching technique 2.67 3.82 Role of clinical supervision 2.30 3.82 Constructive feedback in clinical education 2.70 3.64 Principles of learning and motivation 2.50 3.21 Teaching psychomotor skills 2.44 3.71 Producing audiovisual materials 1.83 .3.18 Presentation skills 2.25 3.79 Asking and answering student questions 2.45 3.39 Perspectives in learning 1.63 2.11 e 5 points possible The highest rated topics prior to September were clinical teaching technique, principles of learning and motivation, and constructive feedback. The lowest rated topics prior to September were perspectives in learning, producing audiovisual materials, and elements of group development. When they were interviewed in March the participants were asked to indicate their current expertise in the topics. The top rated topics were presentation skills, clinical teaching technique, clinical 83 supervision, teaching psychomotor skills, and constructive feedback. The lowest rated area in March was perspectives in learning. Statistical analysis was not conducted on these results, but the results are meaningful when compared to other results presented in this chapter. A.discussion of these results is presented in the following summary of the cognitive data collected during the fieldtest. Eggnitive Data - Summary An analysis of the cognitive data gathered during the fieldtest revealed some notable findings. Although not statistically measured, there was an apparent relationship between the test subscales, self- reports of expertise, and self-reports of handout use. There was also an apparent relationship between these cognitive data and a portion Of the participant reaction data. For example, the fellows scored best on the presentation skills on all. three tests. Presentation skills was also the tOpic that ranked third in both ratings of expertise and handout use. It was also one of the most favorably received topics according to the satisfaction measures. Similar results across cognitive measures were found for the topics, teaching psychomotor skills, clinical teaching, and producing audiovisual materials. On the opposite end of the spectrum, the session presentation on perspectives in learning fared poorly on all measures. That topic received the lowest ratings of expertise and handout use and the partici- pants performed worst on that subscale on the pretest and delayed posttest, and second worst on the posttest. The topic also received more unfavorable reactions than any other. 84 In summary, there was high agreement across the cognitive measures at the top and bottom of the rankings. There also appeared to be a strong relationship between the cognitive data and reaction data for those topics at either end of the spectrum. The cognitive data also indicated that the fellows performed better tn: the posttest than on the pretest. Although delayed posttest results were lower than posttest results, the participants performed better on the delayed posttest than on the pretest. Thus, there was a change in the participants' knowledge and the change was sustained over a period of six months. Behavioral Data - Questions and Results 9. What types of skills and techniques did the participants use follow- ing the completion of the September session? The participants were asked during the telephone interview to describe what specific knowledge or skills they were able to use in the six months since September. All 14 participants reported they were able to use some of the knowledge or skills learned during the September session. "I'm really utilizing a combination of almost all Of them in my teaching in the clinic and the section on group development in my involvement in committees and other groups." Those topics which were identified at least five times are presented in Table 12. 85 TABLE 12 KNOWLEDGE OR SKILLS USED SINCE SEPTEMBER TOPIC/SKILL FREQUENCY OF RESPONSE Presentation skills 11 Audiovisual production 9 Clinical teaching 9 Teaching psychomotor skills 7 Clinical supervision 6 Constructive feedback 6 Group development 5 lil. What types of techniques did the participants expect to use in the next six months? Another question from the telephone interview questionnaire served to gather the data to answer this evaluation question. All 14 fellows expected to be able to use content from the September session in the next six months. "I'm going to have a quarter time teaching position starting July and I'll be precepting then.” Two of the participants reported their present situation did not provide many teaching Opportunities, but they were expecting to be doing more teaching in the future. ”I hope that in the next six months that I'll be in a position of doing more teaching.... My present situation just doesn't have very much opportu- nity to use a lot of these things." Topics which were identified at least five times are presented in Table 13. 86 TABLE 13 KNOWLEDGE OR SKILLS TO BE USED IN THE NEXT SIX MONTHS TOPIC/SKILL FREQUENCY OF RESPONSE Clinical teaching 8 Presentation skills 8 Group development 7 Teaching psychomotor skills 7 Audiovisual production 7 Clinical supervision 5 11. How did the supervisors perceive the participants' ability to apply the content of the session? .A question from the supervisor interview asked the supervisors to describe any new knowledge or skills the participants had used since September. The supervisors' responses indicated that all 14 fellows used some new knowledge or skills. The use of clinical teaching skills and presentation skills were reported by five supervisors. ”A couple of weeks ago I heard him present a talk. I think it was fairly evident from just watching him that he had picked up some skills in presentation of lectures." Four supervisors noticed fellows using skills related to group discussions. 12. 'How did the participants rate their own performance in a series of three completed simulations or presentations? During the interview the fellows were asked to rate their overall performance in three completed activities. A scale from one to five was used with one serving as the low end of the scale and five signifying the high end. The results of the ratings are provided in Table 14. The participants rated themselves highest in the clinical teaching simulation 87 and proposed research presentation. The lowest rating was for the prac- tice teaching exercise. 13. How did the participants rate their expected performance in a repeat of the series of three simulations or presentations? After the fellows assessed their overall performance in each of the three completed activities, they were asked to assess their expected performance if they were to repeat the experience. The same five-point scale was used and the results are presented in Table 14 with the results of the original ratings. TABLE 14 * MEAN SELF-RATINGS OF PERFORMANCE ACTIVITY COMPLETED EXPECTED Clinical teaching simulation 3.41 4.11 Practice teaching exercise 3.23 4.19 Proposed research presentation 3.45 3.88 * 5 points possible Although not statistically analyzed, the results showed positive change for each of the three activities. It was interesting to note that the least change was predicted for the presentation activity, even though it was rated highest initially. Comments made among the reaction data indicate a desire by some fellows for more practice giving presentations. Perhaps this desire was reflected in the ratings. 14. How well did the participants utilize the content of the session related specifically to presentation skills in two different presen- tations? To answer this question, videotapes of two presentations given by the fellows were rated by trained observers using a 16-item videotape rating scale. The first videotaped presentation (VIDEOl) was completed 88 by 13 fellows in January 1982. The second videotaped presentation (VIDEOZ) was completed by all 14 fellows in May 1982. Results are pre- sented in Table 15. TABLE 15 * MEAN SCORES ON VIDEOTAPE RATING SCALE VIDEOl VIDEOZ Mean 40.35 41.71 (50.431) (52.14%) Range 30.00 27.00 * 80 points possible The results indicated there was little difference between the mean scores for the two presentations. NO statistical analysis was conducted since the presentations were conducted four. months and eight months following the completion of the session. Participants were again ranked according to their performance on both of the presentations, but no criteria for determining appropriate or desired levels of performance were set. The data were shared with the program directors for use in their decision making. Individual items on the rating scale were analyzed to identify specific areas of strength and weakness. The participants fared poorly on both presentations on items calling for interaction and consideration of the audience during the presentation. Related the presentation to the audience members' past, present, or future. Solicited ideas or questions from members of the audience. The fellows also performed poorly on these items on one of the presentations. 89 Summarized main points or ideas of the presentation. Varied the rate and pace of the presentation. Responded to ideas or questions from members of the audience. The participants' performance was rated highly for the following three items on both presentations. Delivered, rather than read, the presentation. Presented information in an organized, logical manner. Showed interest in the topic and enthusiasm in presenting it. Performance on the following item was rated highly on VIDEOZ. Spoke clearly and audibly. These results demonstrated that the fellows needed additional work on making topics meaningful to the audience, varying rate of pre- sentation, and involving and responding to the audience during the presentation. The fellows' presentation strengths were in organization, delivery and enthusiasm. 15. How did the program directors perceive the participants' presenta- tion skills? During the interview with the program directors, the presentation skills of the fellows were discussed. The directors observed both positive and negative changes in the presentation skills of individual fellows. "Some individuals used presentation principles we talked about while Others that watched them didn't pick that up or chose not to do likewise." Three of the fellows performed at a substantially lower level than the rest of the group according to the program directors. Despite the poor performance of certain individuals, the program directors still concluded that this group was superior in the area of presentation skills to fellows of previous years. “When you look at all of them, they're 90 still a bit better than past groups. A couple are clearly low, but there's a couple of those in each group.” Since the presentations were given in September, January, and May, the program directors were able to Observe change over time. "It was easy for me to see changes from September to January for four of the fellows.” One program director indicated that 502 of the group had improved from September to May while the other director was more opti- mistic and suggested that as much as 701 of the group showed improvement. ”I saw a lot of them make attempts to use overheads and some organization that I had not seen before.” 16. Did the supervisors perceive any change in the participants' teach- ing behavior due to the session? Two questions on the interview addressed this evaluation question. The supervisors reported they had Observed 11 of the fellows teaching since September. The frequency of observations per supervisor ranged from once to five times or more. The types of teaching most frequently observed were clinical teaching, presentations, and group discussion. The supervisors also indicated they were able to judge from these observations whether the participants' teaching behavior had changed. Five supervisors noted that the fellows were more comfortable and/or confident in their teaching. ”He's more comfortable teaching, especially lecturing.” Three supervisors noticed greater organization in the teaching of the fellows. "He seemed to be more organized. I think he was conscientiously and consciously using some approach and technique, particularly with facilitating discussions.” Two of the supervisors attributed ' the changes in teaching behavior to the Program while two others felt that maturity and the passing of time may have been partially responsible for the change. 91 17. Did the participants perceive any change in their own role or func- tion in their home institutions due to the session? The participants were asked during the telephone interview if their role or function in their organization had changed due to their partici- pation in the September session. Seven of the fellows reported a change of one sort or another. "I think the biggest change has been in my self- image as a teacher. I didn't give any credence to thinking of myself as a teacher, but after the September session I felt like it was legitimate to think of myself as a teacher and developing as a teacher." Four fellows reported they were trying to do more teaching or had changed their teaching style. "I've tried to do more teaching and have changed in that regard. SO I think the way that I'm looked at by the faculty in my residency is slightly different." No fellows said they were not involved in teaching at the moment. "I'm not in a position where I'm doing much teaching.” 18. Did the supervisors perceive any change in the participants' role or function in their home institutions due to the session? The supervisors provided information during the interview that responded to this evaluation question. The supervisors reported that ten fellows had changed their role or function since September. Two fellows were serving as coordinators of preceptor programs. "He's more involved. He coordinates my preceptor program and my community health and psychia- try clinical rotations. He has taken a much more active part in the clinical teaching." Supervisor comments also indicated that two fellows were more involved in research since September. ”He has become more active in working with residents and in surveying patients as a beginning of a more active research activity.” 92 Behavioral Data - Summary The various methods used to collect behavioral data indicated that the participants, as a group, were using a number of the skills and techniques presented during the September 1981 session of the Program. There were occasions when data collected from one group were corroborated ‘by data gathered from a different group or with another procedure. For example, the supervisors reported that the fellows were better organized in their teaching. Organization of presentations was one of the characteristics of the fellows' presentations as identified by the videotape ratings. The information provided by the participants related to the know- ledge and skills they used in the six months was supported by the reports of the supervisors. The participants reportedusing knowledge or skills related to presentation skills, clinical teaching, and group development among the seven topic areas they identified. The supervisors commented that they observed the participants using skills related to these three topics. The program directors also noted the participants' use of some presentation skills during their presentations given during the fellow- ship. Thus, there was agreement among sources of information for this particular aspect of the evaluation. The participants all indicated they had improved as teachers due to the session. The supervisors were able to make a similar judgment about 11 of the 14 fellows. The program directors concluded that a majority of the fellows improved their presentation skills over the course of the fellowship, but the directors were extremely disappointed with the per- formance Of three individuals. The three individuals also ranked near the bottom of the group on the videotape rating of the two presentations. 93 The discrepancy of views on the relative improvement Of the three fellows is discussed in Chapter Five. In summary, positive results were identified via the behavioral data collection procedures. There were occasions when results based on one data source or procedure were confirmed by another data source's or method's results. There was one occasion when there was disagreement across measures and procedures. In the following summary of the results of the fieldtest, the results for all three types Of data collected are compared and major findings are presented. SUMMARY OF RESULTS OF FIELDTEST .A vast amount of information was collected to answer the 18 evalu- ation questions of the evaluation conducted during the fieldtest of the evaluation framework. The important aspects of the results have already been presented within the categories Of reaction, cognitive, and behav- ioral data. The major findings of the fieldtest are presented in this summary. The relationship of the results for the three types Of data gathered are also considered. The strengths and weaknesses of the content, instructors, and activities of the September 1981 session of the Program were identified .and agreed upon in most instances by the participants and program direc- tors. NO major disagreements were evident in the reactions of these two groups. The supervisors also provided favorable reactions to the Program although by necessity their comments were limited in scope. Moderate to high satisfaction was reported by the 14 fellows, 2 program directors, and 11 supervisors. 94 Among the results of the cognitive measures, there was an apparent relationship between participant test performance, self-reports of expertise, and self-reports of handout use. There was also evidence that suggested this relationship extended to the degree of satisfaction with the topics as well. The cognitive data also demonstrated that there was a meaningful change in the cognitive knowledge of the participants. This change occurred over the course of the two weeks and was still present six months following the session. The behavioral data indicated the participants were using a number of skills and techniques related to the September session. Support for this finding was provided by data originating from all three sources of information and from the different data-gathering methods. However, there was disagreement across sources as to the number of fellows who made noticeable improvement during the Program. The objec- tive assessments of the program directors and the videotape rating results did not coincide with the subjective assessment of the fellows and the objective views of the supervisors. This discrepancy is dis- cussed further in Chapter Five. .Across the three types of data there were some notable trends that deserve identification. The practical, concrete, skill-oriented presen- tations were most highly enjoyed. The content of these presentations was referred to the most, used the most, and used and learned the most successfully (according to cognitive test subscales and self-reports). The opposite was seen for the session on perspectives in learning which was viewed as theoretical and irrelevant. 95 In most instances, the information provided by the participants on different measures administered at various times over the nine-month period was consistent to the point of redundancy. There was sufficient reason to believe the data provided by the participants on the reaction measures and self-reports were reliable. The validity of the self- reports is discussed in Chapter Five. The data provided by the partici- pants, program directors, and supervisors also were in agreement in most instances. There was an apparent relationship between the performance of the participants on the three cognitive tests and the videotape ratings. Those individuals ranked at the top of the test results were likely to be highly ranked for the videotaped presentations. Similar results were found for those individuals who scored poorly on the cognitive tests. Scores and ranks for the fellows are presented in Table 16. The results presented to this point have demonstrated that the September 1981 session Of the Program was successful. The session participants learned new cognitive content, utilized important presenta- tion skills in videotaped presentations, and were able to use knowledge and skills learned during the session in their teaching activities in their home institutions. The September session appeared to have impact on their cognitive knowledge and behavior. RESULTS OF METAEVALUATION OF THE FIELDTEST After the results of the fieldtest of the evaluation framework were collected and reported to the program directors, the metaevaluation Of the fieldtest was conducted to collect data to answer the research questions stated in Chapter One. The summarized results of the 96 onmoaonunoo no: one oaouomoo .ouo 00 oaowmmon .muo 0N— : a «as a e 0.N¢ n m.N< 0n m.Nn m m.oN m n.~n on n 0.0w 0 0.~¢ n m.N0 Na Nm 0 n.0e an on 0.0: «as «as Nn 0.nn 0n n.0m m 0.0: Na c 0.Nq H 0.~m m n.n0 n m.m0 N 0.0m NH 0 0.~¢ e n.ne 0 0.~0 a 0.00 0n 0.~¢ 0n 0 0.80 m 0.0: e 0.00 N 0.NN N 0.0a 0 N m.N< Nd 0.0m w 0.~0 a n.0N m m.~m w an n.0N 0H m.om fin n.0m Nu 0.Nn w n.mq N «N 0.nN Nn m.Nm 0 0.m0 m n.0N Na n.0n 0 n 0.Nm 0 n.~e N 0.MN N m.~0 N 0.Nm m Nn n.0m o m.~e n 0.nN 0 0.nN Mn 0.0m e 0 m.~e N m.0e N m.~0 0 n.00 m 0.0: m n 0.Nn w 0.~e an 0.0a on n.0e 0n 0.~o N on 0.0a mu 0.~N on 0.em an n.0m on m.NN N uzmm «eN0m0H> uz M2m soozimOIvnm onn 00o .wonmoanooo onn .oHHoo ozone onn Ann: .nnONNo nmooH ozn now oofinman0uaa noon osn 00h mo>uw noon onsvoo Iona noon n00 nooHoo on nnonm :00 :ON .nwomo no>0 moo no>0 noon 00% noes woaanwuooo nmoh ow wawnnhno>o NH .oawn Hownnaw onn moan Iwnmsfi Naonnoqwov mo: n« .stno o>Nososonoaoo onoa o no: nn omsooom .o>«oo>=w noon no: onos HHono>0 monsvooono onn non .aonwono osn oauonso oNHoo ooonooaon osn oxNH owaann m0 nON o one no» .nmonnoom vohoaoo 00o .Inooo .Iono osn a.“ counoooxo oHoNooom N13 o5. 23o no :02: oeonnoa no: 020 No.3. .o>o: n.00no non... o3 noon onov gossamer oaoo oegonm one Noon non .nooN Nno>o monovooono ooonn «0 :o wsnnoomon onnoa on no: 3003 nH .Hoov nonouooo 0 non ewe o3 noon moonoooon osn wanna an oonmnnosn nNon o3 mmmzommmm .mMOHUNMHG zov nnosn on oo>a0>0u nnoumo 000 oann onnxo onn oonmnnoan noon onnsoon vHoHN 0n noonoo mnooa Isnnoon oaonnnaa NO on: onn on: Naoaqaaa o On noox no: ooanoanmnv aonmono noon on mononmnofiavo monsoooono connooao>o onn ono: Nvovaooxo moonsooon oon Nufinmsfi 0n ooao> noowonmwsm N0 sonnoanONOH ooovono monovooonm connooao>o onn on: .m .N .N monHmmbo OHmHummm .vovsooxo moonsomon onn Nunnmofi 0n o=Ho> noononmmso No sownoan0mon oosoonq oasonm sonnoOHo>o onH .ooononno on can sonnoan0ucn voooo: noon 00o .aoansaa o On noox on connoonono noon on .Hoownoonm on vaoono monnvooono aonnosao>o oss Nooonoooon no on: onn on Hoonnoono anoaoaonm monnosao>o onn mos Na zennmnpo momN "an onnmnno momaom|aoan0n0 00o connoonaooo onoa oon>0n0 0n commune coon onoa 693 00.303 .oaonn noon onn NO Ho>oH o>nnnow00 onn 950028 o>os 0033525 onn 3000 30: .m .eouunoaasm onoa oHnnHH o wannnoaom ooo 0n monoma o>oe nswaa o3 non .stooo no: nosn sonnosnowsa Honou>nvnu oon nom .oonnoanONON on“; monounsan nHom o3 moann onos onoca .Houoao: Nno> soon o>oz canon noon nsoasonn>0o 030 nnonn on manna» Nobosx onn m0 oaom wauanomnon aosn oo>noon0 Nada—now o>os 0300 o3 m.“ 96: on ox: 3003 00.» noon n00 nan .onoooono Hoonooa one: xnoz 00o monsnooa 00 Noon 0: anon Nona. nmoH wsnenNoo onoon 0oz No.50 .wsnnnom nwonn 0.“ 03.30 onn mafia: 030:3 onn mo oonnonsoaooov Isononoaoo 0:0 onoaoaoo vo>nooon oaoo no: manned: mos noon mounds—noun." Naco onn unannom 00% noon conned—noun: 2.3 no; .N Nsonwono onn nsooo .2. On an: :ON non—n moonnoosu 0:3on Hana—Hon Nno> mos no.3. .ono: ooonnmno onos o3 noon n0 owonon newnn oonoaoao noon moans—noun.“ out, onn mawoo onoB o3 nofin m0 vHOn noon once on: o3 302 .owa nw .mow Iona xnozoamnm sownosao>o onn on: .N mmmzommmd . much—.UMMHQ 2550mm mzonnmnao onnnonnm .moocoueoo vonmnooom m0 mnmononma woo mvooo onn on o>wo000mon on moo nonno0H0>o onn m0 noohno onn noooo maownooav nsoounnoo moonovo on no ohms some on vonooaoo 000 ooooo noon «0 on vasono vonooHHOO nonnoanousH unmN Nmnoxoa sonmwooo osn 0n moans—flown: monounrono 0N Manon: xnosoaonw 0033525 onn an: “my zonhmmbo 532mm: ma zonnmmeo mueon o3 nos? won 13%? noon NHHoon eon Noon w.“ onnanonov 0n 0: now noaooo on 01.03 mmmzommmm . mMOHUmmHn 25533 A.u.naoov nN nnn onn «woman on :0» now Hannov .oooooon oaooz Honmono onn nnws nonnnsom Naonoannon n.0on on: nwoooo on vonnnoooo onooaonnoon oaooaoo son soon n.oov or non .nn nnwz nonnaaou on.o3 oooooon .oow_ wnnnonnomlnonnosnounn onn ono: .N Nooeu>0no Nonn nonnosnownn onn no Nnnvnao> onn omoooo on 00» now anonov nmoooo an oonnnoooo .oo» monnmsnomon mo moonsoo onn onoa .n mumzommmm .mMOHUMMHQ zo no on vonooon moonmsaoaoo ona .onounononononnn oHnonnoaoom onsmoo 0n nonhuman NHHoonnmaonon moo Naonownoonono on vasonm connosflo>o no on coanoanouou o>nnonwao=v moo o>nnOnNnnoso .oo: oovnoncw onn now oanouaon Nannouonmuoo one anno> on no vo>nnno oownononononon onn nonn onaooo Haas nonn mNoa 0H vonaoaoaoaa nonn one voooao>ov n0 nooono on vaoono monnvooono moo mnooaonnonn wounonnowloonnoanONGN one .vooooouo on can nonnoanonnn onn no Nooowovo onn nonn om anonov :« connnooov on masons connoanomnn no moonooo one Nonosvovo NHHoONnnoon anosoaonm nonnosao>o onn no noonoaonu onn wounoe vow: oncoaonnman woo moonnoa onn onoz «a zonnmmaa momm "0* onHmMDO mum oHnnNH 0 now on monon nH .mNoNHooo 0n onov nonn No No0 noofinoo coo :0» mo nose no .mo» .nn wnwnmaono no nnomuo 000w o oooa :ON .00» .oow .coo 00h mo NnNHNnoNHonoa onn 00 none mo won>0aon no nnomwo 000» o ovoa 00w .nnwz o>nH 0n o>on ooh unannoaoo o.nonn oann oaoo onn no non .Nnnannozonoo now owaooe onn on non... soon «0 nOH o o.onon.a mmmzommmm m.mmoaomMHn =o onn on vonoomono moonooaoaoo onn onoz Neonmaono Naaooanoaonomm 00o Naonounoonooo onos onov o>Nn IonNHoov onn nonn noomoo n« on: NvouNHooo NHHoonnoaonoN0 00o NHonoNnoonooo onoa onoo o>nnon Iwnnosv onn nonn nooooo nu on: NNHHoonnoaon loam vonNHooo woo monooaaoo onoa onoo onn nonn oooovn>o ononn no: Noova>onm Nonn onaooon onn mo unadanownon onn ouoooo 0n =0N non anonoo nmnooo an oonnnomov onaoasnnoow wnqnonnomlnonnoanounu onn onoz .N .0 .0 .m mzonnmmaa onnnonmm 111 .owoowo oowwwn 0o moo ononn woo Bonn nnfia ooEnn HHo no noonw 00 onoa :0» .oo» mmmzommmm .mMOHUMNHQ zo onn no: .N Noonnooao>o onn no ooonnonHENH onn wonwoaoon .owonwoqu nooonnnoo no onoooao Ionw on« on noooon woo .noonnw .oooo nnooon oonnooaonro onn no: .n mzonnmnao Onmnonmm .wonoonono woo wonooooon ono onoofinoo ooaon onn no onoufioa woo onnwnn onn nonn oo .wonoowooo woo woownoow on wHoono ooonnooao>m .oonnoono>o onn no ooonnonaana onn monwoaoon .owonwonm nooonnnoo mo onooOHoon nnonn on noooon woo .noonnw .oooo on waoono onnooon oownooHo>o oonnnna woo Hono "mamM Nnooooa Hoonnno oo on wonoowooo nnoaoaonu oONnooHo>o onn mo noonwaowm onn moanow woo: onooaonnooa woo owonnoa onn ono: "0* ZOHHmNDU momnw Nonn onons ooo woo ooooanownoo nnonn o>noon0 Nanoonaw o3 onona ooo .nonnoNOn wownnoa onooaonnoon oan noun on won o3 NH“ .no JOOH on non: ono ooounooao>m nooz1u0Iwom onn .Nnnfinnonooooo HNono>0 no osnon on .Hoow noonw o o: wooaon onow nonh .oownonooo OnoN noo 30o Nonn ooo non: .ownOB nonno oH :NoHnonnowooonn wonow on.o3 non: on oonwow non: on... ooo on" wonoononon onot o3 nonn oONnooov Onoon one .ooonnonoooono onn .ooooanounoo Hoonoo onn no owonnon on“. .onooonownnoo ona .onow nooHHOO sonoaoo 0n Nnnoonnoooo nonn ooo wHooo o3 .nnoumo oONnooHHOO onow o onon nanoaonom mounow ooo onn woo Iona onn now ooo onn .wonnnoo o On oxoa o3 onNon> oan onn ononoonooon wdooo o3 .ooonooo nonno aonm onow onn now ooo o3 nonnn o3 woo .oaan onn .nooo onn m0 manon on noono ooonnooou oaoo o>on o3 nonn 93on>nonoa onn o.nH .wowwoannow Hoonu onn woo owonnon woo ooownonoooono ooonoowg onn 0w wan—03 o3 .onooooa o>annowoo mo oomn noonommnw o on On nonn noo: waoos o3 nwoonnao .noonnooo woNoHow .Inooo .Iono o>nnwomoo onn .ooONnooao>m noo3INOIwom onn ooo wnoos o3 .nonoonww noonowooo n0 nonoonnw onn oonn nonno ooooaoo now no“ o on nH mmmzommmm .mMOHUmmHa z nooa no oonnoanomow onn wown>0no wonnoa ocunooao>o nOan .0 N00» on ooHo> nooa mo oOHnoonONoN onn woww>0no oonooo oONnooHo>o nownz .m Nooow on wHoono nonn ooow noo ooo non: .N Nonomo nn 0w 0n onoa oON an Nanoonommnw 0w oON wHoOB non: .n 20:38 8.385 Naonwonm onn 0n woNHooo oona oonnooom 3n03oaonm oonnooao>o onn waw Naos 30m ”onammao 44mmzmo mmmzommmm 92¢ mzowfimmbo ZOHHmooon o3 .onow Nnon loanNNooo wowN>ono woo noo» nnnoON onn on ooon oounooHo>o noo» woo onooh oonnn n0N onnn oaoo onn wonow ooon o>on o3 woo sonmono onn N0 nooN nnNNN onn on oNnn non .noonnooaN o.nN HooN n.o0w o3 oooooon noo o.nH .owohfiooo Noo wonow N0 noN nnwoonn 0o .aonn wooonoown> o3 .onnoa> onno onn 0w 0n noo ow o3 oona oonnoonONoN nonnow o3 son nooNNo Nos n.“ non .noN oaofiwnonoa 9730.33 onn N0 Noo 0w 0n woaow on.o3 NH ooonowoow Noo owoa noo o>on oz .wonoaooo ono: ooonnooou noo oooooon NNNnoaNno .noN ooow o>.oON nonn owofinn Hooonnnwwo onn N0 Noo wonoooonoan n.oo>on o3 .NNon 0n Nanoo oNnnNH o on N93 nH .onow N0 oooNn noonoNNNw ooonn N0 nooo nowoo oonooooa Nooa 00n won o3 onNoz .oONnoonONoN nooa 00n on oonon o>on nnmao 03 .o>Nooononoaoo on on noaonno oo o« nonn ooo oooonooz onp. .ooanoonONoN N0 oooNn noon oonnn onn ono nona oo o« ooon 0n wowoooa 00» .33 oNnn on woow ooo nonn woo Nooowoowon oaoo ooo ononn woo .owonnoa oONnooHHOO onow noonoNNNw .onow N0 ooonooo noonoNNNw N0 NnoNno> o woo: nn woo ooooonoo o>NnooNNo woo o>NnNowoo nnon noono onow nooHNOO on nooonno oo onoa wNw nN nonn on nnwoonno one .nnooon .onoou>noooo onn woo onow o>NnNowoo ona o “adv OONNQEONHON mmmzommmm .mMOHUNMHn Zo onn N0 nnwoonno HHono>0 onn ooo non: .N Noo NHon noooH :ON wHooa onow N0 ooNn nuanz .0 No0 Naon ooN wHooa onow N0 oohn nonnz .0 $82.38 238% 114 m 4 m NHHAHno I m "Max none I N unnN mo mUZHHHDZH 0N mdmoa wannonnnoe Hanna ozon> InonoN nooN>nooom oBoN>nonoN zoaaom owoNnon ooonoowa> onoon o>nNowoo ooONnooHo>N noozINOIwou nubamoomm 115 0.N 0.N 0.N 0.N 0.N 0.N AN mo mquHO 24M! N.N N.N N.N 0.N 0.N 0.N N.N N.N N.N 0.N 0.N 0.N NUZMHUHNNN mombowmm Moau InonoN nooN>nooom ozon>nonoa BOHHom owownon ooonoown> unmon o>nnnomoo ooonnooao>m noasInoIoan mmbomuomm 116 End-of-Week Evaluations in terms of the utility and credibility factors. Low rating on time and resource efficiency brought down the overall rating for the videotape rating. The cognitive tests were rated low on all factors except data manageability. The interviews of the fellows and supervisors rated medium or below for all factors. The data collected with this procedure are summarized with the information gathered with the other two pro- cedures in the following summary of the results of the metaevaluation. SUMMARY OF RESULTS OF METAEVALUATION The three metaevaluation procedures successfully identified the strengths and weaknesses of the fieldtest of the evaluation framework. These strengths and weaknesses are addressed in major findings presented in this summary. A number of the previously presented results also have implications beyond the study and are discussed in Chapter Five. Table 27 contains the five research questions and a brief answer to each ques- tion. The answers are based on the results of the metaevaluation. The additional data gathered during the metaevaluation helped identify those procedures and data that were most useful to the program directors. The three preferred procedures were the End-of-Week Evalu- ations, final debriefing, and videotape ratings. 'The cognitive tests were rated lowest due to the question of the validity of the test data. Among the three types Of data collected, the program directors identified behavioral data as being more useful to them than cognitive and reaction data. However, the program directors indicated that a strength of the fieldtest of the evaluation framework was that reaction, cognitive, and behavioral data were collected using a variety of sources 117 TABLE 27 SUMMARY OF RESPONSES TO RESEARCH QUESTIONS RESEARCH QUESTION RESPONSE 1. What specific problems were The major problems related to the encountered in the fieldtest collection of behavioral data and of the evaluation framework? to the development, administra- tion, and validation of the cogni- tive tests. 2. Was the evaluation framework Yes, the program directors felt practical in its use of re- justified in committing the re- sources? sources required to conduct the fieldtest. 3. Was the evaluation framework Yes, the data were comprehensive useful in providing informa- and confirmed the program direc- tion to the decision makers? tors' subjective assessments of the program's quality and impact. 4. Were the methods and instru- Yes, with the exception of the ments used during the cognitive tests. There were fieldtest of the evaluation reasons to question the validity framework technically of the cognitive test results. adequate? 5. Were the methods and instru- Yes, the methods and instruments ments used during the were conducted in an ethical fieldtest of the evaluation manner. The evaluator was candid framework conducted in an in his interactions with people ethical manner? during the fieldtest. of information and collection methods. A weakness of the fieldtest was that too much of the information collected was redundant and at times overwhelming in its volume. This volume of information was attributed to the number of methods used and to the high proportion of Open-ended questions included on many of the instruments. In conclusion, the results of the metaevaluation demonstrated that the evaluation framework, as configured in the fieldtest, was successful in evaluating the impact of the September 1981 session of the Program. The program directors received valuable information used to make de- cisions about the program and the fieldtest produced results indicating 118 that cognitive and behavioral change had occurred in the group of fellows. SUMMARY OF THE CHAPTER The data gathered during the study were presented in this chapter. First the results of the fieldtest were provided. These results were paired with the appropriate evaluation questions, which were also grouped according to the type of data gathered, reaction, cognitive, or behav- ioral. Tables were used to summarize data when appropriate and summaries were pmovided for each data type. Finally, a summary of the fieldtest results was presented. The results of the metaevaluation of the fieldtest of the evaluation framework.were also provided in this chapter. A self-report prepared by the evaluator highlighted problems encountered during the fieldtest and served as a response to the study's first research question. Responses to the other four questions were furnished based on information collected during an interview with the program directors. The responses to additional questions asked during the interview were also supplied. The results of a third metaevaluation procedure, the rating of fieldtest evaluation procedures and results, were also presented. Finally, a summary of the results of the metaevaluation was provided. In the concluding chapter of the dissertation, a summary of the first four chapters is provided. The results of the fieldtest and metaevaluation are discussed and conclusions are drawn. In closing, recommendations are made for further research and implications of the study for educational practice are considered. CHAPTER FIVE SUMMARY AND CONCLUSIONS INTRODUCTION In this chapter the study is summarized. The problem, literature, procedures, and results of the study are reviewed. The study results are discussed and conclusions are drawn. In conclusion, recommendations for further research are suggested and implications of the study for educa- tional practice are considered. THE PROBLEM Short-term training is a popular format used in training and education throughout the United States. One specific purpose for which short-term training programs are frequently used is to improve the teaching skills of faculty in post-secondary education. Faculty development programs exist in a number of colleges and universities, particularly medical schools. The problem that led to this study was that few faculty development programs evaluate their effectiveness. The majority of existing faculty development programs relied on participant self-reports and reaction data to measure the effect of faculty development activities. Evidence of 120 cognitive or behavioral change in the participants was rarely reported. In addition, no evaluation approach designed specifically for short-term training programs was identified. In response to the problem, the study reported in this dissertation was designed and conducted to determine whether an evaluation framework for short-term training programs could be developed and successfully implemented. THE LITERATURE The literature of the evaluation of faculty development activities in post-secondary education and medical education was characterized by a dependence on self-reports and satisfaction data. Numerous authors, including Centra (1976), Stephens (1981), and Levinson and Menges (1979), stressed the importance of gathering objective data on cognitive and behavioral change to assess program impact on participants. The review of the literature on evaluation models confirmed the initial assumption that no single evaluation model existed that was suited to the task of evaluating short-term training. However, numerous concepts and components were identified within existing models that were applicable to an evaluation framework for short-term training programs. The section of the literature review that focused on evaluation methodology contained an inspection of several critical evaluation procedures. The evaluation procedures examined included the relative value of using quantitative and qualitative methods and the rationale for using multiple methods, multiple sources of information, and multiple levels of evaluation. Many of the concepts discussed in this section were incorporated into the design of the evaluation framework. 121 The final section of the literature review focused on the process of metaevaluation, a relatively new concept. Very few authors have contributed to the field of metaevaluation and the literature was correspondingly sparse. However, suggested approaches were presented and one approach became the framework for the metaevaluation conducted in this study. PROCEDURES AND METHODS In Chapter Three, the procedures used in the design, development, and fieldtest of the evaluation framework were described. The major components of the framework were detailed and displayed in a matrix. The program evaluated during the fieldtest was described and a matrix depicting the evaluation plan for the fieldtest was introduced. Evalu- ation instruments employed during the fieldtest were described and data analysis procedures were outlined. The chapter concluded with a descrip- tion Of the procedures used in the metaevaluation of the fieldtest. RESULTS Fieldtest results and outcomes of the metaevaluation were presented in Chapter Four. Results of the evaluation of the September 1981 session of the Family Medicine Faculty Development Program (Program) were reported with the 18 evaluation questions that guided the evaluation. Metaevaluation data were paired with the study's research questions and the evaluation standards. The major findings drawn from these results were presented and discussed. 122 DISCUSSION Issues related to the results of the study are discussed in this section of the chapter. The major issues include reaction data, cognitive data, behavioral data, evaluation procedures, and the evalu- ation framework. The five issues are considered in relation to the results of the fieldtest and metaevaluation and to issues discussed in previous chapters of the dissertation. The results of the fieldtest demonstrated the redundancy of the reaction data collected with various evaluation procedures from different information sources. The redundant data were reassuring information since the data collected with the End-Of-Week Evaluations in September ‘were supported by the data collected via the interviews in March and the final debriefing in May. The reaction data gathered from the program directors were consistent with the comments of the fellows in most instances and focused on the same weaknesses and strengths of the session. The redundancy of information was noteworthy in that the reliability and validity of the data gathered during the End-of-Week Evaluations were strengthened by the data collected by the evaluator during the inter- *views. The reliability and validity of the data collected in September and March were further supported by the data collected during the final debriefing in May. The reaction data were also notable in terms of their relationship with other outcomes of the study. There was a relationship between participant reactions to topics and presentations during the September session and their subsequent behavior and performance. 123 The terms ”relevant” and ”applicable” were frequently used by the fellows in their coments. The presentations the fellows enjoyed and perceived as most relevant or applicable to their present or future activities were also the subscales on which they had the highest scores on the cognitive tests. The reported use of handouts from these presen- tations and the reported use of skills and techniques related to these topics were also highest among all the topics and presentations. Par- ticipant comments related to the session on presentation skills were also supported by the videotape ratings, especially comments by the fellows pertaining to their improved organization while lecturing and teaching. Among the cognitive results, the issue of the poor quality of the test was the primary concern. Certain logistical constraints, such as lack of time to pilot the test prior to its use with the fellows, were considered previously. Suggestions for improving the test are described below. .A major drawback of the cognitive test was that it did not test the participants' ability to apply or generalize the content of the September session toIother situations. Over half the test items focused on recall of facts, lists, and definitions. The test could be improved substan- tially if more of the test items were rewritten to test application rather than recall of information. Bloom, Hastings, and Madaus (1971) discussed evaluation techniques for assessing application of instruction. Teachers and curriculum makers have long recognized that a student doesn't really ”understand" an idea or principle unless he can use it in new situations. Thus, application is fre- quently regarded as an indication that a subject has been adequately mastered. (p. 159) 124 Bloom et al. acknowledged the difficulty of developing items that measure the learner's ability to apply principles and generalizations to new problems and situations. The posing of new problems and situations is a difficult art in evaluation. It requires the evaluator to find or make new problems and situations within the grasp of his students. It is especially useful if the problems are real ones rather than contrived ones, with artificial or fictitious elements. Students find real problems more satisfying to attack than patently contrived problems, which can seem rather like puzzles and tricks to be solved. (p. 162) Rules for generating application-level test items were suggested by Bloom et al. In short, they stressed the importance that the problem situation be new, unfamiliar, or somehow different from the situations ‘used during instruction. The test item difficulty is determined in part ‘ by how different it is from the problems presented during instruction. The use of appropriate principles of generalization should be required to answer the test questions. The final rule suggested by Bloom et al. was that one or more of the following behaviors should be sampled by each test item. The student can determine which principles or generali- zations are appropriate or relevant in dealing with new problem situations. The student can restate a problem so as to determine which principles or generalizations are necessary for its solution. The student can specify the limits within which a particular principle or generalization is true or relevant. The student can recognize the exceptions to a particular generalization and the reasons for them. The student can explain new phenomena in terms of known principles or generalizations. The student can predict what will happen in a new situation by the use of appropriate principles or generali- zations. 125 The student can determine or justify a particular course of action or decision in a new situation by the use of appro- priate principles or generalizations. The student can state the reasoning he employs to support the use of one or more principles or generalizations in a given problem situation. (p. 165) The rules proposed by Bloom et al. for developing application-level test items could be used to improve the cognitive test used during the fieldtest. Recall items could be rewritten or new items could be generated. A positive outcome of developing application-level items is that poor results may be attributed to the learner's failure to learn the material rather than the uncertainty induced by the questionable validity of the test items on the fieldtest cognitive tests. The other issue discussed in relation to the cognitive results of the fieldtest is the participants' reports of additional study and subsequent handout use. The most additional study of a topic reported by the fellows was less than half of the group, 6 of 14. The least additional study reported was 1 of 14. Several explanations are considered during the following discussion of the lack of additional study. One explanation for the lack of additional study is that the fellows did not have the time or motivation to do extra reading or attend additional workshOps on the topics of the September session. Conversely, the fellows may not have wanted to pursue additional study because they had learned what they needed to know about the topics and further study was not required. This issue was not resolved by the study, but additional study was not reported by a sufficient number of fellows to suspect that additional study had an effect on the retention Of knowledge as measured by the delayed posttest. 126 Handout use was more pronounced than additional study and may have affected delayed posttest scores. However, the topics for which handouts were used the most were also the topics which introduced the skills and techniques most frequently used by the fellows. The possibility that the delayed posttest scores were affected by referral to handouts was counterbalanced by the fact that the purpose of the September session was to provide the fellows with practical skills and techniques they could use while teaching. If no additional study was required due to partici- pant satisfaction with the content of the session and if handouts were used to prepare for specific teaching activities, then the session was successful in a manner which the delayed posttest could not measure. Several issues related to behavioral data collected during the study deserve discussion. The first issue is the videotape rating scale used to measure the fellows' ability to apply presentation skills in two videotaped presentations. The videotape rating scale was developed specifically to assess the presentation skills of the participants in the September 1981 session of the Program. Handout materials from the sessions on principles of learning and motivation and presentation skills provided the basis for the content Of the 16 items on the videotape rating scale form. Unlike the cognitive test results, the videotape ratings were of high validity and provided the program directors objective information related to the fellows' strengths and weaknesses as presenters. As a result, the video- tape rating scale was much more likely to be implemented by the program directors in the future than were the cognitive tests or interviews conducted during the fieldtest. 127 The issue of self-report data is considered next. It was suggested in the faculty development literature that self-reports should not be relied upon as the sole evidence of the effectiveness of faculty development programs. Centra (1976), Stephens (1981), and Levinson and Menges (1979) were critical of the use of self-report data in the absence of cognitive or behavioral measures of faculty improvement as teachers. Yet in most instances reported in Chapter Two, reaction data and/or self-reports were used to measure impact of instructional improvement activities. Distrust of self-report data as evidence of behavioral change is not limited to the faculty deve10pment literature. Howard, Schmeck, and Bray (1979) reported: It is axiomatic that, given a choice between a self-report and a behavioral measure of the same phenomenon, researchers will choose the behavioral measure. Likewise, when behavioral and self-report indices of the same construct show substantial discrepancies, it is seen as a signal to suspect the self- report measure rather than the behavioral measure. (p. 129) Howard, Maxwell, Wiener, Boynton, and Rooney (1980) added: The status of self-report techniques in modern research is clearly that of a second-class citizen. Critiques of self- report approaches, representing detours on the road to a truly rigorous scientific discipline, are ubiquitous.... Researchers are advised to employ self-reports only if no behavioral index of a construct exists, such as with dogmatism, or if behavioral measures are too difficult or too costly to obtain. (p. 293) However, recent research has been conducted to determine whether the validity Of self-report data can be increased. A Retrospective Pretest- Posttest Design, similar to the approach used during the interviews with the fellows to determine expertise in the session topics and performance in the three simulations or presentations, was proposed by Howard et al. (1979). 128 This design would simply improve a modification of Campbell and Stanley's Design 4 to include a retrospective pretest at the time of posttesting. This is accomplished by asking subjects to respond to each item on the self-report measure twice. First, they are to report how they perceive themselves to be at present (Post). Immediately after answering each item in this manner, they answer the same item again, this time in reference to how they now perceive them- selves to have been just before the workshop was conducted (Retrospective Pre). Subjects are instructed to make the Retrospective Pre responses in relation to the corresponding Post response in order to insure that both responses are made from the same perspective. Each set of ratings is scored separately to yield a Post score and Retrospective Pre score. The results of a Retrospective Pretest-Posttest Design are still not conclusive, but the selection of the design is an option available to evaluators. Howard et al. (1980) stated, ”The present set of studies demonstrates that some of the evidence traditionally cited to demonstrate the lack Of accuracy of self-reports must be reconsidered" (p. 309). Thus, the evaluator forced to rely on self-reports to gather a portion of the data to evaluate a short-term training program should consider a Retrospective Pretest-Posttest Design to increase the accuracy of self- report data. The issue of accuracy of self-report data is also of importance when considering certain discrepancies between the subjective and objective data gathered during the fieldtest. In particular the data gathered for three fellows are discussed, since the program directors indicated their disappointment with their performance. Overall test results, presenta- tion skills subscale results, videotape ratings, program director comments, fellow self-reports, and supervisor comments are described. The fellows' rankings are presented in Table 28. With the exception of Fellow 6, who ranked near the top of the group on the posttest and delayed posttest, the Objective data indicated the 129 three fellows ranked near the bottom of the group on all five measures, including the presentation skills subscale. This finding was consistent with the program directors' assessment of the three individuals. We were disappointed in their skill and motivation level.... This group couldn't apply information, just rote recitation. We were disappointed in their ability to verbalize *what we were trying to teach.... In terms of taking advantage of what the program had to offer, they played around with their projects, nothing will change in their lives as a result of be- ing in our fellowship. The comments made by the fellows and the supervisors depicted a different view of the skills and abilities of the three fellows. The fellows all reported an increase in expertise in presentation skills from September to March in addition to predicting improved performance if the proposed research presentation were repeated. Use of the handouts related to presentation skills was reported by two of the fellows. Supervisors of two of the fellows noted increased organization in the fellows' teaching. The third supervisor reported having had little personal contact with the fellow since September and was unable to judge changes in teaching skills. An obvious discrepancy exists concerning the performance of three fellows. The comments of the participants and their supervisors do not coincide with objective test data, rating of two presentations, and comments Of the program directors. One explanation is that the three individuals did improve in the area of presentation skills as applied in their home institutions. This would explain the discrepancy between data collected based on program activities and data collected based on activities on the job. Another explanation might be that the fellows' improvement during the program activities was so slight compared to other fellows that it was not detected by the program directors or objective 130 MN 0N 0N NomnH> 0N NN mN NONQH> m NN 0N NN N 0 N m 0 N m m NN 0 NN. 0N MN NN «N «N N mNNHMm NN< mNNme 44¢ mANHMm NN< sbdqmm .mmfim Imm>o .mmxm Imm>o .mmmm Imm>o thmfifimom Nfimmafimom Bauhmmm mHz oONno>noon0 noonNn Nomnooo noonnoomInoononm o>Nnooooonnom woNoov onnooonINNom woNNoNnnow Noon ooONnooNo>m nooleOIwom onooouonnnom nonwono woNoNonn ononInnonm Nowonnnoo ooooanONnoo Noonoo n0 wonoNooNo on Eonwono onn N0 noonooo onn NNooo 0n NnNNNno nNonn o>Noonoo onoooNONnnoo onn wNw so: «In N Nowounnoo ooooonONnoo Noonoo no wonoNooNo on aonwono onn N0 noonooo onn NNooo onoooNONnnoo onn wNw NNos 30m H>.00 m Naonwono onn N0 noonooo onn No oonnoonon woo mononooN ooo nNonn o>Noonoo onooononnnoo onn wNw 30m MIm m NoNonon woo onooN onoooNONnnoo onn wNw nonwono onn N0 noonooo onn N0 none 30: mamma m Nonoo0NONnnoo onn onoa oonwono onn nnaa woNNoNnoo 30m 0m.30m m 235.5. 23 $9.25 <03 <20 ozNMmmaN NSF mo meHmm 0m MNQmo tun—NONE mz_0_om2 >I=E<0 155 156 The Family Medicine Faculty Development Program at Michigan State University began operation in July. 1978. The program is conducted by the Office of Medical Education Research and Development (OMERAD). in association with the departments of Family Practice (College of Human Medicine) and Family Medicine (College of Osteopathic Medicine) at Michigan State University. This program is supported by a grant from the Bureau of Health Manpower. Public Health Service. Michigan State University's program addresses two major objectives. They are: ' 1. to identify and train new physician teaching faculty for both allopathic and osteopathic family medicine training progams. and 2. to assist existing family medicine faculty in develop- ing and/or refining their pedagogical skills. These objectives are being met through three distinct yet coordinated program components. These components are: 1) a series of teaching skills workshops: 2) a teaching fellowship; and 3) a continuing professional devel0pment progam. Teaching Skills Workshops The teaching skills workshops are being offered to M.D. residents. D.O. interns. preceptors. and other part-time physician faculty who have informal teaching respon- sibilities in family medicine. The purpose of these workshops is to mobilize interest in teaching as a career and to improve the teaching skills of these physicians. The workshops are no longer than one day and focus on specific instructional planning. teaching. and evaluation skills. Each year. a total of eight workshops will be con- ducted for M.D. and DO. physicians in the Michigan area. Teaching Fellowship A teaching fellowship is being offered to M.D. and DO. physicians who have completed or are about to complete a family medicine residency program and to family medicine physicians who are just beginning their teaching career. The fellowship begins in September. and fellows spend one- and two-week sessions at Michigan State University throughout the remainder of the academic year. The fellowship is the equivalent of a three-month traineeship. The goal of the fellowship program is to provide these new faculty members with a proven base of skills in teaching. evaluation. and the management of instruction. Fellows participate in a series of workshops. seminars. and practice teaching situations in real and simulated Who WeAre.... clinical. lecture. and small group settings. A portion of the fellowship program is conducted at the fellow‘s home in- stitution. Here. fellows complete a variety of structured assignments under the supervision of a proiect faculty member. A stipend is available to help fellows defer the costs of participating in the program. Continuing Professional Development Program The continuing professional development program will be offered to existing M.D. and DO. physician faculty with regular teaching responsibilities in family medicine training programs. Beginning in 1982-83. the purpose of this program will be to reduce the rate at which full-time faculty members leave the teaching of family medicine and to provide a forum for the continuing professional development of faculty who cannot participate in the three-month traineeships. The program will include a series of interactive seminars that will allow full-time faculty members to meet with faculty from different in- stitutions to systematically develop solutions to chronic problems in the teaching of family medicine. The seminars will meet approximately ten times during a year in various residency program and family medicine depart- mental settings. Faculty Development Workshop Materials in addition to providing formal training programs. the Family Medicine Faculty Development Program at Michigan State University has developed a series of eight self-standing mediated faculty development workshops. The purpose of these workshops is to assist family medicine departments and residency programs in con- ducting their own faculty development programs. Each workshop package contains all the print and audiovisual materials necessary to conduct the workshop. A detailed administrator's guide explains all steps necessary for plan. ning. conducting. and evaluating each workshop. For additional information about the activities of Michigan State University's Family Medicine Faculty Development Program. please contact: Dr. William A. Anderson Office of Medical Education Research and Development A-209 East Fee Hall Michigan State University East Lansing. Michigan 48824 Phone: (517) 353-9656 MSU e an Alf-nave Anon Eew 0990mm“ luau-on APPENDIX B END-CF-HEEK EVALUATION FORMS FAMILY MEDICINE FACULTY DEVELOPMENT PROGRAM End-of-Heek Ewaluation week 1 September 11, 1981 PART I Please indicate your overall reactions to this past week's sessions by checking the appropriate box. If you have a specific suggestion about how a change should be made, write that suggestion in the appropriate space or at the end of this instrument. ' Aspect of Keep the Specific Program Same Increase Decrease Suggestion 1. Mount of Reading 2. Comfort of Room 3. length of Hor kshops 1;, Relevance of Information 5. Mount of Participation 6. Mount of Practice 7. lumber of Examples 8. level of Infor- mation Compared to My Mount of Knowledge 157 158 PART II Please respond to the following statements using the KEY’given below. KEY: §_A_ means you strongly agree with the statement, A means you agree, _l_l_ means you are uncertain, 2 means you disagree, and S_D means you strongly disagree. 1. I found the Tuesday orientation session very helpful. 2. The concepts presented in the small group process workshop were helpful. .___. SA A U D SD 3. I believe I can effectively use the principles of learning and motivation in my own teaching. SA A U D SD A. The clinical teaching technique session was helpful for under- standing my own teaching style and preferences. SA A U D SD 5. The session on curriculum development in sports medicine was useful . SA A U D SD 6. I have a better understanding of the similarities and differences between allopathic and osteo- pathic family medicine. 7. I can use some of the ideas and skills from the audiovisual workshop . SA A U D SD 159 PART III Please write your responses to the following questions in the space provided. 1. 2. 5. 6. What was the most helpful presentation or discussion during this past week? What presentation or discussion during this past week was not relevant to your needs? What part of the program gave you the most difficulty? What can we do to help you learn during the program? What other suggestions do you have fbr improving the program? What is your overall reaction to this week's program? 160 ADDITIONAL COMMENTS PART I FAMILY MEDICINE FACULTY EVELOPMENT PROGRAM End -of-Week Evaluation Week 2 September 18. 161 1981 Please indicate your overall reactions to this past week's sessions by checking the appropriate box. If you have a specific suggestion about how a change should be made, write that suggestion in,the appropriate space or at the end of this instrument. Aspect of Program Keep the Same Increase Decrease Specific Suggestion 1. 2. 5. 6. 7. 8. Amount of Reading Comfort of Room Length of Workshops Relevance of Information Amount of Participation Amount of Practice number of Examples level of Infor- mation Compared to My Amount of Knowledge 162 PART II Please respond to the following statements using the KEY given below. KEY: SA means you strongly agree with the statement, _A_ means you agree. U means you are uncertain, D means you disagree, and SD means you strongly disagree. 1. As a result of the psychomotor teaching skills session. I am better prepared to teach these types of skills. SA A U D SD 2. I will use the approach pre- sented for the teaching of psychomotor skills . SA A U D 35 3. The session on presentation skills will help me improve my own presentations. u. I feel more skilled as a clini- cal supervisor. SA A U D SD 5. I will use the ideas presented in the constructive feedback session. SA A u D 155" 6. I am more aware of my own think- ing as a physician as a result of the discussion on perspec- tives in learning. SA A U D 35 7. I found the clinical teaching practice teaching sessions (videotape) helpful. SA A Ti" _D "515‘ 8. I have a better idea of how to ask and answer student ques- tions. SA A U 4D 55 9. ‘10. 163 The "practicum" session (Thurs- day afternoon) should be con- tinued. I found the practice teaching assignment a valuable learning experience. SA SA 164 PART III Please write your responses to the following questions in the space provided. 1. What was the most helpful presentation or discussion during the past week? 2. What presentation or discussion during this past week was not relevant to your needs? 3. What, information or topic(s) on teaching was missing from this two-week session? u. What suggestions do you have fbr improving this two-week session? - 5. What is your overall reaction to this two-week session? 6. What research and evaluation topics would you like to see addressed in the January session? 165 ADDITIONAL COMMENTS APPENDIX,C COGNITIVE PRETEST Family Medicine Faculty Development Program Michigan State University September 8, 1981 PRETEST Instructions: This pretest is part of a program evaluation of the Family Medicine Faculty Development Fellowship Program. An- swer each item as completely as possible. Consideration will be given for partially correct responses. Thank you for your cooperation. 166 167 Small Group Development 1. List two components or dimensions of group development that a group leader should be aware of in order to monitor and influence the development of a small group. 2. List three characteristics of effective group functioning. 3. State three advantages of appropriately using a group discussion format. A. What types of actions might a group leader take when there is conflict (e.g., strongly opposing views, arguing, etc.) during a group discussion/meeting? -1- 168 Principles of Learning and Motivation 5. What techniques can be used to make learning meaningful to your students? 6. Describe what techniques you could use during a lesson to stimulate your students‘ attention. 7. What is the most appropriate use of modeling in instruction? 8. Describe how you could go about establishing and maintaining open communica- tion between you and your students. 9. What must be considered in determining which medium and strategy are appropriate for your presentation? Clinical Teaching Technique 10. Describe three roles that a clinical instructor might use in teaching medical students or residents. ll. What three topics should be discussed by the instructor immediately before observing an initial student/patient contact? -2- 169 12. What are four characteristics of effective feedback? 13. How might a lack of feedback negatively effect the learner? (Describe three) 1'4. What factors might inhibit the process of giving feedback: a. from the teacher to the learner? b. from the learner to the teacher on his/her role as a teacher? 15. Given a hypothetical six-week period during which you will be working closely with a student, which teaching technique will you probably use the most during weeks 1 and 6? 16. What are three goals of clinical supervision? 17. List three strategies that a clinical teacher might use in dealing with an anxious patient. 170 18. Physically arrange the examining room and position the following persons to provide an effective history gathering and/or examination opportunity for the resident. Use symbols to show desired position; point indicates direction facing or sitting. Patient a Resident & Clinical teacher Q Desk 1:] Examination table [:3 999cm» fl ‘——1 a Producing Audiovisual Materials 19. List four ways television can be used in undergraduate medical education. 20. List four steps in the process of selecting the appropriate media format for a presentation. -4- 21. 22. 171 State a "rule of thumb" for the maximum amount of printed material to be used on a projected visual image. List two advantages and two disadvantages of 35mm slides, overhead transparen- cies, and television in an instructional mode. Teaching Psychomotor Skills 23. 2t}. 25. 26. Situation: You have been assigned to teach a first-year medical student how to use and read a sphygmomanometer. Why is the use of objectives important in introducing a student to the proper use of the instrument? How would you determine prerequisite or entry skills before teaching the student how to use this instrument? What steps would you include when you introduce and demonstrate any new psychomotor skill to a student? What should you tell your student prior to a demonstration of the use of the sphygmomanometer? Why? -5- 27. 28. 29. 172 What should you have your student do immediately following the demonstration? Describe how and when you would provide feedback to a student practicing the use of the sphygmomanometer. Based on your experience as a family medicine physician, cite two different psychomotor skills that you could teach to first-year medical students. Presentation Skills 30. 31. 32. Situation: You have been asked to give a lecture presentation to a group of businessmen on the topic of hypertension. What should you try to accomplish in the introduction of your presentation? What techniques could you use to get the audience actively involved during the lecture? In what ways could you determine how much of the lecture has been understood by the audience? -6- 173 33. How would you determine the audience's personal reactions and attitudes toward your lecture? Write two specific questionnaire items you would use to elicit this information. 31;. Describe a situation in which a lecture would be an appropriate instructional method and explain why. Perspectives in Learning 35. What are three different types of learning a clinical faculty member would encounter? 36. Distinguish the cognitive view of learning from the behavioral view. 37. How might you draw an incorrect clinical conclusion by using an "availability" heuristic? -7- 38. 39. #0. 174 What are the three factors affecting the predictive value of a laboratory test in a clinical setting? Describe the overall goal of applying decision analysis in clinical practice, i.e., why might decision analysis be useful to physicians? What does it mean to say that physicians work with probabilistic information or make judgments under uncertainty? Include at least one example in your response. APPENDIX D COGNITIVE TEST RATING SCALE Cognitive Test Rating Scale 0 - nothing present or completely wrong 1 - minimal answer, but something there is correct 2 - more than one component present or correct 3 - nearly all components present or correct or all correct or present APPENDIX E VIDEOTAPE RATING SCALE Videotape Rating Scale FELLOW SESSION: JAN___ DMEMED MAY SCALE: 0 - Not done 1 - Done poorly N 3 - Done moderately well u - 5 - Done very well thing the above scale, circle the appropriate nunber for each of the following items. In this videtaped presentation, the fellow: 1. 2. 3. 5. 6. 7. 8. 9. 10. 11. 12. Introduced the topic of the presenta- 0 1 2 3 tion. Related the presentation to the O 1 2 3 audience members' past, present, or future. Provided necessary aids to organize O 1 2 3 the presentation. Delivered, rather than read, the 0 1 2 3 presentation. Presented information in an organ- 0 1 2 3 ized, logical manner. used examples or illustrations. 0 1 2 3 Showed interest in the topic and O 1 2 3 enthusiasm in presenting it. summarized main points or ideas of O 1 2 3 the presentation. Maintained eye contact with members 0 1 2 3 of the audience. Maintained good posture throughout 0 1 2 3 the presentation. varied the rate and pace of the 0 1 2 3 presentation. used appropriate gestures during 0 1 2 3 the presentation. 177 13. Spoke clearly and audibly. 1“. Provided smooth transitions between main ideas or points. 15. Solicited ideas or questions from members of the audience. 16. Responded to ideas or questions from members of the audience. COMMENTS/NOTES: APPENDIX E INTER VIEW PROTmOLS FELLOW INTER VIEW QUESTIONNA IRE NAME DATE QUESTION #1 The first question deals with your previous knowledge of the content of the sessions that were conducted in September. I will read you the name of each topic and then will ask you to respond either "Yes" or "Nb" to the question that I will ask you about each topic. If you respond "yes" then I will ask you to rate your expertise in that topic PRIOR to the beginning of the September program. In making that rating you should keep in mind that a scale from 1 to 5 will be used, with 1 low, 3 medium, and 5 high. Any questions? (PAUSE) Okay, here goes. Before the September program did you have a background in or any previous experience with: ‘ elements of group development Y__ N_ 1 2 3 ll 5 clinical teaching technique I N 1 2 3 A 5 role of clinical supervision Y;__. N____ 1 2 3 A S constructive feedback in Y N 1 . 2 3 A 5 clinical education principles of learning and T___ N___’ 1 2 3 u 5 motivation teaching psychomotor skills Y___ N_ 1 2 3 ll 5 producing audiovisual Y N 1 2 3 ll 5 materials presentation skills I;__ N____ 1 2 3 A 5 asking and answering student I N 1 2 3 A 5 questions perspectives in learning Y N 1 2 3 u 5 (If yes for any of the above) l-bw would you rate your expertise in this topic prior to the September program on a scale from 1 to 5 with 1 low and 5 high? 179 QUESTION #2 For the second question I would like to ask you if you have undertaken any additional study, such as reading, workshop attendance, CHE activities, or other methods of study or learning, in any of the topics of the September program since that program ended. As in the previous question I will read you the question and the topic and then ask you to respond either "Yes" or "No." I will also ask you to rate your current expertise in each of the topics. Again the scale will be from 1 to 5, with 1 low and 5 high. Any questions? (PAUSE) Okay, here goes. Since the September program ended have you undertaken any additional study in: elements of grow development Y_ N__ 1 2 3 ll 5' clinical teaching technique I N 1 2 3 u 5 role of clinical supervision Y;___ N___. 1 2 3 u 5 constructive feedback in Y N 1 2 3 ll 5 clinical education principles of learning and Y;___ N____ 1 2 3 A S motivation teaching psychomotor skills Y;__. N___’ 1 2 3 u 5 producing audiovisual Y N 1 2 3 A 5 materials presentation skills Y_ N__ 1 2 3 ll 5 asking and answering student I N 1 2 3 A 5 questions perspectives in learning I N 1 2 3 u 5 'How would you rate your expertise in this topic at this moment? 180 QUESTION :3 Have you used any of your notes or handouts from the September program since that program ended? I N (If yes) For which topics have you used your notes or handout materials? elements of group devel- !;___N____How'often? once 2-3 “-5 5+ opment clinical teaching tech- Y N How often? once 2-3 “-5 5+ nique role of clinical super- Y N How often? once 2-3 “-5 5+ vision constructive feedback Y N How often? once 2-3 “-5 5+ in clinical education principles of learning I;___N___ How often? once 2-3 “-5 5+ and motivation teaching psychomotor Y N____How often? once 2-3 “-5 5+ skills producing audiovisual T___ N___ How often? once 2-3 “-5 5+ materials presentation skills Y;__.N____Wow often? once 2-3 “-5 5+ asking and answering Y N How often? once 2-3 “-5 5+ student cue st ions perspectives in learning I N How often? once 2-3 “-5 5+ QUESTION l“ Have you shared your new’knowledge and skills that you learned during the September program with your colleagues or other people in your organization or community? I;___ N____ (If yes) Which of the following categories best describe how you shared your new knowledge and/or skills? You may choose more than one category if it is appropriate to your situation. The categories are: formal presentation individual consultation informal conversation(s) written communication(s) other (please specify) 181 QUESTION #5 In the six months since the end of the September program, have you had an opportunity to use any of the knowledge or skills that you learned during those two weeks? I (If yes) Please describe what specific knowledge or skills you have been able to use. Now please describe how you were able to use this specific knowledge or skills. QUESTION P 6 In the next six months, do you expect to have an opportunity to use any of the knowledge or skills you learned during the September program? Y N (If yes) Please describe what specific knowledge or skills you expect to be able to use. Now please describe how you expect to be able to use this knowledge or skill. 182 The next series of questions is concerned with the exercises or simulations that you participated in at MSU that were videotaped for you to review at a later time. QUESTION #7 Did you review the videotape in which you were placed in the role of a clinical teacher supervising a first-year resident? r_ N (If yes) On a scale from 1 to S, with 1 low and 5 high, how would you rate your overall perfomance as a clinical teacher in that videotape? 1 2 3 “ 5 If you were to go through that same clinical teaching simulation tomorrow, how would you rate your expected performance? Again, use a scale from 1 to 5. 123115 QUESTION # 8 Did you review the videotape of your presentation assignment? If you remember, that was the one on the last Friday of the September seesion where you were asked to teachisomething to someone. Y__ N__ (If yes) On a scale from 1 to 5, how would you rate your overall performance in that presentation assignment? 1 2 3 “ 5 If you were given the same assignment, to teach something to someone for twenty minutes, and you had to do it tomorrow, how would you rate your expected performance? 123“5 183 QUESTION #9 Did you review the videotape of the research and evaluation project presentation that you gave in January? Y N (If yes) On a scale from 1 to 5, how would you rate your overall performance in that presentation? Note that the focus is on your presentation and the associated skills, not on the content of the research or evaluation project that your presented. 123“5 If you have to give a similar presentation tomorrow, how would you rate your expected performance? 123115 QUESTION 5 10 Has your participation in the September program changed your role or function in your organization? For example, have your tried sane new teaching techniques or have you significantly changed any of your daily activities? I N (If yes) Please describe how your role or function has changed. QUESTION #1 1 Has your perception of teaching as a career changed since the completion of the September program? I N (If yes) Please describe how your perception of teaching as a career has changed. 184 QUESTION #12 Do you feel that the September program has helped you to become a better teacher? I (If yes) Please describe how the program has helped you become a better teacher. (If no) Please describe why the progam has not helped you become a better teacher. QUESTION :1; If a friend or acquaintance of yours was interested in becoming a faculty member in family medicine or wanted to become a better teacher of family medicine, would you recommend the September program to him/her? Y;___ N___ (If yes) Why would you recommend the September program to someone interested in becoming a faculty'member in family medicine? (If no) Why wouldn't you recommend the September program to someone interested in becoming a faculty’member in family medicine? QUESTION #1“ Is there anything else that has happened to you as a result of the September program that has not been covered by these questions? r_ u_ (If yes) Please explain or describe. 185 QUESTION #15 Do you have any additional comments or concerns that you wish to express at this time? I N (If yes) Please make them at this time. @SING REMARKS: That completes the questions that I have for you at this time. As I mentioned at the beginning of the interview, your responses and coments will remain confidential and no names will be used in the final evaluation report. Since many of you have expressed an interest in ,hearing the results of the evaluation, I will be presenting the results sometime during the May session. Thank you very much for your time and cooperation throughout both this interview and the times when you were taking the written test. Without your cooperation, a quality evaluation of this program would not be possible. Again, thank you very much for taking the time to talk with me at this time. 186 SUPERVISOR INTERVIEW QUESTIONNAIRE NAME DATE QUESTION #1 Once D". learned about the PNFD Program, did you encourage him/her to participate in the program? T;___ N____ (If yes) Why? (If no) Why not? QUESTION #2 Has Dr. shared any of the information or new knowledge or skills that he/she learned about teaching during the September program with you or other members of your organization? Y____ N___ (If yes) Which of the following method or methods best describe how he/she shared this information? formal presentation individual consultation informal conversations written communication other (please specify) 187 QUESTION Ar; Do you know if Ir. has been able to use any of the new knowledge or skills related to teaching that he/she learned in September at MSU? Y— — (If yes) Please describe the types of knowledge and/or skills that Dr. has been able to use, and the types of situations that they have been used in. QUESTION #“ Have you observed Dr. doing any teaching since late September? This could include activities such as one-on-one clinical teaching or supervision, small grow discussion teaching, or formal lectures or presentations. Y;__- N____ (If yes) How often have you observed Dr. doing some teaching since late September? once 2 to 3 times “ to 5 times more than 5 times QUESTION #5 Do you feel able to Judge whether or not I}. 's teaching behavior has changed since late September? I___ l[___ (If yes) lbw has 1):. 's teacher behavior changed since September? What do you think has caused the change in D". 's teaching behavior? 188 QUESTION #6 Have you noticed any change in D‘. 's role or function in your organization since the end of the September program? For example, has he/she become active in new areas of your program or has he/she taken on new responsibilities? Y_ N__ (If yes) Please describe this change in D‘. 's role or function. QUESTION #7 Has your program benefited in any way by n». 's participation in the MD Program? I (If yes) Please describe how your progran has benefited. (If no) Please explain why you do not believe that your program has benefited. QUESTION #8 Would you encourage another resident (or faculty member) from your progran to participate in the fellowship program in the future? I N (If yes) Why? (If no) Why not? 189 QUESTION #9 Do you have any additional comments about either I}. 's teaching behavior or skills or about the FMFD Progran that you would *lTke to make at this time? QUESTION # 1 0 Would you like to receive a copy of the final evaluation report on the FMFD Program? I N (If yes) I will arrange for you to receive a final copy of the evaluation report. QUESTION #11 Do you have any other comments or concerns that you would wish to express at this time? CLOSING That completes the interview. Thank you for your time and cooperation. Hopefully the results of this progran evaluation can be used to improve the WED Program so that people like yourself will continue to send prospective teachers of fanily medicine to participate in the progran. Thanks again for your time and comments. 1. 2. 5. 190 PROGRAM DIRECTOR INTERVIEW QUESTIONNAIRE Did you have any concerns about the September session of the FHFD Program before it started? For example, were there any new segments, new faculty, resource constraints, or other possible problems? Did you have any concerns about the participants prior to the September session? For example, were you worried about the size of the grow, the MD-DO mix, the resident-faculty mix? lbw did you feel after the completion of, the September session? Were you satisfied with the individual segments, faculty, participants, or any other aspects of the session? Based on your first impressions from reading their applications, talking to their supervisors, meeting them for the first time, or using any other information you had, who would you have picked as the fellows most likely to do well in the activities of the September session? Least likely to do well? After observing the fellows during the two weeks in September, who appeared to have mastered the skills and techniques of that session (or perhaps had arrived on the scene with them already)? Who had made the most improvement over the two-week period of time? Had anyone slid back, regressed? 6. 7. 8. 9. 10. 191 When the fellow came back in January and gave their proposed project presentations, who appeared to be the most skillful and effective presenters? Who had made the most improvement since September? Wino had regressed or remained the sane? When they returned in May and gave their final project presenta- tions, who appeared the most skillful and effective? Who had made the biggest improvement since January? Since September? Who were the biggest suprises, either positive or negative, to you over time from September to May? Both of you worked closely with the fellows in preparing various presentations and in conducting their major projects. Which fellows showed during those contacts that they had a good command of the terminology, concepts, techniques, and skills covered during the September session? Did any of the fellows do any follow-up work with you related to the topics of the September session? Did you supply any of them with any additional handouts, references, or any other information related to the September session? Have you noticed or learned of any unintended or unplanned outcomes among the fellows as a result of the September session? 192 How would you compare the overall quality of the presentation skills of this group of fellows (based on the major project presentations) with those of previous grows of fellows? How would you explain this? 11. Do you have any additional comments to make concerning the fellows 12. related to the activities of the September session? A PPENDI X G FINAL DEBRIEFING QUESTIONNAIRE FAMILY MEDICINE FACETY DEVELOPMENT PROGRAM 1981-82 FINAL DEBRIE’ING PART I: Written Responses Please write your responses to the following questions in the space prov ided . 1. 2. “. 5. 6. 7. What is your overall evaluation of the program? What was missing most fron the program? What comments do you have anout the administration of the program? l-bw would you rate your contribution to the program? What would better prepare fellows for the program? What coments do you have about the evaluation of the program? (pre/posttest, telephone interviews, end-of-week evaluations) Coments 193 19“ PART II: Discussion Topics 1. 2. Major Projects September Session: "Teaching and Learning" January Session: "Research and Evaluation" March Session: "Issues in Family Medicine“ May Session: "Adninistrative Skills" A PPENDI X H METAEVALUATION PROCEDURE: PROGRAM DIRECTOR INTERVIEW METAEVALUATION PROCEDURE: PROGRAM DIRECTOR INTERVIEW RESEARCH QUESTION: Was the evaluation franework practical in its use of resources? . SPECIF IC QUESTIONS: 1. Did the evaluation procedures produce information of sufficient value to justify the resources expended? 2. Were the evaluation procedures administered so that program disruption was kept to a minimum? 3. Did the use of multiple instrunnents appear to yield results that justified the extra time and effort involved in their develop- ment and administration? RESEARCH OJESTION: Was the evaluation framework valuable in providing information to you as decision makers? “. Did it provide information that answered specific questions that you had about the program? 195 S. 6. RESEARCH 7. 8. 9. 196 Was the information that you received complete and comprehen- sive? Was there anything left out that you would like to have known? ‘ ibw could the evaluation have been changed to provide more useful information? WESTION: Were the methods and instrunents used within the evaluation franework technically adequate? Were the sources of information described in enough detail for you to.assess the validity of the information they provided? Were the information-gathering instruments and procedures described in enough detail for you to assess the validity of the results they produced? Were the information-gathering instruments and procedures described in enough detail for you to assess the reliability of the results they produced? 197 10. Was there evidence that the data were collected and analyzed systematicallr? 11. Did it appear that the mantitative data were appropriately and systematically analyzed? 12. Did it appear that the qualitative data were appropriately and systematically analyzed? 13. Were the conclusions presented in the evaluation report supported by the data? RESEARCH OJESTION: Were the methods and instrunents used within the evaluation franework ethical in dealing with people and organizations? 1“. Was the evaluation report open, direct, and honest in its dis- closure of pertinent findings, including the limitations of the evaluation? 198 15. Was the evaluation designed and conducted so that the rights and welfare of the hunan subjects were respected and protected? ADDITIONAL QIESTIONS 16. What would you do if you were to do it again? 17. What was not done that should be done? 18. Which evaluation source provided the information of most value to you? 19. Which evaluation method provided the information of most value to you? 20. 21. 22. 23. 199 Which type of data would you rely on? Which type of data would you least rely on? What was the overall strength of the evaluation? What are you doing this time? A PPENDI X I EVALUATION REPORT: INTRODUCTION EVALUATION OF THE SEPTEMBER SESSION OF THE FAMILY MEDICINE FACULTY DEVELOPMENT PROGRAM ACADEMIC YEAR, 1981 - 82 EVALUATION REPORT PREPARED BY: Kent J. Sheets, Ph.D. (Cand.) September 12, 1982 200 201 INTRODUCTION The Family Medicine Faculty Developmernt Program (FMFDP) conducted by the Office of Medical Education Research and Developnent (GAERAD) at Michigan State University (MSU) is supported by a grant from the anreau of Health Manpower, Public Health Service. The two major objectives of this program are to identify and train new physician teaching faculty for fanily medicine training prograns and to help current fanily medicine ' faculty develop and/or refine their teaching skills. One component of this progran is a three-month teaching fellowship offered to allopathic (M. D.) and osteopathic (D.O.) Physicians who have completed or are near canpletion of a family medicine residency progran and to fanily medicine physicians with one year or less of academic teaching experience. It is a two-week session of this fellowship that is the subject of this eval- uation report. The goal of the fellowship is to provide the fellows with a founda- tion of skills in teaching, evaluation, and the management of instruc- tion. The fellowhip begins in September, and participants spend one- and two-week sessions at MSU throughout the remainder of the academic year. Diring these sessions at MSU the fellow participate in a series of workshops, seminars, and practice teaching situations conducted by nationally known medical educators. A stipend is available to help fel- lows cover the costs of participating in the fellowship. The two-week session of the fellowship that was evaluated was con- ducted in September 1981. This session presented workshops and activi- ties concerned specifically with techniques and principles related to teaching and learning in medical schools and residency training programs. A copy of the schedule for the September 1981 session is included in the 202 appendices. The FMFDP has been in operation since July 1978 and has been suc- cessful in meeting its goal of increasing the nunber of fanily physicians in academic positions, but the program directors have little empirical evidence demonstrating that the program has had an impact on the know- ledge, skills, and performance of the participants. Therefore, the eval- uation described in this report was conducted by the author in an attempt to determine whether the September session had an impact on the partici- pants and/or their organizations. The evaluator utilized evaluation pro- cedures already in use by the EMT-“DP staff and also developed some new evaluation instrunents in order to gather different types of evaluation data from a variety of sources. . One objective of this evaluation was to provide information to the FMFDP Directors to help them improve and planfuture Offerings of the September session. A second objective was to determine whether there was any kind of evidence that the September session benefited the partici- pants and/or their home institutions. While the emphasis of the evalua- tion focused on meeting these two objectives, it was also intended to ex- plore alternative evaluation procedures that might be more effective in gathering information useful to the FMFDP Directors and also to look for any unintended or unexpected outcomes. In the remainder of the evaluation report, the evaluation procedures used to gather the data presented in this report are briefly described. Examples of the evaluation instrunnents used to gather the data are pro- vided in the appendices. The evaluation questions that were formulated to guide this study are presented with sunmaries of the results that cor- respond to each question or group of questions. Complete data sets 203 appear in the appendices. In closing, a summary is provided outlining overall results of the evaluation. Recomendations and other comments are also presented. A PPENDI X J FIELDTEST DATA: END-(F-WEEK EVALUATIONS FAMILY MEDICINE FACULTY DEVELOPMENT PROGRAM End-o f-Week Ev a1 uation Week 1 September 8-11, 1981 SUMMARY PART I Aspect of Keep the Specific Program Same Increase Decrease Suggestion 1. Amount of 15 1 2 Hard to keep track Reading of what all the different handouts relate to. Maybe color code or I or index: More references, but continue to show priorities among references 2. Canfort of 8 10 Need windows in Room room; Could be better; I prefer the base- ment, but we should change rooms occa- sionally; Too congested area: Room cramped for the number of people; Could have more room; Subdue lighting slightly: 205 Aspect of Program Keep the Sane Increase ‘ Decrease Specific Suggestion 3. Length of Workshops “. Relevance of Information 15 13 5 Room size was a little small (216): It's really fine, but window are pleasanter if pos- sible More regular breaks (10 minute): Decrease the a.m. audiovisual ses- sion. Keep the mo- vie, but a lot of the a.m. audio- visual session was covered in the poms; Sports medicine lengthy: Some seemed a lit- tle long, others were okay. Decrease Thursday p.m. and Friday a.m. sessions; Gould take more in- formation time; Keep the same ex- cept sports med- icine; Information seems vague, hard to ap- ply; More details on osteopathic/allo- pathic, improve curriculum devel- opment; 206 Increase Decrease Specific Suggestion Aspect of Keep the Program Sane 5. Amount of 15 Participation 6. Amount of 15 Practice 7. Number of 17 Ex amples (rient lecture to- pics as you go--not for the whole fel- lowship at one time; Sports medicine-- important issues raised, but not all questions thorough- ly answered or pre- pared for, i.e., evaluation, person- al/fanily needs of the busy practi- tioner in eval- uation A little too much Need ways to apply information—every- thing sounds good, but I'm left won- dering how to use it; Increase for audio- visual: Good for this stage; Practice (mini sessions) impor- tant, but not always clear, maybe do a "role play" prior 207 Aspect of Keep the Specific Program Same Increase Decrease Suggestion 8. level of Infor- 16 2 Push harder mation Compared to My Mount of Knowledge 208 PART II Please respond to the following statements using the KEY given below. KEY: §_A means you strongly agree with the statement, A means you agree, U means you are uncertain, D means you disagree, and S__l_)_ means you strongly disagree. 1. I found the Tuesday orientation session very helpful. “ ____D. A —D 2. The concepts presented in the small grow process workshop were helpful. 7 8 ‘51" 1 U D SD d 3. I believe I can effectively use the principles of learning and motivation in my own teaching. 10 7 SA A U 15 SD “. The clinical teaching technique session was helpful for under- standing my own teaching style and preferences. 1O “ 2 T A U 15 ‘SD" 5. The session on curriculum development in sports medicine was useful . 2 5 8 3 SA A U D 35 6. I have a better understanding of the similarities and differences between allopathic and osteo- pathic family medicine. 2 7 3 “ 2 7. I can use sane of the ideas and skills from the audiovisual workshop. 9 9 209 PART III Please write your responses to the following questions in the space provided . 1. What was the most helpful presentation or discussion during this past week? on motivation: One by D". on principles of learning; Dr. 's on teaching--both enjoyable and motivating; n'. 's presentation was the most useful. Rad sane concrete ideas I can carry with me; 's on teaching; Principles of learning and motivation; -learning and motivation; -group also excel- Ient; Principles of learning and motivation); Principles of learning and motivation, Audiovisual materials, Ele- ments of grow development; -held my attention, but also had a lot of concrete sug- gestions; Audiovisuals; ' 's presentation on principles of learning and motivation; All presentations were very good. '3 was the most out- stand ing , ho wever ; Toss w between 's and 's; Aud iov isual 2. 210 What presentation or discussion during this past week was not relevant to your needs? Discussion on DO/MD dichotomy—nothing really new. Enjoyed 11'. about paying for technology—this was on target! Curriculun development did not deal with the subject specifically on development; Sports medicine—the presenters spent 1 1/2 hours talking about their program, 1/2 hour trying to talk about how it cane about. A 2 hour talk about "setting up curriculums--logistic, organizational, practical aspects" would have been much more relevant to us; Sports medicine; Curriculun developnent; Curriculun development. The lectures were very interesting and modelled difficulties of developing a curriculun area—but I still don't know if there is a "model" approach to curriculun development; Curriculun developnent; Friday a.m.—D. O./M. D. least relevant was probably curriculun developnent in Family Med- icine. Time, it was shown that an innovative program was started, but I still have many questions of the how; None; All seemed relevant; The session on curriculun development in Sports medicine could have been more relevant. It did not answer many of my questions on the subject; None-all were good to excellent; Sports-but don't can them. Rather, help them abstract from their experience more of the general principles of starting a new program; The sports medicine and the discussion of issues in family medicine. The sports medicine curriculun is exciting, but and seemed to have trouble describing how they developed their curriculnmn and identifying what principles could be applied to the development of curriculun in another area. I don't think Us. and have a firm grasp of either the issues facTng family medicine (which was to be the topic of this morning's talk) or the allopathic/osteopathic dichotomy (the topic discussed the most); Sports medicine, allopathic vs. osteopathic discussion; 211 Curr ic ulunn dev elopnent—sports med ic ine . What part of the program gave you the most difficulty? Sports medicine—curriculum planning, logistics of getting the course together was "fuzzy"; None; Assimilating the large amounts of suggestions/information--I hope I will be able to use most information soon—so far it's all ”tucked away"; Parts that are vague, theoretical; Length of the day in an enclosed room; Clinical teaching--too much orientation (process) for future acti- vities, not enough content (spread out content over the two weeks); Coming up with "good" examples during 's discussion ‘(practice segnents). This was very valuable, however; Vol nae/session ; Probably the amount of material and novelty of the material and trying to absorb a good deal of it; Hard to be so verbal all day—so much new language; Nothing; So far, no area of difficulty; Nothing has been that difficult so far; learning mod(el?) (mode?). What can we do to help you learn during the program? Provide written handouts for all attending people; Increase my general knowledge in teaching skills with feedback to see how I an doing. Specifically, I need help on how to answer and ask questions to help students and residents learn; Provide us with opportunities to practice with feedback—this, I 21.? think, is coming w next week! Perhaps, more small grow tasks would be valuable; More practice, more models or examples to apply; I think you're doing it well; . Continue workshop and anall grow functions; POssibly send out handouts in the summer with relevant leisure reading; You're already are employing a lot of good educational techniques; Indexing of readings and outlines; No specific ideas at present; Continue the starting and stopping on time with appropriate breaks; More time trying to apply the ideas—more practice; Provide exercises and clinical examples for abstract comments; More active participation. What other suggestions do you have for improving the program? So far good and no real improvement; Great as it is. ? more chance for member interaction--sharing of difficulties/successes encountered in our respective prograns; The humanistic orientation is hard to apply. I agree with it and try to apply it. Needs to be more concrete, more examples. I buy into the concept-let's move ahead!; POssibly a little shorter day; Expand 's teaching time; Revise and use another approach to ‘curriculun develoanent; Would like more individual time to discuss projects with faculty (e.g., planning for Friday presentations or major project): lbre ; SO far, I am impressed; More practical time doing things; 213 lbne at this time; The sessions on grow process were difficult to correlate with clinical practice; More . 6. What is your overall reaction to this week's program? Very good--I have been pleased with the information given so far, but need to apply it; Very enthusiastic-pleased--loold.ng forward to next week; Good—glad I cane; Relevant, useful; Slow start, but I feel well oriented, know all the grow by name, and an looking forward to second week; Ontstanding; Excellent, but overwhelmed by Friday; Good; Positive; Excellent; Overall, very positive. I feel that it will be very helpful to me in my future teaching; Excellent—keep w the good work; G‘eatl This is helping _a__];<_>t_; Very good-you people do a good job; Slightly disappointed; Good except for sports medicine. ADDITIONAL COMMENTS Thursday happened to be boring to many people--not necessarily the topics, but the presentation; 214 Thanks!; I feel the social events are a wonderful plus to this program. The people running this are sensitive to this; Would like some feedback on possible ways of handling problems female medical students may encounter; sessions were very good, but might have been better if there had been less initial discussion, a less-rushed presentation of the key issues, and then more time for discussion of those concepts at the end; Tw0 particular presentations needed more organization--sports and the DO/MD one. Good idea to have the "social" BBQ the second night—helps establish ourselves as a viable grow. 215 FAMILY MEDICINE FACULTY DEVELOPMENT PROGRAM End-of-Week Evaluation Week 2 September 18. 1981 SUMMARY PART I Please indicate your overall reactions to this past week's sessions by checking the appropriate box. If you have a specific suggestion about how a change should be made, write that suggestion in the appropriate space or at the end of this instrunent. Aspect of Keep the Specific Program Same Increase Decrease Suggestion 1. Mount of 11 . 2 Make sure that if Reading ' a reading is needed for the next day that this is noted; Make more explicit what we are to read; Spread out more; Mount of material OK for further reference; Be more clear on specific readings for each session/- remind us. 2. Canfort of 12 1 E6 is much better- Room far away from hotel, though: E6 is fine; Change rooms-- a .m ./ p .m . E6 is good. 216 Aspect of Program Keep the Sane Increase Decrease Specific Suggestion 3. Length of Workshop “. Relevance of Information 5. Mount of 7' Participation 6. Amount of Practice 10 3 10 3 Gets long at the end of the week; Finish on time; Attend to breaks better; Don't make any longer. 6 hours leads to signi- ficant fatigue. Add practicum 1/2 day in 1st week. less "student" examples, let's try to focus directly on residents more. We need more practice trying on different styles; Better! Having residents run "how to ask questions” session was very valuable and should be done more. We need more practice trying on different styles; Except fOr practice preceptoring on vi- deo, would like more practice; 217 Aspect of Program Keep the Sane Increase Decrease Specific Suggestion 7. Nnnnber of Examples 8. Level of Infor- mation Compared to My Amount of Knowledge 12 11 2 Practice supervisor role more. Betterl; Increase time, but try not to decrease information given out; Maybe more relevant practice examples from own exper- ience. Use less videotape vignettes; Fewer vignettes;. Change quality, good to increase hospital and resi- dency examples. 218 PART II Please respond to the following statements using the KEY given below . KEY: SA means you strongly agree with the statement, A means you agree, U means you are uncertain, 2 means you disagree, and §_D means you strongly disagree. 1. As a result of the psychomotor teaching skills session, I am better prepared to teach these types of skills. 8 >01 2. I will use the approach pre- sented for the teaching of psychomotor skills. 9 3 1 3. The session on presentation skills will help me improve my own presentations. 1O “. I feel more skilled as a clini- cal supervisor. 9 '1 5. I will use the ideas presented in the constructive feedback session. 8 u 1 6. I an more aware of my own think- ing as a physician as a result of the discussion on perspec- tives in learning. 7 3 1 2 SA A U "—0 TD- 7. I found the clinical teaching practice teaching sessions (videotape) helpful. 5 5 1 1 1 ‘51- T" T T“ ‘51)" 8. I have a better idea of how to ask and answer student ques- tions. 5 8 SA A U D 515 10. 219 The ”practicum" session (Thurs- day afternoon) should be con- tin ued . 10 SA I found the practice teaching assignment a valuable learning experience. ml... 3:. 2 1 A U 3 T T SD _220 PART III Please write your responses to the following questions in the space prov ided . 1. 2. What was the most helpful presentation or discussion during the past week? Psychomotor skills, presentation skills; Can't answer—different aspects of many were helpful; Practice teaching exercises, teaching psychomotor skills, principles of learning and motivation; lecture presentation ; ( ): I liked them all; Practice teaching on videotape; Presentation skills; Those involving our role playing and practice teaching were equally most helpful; . Practice supervisor role. Unfortunately, only done one time. I resent being videotaped and not having a chance to see it. I feel it is urnfair and may refuse to participate next time just to make a point. I've no problem with being videotaped. I want to learn from it, not just be a guinea pig for someone's research or tape development; Teaching psychomotor skills; Heceptoring role play on videotape; Teaching psychomotor skill s . What presentation or discussion during this past week was not relevant to your needs? Perspectives in learning was relevant but could have been better if done earlier in the program; We need a better example of better presented exannnple of curriculun development. A half day more specific on didactics of curriculnmn development might be more useful; Curr iculun dev eloment; 221 Last week—sports medicine, perspectives in Family Medicine. This week-perspectives in learning; Role of clinical supervisor already done in my experience; None; Teaching history; Clinical supervision videotaping-dome good content, but too much "uninvolved" time and lack of client review of videotape; All relevant; Perspectives in learning; Perhaps the perspectives in learning--could be made more relevant; Perspectives in learning was interesting and I enjoyed group discussion, but I don't think it changed my behavior. It did increase my awareness on how I think in medicine. Good articles; I'm still not sure what the purpose of the perspectives on learning session was. It didn't seem particularly meaningful. A PPENDI X K FIELDTEST DATA: FELLCH INTERVIBVS FELLOW INTERVIEW QUESTIONNAIRE NAME COMPOSITE RESULTS QUESTION #1 Before the September program did you have a background in or any previous experience with: ' Mean. elements of group develoanent Y “ N 10 1 2 3 “ 5 1.875 clinical teaching technique Y 12 N___2_ 1 2 3 “ 5 2.666 role of clinical supervision Y 10 N “ 1 2 3 “ 5 2.3 constructive feedback in Y 10 N___“_ 1 2 3 “ S 2.7 clinical education principles-of learning and Y__A N__8_ 1 2 3 “ 5 2.5 motivation teaching psychomotor skills Y_g N___5_ 1 2 3 “ 5 2. ““ producing audiovisual Y 6 N 8 1 2 3 “ 5 1.83 materials presentation skills Y N __6_ 1 2 3 “ 5 2.25 asking and answering student Y 10 N “ 1 2 3 “ 5 2.“5 questions ' perspectives in learning Y 8 N_A 1 2 3 “ 5 1.625 (If yes for any of the above) lbw would you rate your expertise in this topic prior to the September program on a scale from 1 to 5 with 1 low and 5 high? “Mean calculated for those who answered "yes" to each of the items. 222 QUESTION # 2 223 Since the September program ended have you undertaken any additional study in : elements of group developnnent clinical teaching technique role of clinical supervision constructive feedback in clinical education principles of learning and motivation teaching psychomotor skill s producing audiovisual material s presentation skills asking annd answering student questions perspectives in learning *l-bw would you rate your expertise in this "N = 1” Y Y Y Y Y Y Y Y r< I- was N z _a N Id N 1 g d 222 ll‘ “(PM 2 _a N 1 SE 2 _a O z _a w z _a w Mean" 1 2 3 “ 5' 3. 28 1 2 3 “ 5 3. 82 1 2 3 “ 5 3. 82 1 2 3 “ 5 3. 6“ 1 2 3 “ 5 3. 21 1 2 3 “ 5 3. 71 1 2 3 “ 5 3.178 1 2 3 “ 5 3. 785 1 2 3 “ 5 3. 39 1 2 3 “ 5 2. 107 topic at this manent? 224 QUESTION _I; Have you used any of your notes or handouts from the September program since that program ended? Y 1 N 3 (If yes) For which topics have you used your notes or handout materials? H O W O F T E N elements of group devel- Y 5 N 9 “ 1 opment once 2-3 “-5 5+ clinical teaching tech- Y 9 N 5 2 5 2 nique once 2-3 “-5 5+ role of clinical super- Y 8 N 6 3 5 vision once 2-3 “-5 5+ constructive feedback Y 7 N 7 3 3, 1 in clinical education once 2-3' “-5 5+ principles of learning Y 6 N 8 3 3 and motivation once 2-3 “-5 5+ teaching psychomotor Y 6 N__§ 6 skills once 2-3 “-5 5+ producing audiovisual Y 9 N 5 3 3 3 materials once 2.3 u.5 5+ presentation skills Y 9’N 5 5 2 2 once 2-3 ”-5 5+ asking and answering Y 5 N 9 3 2 student questions once 2-3 “-5 5+ perspectives in learning Y__g N 1 2 once 2.3 “-5 5+ 225 QUESTION I“ Have you shared your new knowledge and skills that you learned during the September program with your colleagues or other people in your organization or community? Y 13 N 1 (If yes) Which of the following categories best describe how you shared your new knowledge and/or skills? You may choose more than one category if it is appropriate to your situation. The categories are: 2 (16%) formal presentation 8 (61%) individual consultation 13 (1001) informal conversation(s) 2 (161) written comunication(s) 0 (01) other (please specify) QUESTION #5 ,In the six months since the end of the September program, have you had an opportunity to use any of the knowledge or skills that you learned during those two weeks? Y1“N__O (If yes) Please describe what specific knowledge or skills you have been able to use. Presentation skills (ll) Clinical supervision (6) AV ( 9) Feedback (6) Clinical teaching ( 9) Grow development (5) Psychomotor ( 7) 226 QUESTION # 6 In the next six months, do you expect to have an opportunity to use any of the knowledge or skills you learned during the September program? Y1“NO (If yes) Please describe what specific knowledge or skills you expect to be able to use. Clinical teaching (8) Psychomotor (7) Lectures (8) AV (7) Grow development (7 ) C1 in ical superv ision (5) The next series of questions is concerned with the exercises or simulations that you participated in at MSU that were videotaped for you to review at a later time .- QJESTION #7 Did you review the videotape in which you were placed in. the role of a clinical teacher supervising a first-year resident? L13 "_2‘. (If yes) (in a scale from 1 to 5, with 1 low and 5 high, how would you rate your overall perfomance as a clinical teacher in that videotape? Mean N=12 123“5 3.“16 If you were to go through that same clinical teaching simulation tomorrow, how would you rate your expected performance? Again, use a scale from 1 to 5. Mean N:1“ 123“5 “.107 227 QUESTION #8 Did you review the videotape of your presentation assignment? If you remember, that was the one on the last Friday of the September seesion where-you were asked to teach something to someone. Y 12 N 2 (If yes) On a scale from 1 to 5, how would you rate your overall performance in that presentation assignment? Mean N = 11 1 2 3 “ 5 3.227 If you were given the same assignment, to teach something to saneone for twenty minutes, and you had to do it tomorrow, how would you rate your expected performance? Mean N:13 123115 “.192 One respondent found this "hard to rate" and did not provide a response. QUESTION #9 Did you review the videotape of the research and evaluation project presentation that you gave in January? . Y1O N__3_ (If yes) On a scale from 1 to 5, how would you rate your overall performance in that presentation? Note that the focus is on your presentation and the associated skills, not on the content of the research or evaluation project that you presented. Mean N=10 123“5 3.“5 If you have to give a similar presentation tomorrow, how would you rate your expected performance? Mean One fellow did not complete this assigment. One fellow did not give a point rating for the second question. 228 QUESTION #10 Has your participation in the September program changed your role or function in your organization? For example, have you tried sane new teaching techniques or have you significantly changed any of your daily activities? Y 7 N__1_ (If yes) Please describe how your role or function has changed. Trying to do more or new types of teaching (“) Not doing much teaching right now (2) QUESTION #11 Has your perception of teaching as a career changed since the completion of the September program? I__9N_§ (If yes) Please describe how your perception of teaching as a career has changed. More interested in it (3) More comfortable, confident (3) View teaching as more of a science, less than (2) art QUESTION #12 Do you feel that the September program has helped you to become a better teacher? Y1“N__O (If yes) Please describe how the program has helped you become a better teacher. (If no) Please describe why the progam has not helped you become a better teacher. Gave me conceptual framework to use when (7) teaching Gave me information and skills I lacked before (5) Gave me practice on my presentation skills (3) 229 QUESTION #13 If a friend or acquaintance of yours was interested in becoming a faculty member in family medicine or wanted to becane a better teacher of fanily medicine, would you recommend the September program to him/her? Y 1“ N__g (If yes) Why would you recommend the September program to someone interested in becaning a faculty member in fanily medicine? (If no) Why wouldn't you recommend the September program to someone interested in becaning a faculty member in fanily medicine? Can learn things that are useful as teacher (5) Physicians have little exposure to educational (“) methods Gives you a structural framework for teaching (“) Recomended it already (3) QUESTION #1 “ Is there anything else that has happened to you as a result of the September progran that has not been covered by these questions? Y u N_]_()_ (If yes) Please explain or describe. No common response given. QUESTION #1 5 Do you have any additional comments or concerns that you wish to express at this time? Y 8 N_g (If yes) Please make them at this time. Don't understand where perspectives in learning fit in (3) Concrete, practical things most helpful (2) A PPENDI X L FIELDTEST DATA: FINAL DEBRIEI’ING PART I: 1. FAMILY MEDICINE FACULTY DEVELOPMENT PROGRAM 1981-82 FINAL DEBRIEFING SIHMARY Written Responses What is your overall evaluation of theprogram? Overall program was worthwhile—sane half-day sessions weren't worth tine and effort travel, etc. Highly recommend program to those in or going into Fannily Practice Residency prograns as instructors. Very useful-probably will prove of even more usefulness as I get more involved in the teaching of fanily medicine. I'd recannend it to a friend wlno is serious about wanting to teach family medicine. I feel that the program was valuable for me as a future family medicine educator and that I have learned a great deal of relevant information that will be useful to me in the future. I would recomend it (and have). Yes (would recommend/worth my time). lbpefully next year we can continue to have individuals from our school attend this fellowhip. Worthwhile for those interested in teaching medicine at any level. I would recamnend it to a friend who was definitely camnitted to a future in academic medicine. Those with only interest in part- time teaching did not need so extensive a course. It was worth my time if I did not have to travel so much. Very worthwhile. Generally covered areas not covered at all in routine medical education. (bod use of people from other disci- plines than medicine who's expertise is very relevant to our tasks but in general who were familiar enough with the peculiarities of the medical system to be relevant. I enjoyed the program a lot and would reconnnmend it to others. Orerall it was a good progran with real applications for any further teaching responsibilities I might have. The first and last sessions were of most benefit. Excellent—met my goals. Would (and have) recommended it to others. Sept: Very good. 's programs were helpful, some of clinical teaching with were good. and 230 2. 231 give good programs .on sports medicine, but bad programs on curriculun developnent. Jan: Research presentations were +/-. could have been better. March: was good, excellent, terrible. l‘hy: excellent in May; mediocre (due to dull topic) in March. We didn't care about the structure of TV caneras of his program. - Overall, the program was excellent. I would recommend it to any fanily practice resident, regardless of whether or ~not the person was pursuing a faculty position in family practice. Teaching skills are applicable to many environments. Exposure to grants and research will increase substantially, the likelihood of myself doing something like that. - Excellent. Yes (worth my time). ihve already recommended it. - Overall evaluation and reaction to the program was favorable. I feel that it was well organized, well presented, and that the topics were pertinent. I would recomend the progran to a friend who might be interested in a career in family medicine. It was worth the time, though I do feel the sane anount of material could be presented in a shorter period of time. What was missing most from the_program? - Specifics-a lot of conceptualization was done without getting into actual solutions-more concrete answers; granted that there are some. - I felt the marked contrast of being here and 1001 involved in teaching education, then being back hane and 1001 involved in being a resident was a disappointment. I know you've identified this problem, arranged site visits, made projects relevant to hanne institution. Still, the contrast persisted and I don't know if anything more could be done about it. Maybe if I hadn't been still a resident...but I'm grateful for the opportunity. - Perhaps the only thing that I feel was missing was more clinical relevance to the model of medical education used in my program (which is more attending than active preceptoring); although I'm not sure this affected my learning. - I was the only person involved in a situation not related to Residency Practice planning and mostly felt left out. - Much of my time is spent on teaching rounds at the hospital. This was totally ignored in the program and really needs to be addressed. The one lecture on presentation skills was useful, but much more time needs to be devoted to this topic. I gave several lectures this year and needed more work on presentation skills. Therefore, two areas that are predominant in teaching were 9.22 232 covered adequately. You should also include information on clinical teaching with small groups, i.e., conveying information during rounds while maintaining efficiency in patient care. You need to get into the specifics; that's what's really difficult. The theoretical perspectives by were useless to me. - Hard to find time and energy to apply many of the things we learned back in the work setting. - How to teach residents when backup at the outpatient clinic. Canputers. lake-up and structure of a residency progran. Books or good reading sources (not a voluninous list—make it short). - Some of the more practical aspects of being a faculty member, e.g., preparing budgets, interviewing candidates for residency slots or faculty positions. - Good examples of research and grants. Too much time picldng apart those with problems rather than seeing one good example as a model. Would also like more time for interchange of ideas and discussion about individual prograns, problems, and innovations. - Would have liked a little more on political issues. lhderstanding State and Federal financing into Family Practice. - Developing curriculun' in family practice was weak, as was program evaluation and funding of fanily practice. Post speakers were not family physicians—but this did not reduce their credibility, as they all seened to be very tuned in to fanily practice. - Perhaps too idealistic, not enough of how to deal with people who haven't been here; not enough about dealing with the hopelessly incompetent student (or adminstrator). - I think expectations were met with the sessions on teaching. An area where expectations were not truly met was in the area of medical writing. What comments do you have about the administration of the program? - Tough to get out of residency progran for 5 weeks without sane resentment—assistance at MSU better and more available than that at hane office. Tough to get to East lensing twice in the winter. Maybe schedule March or January session in April. logistics of leaving Friday afternoon to make flights, etc. - Well done. Schedules mailed out prior to sessions and checklists were ve useful. Spreading out sessions thoroughout the year was good an worth the hassle involved working around rotations. My program did not obstruct my caning here--I felt it was very important for my education and career. 233 lb real cannents—overall, a well-aaninistered progran. Schedule was flexible and we appreciated it. Assistance was adequate and always there. Could it be condensed into less time? On the other hand, the practice sessions were most valuable—we needed the chance to practice what we were learning. Well administered. Have one-hour lunch breaks. "Night out" not the night before presentations. On-site visits and assistance were adequate and I have no complaints. Because of living in Lansing, I think I assuned, as did faculty, that there would be plenty of time to meet and discuss the major project, etc. But in fact we only managed to meet once. I think it's probably important for local fellow to still be given meeting times during the fellowhip week (e.g., with advisors about projects). I suspect site visits have the positive effect of rekindling enthusiaan for the project. The visits were excellent and I received all the assistance I desired. Schedules and keeping to it were well done. The first two weeks took a chunk of time from that month's rotation. I'd suggest breaking it up. I'd also suggest scheduling the sessions for either the first or last week of a month. Sessions that occurred in the middle of a month disrupted that month's rotation. No administrative problems that I was aware of. Fine! Program was well administered while physically at MSU and also while away, by mail. Assistance was available when necessary, around the major project and also around scheduling time to make it to MSU. Making a "book" of the major projects was good. Shorter lunch (1 1/2 hours optimal). Overall well run. I liked having the teaching sessions first so that I could evaluate subsequent presentations based upon those principles. I feel the aaninstrators made a conscientious effort to run a very efficient program. It was nice to have assistance available to the fellows throughout the fellowhip. I had a little difficulty with the fellowhip schedule because of commitments to my primary job. 234 How would you rate your contribution to the program? I hope I contributed my share. I feel I was vocal in all the study groups, etc. One can always be more involved--I learned a lot during breaks and after hours in informal discussions with fellow. I feel I should have taken advantage of OMERAD fac il it ies more . Yes (I was involved as I wanted). Yes (I took advantage of fellowship opportunities). I believe that I was involved and took advantage of the educa- tional opportunities, participating actively in both the fel- lowhip sessions and the after-hours sessions wlnere a good deal of informal learning was available. Participation would increase with more diversity of topics. Involvenent could have been more if hane-base situations were more understand ing . I was average in my participation. I thonght that participation of the fellow was fine. Ideally I would have liked to do a project more directly related to my residency, but my needs were fairly clear to complete and write up my research project. I think I could have applied more of what I learned on a different kind of project. (lbwever, I was very grateful to have help doing what I did!) I rate my contribution as average to above average and received feedback from this. I probably talked too much. More "take-home" assignments. Not so much for my experience, because I think I did—tried a lot back hane, but maybe others would benefit from being pushed more to try some of the stuff. I felt involved and as though I had something to add. Specifically, "allowing" the fellows to experiment and be comfortable in the group" and discussing group dynamics was helpful. I occasionally felt I wanted to participate more but wasn't sure of how to do so. I would rate my contribution to the program as average. I feel that because of outside responsibilities, I was probably not quite as active as I would like. Also, since I had just obtained a new faculty position, I do not feel that I was canfortable enough with my new duties to‘ be able to fully tap the fellowhip. 5. 6. 235 What would better prepare fellows for the program? - I feel any fellow in a residency program, preferably 3rd year or' any attending in a F. P. residency program, is "prepared“ to take the program. - Probably most important is a fellow's commitment to teaching. Having a group full of cannitted educators will assure active and fruitful participation. - Reading list or expected topics before the program starts--could help us cover more material. - Better orientation materials before coming to session, describing the details of the year. - Talking to former fellow. - Possible reading sources before the program or articles to read. - Discussions with previous fellows (I did this and I think I had a pretty good perspective on the program before I applied). - Everyone should be well versed in its content-expectations prior to coming to get the most out of it. ' - A 2-3 page handout describing some of the main areas covered in the program, i.e., teaching skills, research, grants, implenentations, etc. Mention small group work, major project, understanding AV, etc. - I think I would better prepare my program director for mat the fellowship had to offer, to help the fellow use his/her residency as a more open "lab." - I think future fellow might appreciate a preview of the sessions that will be offered over the course of the progran before they start it in September. What comments do you have about the evaluation of the pro ram? (pre/posttest, telephone interview, end-of-week evaluations; - All a'e necessary, and I hated doing them. I'm the type who would like to know in advance what I an responsible to do. - Pre/posttests frustrating. Sometimes they violated 's principle—"tell them what you want them to know." Knowing "the elements of group development" hasn't proved relevant to my job as a teacher. After 9 months, I don't know how I did on pre/posttest 236 (which took time) or what it means. Telephone interview made me feel bad that I haven't used many of the handouts. End-of-week evaluations were useful. You obviously use them and are open to suggestions. The pre/posttests were, in my opinion, worthless. The telephone interviews probably gave a better idea about what was being used, although I'm not sure about the outcome. The end-of-week evaluations were" probably the best of the three. A long-term (one-year? five-year?) followup should be done. Pre/posttests—too many, but necessary. Telephone interview were alright. End-of-week evaluations were understandable and necessary. The pre/posttest demonstrated that I did not memorize what the presenters felt important. In fact, without studying the material first, the test was difficult to answer. Ehd-of-week evaluations and telephone interview were fine. Pre/posttest—tedious, felt like a guinea pig for someone's study. Also sane test times required knowledge of jargon not faniliar to me. Telephone interview were okay—more useful than pre/post- test. End-of-week evaluations—a necessary part; handing then out Monday seems like a good idea, especially when there are nunerous different sessions and presenters. Do not give the pre-test when we first get to program--it was a turn off for me. Wait a day. l-bpefully will get feedback about the final evaluations of the program. Enjoyed getting the feedback results we have gotten. The pre/posttest is jargonistic and if you plan to continue to use it, it should be rewritten. I think that the spontaneous suggestions following the individual sessions are a more reliable source of evaluation. Pre/posttest was a big pain; hope it was necessary. End-of-week evaluations were most valuable. Evaluate briefly on closer to a daily basis. Evaluation was extensive, not too disruptive—as time was allowed for the evaluations to be completed. Suggest followup 1-2 years from now with a phone interview similar to the one already done. I feel that it was an improvement to hand out the end-of-week evaluation at the beginning of the week. Pre/posttests were a lot of wrk. Telephone interview—did not mind them. 7. 237 Comments Of most benefit was fellow fellow daily discussions in response to educators' lectures and anall group experiences. Nice setting—both classroom, canpus, and outdoor recreational activities. I'm sad it's over--I'll hope to keep in touch with many of the fellow. More out-of-state fellow who are definitely invested in an academic career. The stipend did not cover or barely covered my expenses; as compared to local people who had little or no expenses. That should be taken into account when determining the amount of the stipends in the future. Should, for practical reasons, have it in three sessions: 2 weeks in September, 2 weeks in February, and 1 week in lhy. I would like to know ways I can further my faculty developnent after this course. A few sessions could be eliminated or strengthened (e.g., curriculun developnennt could be more varied, practical; O". 's could be eliminated). Would a portion on use/abuse of PA's, nurse clinicians, other health professionals in practice and teaching be valuable? Add session on audits--this is an important camnittee function--basis for research and looking at problem areas. l-bw to successfully take charge or partake in an audit. The D.O. and M.D. mix in the program added interesting perspective to the educational process of the fellowship. A non-bias discussion group (too much political intrique with the session in the first week) around where the two professions fit into the health care system would be of interest--_especially some issues around manipulation. More practical aspects of how to help impaired physicians. less on docunenting their existence. Also, would have liked the OIERAD faculty to make more coments during Major Project presentations (although I appreciated your allowing the group to make conlnents). Kind of disappointed there. 238 PART II: Discussion Topics 1. 2. Major Projects - Site visits helpful A positive, worthwhile experience Progress reports, presentations, and notebooks helpful Learned from hearing other fellows' presentations Politically beneficial to sponsoring institution Suggestions: - Incorporate minor projects into major projects -‘ Snaller groups to discuss projects - More feedback - Don't require contract/signature from residency director - Make clear to director what is required of fellow and what they will receive in turn September Session: "Teachingand Learning: - Mixed reaction on length of session - Avoid pretest on first day - Structure of first session good-group interaction beneficial A. Elenents of O'oup (Developnent: - Highly theoretical-hard to grasp - More practical clinical examples D. E. F. G. 239 C1 in ical Teaching Technique: Did not use profile instrunent - Too medical student oriented; not designed for this audience - More practical application to hospital teaching rounds - Focus on resident teaching model, not preceptor model - Clinical teaching simulation helpful Onrriculun Development in Family Medicine .. O'o p and - Give basics of curriculun development and have fellows discuss their institutions specifically Issues in Family Medicine: - O‘op - Fertile topic, useless session Producing Audiovisual Materials: - Need more time - More medical photography Presentations Stills: - More in this area, including public speaking, meetings, lecture/interaction presentations - Add annall group teaching skills Perspectives in Teaming - Drop professional - Too theoretical; focus on practical concepts/skills 240 3. January Session: "Research and Evaluation" F. Need good example of research and good questionnaire Direct (hservation and Rating: - Topic useful - fiesentation weak - TV not appropriate Issues and Strategies for Clinical Evaluation: - Generally negative feedback - Needs to be more generalizable Planning and Conducting Research in Applied Settings: - Good information but poor presentation - Include people wlno have done research Pragram Evaluation: - Least helpful - Need more activity - Show good as well as bad examples Questionnaire Design : - Expand into practice (lecture/AV) Writing for Publication: - Good content - Disliked exercise - Use materials'that were requested to be brought “. 5. 241 March Session: "Issues in Family Medicine" A. D. Time lbnagement: - Useful Funding of Family Medicine: - Generally positive feedback - Be sure to get national focus Comittee Membership: -' Generally positive feedback - Get perspective of hospital aaninistrator Grant Witing: - Session well done - Activity frustrating--many points not clear - Need good exanples to clarify lbalth Policy and Planning: - Generally positive feedback May Session: "Administrative Skills" A. Aaninistrative Skills: - Discuss dealing with different administrators - Include budgeting basics and skills - Perhaps have an additional speaker 242 B. Hidden Curr iculun: - Relate to family medicine educators - Interesting in format ion - Think about a session on stress managenent A PPENDI X M FIELDTEST DATA: PROGRAM DIRECTOR INTERVIEW Interview Responses of FMFD Program Directors 6/16/82 KEY: A - Program Director A B - Progran Director 8 QUESTIONS #1 AND #2 The nunber was easily very close to being twice as large as ever before and this had implications for physical things, the roan, how are we going to monitor these people. We expanded the clinical teaching component and one of my concerns was whether or not those three sessions held together as a unit. How much was repetitive, how much was new, how much was consistent? More time, new content, added clinical teaching. technique content. This was expanded in response to previous evaluations. I knew that perspectives in learning was a high risk going in. The mix of M.D. - D.O.'s a concern. M.D.'s mostly third year residents; D.O.'s both residents and faculty, mostly faculty. Thus they found themselves in different teaching situations. Both would be doing clinical teaching, but the D. O.'s more likely to be doing formal classroom teaching. Mix-up between major project and assignments in the past. This time we stipulated the separation between the two from the very beginning. One thing I felt was useful at the end. QUESTION _Q My major surprise was the fact that the general class of activities was less well received than ever before and the fact that we lost a fellow due to . It did not go over well. 's clinical teaching stuff perceived better in earlier OfferIngs. mite surprised, they said it focused more on med students, preceptor model, rather than on teaching residents. Another surprise was that despite all our advance work and proscribing content and behavior, and talked about sports medicine rather than curriculun development. Along with and , they pointed up the weakness in using clinical personnel as faculty in this program. We need to use them, but they're not pulling the freight. Also surprised by the reception of the AV workshop. They wanted'more of it, wanted to actually produce the stuff. 243 244 We both came out with the notion that this is a nuts and bolts group. The group we had last year was more academic, liked topics of a general nature. Yet, I wasn't less satisfied than last year. We had a different group and we knew right then that certain things would go over well for them and that it required a different curriculum. Don't really know what happened with , either. was coming off a year's sabbatical and maybe wasn't as well tuned in as before. Disappointed in the skill and motivation level of the . Ones we had before I thought were of higher quality. were very good. This group couldn't apply infomation, just rote recitation. Disappointed in their ability to verbalize what we were trying to teach. I agree with that, also with knowing that that was the group that was encouraged by directors to come as opposed to they initiated the request. looking back that will make a tremendous difference in law we recruit classes in the mture. You can't force somebody to like teaching. QUESTION # “ Difficult to say who stars will be. Easier to identify the others, the losers. I knew, I could tell by the type of questions they asked, how much they knew about the program. When I visited there was already sane hesitancy on his part. and were both obvious risks, we knew already and who was using the fellowship to punp up his own program and to get information to use for his own program and grant. was ambiguous about what the fellows could do for him while was very specific, positive about possible projects. I had some similar experiences and had some surprises. Not impressed with . Didn't expect as much as we got. Sane with . Tet him in social setting, that colored my reactions. Sane for , obvious he wouldn't be superstar. QJestion just how far he could come. also obvious from day one that he would be a problem. On the other hand, guys like , I knew he was good as soon as I walked in the door. Same with . With one or two exceptions it was pretty easy to do. One thing that was disappointing, we needed more of a clarification of how many of these people were going into full-time teaching. Had several go into private practice which was different fran their applications. 245 We are pushing the next group harder to be honest with us as to lnow they're going to use it. QUESTION # 5 No baseline to judge it by. General impression was that they had a long way to go. They had been exposed to it, but were not very polished. QUESTION #6 Easy for me to see changes from September to January for , struck me as one really beginning to put things together . regressed, the focus was on research, he wasn't really sure what he was talking about, hadn't really done much in the way of presentations before. improved immensely. He started to spark in my mind at that the. He conceptualized a good evaluation scheme and laid it out in a well organized manner. I was sanewhat disappointed with . He's hot and cold. had no apparent carry-over. has no preparation. Except for the final major project presentation. He would not bring up anything that indicated preparation. Some individuals used presentation principles we talked about while others that watched them didn't pick that up or chose not to do likewise. I saw a lot of them make attempts to use overheads and some organization that I had not seen before. Sane modeling of that. Some pretty primitive, violated rules, but got sense of how to use AV. Became slightly more polished. Some improved, as a group saw about 50/50 improvement. More like 70/30 improvement. 70: showed improvement. I saw the really bad ones in September. , , . 246 QUESTION # 7 The major project presentations were the most well thought out statements they had made. Mainly because we pushed them. Reinforced in their minds what they would be presenting, that they would have to have handout materials. Motivated them to sit and think about what they were going to say. Some had already given their presentations to other groups. Had sane experience already presenting. As a group they were much better than January. I'd agree with that. They felt that they had made improvement. They were proud of their efforts. There were few apologies made about the presentations or their papers. Whereas I had a lot of apologies in the earlier sessions. was a negative surprise for me. When I first met him he was EIways enthusiastic and asked questions, but there are still sane basic teaching skills he's lacking. came in a little lower than my expectations. I had higher expectations for him than I saw. I had two expectations for the person as he participated in the progran and the other and what he accomplished with his major project. He cane of his own volition and made a good permanent contribution to his program. But he didn't get involved as much as I would have liked. QUESTION # 8 B: A: demonstrated understanding of how much information to put on slides. No specific examples, but I do remenber people either in jest, in lighthearted manner talking about how someone else violated principles or didn't do this or that. for exannple didn't use overheads or handouts and they camnentE on this. A couple of times with we went over basic principles of overhead design. called about developing guidelines for supervision, referred to things talked about. 247 QUESTION # 9 A: No. I gave out information on perspectives in learning to and , gave them an extra article I had not given to the entire group. With the presentation skills they wanted more practice, not more content. QUESTION #1 O B: , when I went out to work with him on the major project as an aside he asked me to help him with two major presentations he was giving. We sat down and discussed presentation skills, audience involvenent techniques. He called us and as a result of that he picked up three more lectures to give to first year students. And he brought in his evaluations and they were excellent. I don't know if we can classify submission of STFM papers as unintended outcanes, but was pleasantly surprised that several people had done that. QUESTION #1 1 B: When you look at all of them, they're still a little bit better than past groups. A couple are clearly low, but there's a couple of those in each group. I think they came in lower. I think we did more for them to clean up their acts. I think they went farther and ended up higher. I disagree. I would classify, group wise, the overall entry skills of this group as higher than last year. I think we had a higher quality fellow this year, didn't take them as far, but overall quality this year is higher than last year. I think the presentations and major projects were clearly better than last year or any other year. I agree. Would have to do analysis of each one. Nnmbers deceive you too. More bad apples this year. In terms of taking advantage of what the program had to offer, they played around with their projects, nothing will change in their lives as result of being in our fellowship. , I think picked up sane skills, but can just write those other three off. cane in at higher level, he was a former teacher. right up there, , , . . 248 also former teacher. did well. 's presentation not great, neither was tried hard but had not quite put it all together yet. Hard to separate presentations from quality of the project. QUESTION #1 2 is a counter example of what I originally thought. Overall I was happy with the group, didn't have major concerns, little disappointed in their inability to generalize or translate any information we gave them. If we didn't have a practice session and talk about specifics they gave up pretty quickly. We had to make the application for them. They couldn't. There were times the presenters were not what they wanted and they got passive. I was a little disappointed in that. We have yet to underestimate their abilities. By successive approximations we're caning closer to a better program. I'm not going to throw out anything based on the reactions of this group. Will make curricular changes, do that all the time. It's more that we learn as they do. We have to be careful taking their opinions of what they say they can do. APPENDI X N FIELDTEST DATA: SUPERVISOR INTERVIBVS 249 SUPERVISOR INTERVIEW QUESTIONNAIRE NAME ‘ COMPOSITE RESULTS QUESTION #1 Once Dr. learned about the FMFD Program, did you encourage him/her to participate in the program? Y __1A1_ N _2 (If Yes) WhY? (If no) Why not? Dr. had expressed interest in teaching as (6) career I felt Dr. needed some skills improved (3) I saw O'. as a potential faculty member (3) QUESTION #2 Has O'. shared any of the information or new knowledge or skills that he/she learned about teaching during the September program with you or other menbers of your organization? I12N__3 (If yes) Which of the following method or methods best describe how he/she shared this information? 3 (253) formal presentation “ (3 5) individual consultation 12 (100%) informal conversations 1 (8!) written communication O (01) other (please specify) 250 QUESTION :1 Do you know if O‘. has been able to use any of the new knowledge or skills related to teaching that he/she learned in September at MSU? Y15' N 0 (If yes) Please describe the types of knowledge and/or skills that O'. has been able to use, and the types of situations that they have been used in. Clinical teaching skills (5) Presentation skills (5) Group discussion skills (“) QUESTION #“ Have you observed Dr. doing any teaching since late September? This could include activities such as one-on-one clinical teaching or supervision, small group discussion teaching, or formal lectures or presentations . Y12' N 3 (If yes) How often have you observed Dr. doing some teaching since late September? 1 once Clinical teaching (5) 1 2 to 3 times Presentations (5) 2 “ to 5 times O‘oup discussions (3) more than 5 times 251 QUESTION #5 Do you feel able to judge whether or not O'. '3 teaching behavior has changed since late September? Y 12' N 3 (If yes) How has Dr. 's teaching behavior changed since September? More confident, cOmfortable (5) More organized (3) What do you think has caused the change in Dr. 's teaching behavior? Exposure to program (2) More mature (2) *Two different supervisors were asked questions 3-5 for one of‘ the fellows. . QUESTION #6 Have you noticed any change in Dr. '3 role or function in your organization since the end of the September program? For example, has he/she become active in new areas of your program or has he/she taken on new responsibilities? Y 10 N “ (If yes) Please describe this change in Dr. 's role or function. Trying to do some research (2) Serving as coordinator of preceptor (2) program 252 QUESTION #7 Has your program benefited in any way by On 's participation in the Fli-‘D Program? Y_1_3_N1 (If yes) Please describe how your program has benefited. (If no) Please explain why you do not believe that your program has benefited. l-hs shared new information with us (3) Has expanded our educational base - (3) content wise The major project (3) QUESTION #8 Would you encourage another resident (or faculty member) from your . progran to participate in the fellowship progran in the future? Y_1_J_'_ N__O (If yes) WhY? (If no) Why not? Opportunity for resident to check out (2) academics as a career Onportunity for young faculty to grow (2) Dying to accelerate developnent of (2) a Young program 'Question not repeated for three supervisors with more than one fellow in progran. 253 QUESTION #9 Do you have any additional comments about either D". 's teaching behavior or skills or about the FWD Progran that you would like to make at this time? Y 11 N__é. Logistics of fellowhip very workable (2) Would be helpful if supervisors know more (2) what they can do to help QUESTION #1 0 Would you like to receive a copy of the final evaluation report on the FMFD Program? Y 1 N__Q’ QUESTION #11 Do you have any other comments or concerns that you would wish to express at this time? Y 2 N__g_ Garments combined with #9. A PPE NDI X 0 METAEVALUATION DATA: PROGRAM DIRECTOR INTERVIEW METAEVALUATION DATA: PROGRAM DIRECTOR INTERVIEW The evaluator conducted a telephone inteview with the two directors of the PMFDP on Fr1day, Oztober 22, 1982. The information gathered during this interview was used to answer four research questions as part of the metaevaluation of the fieldtest of the evaluation framework. A transcription of that interview follow. KEY: A - Program Director A B - Progran Director 8 E - Evaluator RESEARCH QUESTION #2: Was the evaluation franework practical in its use of resources? SPECIFIC QUESTION: Did the evaluation procedures produce information of sufficient value to justify the resources expended? B: I felt, in general, yes. But I think that there were specific sources of data that were expensive and tMe—consuning for everybody and we got very little out of, in particular the pretest and posttest. It could be a weakness in the instrunent, but I felt that there wasn't a lot of information that cane out of it and a lot of time went into getting the faculty to write the items, going over rewriting the items, scoring the items, and it took a fair amournt of time for the fellows to complete it. I just kind of discounted the data at the end. That for me was one of the major weaknesses. A: I guess I wouldn't cane down as hard on the cognitive assesanents, but they were a pain in the chops. But I think that like any test there's always a high initial cost. What I've been thinking about, we did it once, would it be worth doing every year? And my answer to that would be, "NO, I don't think so." I felt justified in using the resources that we did for a one-shot deal. I don't think that it would be worth repeating all of those every year. But I think they did provide some valuable data that we just didn't have. B: One of the reasons I felt the paper-pencil test was weak was that the main emphasis of those two weeks is heavy on skills that are assessed in some form of demonstration. I think it's real hard for faculty to write items that assess whether someone has identified a skill or not. I felt that is where we kind of fell down. That's why I say that it might not be the methodthat was bad, but the fact that if we are going to continue doing this that we need to cane up with items that are better written and more valid given the skills. 9". 255 SPECIFIC QUESTION: Were the evaluation procedures administered so that program disruption was kept to a minimum? I didn't see that it intruded much at all, the only possible exception is the cognitive pre-, post-, delayed posttest, but other than that we're doing most of those things anyway during the actual two weeks of the progran. You did a lot of things like your telephone calls and that outside the program didn't really disrupt the program, so no, I don't think the precedures overall were that invasive at all. I pretty much concur on that. Are you interested in knowing if it had been disruptive if we had been doing the evaluation? O' only if it was conducted with an external person doing it? If I personnaly had been cornducting it I think it would have been very disruptive as to rurnning the other parts of the program, trying to do both of these simultaneously. DO you think the results would haVe been different as well? Yes, I think they probably would have been. 1b“? Well, it's hard to speculate, but usually when someone who has designed and delivered part of the progran they becane so invested in what it is they are evaluating that it becomes hard to be totally unbiased. My guess is when I spoke with directors and conducted some of those other activities that the results might have been different. I'm not so sure. I think that anybody who's doing an evaluation or who has an educational background would probably have gotten the same story. SPECIFIC QUESTION: Did the use of multiple instrunents appear to yield results that justified the extra time and effort involved in their developnent and aaninistration? My inclination is that because it was a more comprehensive study that it was definitely justified for doing it the initial time. I think one of the things you learn is where do you get redundant information and then you can ask yourself, "Given that you want a certain quality of information, which source might I pick?" If everything is just confirming what you hear over and over again you can start to select out that procedure that gives you the best information for some of the least effort. I think that all the 256 telephone calls plus the debriefing at the end as well as the End- of-Week Evaluations, we were starting to get sane redundancy in our information. If I were to do it again, I would probably chop out parts of that. I basically agree . RESEARCH QJESTION #3: Was the evaluation framework valuable in providing information to you as decision makers? SPECIFIC QUESTION: Did it provide information that answered specific questions that you had about the program? I think it did. At the time we were in our fourth year in operation and there were sane nagging questions that I always thought about, but because of the complexity had never mounted a sufficient evaluation effort to answer those questions. In a sense your framework answered those questions more in a confirmatory note. I had sort of hunched, but now I had data that told me that we were doing the right things or we were way off-base here, that was very helpful to me. Again, the question comes, would I do this every . year, and my answer would be "Probably not." Maybe if there were changes in the structure of the progran, if we went to another format or if we had significant changes in the faculty that presented during the session then I would want to repeat. But as the program stabilizes, now that I have this information I feel much more secure in knowing that we have sane data that says we're doing something right. I'd agree with that. SPECIFIC QJESTION: Was the information that you received com- plete and canprehensive? Was there anything left out that you would like to have known? Perhaps the only bit of information that was missing, and I don't know how in the world you ever could have collected it, was some docunentation of them using the skills in their setting. In other words, they told us they do lectures and work with medical students, but if we could have actually observed them performing some of the skills taught during the September session in their own environment, that would have been very helpful to me. I would agree that it's always ideal to see if there is direct application data. There were times when if anything I felt that I was maybe inundated with information. In a lot of individual 257 information that was useful, but I might have wanted to see something a little more sunma'ized. But that's more preference that anything else. There was no problem with what I got, but as an administrator and having a certain level of comfort with the progran as it si, I would like to have been able to get a one- or two-page executive sunmary of just the bottom line. SPECIFIC QUESTION: lbw could the evaluation have been changed to provide more useful information? Maybe changing the cognitive level of the test items, making them more application and problem-solving rather than recall. Yes, we really loaded them with that, I think that was a function of writing then kind of at the last minute. One idea, just for organization, given that it's a program evaluation and really a fellowhip evaluation where you look at the whole thing, maybe it would be a useful sunmary to look at the major goals of that two-week session and the reason I'm saying goals and not objectives is because you get so bogged down once you get into all 35 objectives, but maybe if you looked at the major goals and determined if we have evidence that they were attained or not or to what extent, may be another way of organizing it. I'm not so certain that would have been better, but it's something to think about since that was really the macropurpose from our point of view, "Were our goals peached?" Another possibility would be to get some baseline data on the performance skills, like how well could these people do presentations before they got there and how well could they do clinical teaching before they got here. I recognize sane of the problems in doing that, but in an ideal it would have been nice to have sane entry level behavior. Yes, I've wondered about that, too. Another thing that would be ideal, but would also put contraints on the fellowship, is if we could somehow standardize the presentations so that they were more equivalent to begin with. In other words, we give them a lot of O'eedom to do different sort of activities and if we were to give them more guidelines,... Proscribe it a little more... ...It would be easier for us to determine if they have really been applying some of the things that we have been talking about, but at the same time they would lose the freedom of picking what they want to do. 258 RESEARCH OJESTION #“: Were the methods/instrunents used within the evaluation franework technically adequate? SPECIFIC OJESTION: Were the sources of information described in enough detail for you to assess the validity of the information they provided? Yes, I'd agree because I think we're faniliar with it. The question would be more appropriate for someone who isn't as familiar with the progran as we are. I've got no problem with it, but I don't know how someone who isn't intimately familiar with the program design would respond. The point is that we worked with you to help develop some of these things so we knew fran the very beginning where you were going, but the question as I understand that you are asking it is if in the report, in the suunaries, is there sufficient information to assess the validity of the instrunents. My answer would have to be "yes" because we were involved very closely, but I don't know if sanebody else reading this report like my project officer in Washington would be able to. I would say for me that it's no problem, for others I just don't know. I would agree. SPECIFIC wESTION: Were the information-gathering instrunents B: and procedures described in enough detail for you to assess the reliablity of the results they produced? I think that there's a lot of room, just in the design of it for unreliablility, but at the sane time, given an evaluation that's something you have to constantly live with. But I think you made a strong effort with your interrater reliablility scores and by letting us see the items. It makes it fairly easy to assess it, but I think we have to still live with the fact that sane interviews may have taken a different tone than others did based on how well they knew us and how well you know of their prograns and all that. So I think that there's room for unreliability, but I think that you just made a real good effort at removing as much of that as you can from it. I would agree with that. 259 SPECIFIC NESTION: Was there evidence that the data were collected and analyzed systematically? -8: I'd say "yes." A: Yes. E: Was there any area where error may have entered into the process? 8: I'd say the cognitive tests just by the constraints that we had for them. They could have easily been done in groups or with the information right there in front of them. looking at their answers I don't think they did that, but I think that was certainly a possibility. A: I'd agree. SPECIFIC QUESTION: Did it appear that the quantitative data were appropriately and systenatically analyzed? A: From what I saw, I would say yes. B: Yes, I would agree. I think that you made a good effort at analyzing it. SPECIFIC QUESTION: Did it appear that the qualitative data were appropriately and systenatically analyzed? A: Yes, as much as you can subject any of that data to analysis. 8: And also it tends to get a little volunninous at times. I think that's where I got inundated with information. SPECIFIC QUESTION: Were the conclusions presented in the evaluation report supported by the data? A: Yes . 8: I'd agree. 260 RESEARCH OJESTION #5: Were the methods/instrunents used within the evaluation franework ethical in dealing with people and organizations? SPECIFIC QUESTION: Was the evaluation report open, direct, and A: honest in its disclosure Of pertinent findings, including the limitations of the evaluation? Yes . SPECIFIC QJEST'ION: Was the evaluation designed and conducted so that the rights and welfare of the hunan sub- jects were respected and protected? Yes, I think that you were up front with them at all times and there were no hidden agenda. You did not tell them they were being audiotaped to improve the quality of responses and to make them more relaxed. Also to improve the accuracy of the responses you recorded. Additional questions were asked that did not relate specifically to any of the research questions. These questions did relate to the question of how well the evaluation framework had functioned when itwas applied to the September session of the FMFDP. What would you do if you were to do it again? I think it is a job for saneone other than the director or assistant director. In terms of a report I would do the End-of-Week Evalu- ations, I would do the cognitive pre-, post-, delayed posttests. although I would want that to be a different type of cognitive measure. I would do the videotape presentations and ratings. I would do the final debriefing. I guess it's the interview that I have sane question about in terms of the cost, the time, and I think we can get the data from other sources. Yes, I think I would agree with that because we are getting an awful lot of information directly from the fellow. They have input at the debriefing, with the End-of-Week Evaluations, and it seems to me that the interviews are an expensive way to add to that. lbw about interviewing the supervisors? I guess in my mind that other than for public relations payoff that 261 I don't see that that contributed a lot in terms of information. It gave us a lot of satisfaction indices, but in terms of providing us with information that we could use to improve and revise the program, I didn't see a lot there. I guess I'm a little more split in that sanetimes I think the PR factor actually does have its am positive impact on the evaluaton arnd as you get closer to understanding the types of settings under which they're worldng, the ldnds of responsibilities that they are initiating rather than just being given. I can't renember if it was read or it I was picking it up from you along the way, but I was learning things that I might not have learned through any other source. What was not done that should be done? You know what I would do? I would in some way incorporate the two visits we make to a setting, the one for the pre- and the one during the fellowship, sanelnow incorporate that into a data collection effort. I don't have specifics, but I think I would use that sanelnow to collect data. Which evaluation source provided the information of most value to you? The participants . The fellow . Which evaluation method provided the information of most value to you? For me, the ratings of the actual performance, the presentations. It depends on which question, let me qualify it that way. The basic question that I was interested in was "To what degree is what we're doing transferrable?" In other words, what can they now put into operation and that data helped me a great deal. In terms of overall acceptablility, the End-of-Week Evaluations are what I look at. Almost exactly. There's not a single instrunent that I would have my confidence in, but if I had to pick two working together, one where we directly observe their performance and one where they give us information on how well we did on our presentations, the two together would be the best guess. Which type of data would you rely on? Behavioral. Performance data . 262. Which would you least rely on? Cognitive. The cognitive and the supervisors' report. What was the overall strength of the evaluation? The strength for me is that it did make an attempt to collect data about both cognitive and affective outcanes and it used a variety of different sources of data, different data collection methods, and that there was sane redundancy and I think that's good. I think the strength of it was that you managed to hone in on what I think are the three best types of information. The weakness I felt was that in an attempt to be comprehensive we might have taken in too much information. I think maybe we had ’too many measures under each of those different types. That's what I've learned from it anyway, we didn't know that going in. What are you doing this time? It may be a little early to tell. We haven't Mplemented any of the additional thirgs that you've done yet, primarily our questions were answered, the big ones. I don't think we've made any decisions if we're gonna do any of the follow-up interview yet, but it may affect how we gather information when we go out to do the site visits. We certainly videotaped them. No thought of doing any analysis. It's not because we don't feel it's important, like I said earlier this is the fifth year of the program and we have been doing the sane thing for three years and your evaluation cane in the fourth year and provided confirmatory data. In other words, we got confirmed that what we were doing was right. Mother thing along with that. I think what we discovered was that sane of our initial assesanent of their performance was substan- tiated with fairly rigorous ratings of those. I don't think we were that far off. It's not like they weren't rated before. We sat down and we had our own criteria that we judged them against and that we gave them feedback on. What the rating process did was just formalize that more. 263 An ything el se? I was real happy with it. Nothing other than I think you did a real nice job.