H| §g§ (WIWIN!WWWWHWWWINU THS' Tit-ibis :1 W 57;." i/‘l L’ ’ This is to certify that the thesis entitled COHPARISON OF THE PRETESTIPOS'ITEST VS POST/1' HEN SURVEY TOOL IN THE CONTEXT OF CONSUMER FOOD SAFETY EDUCATION presented by TRENT WILSON WAKENIGHT has been accepted towards fulfillment of the requirements for the MASTER’S degree In 4 PUBLIC RELATIONS \ V MajoWs Signaturef 777% 031 00% V Date MSUbMWWWW LIBRARY Michigan State University PLACE IN RETURN BOX to remove this checkout from your record. To AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 6/01 cJCIRC/DateDue.p65-p.15 COMPARISON OF THE PRETEST/POSTTEST VS POST/THEN SURVEY TOOL IN THE CONTEXT OF CONSUNIER FOOD SAFETY EDUCATION By Trent Wilson Wakenight ATHESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of MASTERS IN ARTS Department of Advertising 2004 ABSTRACT COMPARISON OF THE PRETEST/POSTTEST VS POST/THEN SURVEY TOOL IN THE CONTEXT OF CONSUMER FOOD SAFETY EDUCATION By Trent Wilson Wakenight Faced with decreasing budgets, food safety educators must develop effective evaluation strategies in order to accurately prove intervention impacts to funding providers. While many educators use the pretest/posttest self-report tool to record consumers’ knowledge changes, results can be skewed as consumers have a mis- perception of their knowledge levels in the pretest stage, over- or underestimating their knowledge. Subject to an altered frame of reference, or “response shift bias,” after the intervention, consumers’ knowledge is thus reflected inaccurately in the posttest. This can lead to conservative survey scores and inaccurate intervention impact(s). An alternative, the post/then tool, retrospectively measures participants’ resulting knowledge change asking the pretest and posttest questions after the intervention. This reduces response-shift bias and more accurately measures impact(s). This study compared pretest/posttest and post/then mean scores within the context of food safety education for seniors. A pretest/post/then survey was also given to further estimate the accuracy of results. While the latter approach failed due to an inadequate number of participants, the post/then approach gave a more accurate portrayal of impacts and less conservative scores in eight of eight dimensions versus the discovery of meaningful scores in just five of eight dimensions in the pretest/posttest tool. These results can be used to provide justification for program existence and the maintenance of funding. ACKNOWLEDGMENTS Thank you to my committee for their guidance, expertise and patience. Thank you, Dr. Sandi Smith, for graciously accepting the role of committee member on such short notice, and for your guidance and insight in assisting in the crunching of data and exploration of SPSS. Thank you, Dr. Nora Rifon, for taking time out of your very busy schedule to offer direction and insight, and for leading me through the challenges of survey implementation and dependent and independent variables. You have always welcomed me into your office even on the busiest of days and taken time to not only discuss coursework, but family life, as well. Thank you, Dr. Wrigley, for three years of direction, guidance and caring. As I look back on my thirty-three years, it is teachers who have always made the most impact in my life. You are one of those individuals who have instilled in me curiosity, inspiration, and taught me that justice, equal rights and tolerance are things very much worth fighting for. Thank you for your kindness and compassion. Lastly, thank you to my family - Lisa, Shelby and Cody. Your devotion and love during these three years have made my chore tolerable and worthwhile. I can only hope, Lisa, that our academic achievements are inspiration to Shelby and Cody to achieve all that they set out to do and that no goal is too big, no mountain too high. iii TABLE OF CONTENTS LIST OF TABLES .................................................................................... v INTRODUCTION .................................................................................... 1 BACKGROUND ..................................................................................... 3 THE CONTEXT OF FOOD SAFETY, EDUCATION AND EVALUATION .............. 5 LITERATURE REVIEW ........................................................................... 8 SELF-REPORT SURVEYING ..................................................................... 8 MEASURING CHANGE ........................................................................... 8 RESPONSE SHIFT BIAS ........................................................................... 8 POTENTIAL PROBLEMS IN USING THE SELF-REPORT TOOL ........ , .............. 12 ATTRIBUTES OF PRE-TEST TOOL ........................................................... 14 THE RETROSPECTIVE-THEN MEASURE PROVIDES A SOLUTION ................. 15 CHALLENGES IN USING THE POST/T HEN APPROACH ............................... 21 TYPES OF POST/T HEN SURVEYS ............................................................ 22 A COMPARISON OF METHODS WITHIN THE CONTEXT OF CONSUMER FOOD SAFETY EDUCATION FOR SENIORS ........................ 24 RESULTS ............................................................................................ 29 CONCLUSIONS AND DISCUSSION .......................................................... 32 EVALUATION FROM A BROADER VIEWPOINT ......................................... 37 APPENDICES ....................................................................................... 47 APPENDD( A: DEMOGRAPHIC INFORMATION FOR STUDY PARTICIPANTS...48 APPENDIX B: SURVEY TOOL USED IN SENIOR FOOD SAFETY EDUCATION INTERVENTION ............................................ 50 BIBLIOGRAPHY ................................................................................... 5 1 iv LIST OF TABLES Cali/[IPIIARISON OF MEAN SCORES FOR CONDITIONS ONE AND TWO ............ 30 TABLE 2: PAIRED SAMPLE T-TESTS ..................................................................... 31 TABLE 3: FIVE AREAS THAT ASSIST IN EVALUATING A PROGRAM ......................... 38 APPENDIX A: DEMOGRAPHIC INFORMATION FOR STUDY PARTICIPANTS ............................................. 48 A1. STATISTICS .......................................................................... 48 A2. AGE .................................................................................... 48 A3. GENDER .............................................................................. 49 A4. OCCUPATION ........................................................................ 49 A5. EDUCATION LEVEL ............................................................... 49 A6. INCOME ............................................................................... 49 APPENDIX B: SURVEY TOOL USED IN SENIOR FOOD SAFETY EDUCATION INTERVENTION ............................................ 50 INTRODUCTION As funding levels for health education programming face increasing scrutiny in government and university settings, funding-recipients’ accountability has taken on greater importance. Within the state of Michigan, as the government considers further funding cuts for organizations such as University C00perative Extension Services, educators need to accurately evaluate the results of educational programming in order to convey the impact of that programming (Thompson, 1985; Decker and Yerka, 1990; Gentry-Van Laanen, 1995; Schalock and Bonham, 2003; Laughlin, 2004; Raidl, 2004). Food safety educators, particularly within Michigan State University Extension, predominantly use the pretest/posttest self-report survey tool to attempt to understand the impacts of their programming efforts. In the Michigan Department of Agriculture Consumer Food Safety Mini-grant Program, a total of 67 funding proposals have been reviewed since 2002. Forty-eight of those proposals have been from, or involve, Michigan State University Extension educators. Of the 48, just 38 have suggested using evaluation and 28 of those proposals have proposed using the pretest/posttest method of evaluation. In 2003, three of the funded consumer food safety education programs conducted evaluations using a comparison between the pretest/posttest and retrospective- pretest/posttest survey methods. In all three cases, the retrospective pretest-posttest measure, also known as the post/then method (Howard, Ralph, Gulanick, Maxwell, Nance, and Gerber, 1979), provided less conservative, and in some educational dimensions, much more dramatic results than did the pretest/posttest tool. Based on these informal comparisons, and other comparisons within contexts such as leadership training (Mezoff, 1981), financial education (University of Tennessee, 2004, human relations training (Mezoff, 1979, as cited in Mezoff, 1981), assertiveness (Howard, Schmeck and Bray, 1979), nutrition education (Rockwell and Kohn, 1989) and food safety (Raidl, 2004), it has been shown that the intervention treatment creates a perceptual shift in the mind of the workshop participant in terms of his or her estimate of knowledge, awareness, skills, or attitude. This shift is a “response shift bias” (Howard, Ralph, Gulanick, Maxwell, Nance, and Gerber, 1979) and is evidenced in participants’ over- or underestimation of the given dimension in their responses to the pretest measure (Howard, Schmeck and Bray, 1979). This can ultimately lead researchers to an inaccurate evaluation of the impacts of an intervention exercise (Howard, Ralph, Gulanick, Maxwell, Nance, and Gerber, 1979; Howard, Schmeck and Bray, 1979; Howard, 1980; Mezoff, 1981; Pohl, 1982; Preziois and Legg, 1983; Sprangers and Hoogsraten, 1988; Rockwell and Kohn, 1989; Rohs and Langone, 1996; Bedingham, 1998; Rohs, Langone and Coleman, 2001; Rohs, 2002; Utah State University, 2004). This study proposes that a comparison of results from the pretest/posttest and the post/then self-report survey tool within the context of food safety education for seniors will produce the same results. The post/then survey will provide less conservative results and, consequently, a higher degree of accuracy than the pretest/posttest survey tool, thus ensuring a more favorable evaluation of the program effects. Additionally, the study will administer a three-tiered pretest-post/then design. Evidence of participants’ over— or underestimation of knowledge is expected. Differences between the pretest and retrospective ratings would indicate a change in the knowledge dimension, reflective of others’ work using this retrospective testing approach, and provide a more accurate assessment of seniors’ pro-intervention food safety knowledge (Sprangers and Hoogsraten, 1989; Sadri and Snyder,l995). BACKGROUND In a practice that’s becoming more commonplace (Arnold, 2002), community and health educators (Chapman-Novakofski, 1997) and Extension educators and their community collaborators (Duncan, 1997) are linking the justification of programming efforts to the analysis of evaluation results (Bennett, 1979). The “Impact Indicators Project (IIP)” (Chapman, Clark, Boeckner, McClelland, Britten, and Keim, 1995; McClelland, Keim, Britten, Boeckner, Chapman, Clark and Mustian, 1995), reports that Extension field staff and Extension nutrition specialists believe that evaluation is valuable in showing the impact of programming efforts, and is also needed to secure new or continued program funding (Chapman-Novakofski, Boeckner, Canton, Clark, Keim, Britten, McClelland, 1997). This belief is echoed in other summaries expressing the value of evaluation in proving accountability (Thompson, 1985; Decker and Yerka, 1990; Gentry-Van Laanen, 1995; Schalock and Bonham, 2003; Laughlin, 2004; Raidl, 2004) and determining whether or not funding is being spent wisely (Ostroff, 1991; Mann, 1996; Diem, 2003). Within evaluation theory, researchers, including Michael Scriven (1980), criticize the use of evaluation as a tool for providing information to decision-makers. Evaluation, in Scriven’s opinion, is best suited for determining merit or worth. Similarly, evaluation yields insight into the value of a program, write Shadish, Cook and Leviton (1991). While Extension field staff or education specialists might contribute to the social- science aspects of this argument, few, despite having a need for evaluation, use evaluation tools to judge the worth of an endeavor or communicate intervention impacts on a routine basis (Phillips, 1991; Goldstein, 1994; Mann, 1996; Chapman-Novakofski, et al., 1997). One reason is the importance placed on the educational program itself and educators’ reluctance to spend equal time and energy on the evaluation. As one Extension educator commented, “We favor program takeoffs, not their endings” (Rohs, et al., 2001), implying that the cost of program evaluation outweighs the organizational incentives in designing and implementing the programming (Campbell, 1971). Further reluctance is based on evaluators’ fear that the results may point out that a program is not meeting the desired expectations (Mann, 1996). Campbell (as cited in Shadish, et al., 1991) reports that the whitewashing of social programs through uncritical evaluation and the promotion of overly optimistic reports, are key detriments to the further of social programming. Client needs, he argues, versus justifying the keeping of budget funding or political behavior, need to drive social programming. The HP study pointed out other barriers that include the perceptions that: a) clients are resistant to participating in evaluations; b) demographic details are hard to obtain and the request for that information can be offensive to some; c) and, literacy and time constraints make written surveys difficult to complete (Chapman-Novakofski, et al., 1997). Despite the barriers, program educators recognized their responsibility to communicate results to funding providers. Considering these barriers, coupled with the satisfying of financial needs, accountability needs, and the need to determine program value, the implementation of an efficient and easy-to-use tool that accurately records intervention effects is essential. Congruent to this is the use of an evaluation that conveys the difference in particular dimensions before training and after training to indicate a change in behavior, knowledge, skills, abilities, attitudes, or a combination of these (Sadri and Snyder, 1995; Poling, 1999). The self-report survey tool is one way to gauge these changes, and, if documented successfully, can “demonstrate the usefulness of the training enterprise” (Mann, 1996). The context of food safety, education and evaluation The current estimate of incidences of food-home illness in the United States is 76 million per year (Mead, Slutsker, and Dietz, 1999). This is equivalent to one in every four people suffering from a food-related illness each year. While most illness is realized as gastro-intestinal ailments such as diarrhea or vomiting, an estimated 325,000 US. citizens are hospitalized and 5,000 die each year due to food-borne illness (Mead, Slutsker, and Dietz, 1999). The effort to decrease the estimated number of incidences of food-related illness is the mission of organizations throughout the world, including the United States Department of Agriculture (USDA), the US. Food and Drug Administration (U.S.F.D.A.), and the National Food Safety and Toxicology Center (N .F.S.T.C.) at Michigan State University. Despite local and national educational efforts and advances in technology and research, however, food-borne illness still affects millions each year. Additionally, the economic cost of illness and combating illness in our nation is estimated at $20 to $40 billion per year (E.P.A., U.S.D.A., D.H.H.S., 1997). As it is believed that most food-borne illness incidences are preventable through the education of food producers, processors, retailers and consumers, increasing the knowledge of and actualization of proper food handling practices can presumably lower these numbers (E.P.A., U.S.D.A., D.H.H.S., 1997) Though decades have been spent providing food safety information to consumers, more could be done to develop effective evaluation methods. Rennie (1994) points out that current methods of evaluation do little to make a convincing case for the continuance of food safety education in its current form. In fact, reports taken from formal courses indicate that food handling practices have not improved, drawing into question how those practices are evaluated (Rennie, 1994). A measurement of food safety variables was examined in a review of food service manager training in the US. from 1971 to 1984 (Julian, 1984). Of the variables available for evaluation, the knowledge of various food safety principles provided the most reliable information about managers’ skills and abilities. And yet, when addressing knowledge as a variable, the evaluator must recognize the limitation of knowledge retention (Mann, 1996) and the actualization of that knowledge (Rennie, 1994). Consumers’ practice of food safety principles was examined in a 2000 study by Anderson that placed video cameras in the kitchens of study participants. The food safety practices that these consumers claimed to actualize and understand in a pretest survey were shown to be quite different when participants prepared a meal and cleaned their kitchens. Through observational methods, this study showed that self-reported behavior does not always translate into the actualization of that behavior, or “that consumers don’t always do what they say they do” (Anderson, 2000). One outcome of the study is that consumers may not have been equipped with an appropriate knowledge level of the very principles that were being investigated. The Anderson study was funded at approximately 50,000 and conveys that, while observational measurements can be useful, this method can also be difficult and expensive to arrange. Therefore, developing the best possible and least-costly self-report measures, a statistical procedure for analyzing self-report data and effective research design is very important (Howard, Schmeck and Bray, 1979). Perhaps reflective of the challenges in conducting observational studies, it was estimated in 1981 (Mezoff, 1981) that 80 percent of evaluation is performed using self-reporting tools. While a current estimation could not be found, an examination of food safety curriculum shows that the self-report tool is still widely used. In response to both the need for effective self-reporting mechanisms versus observational methods, and the assessment of knowledge versus behavior, Howard (1980) found that if detrimental circumstances akin to self-report tools such as the pretest/posttest can be controlled for, self-report tools addressing knowledge can actually be more accurate than behavior measurement, whether observational or otherwise. In 2002, Raid] conducted a study within the University of Idaho Cooperative Extension Nutrition Program (EN P) that reflected this notion and supported the notion that knowledge can lead to positive changes in behavior. The study involved a total of 112 ENP participants in a nutrition and food safety intervention and found that knowledge of suitable practices allowed consumers to make favorable changes in several dimensions. Whether knowledge measurement or behavioral measurement is conducted, the bottom line is results, not just whether or not people are learning, according to AT&T’s Marc Rosenberg (McCune, 1994). The way to assess these results is to evaluate where one is now, where one wants to go and the best way of getting there (Bedingham, 1998). LITERATURE REVIEW Self-remrt surveying The self-report survey tool can be effective due to low design and execution costs, the ease of information collection from a group of people at one time, and the inducement of a feeling of inclusion on the part of participants. Participants’ anonymity is easier to guarantee through surveying than through other means, and, as the surveyor is present, the clarification of survey questionnaire items is possible (Mezoff, 1981; Poling, 1999). Despite these benefits, it is believed that the majority of experimental interventions that seek to measure change through the use of a self-report too], including a common method such as the pretest/posttest survey, have apparent problems (Sadri, 1995). The most notable of these detriments are threats to internal validity (Howard, Schmeck and Bray, 1979). Measuring change Self-report surveying points to reliance upon the comparison of gain scores. While perhaps integral to the perceived necessity for the pretest/posttest method, this is a process open to other difficulties not present in the comparison of single measurements Cronbach and Furby (1970, as cited in Howard, Schmeck and Bray, 1979). Resmnse shift bias One difficulty in the comparison of gain scores is the influence of historical and instrument threats to internal validity (Howard, Schmeck and Bray, 1979). The implementation of true experimental designs is seen as a countermeasure to internal validity threats (Campbell and Stanley, 1963). When considered singly, the threats of history and instrumentation can be controlled within true experimental designs. However, due to an interaction between the treatment effects and the measurement tool, when a participant moves from the pre-intervention survey stage to the post-intervention survey stage, the potential for a shifting in his or her perspective for responding is triggered (Rohs, 1999). In understanding respondents’ perception shift due to history and instrumentation effects within a pretest/posttest intervention session, evaluators must be sensitive to the levels of knowledge that participants bring to the table. Research studies in the US. and U.K. have shown that approximately 50 percent of managers and salespersons are unfamiliar with their own strengths or weaknesses (Bedingham, 1998). As participants may be unfamiliar with these dimensions or others, including knowledge of a subject, they have no frame of reference or common metric from which to respond to the instrument in the pretest stage (Cronbach, et al., 1970, as cited in Howard, Schmeck and Bray, 1979; Rohs, et al., 2001; Raidl, 2002). This can prompt an individual to misrepresent his knowledge level on the pretest survey (University of Tennessee, 2003). This is described by Preziosi and Legg (1983) as a “glowing (over-) estimate” or an underestimate of his knowledge of a subject. As becomes evident in the post-intervention stage, participants have undergone a historical shift as the reference frame by which they judge their strengths, weaknesses, skills, attitudes, or knowledge has changed (Howard, Ralph, Gulanick, Maxwell, Nance, and Gerber, 1979; Howard, Schmeck and Bray, 1979; Mezoff, 1981; Preziosi and Legg, 1983; Sadri and Snyder, 1995). This change in an individual’s basis for judging her perceptions for a given dimension or her understanding of the standard for measurement is the response shift (Howard, Ralph, Gulanick, Maxwell, Nance, and Gerber, 1979) and could counterfeit the true effects of an intervention assessment (Howard, Ralph, Gulanick, Maxwell, Nance, and Gerber, 1979; Howard, Schmeck and Bray, 1979; Howard, 1980; Mezoff, 1981; Pohl, 1982; Preziois and Legg, 1983; Sprangers and Hoogsraten, 1988; Rockwell and Kohn, 1989; Sadri and Snyder, 1995; Rohs and Langone, 1996; Bedingham, 1998; Rohs, et al., 2001; Rohs, 2002; Utah State University, 2004). The end result is a “response shift bias” (Howard, Ralph, Gulanick, Maxwell, Nance, and Gerber, 1979). Because effects of the intervention treatment can be manifest in the pretest/posttest outcomes, this could be viewed as a threat to internal validity according to Campbell and Stanley (1963). Bray and Howard (1980) argue, however, that threats to internal validity and construct validity are challenging to discriminate. They propose that response-shift bias could also be construed as a threat to construct validity, according to Cook and Campbell (1976). Either way, Bray and Howard (1980) contend that this bias can threaten the proper evaluation of treatment effects. For evaluators to compare pretest scores to posttest scores, a common metric must exist between the two sets of scores. The inaccuracy occurs when an evaluator wrongly assumes a uniform reference frame for each participant when they move from the pretest to posttest (Campbell, etal., 1963; Caporaso, 1973; Neale and Liebert, 1973; Linn and Slinde, 1977; Mezoff 1981). In this way, the response shift can create misleading conclusions and an inaccurate assessment of the training benefits (Howard, Ralph, Gulanick, Maxwell, Nance, and Gerber, 1979; Howard, Schmeck and Bray, 1979; 10 Howard, 1980; Mezoff, 1981;Poh1, 1982; Preziois and Legg, 1983; Sprangers and Hoogsraten, 1988; Rockwell and Kohn, 1989; Rohs and Langone, 1996; Bedingham, 1998; Rohs, et al., 2001; Rohs, 2002; Utah State University, 2004). Of pertinence to this study, this reference frame shift has been shown in educational settings that involve acquiring knowledge or the learning of basic teaching skills (Rohs, et al., 2001). Additionally, Howard, Schmeck and Bray (1979) contend that the potential for the threat of historical and instrumental effects peaks when the treatment purpose is a change in the understanding of the variable being measured. When the response-shift phenomenon is at work, researchers who may prefer posttest comparisons between treatment and control groups must alter their approach. As the self-report instrument asks the respondents themselves to serve as raters who then undergo historical change, the experimental intervention can alter subjects’ understanding of their level of functioning or interpretation of self-report items (Howard, Schmeck and Bray, 1979). Because the treatment groups would develop different histories than control groups (Howard, Ralph, Gulanick, Maxwell, Nance, and Gerber, 1979), the comparison of the treatment and control groups is confounded as ratings within each group evolve with different scales in mind. A study conducted by Rockwell and Kohn (1989) illustrates response shift bias. In examining nutritional behavior, consumers were asked, “Do you include one food rich in Vitamin C in your diet daily?” To answer accurately, the respondent must know which foods are rich in vitamin C. Because a participant may not have this knowledge, he may overestimate vitamin C intake on the pretest. Suppose the participant has increased vitamin C intake as a result of the program. On the posttest, if the participant reports a 11 similar or the same level of vitamin C on the pretest, the posttest level is accurate but because the pretest was an overestimate, it will appear that little or no change in behavior has occurred. Such a result makes it appear that the program had no effect on behavior when, in fact, the program increased vitamin C intake. For managers assessing program effectiveness, the pretest/posttest results can be disappointing. Mezoff (1981) reports that comparisons in literature on this subject point out minimal differences between the two scores, and the consequent results can fail to reveal the true value of a training endeavor (Rohs and Langone, 1996). Additionally, the training may be judged as ineffective, irrelevant or unnecessary (Mezoff, 1981). While response-shift bias is cited as one reason for the development of a more accurate and useful self-report tool, there are other potential problems. Potential problems in using the self-remrt tool Without proper procedural steps, the survey test may not be valid Although the administration of a valid and well-designed pretest/posttest instrument can provide data about participants’ knowledge change, the steps for this procedure can be beyond the scope of many program administrators (University of Tennessee, 2003),. These steps can include a careful review of the program materials in order to develop a valid survey tool that reflects intervention content. A review of the instrument is recommended by experts in the subject matter to give consideration to coverage of the content by test items and verification on appropriate knowledge measurement. A field test of the instrument is also recommended, performed through the administration of the tool to a group of individuals similar to those who will participate in the program (University of Tennessee, 2003). 12 A participant is more willing to “admit” to actual behavior or practices in the post stage By the time a participant has reached the end of the session, she may be more willing to “admit” actual behavior or practices (W odarski, 1987). This can have detrimental effects for food safety educators who use the pretest/posttest methodology. In such an instance, a participant could have overestimated her true behavior in the pretest stage, thus presenting a scenario of little or no change when scores are compared (Raidl, 2002). An example is a self-report of the frequency with which she washes her hands. The same participant may indicate a more honest and lower number in the posttest stage. A researcher using these results to evaluate the benefits of his programming could presume that this decrease was due to the training meant to increase positive behavior. Additionally, there is evidence that a score for a particular dimension could actually increase following an intervention intended to decrease that dimension (Rohs, et al., 2001). Participants may not provide complete data As evaluators require complete data for program assessment, participants’ inability to fill out both the pretest and posttest could render the responses unusable (Utah State University, 2004). A recent study by Raid] (2004) found incompletion rates of 16 percent and 15.6 percent, respectively, for pretest and posttest surveys that dealt with nutrition, food safety and resource management behaviors. In using a post/then measure, one that poses the pretest and posttest questions following the intervention, there were no incomplete surveys in an overall population of 566 participants. In accounting for this difference, Howard, Schmeck and Bray (1979) believe that participants’ lack of clarity on 13 what the pretest is asking them may cause incompletion. Furthermore, participants may view the post-survey as a waste of time that could be better spent on the actual class (Marshak, deSilvva and Silberstein, 1998). The notion of “taking a test” can deter some from participation Some adult audiences may be turned off by the notion of taking a “test” and choose not to participate (University of Tennessee, 2003). It takes too much time to administer In giving the pretest/posttest, extra time must be given for administering the test twice — once at the beginning and once at the conclusion (University of Tennessee, 2003). The survey administrator is unclear of the situation analysis One task of a corporate administrator can be the use of performance reviews. Bedingham (1998) reports that this could be a leader within an organization who might also be unclear about an employee’s attributes and deficiencies and what that individual can do to improve. In such a situation, basing an employee’s review upon data from pretest/posttest responses in an attempt to measure attitude or other dimensions could confound the situation. Attributes of Ere-test tool While the pretest tool carries some potentially negative characteristics including the development of a response-shift bias, there are some positive attributes that this pre- intervention tool can provide. These benefits include increasing a participant’s willingness to learn and setting the tone to make him more receptive to learning; gaining a participant’s psychological involvement in the interview process; encouraging him to invest in the initial cost and thus creating a greater sense of value for the training; the 14 establishment of benchmark data; a participant’s potential familiarity with this type of measurement tool; enhancing discussion of the subject as participants can develop familiarity with the subject; and familiarizing him with key concepts (Belasco and Trice, 1969; Mezoff, 1983; Chapman-Novakofski, et al., 1997; Utah State University, 2004). Lewin (1947), in his description of the first three steps of change, notes that the pretest can “unfreeze” trainees allowing them to relinquish their traditional perceptions, beliefs or behaviors. As part of the “unfreezing,” the pretest provides motivation by increasing anxiety and decreasing complacency, resulting in more positive results. The retrosmctive-then measure provides a solution There are many studies that compare the pretest/posttest survey tool with the post/then survey. For instance, in a management development program using 20 peOple in two groups, Preziosi and Legg (1983) found indicators of impact to be higher for the group using the post/then tool. Other learning environments have shown the benefits of this tool. Within studies on leadership skill building (Mezoff, 1981), interviewing skills training (Howard, Dailey, and Gulanick, 1979; Howard, Schmeck and Bray, 1979), assertiveness training (Howard, Ralph, Gulanick, Maxwell, Nance, and Gerber, 1979), human relations training (Mezoff, 1979, as cited in Mezoff, 1981), learning in classroom settings (Howard, Schneck and Bray, 1979), behavior change (Rhodes and Jason, 1987; Rohs, 1999), educational settings (Bray and Howard, 1980), changes in attitude regarding individuals with HIV/AIDS (Riley and Greene, 1993), changes in knowledge levels of nutritional behavior (Rockwell, et al., 1989), and improvement in teaching skills (Bray and Howard, 1980), the post/then method provided less conservative results that the pretest/posttest tool. 15 In the post-then scenario, the participant is asked two types of questions, and only at the conclusion of an intervention. The first question is a posttest question and asks the participant she now perceives her behavior, skills or knowledge change following the program. The second is asked following the intervention and asks the subject to look back and retrospectively rate each dimension based on how she perceived herself prior to the session (Howard, Ralph, Gulanick, Maxwell, Nance, and Gerber, 1979). On occasion, the order in which these two questions are asked is reversed. Sprangers and Hoogsraten (1989) have shown that varying the order of the then-measurement and the posttest questions does not affect participants’ responses. While the benefits of the post/then survey have been shown in other fields and seems to condone the use of this too], there is no significant representation of the use of the retrospective measure within consumer food safety education. Recently, Raid] (2004) investigated consumers’ food safety behaviors following the completion of six nutrition program lessons. Retrospective survey results indicated an increase in the frequency of positive behaviors, and a decrease in negative behaviors. Assessing the retrospective survey method within the consumer food safety context is particularly crucial for two reasons: As explained earlier, the accurate analysis of impact plays a key role in establishing justification for program funding. Additionally, there is a pressing need for program improvements that could aid in the decrease of current levels of food-borne illness in the United States. In the pretest/posttest and post/then comparative studies listed above, researchers point out nine benefits inherent in the use of the post/then tool: 16 The post/then tool helps better document results The use of the post/then tool is beneficial in facilitating a better documentation of results and can show a higher level of impact through retrospective assessment (Rohs, et al., 2001). The timing of the questions is the key to this benefit (Rockwell and Kohn, 1998). Because the questions are asked following the intervention, a subject’s ratings can be made from the same perspective, yardstick, or frame of reference and thus avoid response-shift bias, seemingly making the retrospective measure more accurate (Howard, Ralph, Gulanick, Maxwell, Nance, and Gerber, 1979; Howard, 1980; Mezoff, 1981; Preziosi and Legg, 1983; University of Tennessee, 2003; Pratt, McGuigan and Katzev, 2000; Mincemoyer and Perkins, 2001; Robs, et al., 2001). Research points out that when a comparison between the pretest/posttest and post/then brought differing results, the pre/post approach gave a more conservative estimate of the treatment effect. A likely conclusion then, is that the pre/post tool gives an underestimation of the benefits of the intervention, thus leading to an inaccurate documentation of those benefits (Howard, 1979; Rohs, et al., 2001). Additionally, Howard and Dailey (1979), Hoogsraten (1982) show that in comparison, scores from the use of the post/then tool can provide evidence of change that is aligned to objective measures of change to a more significant degree versus the pretest/posttest method. This also verifies that a response shift occurred during the pretest/posttest use (Rohs, et al., 2001). The post/then allows for more validity in determining a pre-intervention assessment The post/then tool has been shown as a reliable means of gathering an accurate measure of participants’ real levels of functioning prior to the intervention. Sadri (1995) 17 investigated a series of interventions that lead to the conclusion that “reinterpretations of the response scale served to increase subjects’ ability to rate themselves accurately after the intervention.” Through qualitative research, respondents concluded that their pre- intervention ratings were inaccurate due to increased insight into their abilities following the session. Other researchers (Howard, Schmeck and Bray, 1979; Bray, et al., 1980; Pratt, McGuigan and Katzev, 2000) also have found a “significantly greater” level of validity in using the retrospective method of pretest versus the more common self-report pretest done at the beginning of a session. The Howard study featured a 5-day workshop on interviewing techniques. The researchers discovered significant differences between the pre and then scores. Additionally, in 9 of 13 instances, more favorable results came from use of the post/then tool versus the pretest/posttest. Memory bias and social acquiescence are non-factors in use of the retrospective measure While response shift effects have been determined to be treatment dependent (Rohs, 2002), a report of lower scores in the “then” portion of the testing may assumedly be linked to participants exhibiting a “memory bias.” This assumption is defined as a participant’s consciously over-representing this posttest score or underrating his score on the “then” test, based on the memory of the pretest response. Additionally, it may be assumed that a respondent biased his answers to provide instructors with more favorable results. This desire is called social acquiescence. This benefaction on the part of a participant could actually result in inaccurate research findings (Howard, 1980). 18 In examining these potential threats, several studies have refuted these assumptions. Howard, Schmeck and Bray (1979), and Howard, et a1. (1981), in considering memory bias, social acquiescence and pretest scores, found undergraduate students’ recollected memory ratings to be considerably lower than their actual pretest or retrospective-pretest ratings. Rohs and Langone (1997) found similar results in the assessment of memory regarding pretest scores. In reporting recollections of pretest scores in a leadership class, not one undergraduate student reported an accurate pretest score. The post/then tool is suitable for new or complex subject matter Several studies (Chapman-Novakofski, et al., 1997; Duncan and Goddard, 1997; Rockwell and Stevens, 1992; Stevens and Lodl, 1999) have found that the post-then method is especially adaptable with learners in exploring new or complex issues. While food safety knowledge is a subject that consumers may purport is not new for them and that they already understand, as pointed out earlier (Anderson, 2000), this is not always reflected in their actualization of this knowledge. It could be inferred, then, some food safety principles contain new or complex issues that are, for some consumers, not fully realized or understood. Participants are more likely to admit negative behavior at the end of the intervention As the post/then measurement is completed at the conclusion of an intervention, a participant may have a higher level of comfort and will be more likely to admit in the evaluation to something considered by educators or society as inappropriate in some way (Kiernan, 2002; Raidl, 2002). 19 The post/then is more efficient and affordable as the survey is given just once The post/then survey is used just once, at the conclusion of a study, and therefore can record dimensions efficiently and in a short time frame (Rockwell, et al., 1989; Rohs, et al., 2001). While time is a resource, the financial expenses in recording and demonstrating impact also make the post/then method more economical than direct observational techniques or other approaches (Preziosi, etal., 1983). The data from comparative pretest/posttest and post/then studies is easy to analyze In comparative studies between pretest/posttest scores and post/then scores, a simple one-way analysis of variance and comparison of means have been deemed sufficient (Mezoff, 1981), and have been used along with paired sample t-tests (Duncan and Goddard, 1997; Mincemoyer and Perkins, 2001; Raidl, 2004). This would eliminate the resources that would go into anything other than simple data-entry and number— crunching and could possibly endear educators who were previously fearful of the assumed costs of this process. More accurate results derived from the post/then survey tool allow for more confidence in assessing and making adjustments to the intervention activities As Suchman (1967, as cited in Shadish, et al., 1991) contends, the “success” of an evaluation project is largely dependent on it usefulness in improving services. Because the post/then method can provide more accurate results, educators can more confidently use this tool to draw conclusions about their intervention activities and make any necessary adjustments. In their 1989 nutrition study, Rockwell and Kohn suggested using post/then survey results to examine changes that participants made and link those with program content and teaching techniques. Conversely, if participants did not change their 20 behavior in a certain area, they suggest altering the teaching method or amount of emphasis placed on a topic. Results from the post/then mechanism can help better address accountability needs Lastly, in examining the advantages in using the retrospective-pre/posttest, for evaluators, the payoff is literally that — proving a higher level of impact can enhance one’s bargaining position when negotiating a training budget (Preziosi, et al., 1983). For those involved in the Extension organization, the post/then can more effectively address accountability needs — a vital concern when budget discussions are taking place (Rockwell, et al., 1989). As Preziosi and Legg (1983) stated, “HRD department resources used to implement this type of evaluation can certainly be justified by the benefits they produce.” Challenges in using the apt/then approach While justification for using the post/then approach exists, there are some potential cautions and downsides that must be considered. Instructions on how to respond must be clear Use of the post/then tool can be severely hampered if respondents lack an understanding as to how to complete the measure. Directions must be clear and concise (Rockwell, et al., 1989; Rohs, et al., 2001). The measurement scale needs to be appropriate for the subject and levels of change It is urged that the instrument must identify specific behaviors, knowledge areas or attitudes that may change and design the appropriate scale to test the amount of self- reported change (Rockwell, et al., 1989). 21 Typg of mgt/then surveys In further scrutiny of the post/then survey method approach versus the traditional pretest/posttest model, the University of Tennessee Cooperative Extension Services (2003) engaged in a national financial knowledge initiative. The group encouraged its presenters of financial information to consumers to use one of two types of retrospective posttest tools or one of two post/then tools: Post-Program Knowledge Change Example 1: For each of the statements below, circle the one response that best represents your level of agreement with that statement. Response Scale: SD = STRONGLY DISAGREE D = DISAGREE N = NEITHER AGREE NOR DISAGREE A = AGREE SA = STRONGLY AGREE Strongly Strongly As a result of this program: Disagree Am (Circle One Response) 1. My knowledge of (topic/issue) has increased. SD D N A SA 2. I am more aware now of (topic/issue) than I was before. SD D N A SA 3. I have a better understanding of mpic/issue) than I did before. SD D N A SA One caution in using a method that utilizes the abbreviation style of answer, as shown above, is that explaining this method of response may require an additional step in the instruction process. Post-Program Knowledge Change Example 2: As a result of this program, my knowledge about (topic/issue) has: NOT INCREASED INCREASED INCREASED INCREASED AT ALL VERY LITTLE SOME A LOT 22 Post-then-Pre Knowledge Change Example 1: For each of the topics listed below: 1) In the LEFT column, circle the ONE number that best reflects your level of knowledge AFTER the program, then 2) In the RIGHT column, circle the number that you think best reflects your level of knowledge PRIOR TO participating in the program. Knowledge Level Knowledge Level AFTER Program PRIOR TO Progr_a_rp LOW HIGH TOPICS LOW HIGH 1 2 3 4 5 Topic # l 1 2 3 4 5 1 2 3 4 5 Topic #2 1 2 3 4 5 1 2 3 4 5 Topic #3 l 2 3 4 5 1 2 3 4 5 Topic #_ 1 2 3 4 5 Post-then-Pre Knowledge Change Example 2: Directions: Read each of the following topics and, in the left half of the table, rank your level of understanding at the present time AFTER participating in this program. NEXT, think back to your level of understanding about each topic BEFORE you participated in this program and rank your “BEFORE” level in the right half of the table. Circle the appropriate numbers using the following key: 1 = NO UNDERSTANDING 2 = LITTLE UNDERSTANDING 3 = MODERATE UNDERSTANDING 4 = QUITE A BIT OF UNDERSTANDING 5 = ALMOST COMPLETE UNDERSTANDING MY UNDERSTANDING After Program Before Program How would you describe your understanding of the fOIIOWingZ Jim Little a?" 9:: gimp] None Little :90“! 3"; :omplc 1.Topic #1 l 2 3 4 5 l 2 3 4 5 2.Topic #2 1 2 3 4 5 1 2 3 4 5 3.Topic #3 1 2 3 4 5 1 2 3 4 5 4.Topic #4 l 2 3 4 5 1 2 3 4 5 Through the use of both post/then methods, the firm found that the retrospective pretests offered a “significantly greater” validity than measures of change using the more common self-report pretest. 23 A COMPARISON OF METHODS WITHIN THE CONTEXT OF CONSUMER FOOD SAFETY EDUCATION FOR SENIORS The present study attempts to demonstrate, using two different groups, the response-shift bias effect and how its interpretation can have useful implications for food safety educators. While the traditional pretest/posttest and retrospective-then/posttest tools were implemented, a combination of the pretest/retrospective-then and posttest design was also administered to one group. Justification for this approach is based on several reasons. Greater evidence of a response shift can be seen if the pre-then-post model is used. Though differences between the pretest/then scores would indicate the existence of change, the additional retrospective-then versus posttest scores allow for measurement free from response shift bias as the tests are completed in close proximity (Sadri, 1995). Additionally, by adding a pretest to the post/then approach, a participant’s knowledge, ability or skill level can be better understood. Mann (1997) supports the use of this approach as it provides a “more realistic picture” of improvement due to training. For example, if the comparison of results yields small differences, this can indicate a high degree of skill or sophistication or appropriateness in the dimension(s) being taught. This could suggest that the individual engage in a different type or level of training program. Large differences between the comparison of pretest to posttest and retrospective-then to posttest could indicate that the respondent has no or very little awareness of the subject and that a more comprehensive training or educational program is required (Mezoff 1981). 24 By using all three steps, insights can also be gained regarding the accuracy of information provided to respondents prior to the event. For instance, managers examining the ability of employees should make sure that they are aware of the entire range of skills to permit their judging of their abilities within a knowledgeable frame of reference (Mann, 1997). A further reason for using this three-step approach is inherent in its outcomes. This approach allows for the ability to better determine the accuracy of the retrospective measure at determining participants’ pre-intervention knowledge. While the claim that this measure is more accurate is well-documented throughout the retrospective-test literature, very few take the three-tiered approach to verify greater accuracy in representing consumers’ true pretest knowledge and knowledge gains (Sprangers and Hoogsraten, 1989; Sadri and Snyder,l995). Lastly, while afforded the benefit of the doubt, a respondent may wish to “psych out” the questionnaire or provide socially desirable answers. The use of some identification such as a name or code number to match post/then results with the pretest findings could eliminate this challenge (Mezoff 1981). Background For this study, the measure taken was knowledge of proper food safety practices in accordance with Bandura’s self-efficacy theory (1977, 1982). An explanation of the curriculum was intended to empower seniors with the “belief that (they) can perform successfully the behavior required to produce designated types of performance.” Additionally, seniors in the study had to receive assurance that this perceived self- efficacy could be matched by a certain behavior that would lead to certain outcomes. In 25 other words, not using the same cutting board to cut up chicken and salad could be avoided through specific behavior that would result in the outcome of not getting sick. The combination of self-efficacy and outcome expectations, Bandura’s Social Cognitive Theory (Raab, 1997, as cited in Medeiros, Hillers, Kendall, Mason, 2001), has been shown, by AbuSabha and Achterberg (1997, as cited in Medeiros, Hillers, Kendall, Mason, 2001) to lead to desired outcomes and to be a powerful predictor of human behavior This approach followed Howard’s 1980 study indicating that proper control of response-shift bias through the use of the retrospective measure could lead to a higher level of accuracy versus behavior measurement. While this study aims to measure seniors’ knowledge of food safety concepts within a framework of pretest and retrospective-pretest survey effects upon the posttest mechanism, various independent variables may be at play. For instance, knowledge of food safety principles may be affected by culinary educational experience, gender, marital status, dependence upon others for food or food preparation, family size, and whether the person is an urban or rural dweller. Furthermore, some of these factors may interact with each other, dependence and age, so that not everyone is affected in the same way (Cicciarella, 1995). The USDA. Food Safety and Inspection Service (2002) describes the variance in seniors “food lifestyles” by stating, “Some seniors are homebound and must rely on delivered food. Others are new widowers with little cooking experience. Whether seniors are part of these groups or are experienced cooks, adhering to. . .up-to-date food safety guidelines is just plain good wisdom.” 26 Subjects With the voluntary participation of seniors visiting senior meal sites in Northeastern Michigan, interventions were conducted on March 17, 2004, in Deckerville, Mich., and on March 30, 2004, in Cass City, Mich. The first group was comprised of 41 participants, while the second group was comprised of 13 seniors. The gender proportions of the two groups were very similar along with similarities in age distribution and age range. Of those who provided this information, 34 were female and two were male. The average age was 74 with 35 percent of seniors falling between 66 and 75 years of age. Additionally, among those seniors who responded to demographics queries, they reported an average income of less than $12,630.80, that the average educational level was the completion of high school, and that all but one, a farmer, were retired (Appendix A). Seniors, as well as pregnant women, children, and immune-comprised individuals, are a crucial audience for food safety education. These groups are more susceptible to acquiring a food-borne illness and have more difficulty recovering from illness, according to the United States Food and Drug Administration (U.S.F.D.A.). The senior segment can also be challenging in that most have conceptualized ideas about food safety knowledge from having spent a “lifetime” in the kitchen preparing and handling food on their own terms. Procedure In condition one, the first group of 41 seniors completed the retrospective-then and posttest survey following the 30-minute program. In conditions two and three, a group of 13 seniors completed the pretest and then split into two sections. While 10 completed the posttest only (condition two) immediately following the program, only 27 three seniors in the first section successfully completed the retrospective then and posttest survey (condition three) following the session. Within both groups, there were seniors who chose not to complete a survey. A control or comparison group was not used as due to response shifts, they are unable to accurately provide the comparison sought and may be impractical to use (Robs, 2002). According to Rohs, ”The score on a given scale may have a different meaning for the participant group than the control group.” This document previously cited response shift studies that give explanation for this decision. Treatment The curriculum used during the intervention was developed by the United States Food and Drug Administration (U.S.F.D.A.) and is entitled, “Food Safety for Seniors.” Each senior received a copy of the U.S.F.D.A. “Food Safety for Seniors” guide, along with a food thermometer and a cutting board. The curriculum was taught by a nutrition specialist from the Human Development Commission of Caro, Mich., a 2004 Michigan Department of Agriculture Consumer Food Safety Mini-grant recipient. The intervention consisted of a Powerpoint presentation of the four basic steps to food safety: clean, cook, chill and avoid cross-contamination. Following the session and evaluation, seniors were invited to participate in a hand washing experiment using a UV blacklight and luminescent handgel with the intention of increasing each senior’s understanding of how to properly wash one’s hands. Instrument The dependent measure consisted of one self-report measure. The measure consisted of eight items tapping participants’ perceptions of their food safety knowledge. 28 A heading asked for an estimation of “knowledge of the following:” and self-report items were phrased in statements such as, “How to thoroughly wash your hands,” and “How to use a food thermometer” (Appendix B). A paper-pencil self-report measure was used to gather information on knowledge levels. Based on the University of Tennessee Cooperative Extension model (2003), nutrition studies (Rockwell, eta]., 1989; Raidl, 2004) and three unpublished trial studies conducted by the N.F.S.T.C. in 2003, the measure sought reactions to eight food safety competencies on a 5-point scale (1=none, =complete). Alpha reliability computations for the eight items, when presented in the pretest, retrospective-then test, and the posttest, averaged .98, .96 and .87, respectively. The factor structure of the scale was deemed unidimensional as exploratory factor analyses for the pretest, retrospective-then test and posttest scales yielded primary eigenvalues of 5 and above and subsequent eigenvalues at substantially less than 1. Due to the alpha reliability and factor analyses scores, the results must be analyzed as a whole. Using examples from prior studies (Mezoff, 1981; Rockwell and Kohn, 1989), a comparison of means was conducted. RESULTS Significant results were found in comparison between the retrospective-them posttest group versus the pretest/posttest group (Table 1). Unfortunately, due to a low number of subjects and low participation rate in condition three (only three complete surveys were collected), that data was rendered useless. It could be assumed that, as this measure was administered in a group setting, the influence of other group members (Poling, 1999) created a climate where those who were completing the then/post measure 29 experienced anxiety as the posttest respondents had completed their survey. Additionally, the site at which the intervention was conducted serves meals immediately following the activity possibly prompting some to drop their pencils early and pick up their lunch plates. Table 1: Comparison of mean scores for conditions one and two. Condition 1 Condition 2 Variable Then / post (2:43 % increase Pre /&st_(r_r¥=10) % increase How to thoroughly 3.93 4.68 16.1 4.9 4.9 O wash hands Why washing hands 3.88 4.4 11.9 5 4.7 -3 is important Knowledge of cross- 3.46 4.51 23.3 3.9 4.2 7.2 contamination How to use a food 3.02 4.17 27.6 2.8 4.7 41.5 thermometer Why a food 3.17 4.34 27 2.5 4.4 43.2 thermometer is important How to properly 3.7 4.7 21.3 4.1 4.3 4.7 chill food How to properly 3.7 4.54 18.6 2.9 4.5 35.6 store food How to handle take- 3.29 4.41 25.4 4 4.8 16.7 home food Total: 21 17 Significant differences were found between the scores from condition one and condition two. The pretest/posttest group of seniors who completed the self-report measures at the beginning and at the conclusion of the training only reported meaningful differences in scores on 4 of the 8 food-handling dimensions. However, the retrospective- then/posttest group, condition one, reported meaningful levels of change in 8 of the 8 food-handling dimensions. A closer inspection of pretest and retrospective-then mean scores reveals that that condition one participants (retrospective-then/posttest group) 30 reported significantly lower mean pretest scores on 5 of the 8 questions than did the pretest/posttest group. In addition to comparing the means for each dimension, paired-samples t—tests for conditions one and two were conducted (Table 2). Paired sample t-test are conducted to examine significant differences between means observed for two independent groups, such as men vs. women, white vs. black, and so on. Table 2: Paired sample t-tests Pretest Then-test Posttest T Sig. (2-tailed) Condition 1 X 3.52 4.47 -8.4 0.0 Condition 2 3.83 X 4.56 -2.2 .056 In condition one, the significance is less than .05, thus rejecting the null hypothesis that population mean scores is 0. As the significance in condition two is just above .05, the null hypothesis is accepted. In comparison, it can thus be concluded that participants’ in condition one were not influenced by the treatment and did not undergo a response-shift bias. The treatment did create a reference frame shift for respondents in condition two. In comparison, the average knowledge gain in the post/then condition (21 percent) versus the pretest/posttest condition (17 percent) is very similar to other comparisons in food safety knowledge studies conducted by the N.F.S.T.C. and Raid] (2004). Unfortunately, the low sample sizes for conditions two and three are significant limitations. This data suggests that a response shift may have taken place among the participants who were asked to answer each assessment item twice, once before the training and then again, after the training (pretest/posttest group). In doing so, these 31 participants were evaluating themselves with a shift in their standard of measurement or level of understanding on both their pretest responses (recorded before the program) and posttest (how they felt now). Thus, the comparisons of their pretest ratings of food safety knowledge to their post ratings reflect a less accurate assessment of their knowledge gains than did those in the participant group who rated their knowledge level twice at the conclusion (post/then group). The recommendation from this study to purveyors of food safety knowledge to consumers is that the retrospective-then survey approach be considered more strongly. In addition, “data-snooping” into specific items revealed an interesting flux in scores related to seniors’ knowledge of food thermometer use. For both groups, the pretest estimates of dimensions four and five revealed lower mean scores than for other items. This resulted in knowledge gains ranging from 27 percent to 43.2 percent. An inference that can be made is that the food thermometer dimension is a subject that participants, in their estimation, truly do not know enough about or understand in a significant manner. As just ten completed surveys were available from the treatment group in condition two, further investigation of the thermometer knowledge dimension may be warranted. CONCLUSIONS AND DISCUSSION While results from condition three were unusable in this study due to the low sample size, future examination of the response-shift bias phenomenon should consider this three-step approach, as suggested by Sprangers and Hoogsraten (1989), and Sadri ’ and Snyder (1995). This is particularly crucial within the field of food safety as effective evaluation tools are needed in the face of budget constraints and belt-tightening. 32 The comparison of results from conditions one and two met expectations for this study, based on comparative studies within other fields. However, the response-shift bias, while evident in this example, was not as powerful as in other food safety studies using this comparison. In unpublished studies from the N.F.S.T.C., for example, the difference in overall percentages from pretest/posttest to post/then scores was as high as twenty percent. A larger sample size in both conditions could provide more evidence of this shift. Evaluation theorist Michael Scriven developed the modus operandi method for inferring cause when better methods are not practical (1976, as cited in Shadish, Cook and Leviton, 1991). In inferring causes for the findings of the particular study, income could be a possible cause, based on known evidence. The smaller difference in this example could be due to seniors’ familiarity with many of the food safety concepts presented in the curriculum. This coincides with anecdotal findings that infer that those subpopulations most concerned with food and food security (defined here as the stockpiling and making available of food for low- income individuals) are more conscious of food safety principles. As the most commonly cited income level for this audience was less than $9,999.00, the seniors in this study would be considered low-income individuals. Quite possibly, the development of food safety curriculum that specifically addresses subpopulations that differ by age and income may need to be considered. Along with the inference of cause(s), the sustainability of knowledge gain must be questioned. In accordance with the hierarchy of program change too] for Extension educators, (Bennett, 1979), an additional step in assessing participants’ benefits from an 33 educational intervention of this type would be an assessment of durability of knowledge change. For instance, within the knowledge measure, a suitable assessment could result in the statement, “95 percent of seniors can recall the four steps of safe food handling one year after learning them.” While seniors in this study indicated an increase in their levels of knowledge change, the creation of a means to contact participants at a later point in time would enhance the credibility of this study. Unfortunately, as affirmed in this study, participants using self-report measures may be reluctant to provide needed contact information. Lastly, interesting implications can be drawn from the examination of dimensions four and five — the use and importance of food thermometers. Though the USDA. Food Safety and Inspection Service reports that seniors are more likely to own a food thermometer that other subpopulations (2001), a Pennsylvania State University study (Gettings and Kieman, 2001), points to seniors’ lack of understanding in how to use a food thermometer or a lack of understanding of the importance of food thermometer use. Results from focus groups found that many seniors failed to use a meat thermometer, instead relying on a specific amount of cooking time and using utensils to cut food open and checking doneness by sight. "We heard comments such as, 'If you take chicken out and see blood, then you know you have to leave it in longer,' and, 'I wiggle the turkey leg, and if it's loose, I guess it's done,"' said Gettings. "Barriers to adopting the proper method included resistance to change, the perceived inconvenience of using a thermometer and a lack of resources -- they say they don't own and can't afford a thermometer," according to Gettings. 34 Recommendations from the Gettings and Kieman study include educating consumers through the use of visual aids, free programs in health center-settings and free food thermometers to promote participation. All of these recommendations were followed in this study. The U.S.D.A.-F.S.I.S. 2001 findings report that the “overall use of thermometers is low.” While meat and poultry are increasingly recognized as high-risk food for food- borne illness, only three percent of consumers check their burgers with a food thermometer, and in observational studies, most do not know how to interpret thermometer readings. As a result, the study found that 82 percent undercooked chicken, a source of the harmful pathogen, salmonella (U.S.D.A.-F.S.I.S., 1999). The Anderson study in 2000 found similar results as just five percent of participants used a food thermometer to check for doneness. In understanding consumers’ perceptions regarding food thermometer use, six focus groups with consumers ranging in age from 21 to more than 70 were conducted by the U.S.D.A.-F.S.I.S. in 1997. In all six groups, the use of a food thermometer was not offered as a means of keeping safe from food-home illness. Seniors seemed far more likely to use a thermometer than young adult and young parent groups, yet overall, participants seemed far less likely to use food thermometers than is suggested by past research. While the historical information on seniors’ use of food thermometers may contribute to the explanation of the results in dimensions four and five, two other possibilities exist. 35 First, handwashing, the storage of food, or the washing of cutting boards, for example, may all be behaviors that seniors view as easily-achievable as the process is simple to follow. One can assume that most seniors have access to warm water and soap, for example, and handwashing is fairly easy to comprehend. The use of a food thermometer differs in that this is the adoption of a form of technology and a behavior, rather than the simple adoption of or modification of a behavior. With the listing of different cooking temperatures for different foods and the multitude of different types of thermometers available, there could be underlying complexities that affect seniors’ adoption of this practice. Could the fear of technology be one of these underlying barriers? Second, in dimensions four and five, seniors may felt more inclined to admit to lower levels of knowledge based on social desirability bias (Fisher, 1993). In other words, admitting that one has a lower level of knowledge regarding food thermometers may be more socially acceptable in respondents’ minds versus admitting that one does not know very much about thorough handwashing. No one wants to be thought of as dirty or unclean, and the belief that others may be watching and that it more socially desirable to profess knowledge of handwashing or other dimensions could have actually biased respondents’ answers on dimensions other than food thermometer use and importance. With these findings in mind, the thermometer component should be examined more closely and consideration may need to be given to its promotion as a socially- desirable behavior, or that a lack of thermometer use is socially-undesirable. Due to the unidimensional nature and high alpha scores for this particular survey, an accurate breakout of these specific dimensions was unattainable. In the future, educational 36 interventions should be reassessed to adequately meet the needs of subpopulations. Future study could focus on the education and evaluation of this component in comparison with other fundamental food safety concepts. Evaluation from a broader viemim Lastly, this program must be judged not just empirically, but also from a broader view. When broader conceptual questions are asked, some determination can be made as to whether this program is “good” or has worth. (Shadish, Cook, Leviton, 1991). While many of these questions may have multiple answers, a linking of factual claims with knowledge claims can point to an understanding of program value (Cronbach & Meehl, 1955). Shadish, Cook and Leviton (1991) pose key questions in five program areas that can aid in the broader evaluation of a program (Table 3). While many of these questions are more the stuff of theory, they can perhaps point to judgments regarding this specific program. As this study has attempted to address the value of education within a specific social context and the fulfillment of evaluation in meeting accountability needs, the areas of social programming and valuing will be examined. 37 Table 3: Five areas that assist in evaluating a program Social programming What are the important problems this program could address? Can the program be improved? Is it worth doing so? If not, what is worth doing? Knowledge use: How can I make sure my results get used quickly to help this program? Do I want to do so? If not, can my evaluation be useful in other ways? Valuing: Is this a good program? By which notion of “good”? What justifies the conclusion? Knowledge construction How do I know of all this? What counts as a confident answer? What causes that confidence? Evaluation practice: Given my limited skills, time, and resources, and given unlimited possibilities, how can I narrow my options to do a feasible evaluation? What is my role — educator, methodological expert, or judge of program worth? What questions should I ask, and what methods should I use? Social programming What are the important problems this program could address? The intended purpose of the food safety education program for seniors was the provision of knowledge in order to cause seniors to employ safer food handling practices within their own kitchens. As seniors are a crucial audience when it comes to food-borne illness susceptibility (U.S.F.D.A.), any positive changes that this program can create are important. 38 Can the program be improved? Scriven (1980, as cited in Shadish, Cook and Leviton, 1991) contends that all logic and products of evaluation be “unpacked,” and that the process, materials, treatment length, tests, and all possible variants be examined (1973, as cited in Shadish, Cook and Leviton, 1991). Within this “broadening of horizons,” and with this specific study in mind, improvements could certainly be made based upon the meeting of seniors’ needs. For starters, one might call into question the U.S.F.D.A. materials from which the instruction was provided. Additionally, the personnel conducting the intervention could be changed, or their delivery improved based on the needs of the audience. The venues that hosted the sessions could, according to Scriven (1974, as cited in Shadish, Cook and Leviton, 1991), also be approached. In improving conveyance of the program curriculum, one might consider application of Health Belief Models within a food safety context. Schafer, Schafer, Bultena, and Holberg (1993, as cited in Medeiros, Hillers, Kendall, Mason, 2001) found that a combination of perceiving potential food-related illness as a personal threat, conveying the benefits of following specific actions, and high self-efficacy, lead to engagement in food safety behavior. The current study, while attempting to encourage the practice of specific actions and self-efficacy, may have spent too little time convincing seniors of the potential harm of food-borne illness. Greater consideration could also be given to the use of personal-efficacy as a means of empowering participants to make changes in their behavior. Part of developing this sense of empowerment is the development of confidence in one’s actions. Barrel] (1995) suggests that self-confidence and a feeling of being in control can motivate participants to 39 perform at higher levels. Personal efficacy is derived from an internal locus of control. The greater the internal locus of control, the greater the chance that participants will attribute their success to their own abilities and not to luck or chance, as do persons with an external locus of control. When the connection is made between thoughts leading to action, they can positively affect their own performance. Researcher Kim Witte suggests other variables can also play a part in behavior modification, per most health behavior change models. These include fear, barriers, subjective norms, defense avoidance and reactance. Along with the Health Belief Model, another approach, though less common, is Witte’s Extended Parallel Processing Model (EPPM). The EPPM addresses fear as a behavioral modifier (W itte, 1992). The first part of this model focuses on an emphasis of the magnitude of a threat and the probability that the threat will occur. The second component of the EPPM focuses on the efficacy of the recommended response with regard to response-efficacy and self- efficacy. Within efficacy, steps are presented with the intention of increasing self- efficacy as a means to avert or minimize the threat. Fear appeals, when employed correctly, are useful in health behavior change (W itte & Allen, 2000, as cited in Borg, Cicotte, Finks, Mercado, 2000). Fear appeals attempt to motivate individuals to perform certain recommended behaviors by scaring people into action (Morman, 2000, as cited in Borg, et al., 2000). While fear can play a powerful role in getting people to adopt recommended behaviors, caution must also be given to the level of fear the food safety educator imposes upon one’s audience. As food is a commodity that plays a vital role in the lives 40 of each of us on a daily basis, the provocation of fear may require tempering to a certain degree, such that individual’s consumption habits are not seriously affected, or the creation of neurosis leading to the desire to constantly “keep clean.” Anecdotal evidence exists that tells of over-frequent handwashing, for example, on the part of children, washing to the point of skin damage to their hands for fear of “germs.” Other factors that could enhance this study include the collection of a greater amount of sample data. This would allow for more critical evaluation of the effort, possibly ceasing the reporting of overly optimistic results. As Campbell argues (as cited in Shadish, et a1, 1991) in his description of social programming, uncritical evaluations do not aid in the promotion of useful social knowledge. Lastly, thorough quantitative or qualitative goal-free examination is needed to target the variants that could possibly improve this program. One component of this could be the procurement of voluntary testimonials from those who have participated in the treatment, as suggested by Campbell (1969, as cited in Shadish, et al., 1991). While this information could be useful in presenting program results, this could also provide thoughts on the examination and improvement of message content and delivery. As Scriven argues (1974, as cited in Shadish, et al., 1991), if some doubt is placed on the effects of the treatment, then the “merit of the materials must be judged less positively.” Is it worth doing so? As has been shown from knowledge-based intervention efforts, there is concern over knowledge retention (Mann, 1996) and knowledge actualization (Rennie, 1994). Is the information being acted upon and to what degree? What elements of the information are being actualized versus those that are not and why? The question that arises then is, 41 “What are the appropriate criteria for declaring that an individual has achieved a desired behavior?” (Medeiros, Hillers, Kendall, Mason, 2001). Within food safety, today’s actions can lead to tomorrow’s illness. Thus the goal is performance of the appropriate behaviors 100 percent of the time. As Medeiros, et a1. (2001) suggest, program directors must decide where to set their own goal criteria for evaluation of a successful program. While zero tolerance is ideally the goal for food safety education, the reality is that food handling errors will most certainly continue. An assessment of current behavior versus desired behavior is one approach. Another is assessing movement over time toward achieving a desired behavior. Based on the estimate of the number of incidences of illness that occur each year (Mead, et al., 1999), seniors do have health and economic needs for food safety information. The worth of improving the program is apparent. Careful consideration, however, must be given to seniors’ level of need versus other forms of programming (contrary to the concept of simply holding a program to absolute standards, as cited in Shadish, et al., 1991). Using Scriven’s key evaluation checklist (1974, as cited in Shadish, et al., 1991), a needs assessment would indicate whether there is a “desperate need,” or just “a possible significant need.” Determinations like this could drive the decision to consider this a program of value and worthy of improvement. If not, what is worth doing? As Michigan State University embarks upon the 2004 food thermometer campaign partnership with the U.S.D.A., scrutiny regarding the focus of the campaign has arisen. While there is some evidence that poor personal hygiene and improper handwashing are responsible for a greater estimated number of incidences of food-borne illness, according 42 to Bryan (1988, as cited in Medeiros, et al., 2001), consideration must be given to what is being taught within food safety and, going beyond comparative standards (that might just look at a program being good versus its alternative, as suggested by Scriven and Campbell, as cited in Shadish, et al., 1991), what practices are most likely to be accepted and adopted. In her 2000 observational study, Janet Anderson found that a higher number of consumers were observed using a food thermometer in their kitchens, versus the number that reported on the pretest self-report that they use a food thermometer. In most other areas, however, the actual practice of other food safety behaviors, such as handwashing, were lower than reported. The inference could be that fewer barriers exist pertaining to the use of food thermometers. As Medeiros, et al. (2001), found, many programs to date have put all of their eggs in one basket, attempting to teach food safety practices in a number of areas all at once. Medeiros, et a]. (2001) also find that none of the programs included evaluation tools organized to measure specific behavioral constructs. The problem in clumping subject matter, they report, is that programs cannot be re-focused to meet the specific subject matter construct needs based on the knowledge of an audience for a specific construct. By basing a program on a specific behavioral construct, outcome measures could be linked to show the effectiveness of specific constructs in reducing food-borne illnesses. This approach is inline with Campbell’s description of validity: The validity of a claim that A caused a change in B, with causation meaning the change in B would not have occurred without A (1986, as cited in Shadish, et a1, 1991). 43 The organizers of the Michigan State University and U.S.D.A. food thermometer campaign have recognized the need for focusing based on several observations (Conley, Andreasan, Durch, Stafford, McPeak, 2002): 1) There are already many handwashing education campaigns in existence 2) Consumers’ adoption of proper handwashing behavior faces a considerable number of barriers such as “no time for washing,” “soap is not available,” or simply, inadequate practice of the behavior 3) Prior U.S.D.A. research has shown that increased marketing focus on the use of food thermometers within specific segments of the population doubled the practice of the behavior between 1998 and 2001 4) Certain segments of the population can be more easily moved from pre— contemplation to contemplation than others, providing a “low-hanging fruit” scenario. 5) More than 50 percent of consumers already own a food thermometer, whereas, those using the thermometers “always” or “often” is closer to three percent Researchers involved in the project believe that because most consumers already own a thermometer and may be familiar with thermometer use associated with preparing a turkey at Thanksgiving, the task of moving toward an increase in the use of thermometers for meat or other forms of poultry may be a “hard-sell,” but is definitely not a “no-sell.” Elie: The value of a program and declaring it “good” or “bad,” can depend on valuation, argues Rescher (1969, as cited in Shadish, Cook and Leviton, 1991). He points out that this valuation can be compared with other programs or metric, and / or assessed with numerical measurements. In valuing a social program, such as food safety education for seniors, evidence, logic and rationality all play a role (Shadish, et al., 1991), but need can dictate a determination of value, argues Scriven (1980, as cited in Shadish, et al., 1991). Furthermore, a decision must be made as to the detrimental or harmful effects if a program is discontinued, and consequently, a need goes unfulfilled (Beauchamp, 1982, as cited in Shadish, et al., 1991). As this specific program attempted to provide seniors with the means to counter the potential for food-borne illness, a significant health and economic burden, and provided positive numerical results, one might argue that a need exists. The program cannot be judged on its own merits, however. Comparatively speaking, multiple perspectives are needed, according to Scriven (as cited in Shadish, etal., 1991), as a single study is “incomplete.” Additionally, in fulfilling the need of food safety education for seniors, the findings of the social program must be placed in the hands of those for whom the outcomes might have the most value. As Shadish, et a1. (1991) contend, identifying specific users of the evaluation could increase the probability of creating useful information. Evaluation researchers such as Carol Weiss and Joseph Wholey (as cited in Shadish, et al., 1991) believe that policymakers and decision makers should be recipients as they are responsible for social policy. Clients, service providers and local administrators should also be included, argues researcher Robert Stake (as cited in Shadish, et al., 1991). For 45 educators and evaluators, serving those in need, while meeting stakeholder’s values and goals, should be a priority. 46 APPENDICES 47 APPENDIX A DEMOGRAPHIC INFORMATION FOR STUDY PARTICIPANTS A1. CUMULATIVE STATISTICS FOR PARTICIPANTS IN THE MARCH 17 AND MARCH 30, 2004, WORKSHOPS AGE GENDER OCCUPATI EDUCLVL INCOME N # of responses 44 36 40 19 19 submitted # that did not respond to 10 1 8 14 25 25 these items Mean 74.7273 -12630.8421 A2. AGE - MEAN: 74.73 Cumulative Frequency Percent Valid Percent Percent Valid 62.00 2 3.4 4.5 4.5 54-00 1 1.7 2.3 6.8 65.00 1 1.7 2.3 9.1 66-00 3 5.1 6.8 15.9 67-00 3 5.1 6.8 22.7 69-00 1 1.7 2.3 25.0 70.00 3 5.1 6.8 31.8 71.00 2 3.4 4.5 36.4 72.00 2 3.4 4.5 40.9 73.00 3 5.1 6.8 47.7 74.00 2 3.4 4.5 52.3 75.00 2 3.4 4.5 56.8 76.00 1 1.7 2.3 59.1 77.00 1 1.7 2.3 61.4 78.00 1 1.7 2.3 63.6 79-00 2 3.4 4.5 68.2 80.00 1 1.7 2.3 70.5 81-00 3 5.1 6.8 77.3 82-00 2 3.4 4.5 81.8 83-00 2 3.4 4.5 86.4 84-00 1 1.7 2.3 88.6 85-00 2 3.4 4.5 93.2 86-00 3 5.1 6.8 100.0 Total 44 74.6 100.0 Missing System 10 254 Total 54 100.0 48 A3. GENDER Cumulative Frequency Percent Valid Percent Percent Valid 36 39.0 39.0 39.0 f 34 57.6 57.6 96.6 m 2 3.4 3.4 100.0 Total 36 100.0 100.0 A4. OCCUPATION Cumulative Frequency Percent Valid Percent Percent Valid 40 32.2 32.2 32.2 farmer 1 1.7 1.7 33.9 ret 39 66.1 66.1 100.0 Total 40 100.0 100.0 A5. EDUCATION LEVEL Valid Cumulative Frequency Percent Percent Percent Valid 29 50.8 50.8 50.8 College graduate 2 3.4 3.4 54.2 Grade school 2 3.4 3.4 57.6 High school 15 25.4 25.4 83.1 Some college 5 8.5 8.5 91.5 Some high school 5 8.5 8.5 100.0 Total 29 100.0 100.0 A6. INCOME Cumulative Frequency Percent Valid Percent Percent Valid -29999-00 1 1 .7 5.3 5.3 -1 9999.00 1 1.7 5.3 10.5 44000.00 5 8.5 26.3 36.8 -9999.00 12 20.3 63.2 100.0 Total 19 32.2 100.0 Missing System 35 67.8 Total 54 100.0 49 APPENDIX B SURVEY TOOL USED IN SENIOR FOOD SAFETY EDUCATION INTERVENTION Below are statements about food safety knowledge. We would like you to tell us how complete your knowledge is of each piece of food safety information before today’s workshop. For each item on the left, circle a number in the box labeled “Before today’s workshop” that best describes your knowledge today. If you circle a number “1,” for example, then you estimate that your knowledge level of that subject is “none.” If you circle a number “5,” for example, then you estimate that your knowledge level is “complete.” Please estimate your knowledge of the following: MY KNOWLEDGE Before today’s workshop 1. How to thoroughly None little moderate quite a bit complete wash your hands 1 2 3 4 5 2. Why hand washing None little moderate quite a bit complete is important 1 2 3 4 5 3. How to avoid CIOSS- None little moderate quite a bit complete contamination 1 2 3 4 5 4. How to use a food None little moderate quite a bit complete thermometer 1 2 3 4 5 5. Why I should use None little moderate quite a bit complete a food thermometer 1 2 3 4 5 6. How to safely None little moderate quite a bit complete chill food 1 2 3 4 5 7. How to safely None little moderate quite a bit complete store food 1 2 3 4 5 8. How to safely handle None little moderate quite a bit complete food that I take home 1 2 3 4 5 from a restaurant or other place 50 BIBLIOGRAPHY 51 BIBLIOGRAPHY Anderson, J. (2000. A camera’s view of consumer food handling and preparation practices. Final report prepared for the United States Food and Drug Administration. North Logan, Utah: Spectrum Consulting. Anderson, J. (2002). flat consumers say they do. . .wh_at they actually do: A comparison? Presented Sept. 18, 2002, at Thinking globally - working locally: A conference on food safety education. Orlando. Arnold, M. (2002). Be logical about program evaluation: Begin with learning assessment. Journal of Extension, 40 (3). Bennett, C. F. (1976). Analyzing impacts of Extension proggams (ESC-575). Washington, DC: Extension Service, USDA. Bandura, A. (1980). Gauging the relationship between self-efficacy, judgment and action. Cognitive Therapy and Research, 4(2), 263-268. Bandura, A. (1982). Self-efficacy mecfihanism in human agency. American Psychologist, 12(4), 30-40. Bandura, A. and Adams, NE. (1977). Analysis of self-efficacy theory of bemvioral change, Cognitive Theory and Research, 1, 278-310. Barrel], J. (1995). Working toward student self-direction and personal efficgg as educational goals. North Central Regional Educational Laboratory, http://www.ncrel.org/sdrs/areas/issues/students/learning/lr200.html. Belasco, J. and Trice, H, (1969). The assessment of Change in Training and Therapy. New York, McGraw-Hill. Bennett, CR, (1977). Analyzing impacts of extension proggams. Washington, DC: Cooperative Extension Service, US. Department of Agriculture. ESC 575. Bennett, C. F., & Rockwell, K. ( 1996, draft). _T_argeting outcom_es of programs fl‘ OP): An integrated approach to planning and evaluation. Washington, DC: CSREES, USDA. Borg, M., Cicotte, D., Finks, C., Mercado, C. (2000). Using fear to reach militg members: A content—analytic study of Defense Department PSAs. University of Oklahoma, http://www.ou.edu/deptcomm/dodjcc/groups/ 01A1frntroduction.html. Bray, J .H. and Howard, GS. (1980). Methodological considerations in the evaluation of a teacher-training program. Journal of Educational Psychology, 72(1): 62-70. 52 Campbell D. and Stanley J. (1963). Experimental and Quasi-experimental Designsr for Research on TeachingyChicago, Rand McNally, 1963. Caporaso, J .A. (1973). Quasi-experimental Approaches to Social Science: Perspective and Problems. Evanston, IL, Northwestern University Press. Centers for Disease and Control, (1997). Surveillance for Food-borne Difise Outbreakp --United States, 1993-1997. Atlanta, GA: US. Department of Health and Human Services, Public Health Service, CDC. Chapman, K., Clark, C., Boeckner, L., McClelland, J., Britten, P., & Keim, K. (1995). Multistate impact indicators project. Proceedings from the Society for Nutrition Education Annual Meeting, 20, 45. Chapman-Novakofski, K. Boeckner, L.S. Canton, R. Clark, C. D. Keim, K. Britten, P. McClelland, J. (1997). Muating evaluation: ngt we’ve leained. Journal of Extension, 35 (l). Cicciarella, C. (1995). Experimental Research Concepts. http://www.latech.edu/tech/education/cicciarella/hpe509/experim.txt. Conley, S., Andreasen, A., Durch, J ., Stafford, S., McPeak, H. (2002). (unpublished) USDA’s food thermometer educati:on campaigp: A social marketing approach to change Mavior and prevent foodbome illness. A report for the Journal of Nutrition and Behavior. Cook, TD. and Campbell, UT (1979). Quasi-experimentation: Desigp and analysis issues for field setting§._Boston, Houghton-Mifflin. Cronbach, L., Meehl, P. (1955). Construct validity in psychological tests. Psychological bulletin, 52, 281-302. Decker, DJ. and Yerka, BL. (1990). Organizational philosophy for prom evaluation. Journal of Extension, 28, 28-29. Diem, KG. (2003). Program development in a political world: It’s all about impact! Journal of Extension, 41 (1). Dillman, DA. (197 8). Mail and telephone surveys: the total desigr_r method. New York: John Wiley and Sons, Inc. Duncan, S. and Goddard, H.W. (1997). Training in evaluation for parent education programs using the National Extension Parent Education Model (NEPEM). Journal of Extension, 35 (2). 53 Fisher, R. J. (1993). Social desirability bias and the validity of indirect questioning. Journal of Consumer Research, 20, 303-315. Gentry-Van Laanen, P. (1995). Evaluating extension prgram effectiveness: Food gifgty education in Texas_. Journal of Extension, 33(5). Gettings, M. and Kieman, N. (2001). Pr_actices and perceptions of food safety among seniors who prepare meals at home. Journal of Nutrition Education, 33(3), 148-154. Goldstein, LL. (1994). Training in Organisations, Brooks/Cole, Pacific Grove, CA. Grove, DA. and Ostroff, C. (1990). Program evaluation, in Wexley, K. and Hinricks, J. (Eds), Developing Human Resources, BNA Books, Washington, DC. Hoogsraten, J. (1982). The retrospective pretest in an educational training context, Journal of Experimental Education, 50:200-04. Howard, GS. (1980). Response-shift bias: a problem in evaluatinmterventions with Dre/post self-reports. Evaluation Review, 4(1):93-106. Howard, GS. and Dailey, PR. (1979). Response shift biw source of contamination of self-report measures. Journal of Applied Psychology, 64: 144-50. Howard, G., Dailey, P., and Gulanick, N. (1979). The feasibility of informed pretests in attenuating response-shift biaa, Applied Psychology Measurement, 3, 481-494. Howard, C.S., Ralph, K.M., Gulanick, N.A., Maxwell, S.E., Nance, D.W. and Gerber, S.L. (1979). Internal validity in pre-test-post-test self-report evaluations and a reevaluation of retrospective pre-tests. Applied Psychological Measurement, 3: 1-23. Howard, C.S., Schmeck, RR. and Bray, I.H. (1979). Intergd invalidity in studies employing self-report instrum_ents: A 'suggested remedy'. Journal of Educational Measurement, 16(2), 129-135. Howard, G.S., Millham, J ., Slaten, S. and O’Donnell, L. (1981). Influence of sub@ response-style effects on retrospective measures. Applied Psychological Measurement, 5: 144-150. Kieman, N. (2002). Proggam evaluation, httpz/lwww.extension.psu.edu/evaluation. Kirkpatrick, BL. (1960). Techniques for evaluating training programs. Journal of the American Society of Training Directors, 14: 13-32. Laughlin, K. and Randall, K. (accessed on January 16, 2004). Exteénsion Tactics: Proggam Evaluation, University of Idaho Agricultural Communications, 3. 54 Lewin, K. (1947). Group decision and social change. In T. Newcomb and E.Hartley (Eds) Readings in social msychology. Holt: New York, 330-345. Linn, R. L., and Slinde, J. A. (1977). The determination of the significance of change between pre- and post-testing periods. Review of Educational Research, 47: 121-150. Macro International Inc. (1998). Unpublished data. Focus groups on barriers that limit consumers’ use of thermometers when cooJking meaL and poultry products. Submitted to the United States Department of Agriculture Food Safety and Inspection Service, Jan. 1998. Marshak, S.H., deSilvva, P. and Silbertstein, J. (1998). Evaluation of peeLtaught nutrition education for low-income parents. Journal of Extension, 27, 19-21. McClelland, J ., Keim, K., Britten, P., Boeckner, L., Chapman, K., Clark, C., & Mustian, R. (1995). Measuring dietary szknowledge and bmvior using impact indicators. Proceedings from the Society for Nutrition Education Annual Meeting, 20, 46. McCune, J. (1994). Measuring the vfle of employee education. Management Review, 83(4). Mead PS, Slutsker L, Dietz V (1999). Food-related illness and death in the United States. Emerging Infectious Diseases, 5:607-25. Medeiros, L., Hillers, V., Kendall, P., Mason, A. (2001). Evaluation of food safety education for consumers. Journal of Nutrition Education, 33, 27-34. Mezoff, B. (1981). How to get accurate self-reports of training outcomes. Training and Development Journal, 56-61. Mezoff, B. (1983). Six more benefits of pretesting trainees. Training, 4547. Mincemoyer, C. and Perkins, D. (2001). Buildingfiwur youth developmJeant toolkfi A community youth development orientation for Pennsylvania 4-H / youth programs. Journal of Extension, 39 (4). National Food Safety and Toxicology Center, http://www.foodsafe.msu.edu Neale, J .M. and Liebert, RM. (1973). Science and Behflor: An Introduction to Methods on Research. Englewood Cliffs, NJ, Prentice-Hall, 1973.) Patton, M. Q. (1986). Utilization-focuaed evaluation. Beverly Hills, CA: Sage. Phillips, J. (1991), H_andbool_< of Training Evaluation and Measurement Method_§, Gulf Publishing, Houston, TX. 55 Pohl, NP. (1982). Using retrospective pre-ratings to counteraLt response-shift confounding. Journal of Experimental Education, 50(4), 211-214. Pratt, C. McGuigan, W. and Katzev, A. (2000). Measuring program outcomes: Using retrospective p_retest methodology._American Journal of Evaluation, 21, 341-349. Raidl, M, (2002). Extension nutrition proggam uses retrosmctive survey to determine impact on participants, Impact, University of Idaho Extension. Raidl, M. (2004) Unpublished data. Use retrospective surveys to obtain complete dag sets and measure impact in extension program. University of Idaho Cooperative Extension Services. Rhodes, J .E. and Jason, LA. (1987). The retrospective pretest: An alternative approach in evaluating drug prevention. Journal of Drug Education, 17: 345-356. Riley, J .L. and Greene, RR. (1993). Influence of education on self-perceived attitudes about HIV / AIDS among human services providers. Social Work, 38 (4): 396-401. Rockwell, SK. and Kohn, H. (1982). Post-then pre evaluation. Journal of Extension, 27: 19-21. Rockwell, S. K., & Stevens, G. L. (1992). How accurate are pretest self-report measures that assess adults' knowledge apd behavior? Paper presented to the American Evaluation Association. Rohs, F. and Langone, CA. (1996). Measuring leadership skills development: A comparison of methods. Association of Leadership Educators Proceedings, Burlington, Vermont, 7: 73-78. Rohs, F. and Langone, CA. (1997). Increfld accaracy in m_aasuring leadership impacts. Journal of Leadership Studies, 4(1): 150-158. Rohs, F. and Langone, CA. (1998). Response-shift bias in student self-report assessments. N ACTA Jouma], 42(1): 46-49. Rohs, F. (1999). Response-shift bias: A problem in evaflating leadership development with self-report pretest-posttest measures. Journal of Agricultural Education, 40(4): 28- 37. Rohs, F., Langone, C., Coleman, R. (2001). Response-shift bias; A problem in evaluating nutrition education usingself-report measures. Journal of Nutrition Education, 33(3), 165- 170. Rohs, F. (2002). Improving the evaluation of leadership proggams: Control resmnse shift. Journal of Leadership Education, 1(2). 56 Sadri, G. and Snyder, P. (1995). Methodological issues in assessing training effectiveness. Journal of Managerial Psychology, 10(4). Schafer, R., Schafer, E., Bultena, G., Holberg, E. (1993). Food safetY: An application of the health belief model. Journal of Nutrition Education, 25, 17-23. Schalock, R. and Bonham, G. (2003). Measuring outcomes and managing for results. Evaluation and program planning, 26 (3), 229-235. Shadish, W., Cook, T., Leviton, L. (1991). Foundations of proggam evaluation. Newbury Park, CA: Sage. Sprangers, M. and Hoogsraten, J. (1988). Resmnse style-effects, resppnse-shift bias and a bogus pipeline: A replication. Psychological Reports, 62: 11-16. Sprangers, M. and Hoogsraten, J. (1989). Pre-testing effects in retrospective pre-test- post-test designs. Journal of Applied Psychology, 74, 555-72. St. Pierre, RC. (1982). Specifying outcomes in nutrition education evaluation. Journal of Nutrition Education, 14(2), 49- 51. Stevens, G. and Lodl, K. (1999). Community coalitions: Identifying changes in coalition members as a result of training. Journal of Extension, 37(2). Summers, J .C., Miller, R.W., Young, R.E., & Carter, CE. (1981, July). Proggam evaluation in extension: A comprehensive study of methods, practices and procedures (Executive Summary). Morganton, WV: West Virginia University Cooperative Extension Service, Office of Research and Development. Thompson, C. (1985). Mixing Apples and Oranges: Results can be agggagated across individual proggams. Journal of Extension, 23 (1). United States Department of Agriculture, http://www.usda.gov. United States Department of Agriculture. FY88-91 Extension's food nutrition and health program emphases and outcome indicators. Washington, DC: Home Economics and Human Nutrition Extension Service-USDA. United States Department of Agriculture Food Safety and Inspection Service, (1999). Progress report on salmonell_a testing of raw meat and poultry products. http://www.fsis.usda.gov/OA/backggound/salmtest4.htm, Oct., 1999. US. Environmental Protection Agency (EPA), US. Department of Agriculture (USDA), Department of Health and Human Services (DHHS). (1997). Food safety from farm to t_able: A nationaifood safety initiative. A report to the President. US. Environmental Protection Agency, Washington, DC. 57 United States Food and Drug Administration, http://vm.cfsan.fda. gov United States Food and Drug Administration, http://vm.cfsan.fda.gov/~dms/seniorsb.html, accessed on April 22, 2004. United States Food Safety and Inspection Service, (2002). Seniors need wisdom on food safety. http://www.fsis.usda.gov/OA/pubs/seniors.htm. University of Tennessee Cooperative Extension Service. Financial security in later life. Accessed on December 12, 2003, httpzllweb.utk.edu/~aee/ParticipantChanges.htm. Upshaw, V., Umble, K., Orton, S. and Matthews, K. (2000). Unpublished data. North Carolina Institute for Public Health, School of Public Health, University of North Carolina at Chapel Hill. Voichick, J. (1991). Impact indicators proiect report. Madison WI: Extension Service- USDA. Warner, P. D., & Maurer, R. C. (1984). Methods of Luigpam evaluation. Lexington, KY: University of Kentucky Witte, K. (1992). Putting the fear back into fear apppals: The extended paflllel process model. Communication Monographs, 59, 329-349. Witte, K. & Allen, M. (2000). A meta-arglysis of fear Appeals: Implications for effective public health campaigps. Health and Education Behavior, 27, 591-615. http://www.ca.uky.edu/agpsd/soregion.htm httpzl/www.extenfin.psu.edu/evaluationlresources.html www.cyfemet.org/traininglevaluation.ppt, Evaluation: Using multiple methods to demonstrate impacts. Utah State University. Accessed on April 20, 2004. 58 RRRRRRRRR IIIIIIIIIJIIIIIIIIIIIIIIIIIIIII 2497 7088