71‘ L«$_S"S LIBRARY Michigan Sate lllllllllllllllllllllllllll\llolllllzllllllglll "mm” 3 1293 1040 This is to certify that the thesis entitled The Influence of Cues About Relevance and Quality on the Selection of Instructional Films presented by Joseph Mi chae l Meme 1 has been accepted towards fulfillment of the requirements for Ph.D. degree in Education Major professor Date W 0-7 639 more: W G) Copyright by JOSEPH MICHAEL MEMMEL 1978 THE INFLUENCE OF CUES ABOUT RELEVANCE AND QUALITY ON THE SELECTION OF INSTRUCTIONAL FILMS By Joseph Michael Memmel A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Secondary Education and Curriculum 1978 ABSTRACT THE INFLUENCE OF CUES ABOUT RELEVANCE AND QUALITY ON THE SELECTION OF INSTRUCTIONAL FILMS By Joseph Michael Memmel The Problem Traditionally, behavioral models and methods have not been used to evaluate the effectiveness of film descriptions which are commonly found within the marketplace. To date, the type of informa- tion typically provided within film descriptions has usually been based upon untested, a priori assumptions about the characteristics which define a "good" description. This study explored a different alternative-~definition of a "good" film description in terms of empirical evidence obtained of its actual effectiveness. Objectives l. To define a behavioral model of film selection based upon the perceived relevance and quality of instructional films, and the basic assumptions, theoretical foundations and measurement methods upon which the model was based. 2. To obtain evidence of the validity and reliability of the model. 3. To determine if the model can be used to evaluate the effectiveness of different film description styles. Joseph Michael Memmel Methodology First, the model, a forced-choice decision model, was defined in terms of two basic types of information cues, instructional per- tinence indicators (IPI's) and film quality indicators (FQI's). The cues assist one to make three fundamental types of film selection judgments--instructional pertinence (relevance), film quality and betterness. The latter are judgments of the degree to which a given film is perceived to be "better" to use than another for a particular instructional purpose. Second, the related literature and research was reviewed to identify ways in which the model was supported by other studies and writings. Third, an experiment was administered to test important assump- tions of the model and its use. The experiment was employed to deter- mine the degree to which IPI's and FQI's actually influenced the film selection process. The experiment was designed to compare (a) the similarity of rated selection judgments elicited from three experi- mental film description styles with (b) counterpart judgments elicited from the corresponding films. The three description styles all contained similar arrays of IPI's. But they varied in terms of the type of FQI's provided. One style contained valid overall ratings of film quality. Another contained invalid quality ratings; the third, no quality ratings. Three independent variables were investigated: (a) treatments, the type of decision-making information provided--films or film des- criptions; (b) stimulus conditions, the particular circumstances Joseph Michael Memmel posed for the selection judgments; and (c) films, the specific stimulus films which were used. Rated relevance, quality and betterness judgments were obtained as dependent variables. Mean ratings were used as basic indices of response. Subjec- tive measures were also used. Analysis of variance (ANOVA) and Spearman rank correlation statistical analysis techniques were used to compare the subjects' mean ratings. General Findings l. Although all of the assumptions of the film selection model were not confirmed, the model was generally found to be a valid, meaningful one. 2. The model was reliable in that the general principles of selection defined by the model were confirmed. However, the model was unreliable in that the ANOVA and rank corre- lation measurement methods did not always produce similar experimental results. This was attributed to the use of small sample sizes, however. 3. The model was found to be useful for evaluating the effec- tiveness of the experimental film descriptions. 4. Each of the independent variables significantly influenced the experimental results. Strong (p 5 .0l) treatment, stimulus condition, and film effects occurred. Joseph Michael Memmel 5. Strongly divergent (.0l 5 p 5 .l5) rating tendencies resulted from the presence and the lack of overall ratings of film quality within the experimental descriptions. 6. Distinct types of IPI's and FQI's were identified which can be included potentially within film descriptions. Implications and Recommendations Both were provided, related to practical, theoretical and experimental design considerations. To Marlex, "Marvelous" Michelle, "Charming" Chad, and "Lovely, Little" Lanette-- for the inspiring beauty, charm, warmth and "special" glow of their love . . . . and to our Mothers and Dads-- for giving us our start, and the opportunity to share the joys, trials, marvels and triumphs of life within our Creator's incredible universe. iii ACKNOWLEDGMENTS To the following people, I offer a special "Thank You" in appreciation of the unique support and assistance which they gave in helping me to complete this dissertation. To members of my doctoral program committee: Dr. Ted Ward, Dissertation Director; Dr. Paul Witt, past Chairman; Dr. Castelle Gentry, present Chairman; Dr. Norman Bell; and Dr. James Page. I am especially indebted to Dr. Ward for his ongoing guidance and support while conceptualizing and organizing the initial drafts of this docu- ment; and to Dr. Gentry, who supervised the preparation of the final draft. To the administrators and information specialists of the graduate research libraries at the following universities, who helped me to complete the review of related research and literature: Michigan State University, East Lansing; University of Wisconsin-Milwaukee: New Mexico State University, Las Cruces; and the University of Washington, Seattle. I To those who served as film evaluators for the experimental phase of this study, namely: Drs. Leon Williamson, Sherry Engle, Chuck Bomont, John Thomas and Alma Barba of the College of Education, New Mexico State University; Drs. Kenneth Melgard, Simons and Ewing of the College of Natural Science, New Mexico State University; Dr. Phil Dillard and Red Hall of the Audiovisual Center, New Mexico iv State University; Messrs. Bob Price and Dick Hanks of the Southwest Regional Media Center for the Deaf, New Mexico State University; and M55. Linda Howard, Chris Ramsey, Linda Hanks and Roberta White, resi- dents of Las Cruces, Nex Mexico. To Dr. Don Ferguson, Associate Dean, College of Education, New Mexico State University, for his assistance in obtaining permission to use NMSU facilities, resources and students for the experimental phase of the study. To all of the "volunteers" who served as experimental subjects, for giving their time, interest and commitment to help. To Ms. Sue Cooley for her conscientious efforts in preparing and submitting the final, polished copies of this document. And especially--to my wife and family, for their admirable help in making the write-up and submission of this study a more bear- able task --To my magnificent wife, Marlex, for her tireless devotion and persistence in typing the initial rough drafts, and for her valiant, continual support during the rather prolonged time period during which this project was pursued; --To Michelle, Chad and Lanette, for their heart-warming smiles and delightful antics when they visited with me at my desk; and for forgiving me for not being a better "Dad" whenever I had to "scoot" them away. TABLE OF CONTENTS LIST OF TABLES ......................... LIST OF FIGURES ......................... LIST OF ABBREVIATIONS AND SYMBOLS ................ Chapter I. INTRODUCTION ...................... Purpose of the Study ................. Objectives of the Study ................ Method of Investigation ................ Conceptualization of the Film Selection Model . . . . The Review of Related Literature .......... The Experimental Phase of the Study ......... Rationale: The Problem and Need for the Study ..... Definitions ...................... General Assumptions of the Study ........... Film Selection Decision Models ........... Relevance and Quality Information Cues ....... The Film Selection Process ............. The Influence of Film Descriptions ......... The Design and Evaluation of Film Descriptions . . . The Film Selection Model .............. The Conceptual Model of Film Selection ........ Latent vs. Manifest Levels of the Model ....... General Assumptions of the Model .......... Hypotheses ............... ' ....... General Scope and Limitations of the Study ...... The Significance of the Study ............. Theoretical Contributions .............. Practical Contributions ............... Organization of the Study ............... II. REVIEW OF RELATED RESEARCH AND LITERATURE ........ Introduction ..................... Purpose of the Review ................ Objectives of the Review .............. Organization of the Review ............. vi xvii Chapter Factors Which Influence Film Quality Judgments Film Quality: Basic Views .............. Film Quality Appraisal Methods ........... Film Quality Evaluators-Selectors .......... Film Evaluation-Selection Criteria ......... Evaluation-Selection Criteria Associated With Different Views of Film Quality ........... The Inherent Attributes View of Film Quality: Associated Evaluation-Selection Criteria ..... The Technical Quality View of Film Quality: Associated Evaluation-Selection Criteria ..... The Instructional Design View of Film Quality: Associated Evaluation-Selection Criteria ..... The Effects-Assessment View of Film Quality: Associated Evaluation-Selection Criteria ..... Factors Which Influence Relevance Judgments ...... Definitions of Relevance and Relevance Judgment . . . Related Research .................. The Reliability and Validity of Relevance Judgments . Relevance and Quality Information Cues Present in Film Descriptions .................. The Importance of Film Descriptions and Information Cues Exhibited by Them .............. Relevance and Quality Cues Suggested From Film Description Content Analyses ........... Unique Relevance and Quality Cues .......... Criticisms Made of Information Cues Supplied in Film Descriptions ................. Recommendations Made for Improving Information Cues Supplied in Film Descriptions ........... General Conclusions Drawn From the Review ....... The Distinction Between Relevance and Quality Cues Evaluation-Selection Criteria Which Influence Relevance and Quality Judgments .......... Characteristics of Films Which Influence Relevance Judgments Elicited From Them . . . . ....... Nonfilmic Factors Which Influence Both Relevance and Quality Judgments Made About Films ...... Characteristics of Films Which Influence Quality Judgments Elicited From Them ........... The Extent to Which Available Film Descriptions Exhibit Relevance and Quality Information Cues Assumptions About Instructional Pertinence Indicators and Film Quality Indicators ............. Assumptions: Instructional Pertinence Indicators . Assumptions: Film Quality Indicators ........ General Assumptions ................. vii Chapter Page Instructional Pertinence Indicators and Film Quality Indicators Suggested From the Literature for Inclusion Within Film Descriptions ......... 103 Critique ....................... 115 The State-of-the-Art of the Related Research and Literature .................... 115 Needed Research and Evaluation Efforts: Critical Questions ..................... 117 Summary ........................ 121 III. EXPERIMENTAL DESIGN AND METHODOLOGY ........... 127 Purpose ........................ 127 The Experimental Method ................ 127 Independent Variables ................ 128 Dependent Variables ................. 128 Subjects ...................... 129 Treatment Groups .................. 129 Stimulus Materials ................. 130 Procedure ...................... 135 The Experimental Design ................ 138 Design Schematics .................. 138 The Stimulus Conditions ............... 139 Experimental Units: Units of Analysis ........ 142 General Measurement Methods: The Manifest Level of the Film Selection Model .............. 143 Rating Scales .................... l43 Anticipated Rating Patterns ............. l43 Subjective Comments ................. 151 The Measurement Methods Used for Comparing the Experimental Abstracts ............... 152 Measures: Reliability and Consistency ........ 152 Definitions of Reliability and Consistency ..... 153 Indices of Reliability and Consistency ....... 154 Other Descriptive Indices .............. 159 Measurement-Related Assumptions ............ 159 General Assumptions ................. 160 The Film Selection Process ............. 161 Film Descriptions .................. 161 The Film Selection Model .............. 162 The Indices of Reliability and Consistency ..... 163 The Quality Measure ................. 165 The Relevance Measure ................ 165 The Experimental Abstract Styles .......... 166 The Subjective Comments ............... 167 Nuisance Variables ................. 168 Limitations of the Experimental Design and Methodology . 169 The Repeated Measures Design ............ 169 Sampling Considerations ............... 171 viii Chapter Page The Independent Variables .............. 171 Measurement Methods ................. 172 Nuisance Variables ................. 179 The Data Analysis Methods .............. 180 Problems Encountered ................ 180 Summary ........................ 181 IV. EXPERIMENTAL FINDINGS AND RESULTS ............ 187 General Findings: Summary ............... 187 The General Question Posed by the Hypotheses . . . . 187 The Hypotheses ................... 189 Critical Assumptions ................ 190 Hypothesis One: The Subjects' Quality Ratings, ANOVA Indices .................... 194 Hypothesis One ................... 194 General Findings .................. 194 Mean Ratings and Standard Deviations ........ 197 Hypothesis Two: The Subjects' Relevance Ratings, ANOVA Indices .................... 202 Hypothesis Two ................... 202 General Findings .................. 202 Mean Ratings and Standard Deviations ........ 205 Mean Ratings: Combined Cell Means .......... 217 The ANOVA Analyses: Combined Cell Means ....... 217 Hypothesis Three: The Subjects' Betterness Ratings, ANOVA Indices .................... 218 Hypothesis Three .................. 218 General Findings .................. 218 Mean Ratings and Standard Deviations ........ 225 Abstract vs. Film Shifts .............. 225 Hypothesis Four: The Subjects' Relevance Ratings, Rank Indices .................... 229 Hypothesis Four ................... 229 General Findings .................. 229 The Rank Indices .................. 230 Hypothesis Five: The Subjects' Ratings, Rank Indices, Combined Means ................... 232 Hypothesis Five ................... 232 General Findings .................. 232 The Rank Indices .................. 233 The Film Evaluation Panel's Quality Ratings ...... 235 General Findings .................. 235 Overall Means and Standard Deviations ........ 235 The Difference Between the Overall Means ...... 236 The Mean Criterion Ratings ............. 236 The Reliability of the Ratings ........... 239 ix Chapter V. The Film Evaluation Panel's Subjective Comments . . . . General Findings .................. Strengths vs. Nonstrengths ............. Subjective Comments vs. Mean Criterion Ratings The Subjects' Subjective Comments, Quality Measure General Findings ................. : Strengths vs. Nonstrengths ............. Type P, Q, and PO Comments ............. The Subjects' vs. Panel's Responses, Quality Measure General Findings .................. Subjective Comments: Similarities .......... Subjective Comments: Dissimilarities ........ Subjective Comments vs. Overall Mean Ratings The Subjects' Subjective Comments, Betterness Measure . General Findings .................. Type P, Q, and P0 Comments ............. Evaluation-Selection Referent Criteria ....... Critical Assumptions: The Experimental Phase of the Study ...................... General Findings .................. Relevance Ratings: 03 Stimulus Condition ...... Macro-Type Rating Patterns ............. The Intermeasure Correlations ............ SUMMARY, CONCLUSIONS, IMPLICATIONS, AND RECOMMENDATIONS . The Intent and Emphases of the Study ......... Purpose of the Study ................ Objectives of the Study ............... Method of Investigation ............... Rationale: The Problem and Need for the Study . . . . The Model of Film Selection .............. Assumptions: The Film Selection Process ....... Assumptions: Film Descriptions ........... Assumptions: The Model ............... The Review of Related Literature and Research ..... The Experimental Design and Methodology ........ Overview ...................... The Hypotheses ................... Measurement Methods ................. Critical Measurement Assumptions .......... The Experimental Findings and Results ......... General Findings: The Hypotheses .......... General Findings: The Experimental Abstract Styles . General Findings: The ANOVA Analyses ........ General Findings: The Model of Film Selection . . . . General Findings: Measurement Methods ........ Page 242 242 242 244 244 244 245 246 249 249 250 251 253 253 253 255 255 256 260 261 264 264 270 270 270 271 271 271 272 273 275 276 276 279 279 281 282 283 286 286 287 288 289 289 Chapter Conclusions: The Experimental Phase of the Study . The Hypotheses ................... The Experimental Abstract Styles .......... The Model of Film Selection ............. Discussion ...................... Interpretation of the Experimental Results ..... Assumptions of the Study: Revisions and Additions . . Use of the Model for Evaluating Film Descriptions . . The Generalizability of the Experimental Results . Implications of the Study ............... The Use of Film Quality Ratings ........... The Reliability of Film Descriptions ........ Teacher Training .................. Applications of the Model .............. The Design of Film Descriptions ........... Use of the Model for Evaluating Film Descriptions . . Relevance Theory .................. Recommendations .................... Refinement and Testing of the Model ......... Refinement of the Measurement Methods ........ Testing the Use of the Model for Evaluating Film Descriptions ................... Defining Descriptions in Behavioral Terms ...... Testing Other Applications of the Model ....... Development of Other Behavioral Models ....... APPENDICES ........................... A. on THE RATING INSTRUMENTS USED BY THE FILM EVALUATION PANEL . THE RATING INSTRUMENTS AND SURVEY FORM USED BY THE EXPERIMENTAL SUBJECTS ................. THE EXPERIMENTAL ABSTRACTS ............... THE FILM EVALUATION PANEL'S QUALITY RATINGS ....... THE FILM EVALUATION PANEL'S SUBJECTIVE COMMENTS ..... RAW DATA: THE SUBJECTS' RATINGS ............. THE SUBJECTS' SUBJECTIVE COMMENTS OBTAINED FROM THE QUALITY MEASURE .................... xi Page 290 290 294 299 302 302 305 309 312 312 313 314 315 315 316 317 317 318 319 320 323 324 326 326 328 329 342 360 370 373 378 383 Chapter Page H. THE ANOVA ANALYSIS MADE OF THE SUBJECTS' BETTERNESS RATINGS ........................ 390 I. THE ANOVA ANALYSES MADE OF THE SUBJECTS' QUALITY RATINGS ........................ 392 J. THE ANOVA ANALYSES MADE OF THE SUBJECTS' RELEVANCE RATINGS ........................ 394 K. THE EXPERIMENTAL DESIGNS USED FOR THE RELEVANCE, QUALITY, AND BETTERNESS MEASURES ........... 398 BIBLIOGRAPHY .......................... 401 xii Table 10. 11. 12. LIST OF TABLES Instructional Design Characteristics of Instructional Films ......................... Empirical Research Conclusions About Factors Assumed to Influence Relevance Judgments Obtained From Print—Form Documents and Document Representations Instructional Pertinence Indicators Suggested for Inclusion Within Film Descriptions ........... Film Quality Indicators Suggested for Inclusion Within Film Descriptions ................ Means and Standard Deviations of the Overall Quality Ratings Elicited From the Experimental Subjects and Film Evaluation Panel ................. Session One vs. Session Two: Mean Quality Ratings The 01F] Stimulus Condition: Means and Standard Deviations of the Relevance Ratings Elicited From the Experimental Subjects ............... The 02F2 Stimulus Condition: Means and Standard Deviations of the Relevance Ratings Elicited From the Experimental Subjects ................. The 03F] Stimulus Condition: Means and Standard Deviations of the Relevance Ratings Elicited From the Experimental Subjects ................. The O3F2 Stimulus Condition: Means and Standard Deviations of the Relevance Ratings Elicited From the Experimental Subjects ................. Mean Relevance Ratings: Combined Means Across the Four Stimulus Conditions, by Sessions and Treatment Groups Mean Relevance Ratings Obtained for Films One and Two, for the High and Mediocre Relevance Conditions, Combined, by Treatment Group and Session ........ xiii Page 66 77 104 109 198 199 206 207 208 209 219 220 Table 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 01. Mean Relevance Ratings Obtained for the High and Mediocre Relevance Conditions for Films One and Two, Combined, by Treatment Group and Session ........ Mean Relevance Ratings Obtained for the Mediocre and High Relevance Conditions, for Sessions One and Two Combined, by Treatment Group and Stimulus Condition Means and Standard Deviations of the Betterness Ratings Obtained From the Experimental Subjects, by Treatment Groups and Sessions .................. Means and Standard Deviations of the Quality Ratings Elicited From the Film Evaluation Panel ........ The Mean Criterion Ratings Obtained From the Film Evaluation Panel .................... Frequency Distribution of Type P, Q, and PO Comments Elicited From the Experimental Subjects for the Quality Measure, for Films One and Two, Combined . . . . Frequency Distribution of Type P, Q, and PO Comments Elicited From the Experimental Subjects for the Betterness Measure, by Treatment-Session Groups The Percentage of Responses Obtained for the Evaluation- Selection Factors Referred to Most Often in the Subjects' Betterness Comments, by Type of Comment and Selection Decision ................. Mean Relevance Ratings Obtained for Films One and Two for the 03 Objective, During Session One, by Treatment Groups .................... The Correlation Between the Subjects' Relevance and Quality Ratings .................... The Correlation Between the Subjects' Quality and Betterness Ratings ................... The Correlation Between the Subjects' Relevance and Betterness Ratings ................... The Panel's Ratings for Film One Obtained From the Film Quality Rating Instrument ............. xiv Page 221 222 226 237 241 247 257 259 262 266 267 371 Table Page 02. The Panel's Ratings for Film Two Obtained From the Film Quality Rating Instrument ............ 372 E1. The Strengths of Film One vs. the Nonstrengths of Film Two: Representative Subjective Comments Made by the Film Evaluation Panel ............. 374 F1. The Treatment Group I Ratings ............. 379 F2. The Treatment Group II Ratings ............. 380 F3. The Treatment Group III Ratings ............ 381 F4. The Treatment Group IV Ratings ............. 382 61. Illustrative Examples of Type P Comments Elicited From the Experimental Subjects for the Quality Measure . . 384 62. Illustrative Examples of Type O Comments Elicited From the Experimental Subjects for the Quality Measure . . 386 G3. Illustrative Examples of Type PQ Comments Elicited From the Experimental Subjects for the Quality Measure ....................... 389 H1. Results of the Treatments (4) x Sessions (2) ANOVA Analysis Made of the Subjects' Betterness Ratings . . 391 11. Results of the Treatments (2) x Sessions (2) x Films (2) ANOVA Analysis Made of the Quality Ratings Obtained From Treatment Groups I and IV . . . 393 12. Results of the Sessions (2) x Films (2) ANOVA Analysis Made of the Treatment Group I Quality Ratings ....................... 393 J1. Results of the Treatments (4) x Sessions (2) ANOVA Analysis Made of the Relevance Ratings Obtained From the 01F] Stimulus Condition ........... 395 J2. Results of the Treatments (4) x Sessions (2) ANOVA Analysis Made of the Relevance Ratings Obtained From the 03F2 Stimulus Condition ........... 395 XV Table J3. Results of the Treatments (4) x Sessions (2) x Stimulus Conditions (2) ANOVA Analysis Made of the Relevance Ratings Elicited From the Mediocre Relevance Condition, for the 03F] and 03F2 Stimulus Conditions Combined ........................ J4. Results of the Treatments (4) x Sessions (2) x Stimulus Conditions (2) ANOVA Analysis Made of the Relevance Ratings Elicited From the High Relevance Condition, for the 01F] and 02F2 Stimulus Conditions Combined ........................ J5. Results of the Treatments (4) x Films (2) ANOVA Analysis Made of the Subjects' Relevance Ratings Elicited for the 03 Objective During Session One .......... xvi Page 396 396 LIST OF FIGURES Figure Page 1. The Forced-Choice Film Selection Process ......... 21 2. Latent and Manifest Levels of the Forced-Choice Film Selection Process ................... 23 3. The Forced-Choice Film Selection Process Used by the Experimental Subjects ................. 25 4. The Units of Analysis: The Eight Treatment-Session Groups ......................... 139 5. The Overall Design of the Experimental Phase of the Study ......................... 140 6. Rating Scales Used With the Film Selection Model ..... 144 7. Possible Combinations of Relevance, Quality, and Betterness Ratings for Forced-Choice Selection Decisions Involving Pairs of Films ........... 145 8. Expected Rating Patterns: Macro-Types I, II, III, and IV . 148 9. Expected Micro-Type Rating Patterns for Forced-Choice Selections ....................... 149 10. The Type IV Micro-Type Rating Patterns .......... 150 11. Types of Ranked Means Profiles Obtained for the Treatment-Session Groups ................ 155 12. Treatment-Session Group Comparisons Associated With Corresponding Consistency and Reliability Indices . . . 157 13. Mean Quality Ratings: The Interaction of Treatment Groups and Films, by Sessions ............. 200 14. Mean Quality Ratings: The Interaction of Treatment Groups and Sessions, by Stimulus Films ......... 201 15. Mean Relevance Ratings: The Interaction of Treatment Groups and Sessions, by Stimulus Conditions ...... 213 xvii Figure 16. 17. 18. 19. 20. 21. 22. K1. K2. K3. Mean Relevance Ratings: The Interaction of Sessions and Treatment Groups, by Stimulus Conditions ..... The Interaction of Stimulus Conditions and Sessions, by Treatment Groups ................. The Intrasubject and Intersubject Abstract vs. Film Shifts Noted for the Relevance Stimulus Conditions, by Films and Treatment Groups ...... Mean Betterness Ratings: The Interaction of Treatment Groups and Sessions ................. The Intrasubject and Intersubject Abstract vs. Film Shifts Noted for the Betterness Measure, by Treatment Groups ................... Frequency Distribution of the Quality Comments, by Type of Comment and Treatment-Session Groups ..... Frequency Distribution of the Betterness Comments, by Type of Comment and Treatment-Session Groups ..... The Experimental Design for the Relevance Measure The Experimental Design for the Quality Measure . . . . The Experimental Design for the Betterness Measure . . . . xviii Page 214 215 216 227 228 248 258 399 400 400 LIST OF ABBREVIATIONS AND SYMBOLS ANOVA ...... Analysis of variance B .......... Betterness C .......... Conditions (stimulus conditions) '0 .......... Mean rating difference F .......... F-test value; films F] ......... Film one F2 ......... Film two FQI ........ Film quality indicator FQRI ....... Film Quality Rating Instrument IPI ........ Instructional pertinence indicator M .......... Mean ND ......... The "no difference" betterness option 01 ......... Objective one 02 ......... Objective two 03 ......... Objective three p .......... Probability P .......... Pertinence-oriented (type P comments); Preference (teaching) PQ ......... Pertinence and quality-oriented q1 ......... Quality rating, session one q2 ......... Quality rating, session two 0 .......... Quality; quality-oriented (type Q comments) xix r .......... Product-moment correlation r1 ......... Relevance rating, film one r2 ......... Relevance rating, film two R .......... Relevance (instructional pertinence) S .......... Subjects S .......... Session, sessional; subject 51 ......... Session one S2 ......... Session two SD ......... Standard deviation t1 ......... Treatment group I, session one t2 ......... Treatment group II, session one t3 ......... Treatment group III, session one t4 ......... Treatment group IV, session one '6] ......... Treatment group 1, session two ‘32 ......... Treatment group II, session two tfl3 ......... Treatment group III, session two tfl4 ......... Treatment group IV, session two T .......... Treatment; total T1 ......... Treatment group I T2 ......... Treatment group II T3 ......... Treatment group III T4 ......... Treatment group IV XX CHAPTER I INTRODUCTION This chapter describes the following facets of the study: the purpose and objectives; the underlying rationale-~the prob- lem investigated and the need for the study; the method of inves- tigation; the basic, underlying assumptions and conceptual foundations; the hypotheses formulated for the experimental phase; the general scope, limitations, significance, and organi- zation of the study; and key terms and phrases used to describe the study. Purpose of the Study Overall, this study was designed to investigate the nature and influence of relevance and quality information cues exhibited by films and film descriptions upon the film selection process, from a behavioral viewpoint of the selection process. Objectives of the Study The major objectives of this study were the following: 1. To define a behavioral model of film selection based upon judgments made of the perceived relevance and quality of instructional films, and the basic assumptions, theoreti- cal foundations and measurement methods upon which the model was based. 2. To obtain evidence of the validity and reliability of the model. 3. To determine if the model can be used to evaluate the effectiveness of different film description styles. Method of Investigation Three essential tasks were pursued in this study: (a) defi- nition of the model of film selection, (b) analysis and synthesis of the related research and literature, and (c) design and administra- tion of the experimental phase of the study. Conceptualization of the Film Selection Model The model was defined in terms of a relatively simple notion, namely, that three fundamental types of judgments are made by selec- tors: judgments of instructional pertinence (relevance to a defined instructional situation) and qualitative worth (film quality), and judgments of betterness when the strengths and nonstrengths of two different films are compared for a particular instructional purpose. The model was based upon the assumed influence of informa— tion cues exhibited by films and film descriptions, which serve as instructional pertinence indicators and film quality indicators. The model was developed in several stages. First, the pro- cess by which selectors use instructional pertinence indicators and film quality indicators was defined in behavioral, measurable terms-- in terms of rated judgments of relevance, guality, and betterness made by selectors. Second, a methodology for comparing the relative magnitudes of ratings obtained from film descriptions and the cor- responding films was developed. Third, the fundamental principles of selection and the corresponding selection judgment patterns implied by the model were defined. The selection principles predicted that several basic types of rating patterns would tend to be exhibited when the model of film selection is used to compare selector responses. Fourth, general assumptions underlying the model were defined and critical assumptions were identified to enable verification of the inherent validity and reliability of the model. Fifth, formal hypotheses were established for the experimental phase of the study. The Review of Related Literature The literature was reviewed essentially to identify documents useful for defining the theoretical, conceptual, and behavioral foun- dations of the study. Four literature bases were reviewed: the literature and research dealing with (a) the theory and practice of the instruc- tional materials evaluation-selection process, (b) the design and evaluation of media descriptions, (c) film evaluation-selection criteria, and (d) factors which influence the relevance judgment process. A critique of the related literature and research was pro- vided. As well, insights gained from the review were incorporated into sections of this study describing (a) assumptions about instruc- tional pertinence indicators and film quality indicators, (b) other assumptions of the film selection model investigated, and (c) the kinds of instructional pertinence indicators and film quality indi- cators which can be included potentially in film descriptions. The Experimental Phase of the Study An experiment was administered to determine the degree to which relevance and film quality information cues supplied in sev- eral different film description styles actually influenced the film selection process. The experiment was designed to compare (a) the similarity of rated judgments of the relevance, quality, and better- gg§§_of a pair of films, elicited from the experimental film des- criptions, with (b) counterpart judgments elicited from the corresponding films. Emphasis was placed upon comparison of the magnitudes and rank order relationship of mean ratings obtained from several groups of selectors. Emphasis was also placed upon identification and clarification of the types of information cues perceived by selectors during the selection judgment process, as indicated by the kinds of film evaluation—selection criteria considered for a given, particular judgment. The effectiveness of three experimental film description styles was evaluated during the experimental phase of the study. The styles varied in terms of the type of film quality information sup- plied or not supplied within the descriptions. The basic question investigated by the experimental phase of the study was the following: "Does the use of overall ratings of film quality within film descriptions make a difference?" Or, said another way, "Are selection judgments significantly influenced by the use of overall ratings of film quality within film descriptions?" In answering the above questions, an evaluation of the actual effec- tiveness of the three experimental film description styles was obtained. As well, simultaneously, empirical evidence was obtained of the validity and reliability of the model of film selection. Rationale: The Problem and Need for the Study One can reasonably assume that the selection and use of instructional films is substantially influenced by the type of infor- mation supplied in film descriptions. But the type of information typically found in film descriptions leaves much to be desired. To illustrate, according to S. C. Johnson (1971), Virtually all current media information systems, including even university film library catalogs, do not meet even minimal criteria for inte rity, availability, and useful- ness of information (p. 71 In the words of Heinich (1973), "The simple listings we have are primitive compared to the kinds of information decision-makers need" (p. 23). To date, the design and evaluation of instructional media descriptions in general, and of film descriptions in particular, has been accomplished essentially through the use of subjective, opinion- oriented assessment procedures rather than through use of objective, empirical, experimental investigation methods. Characteristics which define an effective film description have been defined pri- marily on a priori grounds, in terms of recommendations and sugges- tions made in the literature about criteria and factors of impor- tance to the film evaluation-selection process. This study was pursued to assist definition and clarifi- cation of some basic theoretical perspectives and behavioral measure- ment methods which can be used potentially to define the character- istics of an effective film description on empirical, experimental grounds. The study focused upon several voids noted in the related literature and research. First, the film selection literature has focused little to date upon what is known about the relevance judg- ment process, of significance to the design and evaluation of film descriptions. Second, few documents have been written which define the film selection process in measurable, behavioral terms. Third, in particular, the literature has treated little, to date, several fundamental questions which need to be addressed, to define a behavioral theory of film selection and film description design, The questions from which this study arose are these: 1. What kinds of judgments, cognitive comparisons, do selec- tors make during the selection process? 2. What kinds of measurable responses are representative of the types of film selection judgments made by selectors? 3. What measures, baseline measurements, indices, and pat- terns of selection responses can be used to reveal the degree of influence of different film description styles upon the selection process? 4. What tests of significance, statistical or otherwise, can be used to identify meaningful, reliable differences noted about selection responses which are made under controlled, experimental conditions? This study is based upon the assumption that if a media description does not yield valid, reliable, predictable user responses, it probably is not a "good" description. The study was pursued because no evidence from controlled, empirical studies was found in the literature, which indicated the degree to which media descrip- tions yield valid, reliable, predictable responses when read or compared by selectors. Such evidence should prove to be useful for improving the design of media descriptions. The study is also based upon another important assumption: that suggestions and recommendations about the specific types of information which should be included in media descriptions should (a) arise from a clearly defined conception or theory of the media selection process and (b) be substantiated by empirical investigation. The three film description styles evaluated during the experimental phase of the study were designed specifically for this study. The basic film description common to each style was defined on the basis of studies by Gilkey (1962, p. 99) and S. C. Johnson (1971, pp. 149-50). Gilkey and Johnson found that the kinds of film description information preferred by science and social studies educators include, among other things, the following: a paragraph- length subject matter content description; indications of the purpose and audience level; a list of subject matter topics treated; an appraisal, including indications of the strengths and weaknesses of a given film, when appropriate; and an overall rating of quality. The three experimental film description styles were compared in an attempt to define the optimal amount and type of information needed in film descriptions to elicit consistent, predictable selec- tion responses--responses similar to those made upon actual preview of a given set of films. The experimental film description styles represent a compromise in design between the somewhat "simplistic“ description styles commonly used in the marketplace (styles which typically provide two to four sentences of information), and other styles which supply a page or more of detailed, product evaluation information, investigated and described by Bond (1972a, 1972b), Gilkey (1962), and S. C. Johnson (1971). Definitions The following definitions are provided for key terms used to describe this study. Abstract: A description of the unique nature and character— istics of an instructional material or set of materials, designed to aid selectors to make judgments about the use of the material for particular instructional situations, consistent with judgments made from actual review of the materials. An abstract would typically include from one-half to one page of single-spaced typed information. Betterness: The attribute appraised when one (a) judges the degree to which a particular instructional material is perceived to be "better" than another, or (b) selects a particular item as being better than another from choices available, to fulfill the demands of a particular purpose. The betterness attribute is assumed to reflect any and all factors deemed to be significant to the selector, including nonqualitative factors such as cost, color, film length, etc. Betterness judgment: The process of judging the "betterness" attribute. This process yields judgments referred to as "betterness judgments." For the study, betterness judgments measured in the form of ratings are referred to as "betterness ratings" or "betterness measures." The process of obtaining the judgments is referred to as the "betterness measure." Conceptual model of film selection: A comprehensive descrip- tion or systemic representation, a paradigm, of the film selection process. The particular conceptual model of film selection investi- gated is defined in terms of three criteria: 10 l. The underlying assumptions of the model; 2. The measurement methods used to implement the model; and 3. Expected types and patterns of responses made by selectors involved in making "film quality," "relevance," and "betterness" judgments. Evaluation: The process of "quality judgment." This process yields judgments referred to as "evaluations" or "quality judgments." The process may also involve "relevance judgments," if and when relevance is deemed to be a quality criterion. If the object, pro- cess, or activity being judged is compared to another, "betterness judgments" may also be involved in producing the evaluation. How- ever, evaluation is viewed in the study primarily as the process by which the value of that which is judged is judged against distinct, specifiable, quality criteria rather than relevance criteria. Film description: A verbal summary of the unique character- istics of a given film provided to aid selectors with the selection process. The verbal summary may also include indications of the film's quality. For this study, the term "film description" refers generically to the type of summary information provided in conven- tional film reviews, film evaluations, and film catalogs, for a given film. For the experimental phase of this study however, the particu- lar descriptions used as stimulus materials are referred to as the "experimental stimulus abstracts" or "experimental abstracts." For this study, two basic film description styles were investi- gated: abstracts containing instructional pertinence indicators (IPI's) 11 and those containing both IPI's and film quality indicators (FQI's). (The term "filmdescription" is also sometimes referred to in this study as the process of describing films, i.e. , as the "filmdescription process.") Film description style: The characteristic nature of a given type of film description, as defined by the kind of information ele- ments or cues it contains, irrespective of their arrangement or for- mat within the description. Film evaluation: The process of judging the quality of an instructional film in terms of its intended purpose, intended target population of users, and related criteria or indicators of excel- lence. The process yields judgments referred to as "film evalua- tions” or ”film quality judgments." The person or group of persons who make the judgments are referred to as "film evaluators." Film selection: The process of selecting (accepting or rejecting) films for particular instructional purposes. Film quality: A measure of the "excellence" exhibited by a film in relation to its intended purpose, intended target population of users, and related criteria or indicators of excellence. Film Quality7indicators: Characteristics of films or other cues present in films or film descriptions which provide an indica- tion of the excellence of a film. This includes for example, for films, characteristics such as the technical quality of the sound, narration script, and visual treatment; the validity and accuracy of the message; the clarity of purpose; and other factors described in Chapter II. For abstracts, indications of the aforementioned film characteristics would be examples, including overall ratings of 12 excellence, critical appraisal remarks, awards of merit received, usage data about the effectiveness of the film with specific target audiences, and other factors described in Chapter II. (The abbrevia- tion ”FQI" refers to the term "film quality indicator" throughout the text of this study.) Frame of reference: The context established against which a thing judged is related, when a judgment is made about it (e.g., the “instructional situation" specified for a particular relevance, quality, or betterness judgment). Instructional film: A 16mm film produced for instructional or educational purposes, rather than other purposes (e.g., the type of film listed in the NICEM Index to 16mm Educational Films [National Information Center for Educational Media, 1973]). Also referred to as a "film" or "educational film." Instructional pertinence: The attribute expressed by the degree to which an instructional material or set of materials is judged to be "logically related, pertinent" to a specified instruc- tional situation, by a selector. Instructional_pertinence indicators: Characteristics of instructional materials or other cues present in the materials or descriptions of the materials, which assist selectors to judge the relevance of the materials for specific instructional situations. This includes for example, cues or characteristics regarding the subject matter nature, vocabulary-comprehension level, message slant, target population slant, or viewpoint expressed by an instructional film, and other factors discussed in Chapter II. (The abbreviation 13 “IPI” refers to the term "instructional pertinence indicator" through- out the text of this study.) Instructional situation: The particular purpose and set of circumstances for which instructional materials are chosen for use by selectors. The combination of situational variables which defines the "frame of reference" used for selection judgments made by selec- tors. Judgment: The process of appraising the degree of relation- ship between an object, process, or activity judged, and a frame of reference. The judgment process yields appraisals referred to as "judgments." Quality: The attribute which is a measure of the "excel- lence" demonstrated by an object, process, or activity judged. The attribute is judged in terms of criteria, the indicators of excel- lence, used as the frame of reference for judgments made. Quality judgment: The process of judging the "quality" attribute. This process yields appraisals referred to as "quality judgments." For this study, quality judgments measured in the form of ratings are referred to as “quality ratings" or "quality measures." The process of obtaining the rated judgments is referred to as the "quality measure." Relevance: The attribute measured by the degree to which a given consideration is judged to be "logically related, pertinent" to another consideration. For relevance judgments made in the study, the terms "relevance" and "instructional pertinence" are synonymous. In more specific terms however, instructional pertinence is the 14 particular ”aspect of relevance" described by Saracevic (1970, pp. 120-21), defined for relevance judgments made by the experimen- tal subjects. For this study, "relevance" is considered to be defined as "instructional pertinence." Relevance judgment: The process of appraising the "relevance' attribute. This process yields appraisals referred to as "relevance judgments." For the study, relevance judgments measured in the form of ratings are referred to as "relevance ratings" or "relevance measures." The process of obtaining the rated judgments is referred to as the "relevance measure." Selection: The process of selecting (accepting or rejecting) instructional materials for particular instructional purposes. Two basic selection processes exist: the selection of materials fig; materials collections and the selection of materials jrgm_materials collections. Selection judgment: An appraisal or judgment made by a selector during the selection process. Three types of selection judgments were addressed by the study: relevance, quality, and betterness judgments. Selector: The person who selects an instructional material or set of materials from available alternatives, for a given purpose. The "judge" involved in making materials selection judgments (e.g., a teacher, curriculum specialist, experimental subject, etc.) This person is not necessarily the actual user of the materials however, as in the case when a curriculum specialist, film library director, 15 or curriculum committee selects materials for use by others. Also, when the selector is involved in making quality judgments, he is simultaneously serving as an "evaluator." General Assumptions of the Study Assumptions follow about (a) basic film selection decision processes used by selectors, (b) the general film selection process, (c) the influence of film descriptions and relevance and quality information cues upon the selection process, and (d) the design and evaluation of film descriptions. Film Selection Decision Models This study assumes that two basic decision models tend to operate during the film selection process: a criterion-referenced "threshold" of acceptance decision process and a "forced-choice" decision process. The model of selection investigated by this study is a forced-choice decision model. The threshold decision process implies that certain criteria must be met before a given film is selected as acceptable for use. When the criteria are not met, the film is rejected. The threshold decision model can operate when one or more films are considered for selection purposes. To illustrate, the threshold model operates when (a) a teacher considers using a film suggested by a colleague or some other information source or (b) when a film library selection committee is seeking films to add to a film collection. One or more films may be accepted for use when the "threshold" decision process is used. 16 The forced-choice decision process operates when two or more films are considered for a given selection purpose and only one can be selected for use, for whatever reason. For example, the forced- choice decision process operates when one considers several films described in a film catalog, or films available from several sources, and there is enough time available to use only one film for a par- ticular instructional purpose. The forced-choice decision model defines film selection as the decision process which occurs when one film, of those considered, is ultimately judged as being "better" to use. The better film is selected, accepted, while the other(s) is(are) rejected for the intended use. The forced-choice model demands that one of several films be chosen as the most acceptable. In contrast, the threshold model makes no such demand of acceptability. If appropriate, any and all films considered can be accepted or rejected. Other important assumptions of the study follow, listed by topic. Relevance and_Qua1itx Information Cues 1. Two types of information cues conveyed by both film and film descriptions influence the nature of selection judgments made by selectors, namely, instructional pertinence indicators (IPI's) and film_guality indicators (FQI's). l7 2. Film quality is a characteristic of films which can be validly and reliably measured and validly described or indicated in various ways within film descriptions. 3. Instructional_pertinence is a film usage consideration which can be validly and reliably measured and validly described or indicated in various ways within film descriptions. 4. The fundamental literature base most likely to provide assistance in defining the nature and potential utility of specific IPI's and FQI's is that dealing with (a) the evaluation, selection, and description of films and other learning materials and (b) the concept of "relevance" associated with the field of information science. The Film Selection Process 1. Two distinctly different film selection processes operate within the marketplace: selection fg§_and fggm_film collections. 2. Film selectors select films for intended uses, for spe- cific instructional purposes, situations, and target populations of learners, although selectors may not always consciously recognize this reality nor take it seriously. 3. When seeking films, the selector's basic goal or task is to choose films which are the most relevant and of the highest qual- ity of those available for potential use for a particular purpose. 4. Numerous pairs of instructional films exist within the marketplace which (a) can be used potentially for the same or similar 18 instructional purposes and (b) also vary significantly in terms of recognizable, measurable, fihnquality attributes. 5. Although various factors operate within the marketplace which tend to control the relevance and quality of films chosen for use by selectors, research and evaluation efforts are needed to find ways to help selectors to choose the most relevant and highest qual- ity films available to them. 6. The selection of instructional films can be improved by improving the kinds of IPI's and FQI's which are supplied within film descriptions. The Influence of Film Descriptions 1. When films are selected for use from film collections, the selections are often made on the basis of film descriptions which are read, rather than upon actual prior previewing of the corres- ponding films. 2. Both the type and quantity of information elements or cues supplied in film descriptions may affect the nature of selection judgments elicited from the descriptions. 3. Selectors who use film descriptions for selection pur- poses will usually accept, as credible and reliable, quality ratings or other indications of film quality provided within the descrip- tions, supplied by way of a credible film evaluator or group of evaluators. 19 The Design and Evaluation of Film Descriptions 1. Because film selectors select films for specific uses, i.e., for distinct instructional purposes, situations, circumstances, curricula, and target populations of learners to be served, such factors can and should serve as key reference points, as "instruc- tional anchor referents," for the design of effective film descrip— tions. 2. Three critical factors can be used to define an effective film description: a. It should contain information demonstrated to assist, to positively influence, the making of film selection deci- sions. b. It should contain an adequate array of both IPI's and FQI's. c. It should provide valid, reliable, representative infor- mation about important relevance and quality characteris- tics noted about films. 3. Film descriptions can be said to contain an adequate array of IPI's and FQI's, if and when they tend to elicit selection judg- ments which are similar, consistent, with those elicited from the corresponding films. 4. Film descriptions can be designed in ways to contain an adequate array of IPI's and FQI's. 5. The content validity and face validity of a given film description can be determined by assessing the nature of information 20 contained within the description, on the basis of a priori criteria, which define the characteristics of the style or type of description prepared. The Film Selection Model l. The model is a valid, reliable model of the film selection decision-making process. 2. The actual validity and reliability of the model can be assessed by determining the degree to which critical assumptions of the model hold true, as demonstrated by evidence obtained from the experimental phasecyf this study and other studies. 3. The model can be used meaningfully and reliably to evaluate the actual effectiveness of different film description styles. The Conceptual Model of Film Selection The model is depicted in schematic form in Figures 1, 2, and 3. As defined for this study, the model pertains primarily to the process of selection fggm_film collections. In general, the model is also applicable to the process of selection fgg_film col- lections, as well, with a simple modification: change the term "instructional" to a more encompassing term such as "curricular" or "usage," as appropriate. Figure 1 shows that forced-choice selection decisions are basically decisions to accept or reject a given film from those available for use for a particular instructional situation. The decisions occur after relevance, quality, and betterness judgments .mmmuoca cowuompmm Epwe muwonoiumocow mzh--.P mesmwd mmmuoca me_cc oce_umam newsmmwmm< _amuwwwmuwwwH ago: to 03C Lo 1....... spc_cn_peasou co mpewsmt.=amm mowpmwcmuomcmcu guacammz-zoumz . 21 e mmmccmuumm .m seepaso .N mocm>mpmz .F mucwEmnzo r cowmwomo cowaumpmm cocuaap_m cowpmsp_m chowuuscpmcH —mcowpo:cum:H cm>ww mzp Lee =w>wu a cow N Epwm pumwmm x Epwu pmmou< 22 are made by selectors, as a result of a prior match/mismatch, com- patibility assessment process. During the matching process, the characteristics of available films are compared with the character- istics and requirements of a given instructional situation, to deter- mine the degree to which the films match the situation. Latent vs. Manifest Levels of the Model Figure 2 depicts two distinct levels of the selection process which are important to distinguish, namely, the latent and manifest levels.1 The latgpt_level deals with unobservable, abstract, cognitive-affective dimensions of the selection process, whereas the manifest level deals with observable, measurable aspects of the selec- tion process. Figure 2 shows that, at the lgtgpt_level, selection decisions result from a cognitive-affective compatibility assessment process. During this process, the characteristic nature of available films and other media/method alternatives (when appropriate) are compared with perceived instructional needs and requirements. The comparisons made focus upon instructional pertinence and film quality considera- tions. Subsequently, relevance and quality judgments are made, and in turn, betterness judgments and specific accept/reject selection decisions for each film considered. Figure 2 shows that. atthe manifest level, selection decisions are expressed in visible, observable terms. To illustrate, a given selector might indicate or state the following: "I'll rate film X as such"; or "I would prefer to use film Y because. . . ."; or .mmmooee eowuuwpmm EFT» mowoeuunmueo» we» we m_m>m_ enmewews new pemuwnuu.m weaned mm>muweempp< noeamz\wwnwz muw_eeoeee< Lwepo new mspmd Lo mu_ememm mmwm: pwmpemuoe new muwamwemuoweweu .mmuznweup< nm>wmoeme mmmooee newsmmmmm<. xuwpwao 5pm; new moemewpeme PweowuoeeumeH mmmuoee pewEmmmmm< swc_wwcpweeou euwweWLz\euwwz m>wwuocce-m>_pcemou 23 .4 .upm .mewwewcawn .muemnzpm .meouompwm .memm: ep_u Loewe nm_eowa .m meowueweo -mmo E__d nmpewee .N wepwu ._ mmUL30mmm co PHMELonv—LH spec o_ceumem + mpemEme_:omm new mnwmz pweowuozcumea nm>_wueme muemEanen Auwpwao a muew>mem 1+ muemEmnzn mmmeewupmm t F meowmwoma eowuompmm Awummme\nemuuev meo_mvomo eo_powpwm nonwpm co nwmmmeaxm emuemememm eoeue< pweowposeumeH= we macaw ew nmewwmo muemEmcwacmm nmmmmeaxm .uwwvowam meowuwsuvm Pweowguseuwmw [aAai 1u3191 lanai 188;;UPN 24 "This particular item is more appropriate . . . , more useful . . . , would be better to use . . . , is a better film . . . , etc." Responses made by selectors at the manifest level result from percep- tions elicited from specific films, descriptions of films, or other sources of information such as professional colleagues. Specific instructional situations are expressed in terms of specific "instruc- tional anchor referents, including instructional intents (goals, purposes, missions, objectives, etc.), target group requirements (characteristics of specific learners, school environments, teachers, etc.), and other pertinent factors. The film selection process used by the experimental subjects who participated in this study is depicted in Figure 3. The figure illustrates that selection judgments and decisions made by the sub- jects were expressed in the form of relevance, betterness, and quality ratings. The ratings were assumed to be accurate, reliable reflections of selection judgments made by the experimental subjects. Subjects assessed the degree of match/mismatch between (a) character- istics of the experimental films noted from reading either type I, II, or III abstracts of the films, or from previewing the films, and (b) characteristics and requirements of hypothetical instructional situations specified in the data-collection instruments prepared as stimulus materials for the subjects. Copies of the instruments, the Abstract Questionnaire and the Film Questionnaire, are provided in Appendices Bland 82, respectively. Copies of the type I, II, and III experimental abstracts are provided in Appendix C. 25 .muoonnzm Pwpemewemexm we» xe new: mmmooea eowuumpmm Ep_» mowoeuanmoeom we»--.m meemwd mE—mm pwuemewemaxm we“ no muwemewm mmwm: Fwwnemuoa new mowumwcmuoweweu .mmuznweuu< nm>_moeme mmmooee newsmmmmm< »S_Pw=o e__e new mucm: ELM; cho mac—Cums mmmUOLn— fem—emmmmmd. spe_ce_sweeou euwwemcz\eowwz m>_eumcc<-m>wwceeou #. wepwu _wuemewemaxm me» .~ muuwenmwe HHH-H mesh .. mpwwemuwz m:_=awum queos_emaxm . m emEm 3 . —xu_Pw:ounew womw>wpmm_ . w. - _ mnemEmnan mmweewuumm _ meowm_omo eowuompmm meowuwepwm _weopuo=eumem _wu_nweeoe»= we“ aeon< nm>wmucme muemEmeweemm new mnmmz pweo_uu=eumem m _+ mmmecwnumm .m »w__w=o .N muew>m_mm ._ mmewpwe mewweeowummao E—Pu .N mcwweeowummao uoweamn< .— ”muem53eumeHeowuumppouuwuwo Pwuemsmemexm we“ er nonweommo meo_uw:uwm meowuw:u_m .weo_uo=cume~ _wuwwmepoexe L, [3A31 nuaie1 lanai 1S3;LUPN 26 General Assumptions of the Model The following are important general assumptions of the model: 1. In essence, forced-choice selection from film collec- tions is: a. basically a function of the interaction of two related but unique user perceptions, instructional pertinence and film quality. b. an interactive function of the influence of instructional pertinence indicators (IPI's) and film quality indicators (FQI's) perceived by selectors. c. a function which can be manifested in terms of relevance, film quality, and betterness judgments made by selectors. 2. The attributes "relevance," "film quality, and "better- ness" are unique, measurable variables. 3. Relevance, film quality, and betterness judgments are value judgments, perceptions unique to a given film selector. 4. The most important, meaningful IPI's and FQI's perceived by selectors tend to influence the selection process. 5. Generally speaking, IPI's will tend to influence rele- vance judgments, and FQI's quality judgments. 6. Relevance and quality judgments are independently made. Generally speaking, the perceived quality of a film will not tend to influence the degree to which the film is perceived to be relevant to a given instructional situation. Likewise, the perceived relevance 27 of a film to a given instructional situation will not tend to influ- ence the perceived quality of the film. 7. Selectors tend to choose films for instructional purposes according to five basic selection principles: Principle A: In general, selectors tend to choose as better, films perceived to be higher in relevance to an intended use and higher in Quality, when provided with choices perceived to be sig- nificantly different in quality and relevance to an intended use. Principle 8: When selectors are confronted with pairs of films perceived to be significantly different in relevance to an intended use, the film deemed to be higher in relevance will tend to be selected as better to use regardless of the perceived quality of the two films. Principle C: When pairs of films are perceived to be relatively equivalent in relevance to an intended use, and relatively equiva— lent in quality as well, selectors will tend to indicate that neither film is better to use for the intended situation. Principle 0: When pairs of films are judged to be relatively equivalent in relevance to an intended use, but significantly dif- ferent in quality, the film judged to be higher in quality will tend to be selected as being better to use. Principle E: The strength of a betterness decision is a direct function of the degree to which a selector perceives that the films compared are different in relevance to the intended usage situation and different in quality. 28 8. The use of FQI's within film descriptions, including overall ratings of film quality, can affect the nature of selection judgments elicited from the descriptions. In particular: a. The presence of FQI's can significantly influence "quality" and "betterness" judgments, b. The use of FQI's will ppt_significantly influence "relevance" judgments, and c. The lack of FQI's can significantly influence "quality" and "betterness" judgments. 9. In general, film descriptions can be classified meaning- fully into three basic types: Iype_I: Descriptions containing IPI's, but no FQI's. Type II: Descriptions containing IPI's and valid FQI's. Type III: Descriptions containing IPI's and invalid FQI's. 10. Film descriptions will tend to elicit selection judgments according to the following four principles: Principle One: Type I descriptions will tend to elicit both betterness and quality judgments which are pgt_similar to those elicited from (a) the corresponding films and (b) type II descriptions. Principle Two: Type II descriptions will tend to elicit relevance, quality, and betterness judgments which are similar to those elicited from the corresponding films, if the descriptions contain an adequate array of IPI's and FQI's. Principle Three: Type III descriptions will tend to elicit betterness and quality judgments which are pgt_similar to those elicited from (a) the corresponding films and (b) type II descriptions. 29 Principle Four: Film descriptions containing an adequate array of IPI's will tend to elicit relevance judgments which are similar to those elicited from the corresponding films, regardless of the presence or lack of FQI's within the descriptions. Hypotheses Five hypotheses were formally tested during the experimental phase of the study. Hypothesis One: The mean mpgnitude of guality ratings elicited from the experimental abstract style lacking the overall ratings of film quality will be significantly dif- ferent than that elicited from the experimental films. Hypothesis Two: The use of overall ratings of film quality within the experimental abstracts will ggt_significantly alter the mean magnitude of relevance ratings elicited from the abstracts. Hypothesis Three: The use of overall ratings of film quality within the experimental abstracts will significantly alter the mean magnitude of betterness ratings elicited from the abstracts. Hypothesis Four: The use of overall ratings of film quality within the experimental abstracts will ppt_significantly alter the rank order relationship of mean relevance ratings elicited from the abstracts. Hypothesis Five: The use of overall ratings of film quality within the experimental abstracts will significantly alter the rank order relationship of mean relevance, quality, and betterness ratings, when combined, elicited from the abstracts. General Scope and Limitations of the Study Related considerations were partially described in previous sections of this chapter. However, some additional comments about the scope and limitations of the experimental phase of the study are in order. 30 The experimental phase of the study was not based upon a tried, proven theoretical foundation or measurement model. Rather, it was based upon a previously untested modification of a measurement method described and used by Vinsonhaler (1966, pp. 1-11)2 for assess- ing the predictive validity of document abstracts. The modification was designed to incorporate measurement techniques appropriate to the nature of the model of film selection investigated. Other than Vinsonhaler's work, few studies or guidelines were found which described the use of behavioral measures for evalu- ating the comparative merits of different types of audio—visual media descriptions. Hence, the selection of appropriate measures and measurement methods to use for this study was somewhat precedential. Several measurement techniques were incorporated into the model of film selection investigated. The model of selection was based upon the use of both objective and subjective measurement methods. Commonly used statistical analysis methods and indices were employed, namely, frequency distributions, Spearman rank cor- relations, and the analysis of variance technique. The degree to which the measurement methods used by this study reflected mutually supportive experimental results and con- clusions was assumed to provide a useful check of the validity and reliability of the model. This study assumed that if the model was or was pgi_a valid, reliable one, the critical assumptions under- lying the measurement methods employed by the model would tend or pgt tend respectively, to be verified, confirmed, by the experi- mental data. 31 The experimental data were analyzed and interpreted there- fore, for two essential purposes: first, to determine whether or not the experimental hypotheses were confirmed or rejected, and second, to determine the degree to which critical assumptions of the model were verified or not verified. The model was based upon numerous assumptions described in this chapter and Chapter III. Hence, verification of the validity and reliability of the model was limited in this study to tests of only the most critical assumptions of the model. The Significance of the Study The significance of this study lies within the theoretical and practical contributions made. Theoretical Contributions Four theoretical contributions were made. First, the study defined and investigated the film selection process in terms of both relevance and product quality theory. The study is one of few which has done so. Second, the study defined and field-tested a conceptual model of film selection which illustrates how descriptions are actually used in the making of selection judgments. Third,the study provided evidence of the validity, reliability, and utility of the model inves- tigated. Fourth, the study provided a theoretical foundation, a conceptual framework, useful for continuing research into (a) fac- tors which influence the film selection process and (b) evaluation of the effectiveness of different styles or types of film descriptions. 32 The theoretical significance of the study lies not so much in providing ultimate answers to questions and hypotheses specified, but rather in identifying and clarifying (Mithe basis of experimen- tal evidence, decision points at which greater critical thinking needs to be pursued. The study builds upon the experimental efforts of Gilkey (1962) and S. C. Johnson (1971), who found that selectors prefer film descriptions which critically appraise the merit of the product described, i.e., the "quality" of the product. The study also builds upon a study by Latzke (1971) and the general literature dealing with the evaluation-selection of films and other learning materials. Latzke's study and the general litera- ture suggest the importance of providing information within film descriptions about the "quality" of the films described, to aid the selection process. This study was founded upon a theoretical stance distinctly different from that expressed traditionally within the literature. Traditionally, an effective film description has been defined pre- dominantly in terms of subjectively determined factors and char— acteristics suggested from expert opinion and user preference studies. In contrast, however, this study defined an effective film description in objective, behavioral terms as one which elicits selection decisions and judgment patterns which are basically simi- lar' to those elicited by the film portrayed by the description. The comprehensive work of Saracevic (1970b, pp.1ll-51),3 who reviewed in 1971 the body of research and literature on relevance 33 judgment and its measurement, substantiated the need for further definition and clarification of the concept of "relevance" and the need for related experimental studies. Saracevic's review and others by Cuadra (1966-73) and Penland (1972, pp. 488-543) indicated the general lack of relevance research done to date, involving non— print media. This study is unique in that it investigated and defined film selection in terms of the simultaneous interplay of three per- ceptions, “instructional pertinence," "film quality," and "better- ness," perceptions which are basically "aspects of relevance," as defined by Saracevic (1970b, pp. 120-21). Practical Contributions This study addressed a very practical matter, the design and evaluation of film descriptions made available within the marketplace. The study is of practical value to media information researchers, media selection researchers, and persons involved in the preparation of descriptions of films and other instructional materials. The study addressed a fundamental, underlying question of both theoretical and practical significance: "What constitutes a 'good,I 'effective' film description?" In response to this ques- tion, the study provided the following: 1. A comprehensive list of the basic types of relevance and film quality cues which can be included in film des- criptions; 34 2. Evidence of the impact upon the materials selection pro- cess of supplying overall ratings of excellence as film quality cues within film descriptions; 3. Evidence of the adequacy vs. inadequacy of three specific film description styles--the type I, II, and III film abstract styles--defined for the study; and 4. Evidence of the effectiveness and limitations of the measurement methods and procedures used by the study to evaluate the effectiveness of the experimental abstracts. Organization of the Study This study is described in four subsequent chapters, summar- ized as follows: Chapter II: Review of Related Research and Literature sum- marizes the literature and research pertinent to the purposes and objectives of the study. The chapter closes with listings of assump- tions and conclusions drawn from the review about the nature, influ- ence, and types of relevance and film quality information cues which can be supplied potentially in film descriptions. Thereafter, a critique of the state-of-the-art of the related research and litera- ture is provided, including recommendations for further research. Chapter III: The Experimental Design and Methodology describes the purpose of the experimental phase of the study; the measurement methods employed by the model of film selection investigated, the assumptions upon which they were based, and limitations of the model; and application of the model during the experimental phase of the 35 study. The application is described in terms of the particular experimental design and measurement methods used and related criti- cal assumptions. Chapter IV: The Experimental Findings and Results objectively summarizes the results of the experimental phase of the study. It describes the resultsof statistical analyses in support or rejection of the experimental hypotheses, results pertinent to critical assump- tions underlying the study, results pertaining to the three film description styles which were evaluated, and the reliability of the experimental data. Chapter V: Summary, Conclusions, Implications, and Recom- mendations summarizes the purposes, objectives, methods, and results of the study. Conclusions drawn from the experimental phase of the study are provided, as well as a discussion of the experimental results. The chapter closes with a discussion of the implications of the study, and offers recommendations for further research. The Appendices which follow contain tables and figures too cumbersome to include within the body of the study, and examples of stimulus materials and data-collection forms used during the experi- mental phase of the study. 36 Footnotes--Chapter I 1The notion of latent vs. manifest levels of the model was derived from Cook's (1971) predictive model of relevance decision making (pp. 34—56). 2The "Predictive Validity as Relevance Prediction: Design IV" method discussed by this document (pp. 9-10) was used as the basis for the experimental design of the study. 3The basic content of this article is also contained in the literature review section of Saracevic's unpublished doctoral disser- tation, "On the Concept of Relevance in Information Science," Case Western Reserve University, 1970a. CHAPTER II REVIEW OF RELATED RESEARCH AND LITERATURE Introduction Purpose of the Review The primary purpose of this review was three-fold: 1. To identify and summarize documents useful in defining the theoretical and conceptual foundations of the study. 2. To identify the types of relevance and quality information cues which should be considered for inclusion within film descriptions; instructional pertinence indicators (IPI's) and film quality indica- tors (FQI's) worthy of additional research and investigation. 3. To provide a brief critique of the state-of-the-art of the literature in terms of development of a behavioral theory of film description design, a theory based upon the influence of rele- vance and quality information cues on the film-selection process. Objectives of the Review The literature was reviewed, specifically, to identify writings which: 1. Suggested factors that can influence film quality and relevance judgments made by selectors. 2. Described the importance of film descriptions and the information cues provided by them. 37 38 3. Described the kinds of film quality and relevance infor- mation cues present in film descriptions. 4. Illustrated the variety of types of film evaluation- selection criteria considered to be important, meaningful, to dif— ferent groups of film evaluators-selectors--criteria serving as the origin of film quality and relevance cues perceived by selectors. 5. Illustrated the variety of sources and viewpoints from which relevance and quality information cues can be drawn. The critique was provided to suggest recommendations for research and evaluation efforts needed to develop a behavioral theory of film description design. Organization of the Review The following topics serve as major organizational headings for this chapter: (a) factors which influence the making of film quality judgments; (b) evaluation-selection criteria associated with different views of film quality; (c) factors which influence relevance judgments; (d) quality and relevance information cues present in existing film descriptions; (e) general conclusions drawn from the review; (f) assumptions about instructional per- tinence indicators and film quality indicators derived from the review; (9) instructional pertinence indicators and film quality indicators suggested for inclusion within film descriptions; and (h) the critique. 39 Factors Which Influence Film anlity Judgments FilmyQuality: Basic Views The basic view of film quality held by film selectors was the first major factor suggested by the review of related research and literature as influencing film quality judgments. Four distinct views of film quality dominated discussions in the literature: the "inherent attributes," "technical quality," "instructional design," and "effects assessment" views. These basic views provide the conceptual foundation for the list of IPI's and FQI's described in Chapter II. Inherent attributes.--This traditional view of film quality defined the value of instructional films in terms of a priori evaluation-selection criteria. The criteria defined or implied recognizable film characteristics which serve as "indicators of excellence" when making film quality judgments. The "inherent attributes" view of film quality stressed use of subjective judgment and interpretation processes and traditional film evaluation—selection criteria for determining what constitutes a "good" film, i.e., a film which is high or higher in quality in contrast to one which is low or lower in quality. The view was reflected by many writers who recommended, overall, a broad range of criteria to use for assessing film quality. Technical guality.--In general, the technical quality view referred to a distinct subset of "inherent attributes" or factors related to the professional caliber of the "audio" and "video" 4O make-up of films, namely, technical elements under the control of the producer and production team. The "technical quality" view of film quality was generally associated with film evaluation activi- ties of media production specialists or others familiar with film production technique. Instructional design.--This view of film quality assessment stressed attention upon film design elements, i.e., Specific char- acteristics or attributes of films demonstrated or assumed to influence learning from films. The "instructional design" view basically defined a high- quality film as one designed and structured in ways which result in learning consistent with the intended purpose of the film. The fundamental, critical question asked by proponents of the view was, "What film design characteristics help or hinder persons in learning from films, as demonstrated by empirical research or field test- usage evidence?" Although "instructional design" characteristics, like "technical quality" characteristics, were found to be recognizable "inherent attributes" of films, each of these types of characteris- tics is treated in discussions to follow, as a separate entity. Effects Assessment.--This view of film quality was character- ized by a concern for the impact of films upon viewers of the films, rather than by a concern for film attributes or characteristics per se. According to the "effects assessment" view, a "good" film 41 produces intended, desired responses or behavior changes within film viewers. Filmjuallty Appraisal Methods The particular method used in evaluating films was the second major factor suggested by the review of related literature as influencing film quality judgments. Methods of appraisal were found to be important because they determine the kind of informa- tion which is produced and communicated to others in film descrip- tions, regarding "evaluations" which have been made about a given film. Documents reviewed here deal primarily with film evalua- tion methodology considerations which influence the type, validity, credibility, and reliability of IPI's and FQI's that are eventually supplied in film descriptions. The documents deal with the following topics: basic types of evaluation methods, characteristics of film evaluation forms and instruments, and types of response modes used to describe judgments made by film evaluators. Carpenter (1969) reported that methods used to assess the quality of instructional materials were one of nine critical factors which influence judgments made about the quality of the materials. Carpenter's study concluded that assessment of the quality of instructional materials can not be limited to just the appraisal of inherent characteristics or attributes of the materials. Rather, the appraisal must also include assessment of the context within which the materials are produced, used, and evaluated. Carpenter's 42 study provided one of the most comprehensive, systematic discussions found in the literature of factors assumed to influence quality judgments made about instructional materials (pp. 1-41). According to S. C. Johnson (1971), three basic methods of educational appraisal have been used traditionally to assess the quality of instructional films: (a) the "personal judgment and assessment" method, founded upon subjective opinions and interpre- tations about film characteristics and attributes which define the value of a given film; (b) the "experimental" method, based upon use of classical, empirical research investigation methods to deter- mine the effectiveness or impact of a given film; and (c) the "psychometric validation" method based upon efforts to validate the effectiveness of films for specific instructional purposes and target populations of learners, through the collection and use of learner performance data (pp. 36-38). Analysis of Johnson's lit- erature review revealed that film evaluation efforts associated with these three assessment traditions have tended to merge together in many instances, with newer forms and methods of appraisal, e.g., those based upon use of preference ratings, opinion surveys, beha- vioral task analyses, subject matter content analysis, and classroom interaction analyses (pp. 22-28, 36-48). Hoban's classic study (1942) investigated the reliability and internal consistency of several methods of evaluating films. Methods studied by Hoban included, among others, those based upon use of evaluation forms; judgments made by panels of experts of the potential value and effectiveness of films; judgments made by students 43 and teachers of the actual value and effectiveness of the films; teacher, student, and parent interviews; stenographic records of classroom discussions; and anecdotal records of student behavior. In 1955, two noteworthy documents were produced by the Instructional Film Research Program at Pennsylvania State University. Edwards' report described the conceptual and procedural basis for a film evaluation method involving statistical analysis of rank order responses obtained from panels of film evaluators (pp. 1-26). Greenhill's report described the development and empirical vali- dation of a rating instrument for the evaluation of training films (pp. 1-68). Through the years a variety of evaluation forms and instru- ments have been devised to enhance the precision and objectivity of the film evaluation process. Guss (1957) analyzed and noted characteristics of film evaluation forms used prior to 1950. Guss found that, in general, they: (a) identified two basic types of information about films--"facts about films" and "judgments of their value"; (b) were designed in recognition of different levels of experience, training, and interests of audiences which view films; (c) varied in emphasis in terms of factors evaluated; and (d) were similar in that the purpose of the film was the most important evaluation factor considered (p. 49). Film evaluation forms and instruments described in the literature since Guss's study have exhibited similar characteristics. The historical development and characteristics of film evaluation forms and instruments were reviewed by Guss (1952), 44 Gilkey (1962), Latzke (1971), and S. C. Johnson (1971). Analysis of reviews provided by these authors revealed that most forms and instruments developed prior to 1970 solicited, in various combina- tions, three basic types of responses: (a) indications of filmic characteristics selected from a checklist of factors, (b) ratings of criteria or factors described in terms of scaled values, and (c) responses elicited from questions designed to yield subjective opinions and comments about specific evaluation factors. In 1969, EPIE reported its assessment of the state-of-the- art of evaluation forms and practices commonly used by elementary and secondary schools nationwide, to assess the quality of instruc- tional materials, including films. The report criticized the general nature of conventional evaluation-selection practices, in that they were too often characterized by narrowness of viewpoints and per- spectives, resulting in lack of attention to important evaluation- selection factors, and limitations of use of both objective and subjective judgment techniques (pp. 4, 94-99). Specific criticisms directed toward the use of evaluation forms were these: the rare use of student input; lack of use of questions targeted upon precise evaluation factors; lack of mention of the relationship of learner characteristics to use of materials evaluated, other than to reading level considerations; and lack of indication that materials would likely benefit or not benefit specific types of students, especially ethnic groups, socioeconomic groups, or other students with special interests or needs (pp. 94-99). 45 Film Quality Evaluators-Selectors The type of persons involved in the evaluation-selection process was the third major factor suggested by the review of related literature as influencing film quality judgments. This fac- tor was found to be important because persons with different back- grounds, training, and experience reflect different viewpoints and priorities about (a) the film quality assessment process in general and (b) the evaluation-selection criteria base used during the process in particular. This factor was also found to be important, because it can influence the credibility and perceived value of information provided to others in film descriptions, about a given film evaluation which has been made. Baird (1973a) found that persons closest to instruction, i.e., teachers, instructors, and faculty members, were most often recommended and actually found to play the primary role in film evaluation-selection activities. Other persons recommended as film selectors were subject matter consultants and department heads (pp. 22-23). Persons found to actually be involved in the selection process were: media specialists, subject specialists, department heads, media program administrators, school principals, film library directors, and librarians (pp. 24-26). Greenhill (1955, pp. 22-23), EPIE (1969, pp. 94-99), and Carpenter and Froke (1968, p. 29) expressed concerns about the need to improve the validity and reliability of film quality assessments made by different types of evaluators-selectors. They suggested several avenues by which to achieve the task: (a) improvement of 46 instruments designed to aid the evaluation-selection process and (b) improvement of the judgmental skills of the evaluators-selectors through appropriate training activities. Greenhill (1955) stressed the need for training film evalua- tors in the interpretation and use of evaluation instruments. In particular, he emphasized that evaluators must (a) clearly under- stand the objectives specified for films to be evaluated, the char- acteristics of the target audience, the conditions under which the films are to be used, and the exact interpretation of each criterion judged; and (b) judge each criterion independently (pp. 22-23). Film Evaluation-Selection Criteria The particular type and kind of appraisal criteria considered during the film evaluation-selection process was the fourth major factor suggested by the review of related literature as influencing film quality judgments. A broad range of criteria was suggested for film evaluation- selection purposes. The specific types and kinds of criteria recommended, categorized by the view of film quality assessment with which they were associated, are discussed in detail in forthcoming sections of this chapter. Evaluation-Selection Criteria Associated With Different Views of Film Quality Information about criteria of significance to the design, evaluation, and selection of films was assumed by this study to be important to include in film descriptions. The study assumed that 47 appropriate expressions of such criteria within film descriptions assist selectors to determine the degree to which films are "relevant" to and of sufficient "quality" for a given instructional purpose or situation. Documents reviewed in this section describe (a) representa- tive types, categories, and kinds of criteria recommended from for- mal studies and investigations or from persons or groups with exten- sive experience in the materials evaluation-selection process; or (b) unique criteria not commonly considered by others. The documents are grouped into four basic categories, according to the degree to which they dominantly display viewpoints related to the film evalua- tion, film selection, instructional materials evaluation, or instruc- tional materials selection processes. The Inherent Attributes View of Film Quality: Associated Evaluation-Selection Criteria Associated criteria were suggested from both the film evaluation-selection literature and the general materials evaluation- selection literature. Film evaluation criteria.—-Studies by Guss (1952, pp. 34-35, 309-11) and Baird (1973b, p. 5)1 offered useful summaries of film evaluation criteria commonly suggested by proponents of the inherent attributes view of film quality. After analyzing 50 lists of film evaluation criteria, Guss (1952) reported that criteria such as these were commonly recom- mended ones: accuracy, authenticity and scholarship; social 48 significance; clearly definable teaching purposes; unity; technical excellence; and general effectiveness. Evaluation criteria not conmonly accepted were these: the manner of organizing content, the use of still picture techniques, the scope and depth of cover- age of subject matter treated, and aesthetic considerations (pp. 34-35). After surveying 12 university and film college libraries, Guss (1952) found these criteria to be generally accepted ones: 1. Psychological factors: For example, whether the film invited audience participation, identification, and ego-involvement; depicted well-supported main ideas; provided incentives; or stimu- lated interest. 2. Technical factors: Such as sufficient photographic quality, intelligibility of the sound, adequacy of orientational devices used, appropriate pictorial representation, and effective sound accompaniment. 3. Content factors: The adequacy and appeal of the subject matter treatment; the accuracy and appropriateness of representa- tions portrayed; lack of serious omissions, misconceptions, and stereotypes; and the range of content and nature of examples used, especially in relation to the intended audience level and purpose of the film. 4. General factors: For instance, unity and wholeness achieved through use of complementary and supplementary component parts and the significance of the film's purpose (pp. 309-11). 49 In a more recent study of film evaluation-selection criteria used by large university film libraries, Baird (1973b) identified 21 commonly recommended film evaluation criteria, which he classi- fied into six general groups, listed below. In general, all of the criteria were rated as being "important" to "very important" by directors of the 128 libraries surveyed (p. 5). 1. Curriculum Criteria a. Appropriateness for grade specified b. Correlation with Specific curriculum programs 2. Affective Criteria a. General overall effect b. Motivational quality and interest c. Aesthetic value 3. Content Criteria Datedness in styles, procedures, etc. Scope or coverage Appropriate emphasis of ideas Unity of the parts (wholeness, continuity, etc.) Order of presenting ideas, concepts, etc. (DQOU'OJ 4. Technical criteria a. Production date (datedness) b. Overall technical quality c. Color vs. black and white 5. Filmic Criteria a. Appropriate use of the film medium. b. Pacing (presentation rate) c. Creative film making d. ApprOpriate orienting devices illustrating size and space relationships 6. Utilization Criteria a. Clear objectives b. Purpese of film (basic, enrichment, introductory, etc. c. Learning approach (inductive, deductive, etc.) d. Type of film (documentary, demonstration, drama- tization, etc.) 50 In the Educational Film Library Association (EFLA) manual for film evaluators, Jones' (1967) description of a "good" film included emotional quality factors such as the ability to "evoke a response"; to "leave the audience wanting to see more"; to create lasting, vivid expressions; and to not be offensive in any way (pp. 7-9). A list of 61 critical questions was also provided in the EFLA manual by Limbacher (in Jones, 1967, pp. 18-19) categorized in terms of their relationship to six considerations: subject matter content, psychological impact, artistic-technical values, social and ethical values, entertainment values, and audience reac— tions. Three general criteria specified for development by EFLA evaluations were "how" the film can be used, "by whom," and "for what purpose" ("Brief Guide," 1967, p. l). Landers (n.d.) emphasized that "the basic criterion must always be the need of the institution for the subject covered; does it fit a specific requirement of the curriculum?" (p. 1). Other criteria offered by Landers concerned factors such as the credi- bility of the production; "slanting," which may result in miscon- ceptions; the use of visuals well-coordinated with the script; the suitability of vocabulary level and tone of presentation; the digestability of information, i.e., not too little or too much; the essentialness of information provided; and appropriateness of the film to the background of the audience (p. 1). Some noteworthy criteria of significance to film evaluation in general, but not commonly mentioned, were offered by Sherman, Johnson, and Payne. For instance, the "authority figure" represented 51 by the film narration was suggested by Sherman (1968) and others as a significant evaluation criterion (p. 9). Johnson (1971) sug- gested that two types of instructional purposes can be distinguished and used as evaluation criteria: teaching purposes and learning purposes. Teaching purposes dealt with possible instructor uses of a film; learning purposes with behavioral objectives which speci- fied tasks a student should be able to accomplish as a result of using the film (p. 43). Payne (1952), who evaluated physical edu- cation films for girls, noted three criteria of broader potential use: the type of film (instructional, motivational, etc.), level of skill demonstrated, and usefulness with males or females or both (p. 156). Several authors focused attention upon criteria pertinent to the film evaluation needs of specific subject matter disciplines and special interest target groups: for example, the National Coordinating Council on Drug Abuse (1970, 1971), drug education films; Friedman (1961) and Smith (1958), science films; and Shetler (1961) and Cockrum (1955), music education films. Film selection criteria.--Baird's study (1973b) made several noteworthy contributions. First, his study stressed the distinc- tion between film evaluation criteria and film selection criteria. Baird demonstrated that criteria used in the evaluation of films are not necessarily the same as those used in the final selection of films. Some criteria were shown to serve distinct roles as either selection criteria or evaluation criteria, while others 52 serve dual roles as both evaluation and selection criteria (pp. 2-10). Second, Baird's study noted the distinction between selec- tion jg: and from film library collections. Baird concluded that: Evaluation criteria used by teachers to select films for classroom use are (and perhaps should be) different than the criteria used by a film library staff to select films for inclusion into the library (1973a, p. 31). Third, Baird's study defined the film evaluation-selection process in terms of four separate essential steps: 1. Identification of films available for potential use, 2. Determination of those to be previewed for formal evaluation purposes, 3. Evaluation of films selected for previewing, and 4. Final selection of those to be obtained (1973b, pp. 2, 7, 10). Fourth, Baird's study showed that some film quality assess- ment criteria were considered to be more important than others at each of the above successive steps (1973b, pp. 2-7). Baird found that film quality ratings supplied by published reviews (EFLA, Landers, etc.) were not considered to be very important for determination purposes, i.e., for selecting films for preview- ing purposes (1973b, pp. 4, 7). Additionally, Baird found that evaluation ratings supplied by potential faculty users, datedness, and grade level were deemed to be important final selection criteria, among others. It is interesting to note, however, that film quality ratings supplied by other persons or groups were shown to be of only 53 mediocre to low importance for final selection decision-making pur- poses. Ratings of film library directors and composite ratings of evaluation conmittees were shown to be mediocre in importance, and student ratings and published ratings, low in importance (1973b, pp. 6, 7). Sherman's 1958 study dealing with the evaluation and selec- tion of films used by elementary school teachers concluded that: "There seems to be no single criterion that will of itself guarantee acceptance or rejection of a film. Different reasons will determine the [final selection] recommendation of different films" (p. 115). Sherman found that the underlying reasons why films were usually rejected for use, in decreasing order of frequency, included: (a) too difficult or advanced for the grade level, (b) not relevant to the curriculum, (c) content not well organized, (d) faulty reception or technical quality of the film, (e) too many concepts, (f) uninteresting, (h) undesirable emotional effect (1958, p. 113). "Appropriate use of the medium" was often suggested as a film evaluation-selection criterion. Wittich and Schuler (1967) described two unique, related instructional contributions which have been made traditionally by films. Films have: 1. Provided learning experiences in vicarious form which are basically "inaccessible," i.e., too difficult, too time con- suming, too costly, or impossible to experience directly; and 2. Simplified and clarified instructional content to be learned through use of various audiovisual cueing and photographic techniques; accompanying narrative, dialogue, and sound effects; and subjective camera angles. Photographic techniques mentioned ran 54 the range from direct life-size photography to close-up, microscopic, telephoto, x-ray, infrared, time-lapse, slow-motion, animation, "behind-the-scenes,' and "candid camera" techniques (pp. 410-19). Boucher (1971) described the contributions of the film medium hiterms of the unique advantage offered by films for training purposes. Boucher noted that when selecting films as a medium of instruction, one should consider that films can be used to overcome intellectual or reading barriers, capture the continuity of real action, provide a "front seat" view for demonstrations, be used for testing purposes, facilitate the learning of tasks and procedures, and serve as a model for guided mental practice (pp. 33-34). Harrison's text (1973) on film library administration expressed a view of film selection closely related to that of this study. Selection for film library collections was suggested by Harrison as being concerned primarily with two endeavors. The first and most important was the selection of a balanced collection of films relevant to the interests and foreseeable demands of clientele to be served; the second, the selection of films of ade- quate guality (p. 264). Instructional materials evaluation criteria.--Published by the Association for Educational Communications and Technology, Selecting Media for Learning (1974) contained two pertinent articles. The articles by Rosenberg and Anderson listed numerous factors to be considered in assessing the sensitivity with which minority ethnic groups are treated in instructional materials. 55 A study by Steiner (1972) documented the failure "of much of the materials used in classrooms to be relevant and interesting to students, especially those from minority backgrounds" (p. 2090-A). The study described the theoretical background for development and use of "micro-relevant" materials, materials based upon notions of visual literacy, and "things specifically related to the child's own immediate environment and milieu" (p. 2090-A). Van Etten (1969) voiced the need for greater attention to use of product descriptions derived from the evaluation process. Van Etten noted that too much emphasis has been placed upon "evalua- tion" and not enough on "analysis and description" of the content and intended uses of the material (pp. 1-15). Instructional materials selection criteria.--Goff (1970) noted that the selection of materials is justified only when the materials demonstrate "curricular validity" and "content validity" (pp. 41-42). Brown and Norberg (1965) proposed a materials evaluation- selection scheme in which selection was based upon three factors: the educational goals of a given institution, the expected contribu- tions related to instructional objectives, and standards used to assist the making of discriminative judgments when comparing alter- native products (pp. 72-74). The Joint Committee of the National Education Association and the Association of American Publishers (1972) reported that the instructional materials selection process is evolving, changing in emphasis and needs. Their publication emphasized the need to 56 place greater attention upon selection criteria which reflect the multi-ethnic backgrounds and interests of today's pupils, students, and teachers (pp. 1-60). The Technical anlity View of Film Quality: Associated Evaluation-Selection Criteria Associated criteria were suggested primarily from the film evaluation literature. Film evaluation criteria.--Overa11, technical quality dis- cussions generally considered the appropriateness of use of audio and visual aspects of the film medium, under control of the film designer and production crew. Related discussions were summarized in the literature by Jones and Limbacher (in Jones, 1967, pp. 7, 17, 18, 26, 27) and Guss (1952, pp. 309-11). Visual technical quality considerations generally referred to factors such as the clarity of focus; color and contrast control; color vs. black/white factors; proper film exposure; the caliber of photographic composition, camera work, acting, and editing tech- niques; and the nature of settings used. Audio considerations were usually concerned with the clarity, intelligibility, pacing, and synchronization of the sound; the nature, tone, and style of the dialogue and narration; and the appropriate- ness of background music and sound effects used. 57 The Instructional Design View of film anlity: Associated Evaluation-Selection Criteria During the last decade or so, the term "instructional design" has conveyed various meanings in the literature, depending upon the context of its use. For this study, however, the term "instructional design" was associated with "instructional design characteristics"-- characteristics defined as those recognizable attributes of films and other learning materials which influence learning, as demon- strated by empirical research or field test-usage evidence. The related research and literature suggested that four basic categories of instructional design characteristics can be defined for instructional films, namely, characteristics related to these aspects of film making and film use: (a) learning purposes and objectives, (b) learner vocabulary levels, (c) learner comprehension levels, and (d) learner-audience involvement considerations. Docu- ments reviewed in this section describe film characteristics commonly mentioned as evaluation-selection criteria in the general literature, characteristics which fall under the four categories above. Related research publications were reviewed by several authors: Schmidt, Miller, Hsia, Popham, Torkelson, Briggs, Saettler, and Edling. A summary of instructional design characteristics and criteria suggested by these authors and others follows. Film evaluation criteria.--Miller (1970) identified nine pre-production design elements, including four primary and five secondary elements, of significance to the design of films produced 58 for elementary school use. The primary elements were concerned with use of student participation, knowledge of results, redundancy, and attention-directing techniques. The secondary elements stressed the use of introductory information, organizational outlines, and reviews; the readability of the film commentary; and color as a discriminating cue (pp. 31-78, 109-l6). Schmidt (1972) analyzed the nature of instructional design elements considered in the production of 20 outstanding films used for elementary and secondary school level instructional purposes. He identified over 50 "operational generalizations," i.e., instruc- tional design principles, supported by empirical research evidence or by consensus of expert instructional film makers. Instructional design characteristics suggested from Schmidt's study dealt with considerations such as: 1. The sequencing, repetition, and redundancy of pictorial— verbal stimuli; 2. The use of cues, prompts, attention gaining/directing devices, information organizers, and other programmed instruction variables; 3. The pacing and rate of development of information pre- sented; and 4. The relevance, appropriateness, familiarity, and sim- plicity of stimuli presented (pp. 61-90, 115-20). Torkelson's review (1968) of instructional film research reported several other considerations found to influence learning from films: the use of worksheets with films, to specify distinct 59 viewing aims for pupils; the tailoring of films designed for opinion change purposes, to the characteristics of the intended audience; and revision of films on the basis of field test results (pp. 132-34). Hoban (1971), who recently reviewed the state-of-the-art of instructional films, suggested that the attention-holding ability of films was an important instructional design element. He noted, also, that the actual attention-holding ability of most instruc- tional films was somewhat questionable (p. 6). Hoban (1967) recommended that three fundamental factors determined the value of an educational film: "the purpose of its use," "the strong and weak points of the film" in relation to its purpose, and "the responses of students" which reveal that the film can actually fulfill its purpose (p. 9). Hoban stressed that In evaluating motion pictures, as in evaluating anything else, it is not enough to ask, "Is this a good film?" The question is not whether the film is good, but good for what purpose, with whom, and under what circumstances (p. 9). Film selection criteria.--Allen (1967) suggested that films should be selected for use on the basis of their ability to achieve specific types of learning objectives. According to Allen, avail- able empirical research evidence has demonstratedthat instructional films can be used very effectively for objectives such as learning visual identifications, principles, concepts, rules, and pro-. cedures and somewhat less effectively for objectives such as the learning of factual information and skilled perceptual motor acts and the development of desirable attitudes, opinions, and motiva- tions (pp. 27-31). 60 Instructional materials evaluation criteria.--Eash (1969) designed an instrument for the evaluation of curriculum materials based upon analysis of the instructional design characteristics of the product and other factors. Four categories of factors were assessed by the instrument: the nature of objectives specified for the product, theproduct's organizational approach (scope and sequence), modes of transaction used to engage and direct the learner, and the nature of feedback and evaluation techniques used to assess the material's effectiveness (pp. 18-24). Eash (1974) indicated that use of his instrument frequently identified "paradoxes and contradictions" present in the design of products evaluated (pp. 38-39). Also, Eash stressed the need for describing the population of students used as the "anchor referent" for evaluations made, and the need to verify the evaluation by obtaining empirical evidence from actual tryout of the material (1969, p. 18). An example of evaluations produced for the instruc- tional film medium through use of Eash's instrument was reported by EPIE ("Thorne Marine," 1972, pp. 5-7). The "treatment variables" described in Popham's review (1969) of curriculum materials research were basically instructional design variables: introductory and summary information organizers, opportunities for relevant practice, built-in knowledge of results and learner interest considerations, prompts and cues, information provided by single and multiple information channels, and the sequencing and pacing of information presented (pp. 322-31). 61 Briggs' review (1968) of research on "learner variables and educational media" emphasized that these factors influence learning from curriculum materials: learner characteristics such as IQ, special aptitudes, and entering competencies; sensory modes used; the sequencing of instruction; and programing variables such as size of step, frequency of feedback, and the variety of examples used (p. 161). An extensive review of the research on single and multiple channel information processing by Hsia (1971) concluded that the well-synchronized audio-visual presentation results in more effec- tive communication than just an audio or video presentation alone (p. 66). Hsia indicated that either or both the audio and video channels of an audiovisual presentation can become overloaded, since the load capacity varies with each (pp. 65-66). Hsia also noted that the information-processing rate varies considerably for indi- viduals responding to audiovisual presentations. Hsia reported that the processing rate is influenced by five factors: the dimension- ality of stimuli presented (audio, video, or combined audio-video); the difficulty level; and the relevance, familiarity, and associa- bility of information provided (p. 63). Gropper (1968) investigated the role of different types of visuals--"criterion visuals" and "intermediary visuals"--in the instructional process. He found that visuals played a variety of roles in instruction, of significance to the instructional design of films. Gropper reported that visuals can be used effectively to cue responses, reinforce responses, serve as examples to facilitate 62 acquisition of discriminations and generalizations, and facilitate the transfer of learning to a verbal situation (pp. 353-54, 361). According to Latzke (1971), Stevens and Morrisett (1967) developed and validated an instrument for evaluating social studies curriculum materials, based upon six criteria. The criteria, ranked by order of importance found, were these: (a) rationale and objec- tives, (b) instructional theory and teaching strategies, (c) con- tent, (d) descriptive characteristics, (e) antecedent conditions, and (f) overall judgment (Latzke, 1971, pp. 22-23). Krathwohl (1965) indicated that the conceptual basis for the specification and taxonomic classification of objectives could serve as a possible framework for evaluating instructional materials (pp. 83-92). He mentioned that three publications could assist with the task: Mager's Preparing Instructional Objectives (1962), Bloom and others' Taxonomy of Educational Objectives--Cognitive Domain (1956), and Krathwohl and others' Taxonomy of Educational Objectives-- Affective Domain (1964). Simpson (1966-67) reported development of a classification system for educational objectives related to the psychomotor skills domain, which could also possibly serve as a framework for materials evaluation (pp. 110-44). Instructional materials selection criteria.--EPIE's "how to" handbook on materials selection (1973) and Tyler, Klein, and Michael's Recommendations for Curriculum and Instructional Materials (1971) provided two of the most comprehensive lists of criteria and factors recommended in the literature for the design, evaluation, and selec- tion of instructional materials. 63 EPIE's handbook (1973) contained a "criterion check list" to aid schools and institutions to develop effective materials selection procedures. The list was composed of over 65 factors related to five major considerations: the producer of the material, administrative requirements, curricular focus, pedagogical requirements, and evalua— tion requirements. Factors of potential significance to film selec- tion, among others recommended by EPIE, included: 1. The professional credentials of the producer and the evidence he supplies to back up claims made for the material. The scope and sequence of the material and the "point of view" expressed regarding the "treatment of minorities, ideologies, personal and social values, sex roles." The suitability of the material in terms of socioeconomic, geographic, and ethnic considerations. The teacher's particular "pedagogical style." The compatability of the material with particular views of learning. The methods and approach used to assess the effectiveness of the materials. The source and type of evidence made available to verify the effectiveness of the material for the intended set of learners (p. 13). A broad range of criteria was offered by Tyler, Klein, and Michael (1971). Some of the more basic criteria of significance to the selection of instructional films were the following: 64 l. Rationale: Criteria dealing with the underlying concep- tual, theoretical, and philosophic bases of the materials. 2. Opjectives: Criteria dealing with the type, source, sig- nificance, and value of instructional objectives Speci- fied for the materials. 3. Learner characteristics: Indications of the ways in which materials are relevant and effective to use with students exhibiting specific learning characteristics and needs. 4. Evaluation-effectiveness: Factors dealing with methods used to evaluate the effectiveness of the materials and the type of evaluation and field-test revision data sup- plied to potential users. 5. Usage conditions: Indications of specific conditions and requirements to be met for optimal, effective use of the materials. 6. Practicality: Factors associated with the practical realities of using the materials. 7. Information dissemination: Indications of the availa- bility and accessibility of information provided to potential users, describing considerations above (pp. 25-39). Hoye (1970) mentioned three questions that should be asked before final selection of instructional materials is made: (a) How do the materials relate to the instructional objectives specified for the learner? (b) Have the materials been field-tested? and 65 (c) Is there an adequate description of field test data upon which to base an intelligent selection decision? (pp. 364-65). Saettler's review of research (1968) on ”design and selec- tion factors" revealed that these instructional design considerations influence learning from films: the interaction of information pre- sented by single and multiple stimulus channels (audio-video), the nonverbal content of films, and word-picture relationships conveyed (pp. 120-23). Focusing on a "systems approach" to learning viewpoint, Phillips' review (1966) of research on the implementation of learning materials indicated that materials selection should be based upon five key considerations: learner characteristics, the characteris- tics of the medium, teacher characteristics, intended instructional methods, and administrative requirements (pp. 373-79). Summary: Instructional design characteristics of instructional film§.--Instructional design characteristics identified from the lit- erature reviewed in this section, and the source (author) of the characteristics, are summarized in Table 1. Most of the characteris- tics listed in the table were derived from the operational generali- zations described by Schmidt (1972, pp. 61-90, 115-20). The Effects-Assessment View of Film Quality: Associated Evaluation-Selection Criteria The "effects assessment" view of film quality was character- ized by a concern for the impact of films upon film users. Hoban (in Dale et al., 1937) was one of the earliest writers to express 66 Table l.--Instructional design characteristics of instructional films.a Instructional Design Characteristic Sourceb Purpose-Objective Characteristics l. Exhibits purposes, objectives, and approaches which are directed: a. primarily, toward the learning of concepts, principles, rules, procedures, and visual- spatial distinctions. b. secondarily, toward the learning of facts, perceptual motor skills, attitudes, motiva- tions, and opinions. c. toward specific audiences exhibiting defined characteristics. d. toward highly specific expectations of learn- ings to be accomplished for specific learning domains (cognitive, affective, or psychomotor). Exhibits use of visual rather than audio stimuli, to assist the learning of visual-spatial discrimi- nations. Exhibits use of audio rather than visual stimuli, to assist the learning of temporal distinctions (such as rhythm, sequence, speech, etc.). Exhibits use of multiple stimulus channels, both audio and video, to aid the learning of concepts involving time and spatial distinctions. Incorporates revisions suggested from learner response data obtained from use of preliminary versions. Vocabulary Level Characteristics l. Exhibits readability levels for commentary and dialogue which match the level of the intended audience. Exhibits use of labels, names, and technical terms familiar to the intended audience, unless defined in the film. Allen Allen Schmidt Torkelson Allen Schmidt Schmidt Schmidt Torkelson EdlingC Schmidt Miller Schmidt 67 Table l.--Continued. Instructional Design Characteristic Source Comprehension Level Characteristics 1. Organizational Structure-Sequence Characteristics a. Exhibits use of pictorial stimuli followed by Schmidt verbal responses or labels. b. Exhibits use of information organizers: intro- Schmidt ductions, summaries, or outlines, to alert the Miller learner of what to expect or to emphasize what Po ham is important to remember. p c. Exhibits redundancy, repetition of main ideas, Schmidt examples, sequences, or concepts treated. Miller d. Exhibits a balanced dialogue and commentary; Schmidt not "too little“ or "too much" commentary. e. Exhibits simplicity: uses only the most essential , Schmidt relevant, necessary audio-visual stimuli. 2. Programing Characteristics . . . . . Schmidt a. Prov1des feedback, i.e., confirmation of Miller logical consequences or knowledge of results Bri s to the viewer. Popagm b. Provides a variety of appropriate examples to Schmidt illustrate ideas emphasized or important to Briggs remember. Miller c. Exhibits use of prompts, cues, criterion aquggt visuals, and attention gaining-directing Gro er devices, to call attention to details relevant Po 55m to the purpose-objectives of the film. Hsia d. Exhibits use of color cues when color dis- Schmidt criminations are necessary, and use of color Miller in nondistracting ways. e. Exhibits use of the subjective angle of view (the view from the eye of the demonstrator) Schmidt when psychomotor skills demonstrated in the film are to be learned. . f. Exhibits a pacing rate appropriate to the 332?;gt level of the intended audience. Hsia 68 Table l.--Continued. Instructional Design Characteristic Source (1) Rates of development or tempo slow enough Schmidt to enable comprehension by the intended Miller audience. Hsia (2) A rate of presentation about llO-l4O words Schmidt per minute for commentary and dialogue. (3) Use of pauses or other methods of slowing down the rate of development, when attention Schmidt shifts from one information source to another. (4) Use of appropriate step sizes. Briggs 3. Audio-Visual Syhchronization Characteristics Exhibits use of well-synchronized, complementary Schmidt audio-visual stimuli (mutually supportive, Hsia mutually relevant); demonstrates use of noncon- Saettler flicting information channels. 4. Information Load Characteristics Presents a reasonable amount of information, both general and specific, to enhance com- Schmidt prehension and retention of main points Hsia emphasized by the commentary or dialogue. Learner-Audience Involvement Characteristics 1. Exhibits content and treatment methods tailored Schmidt to the interest, appeal and other relevant Popham characteristics of the intended learners. Torkelson 2. Exhibits use of audience participation techniques . . . . . Schmidt (e.g., questions or statements which 1nV1te Miller viewer participation, whether covert or overt). 3. Exhibits use of prompts, cues, and attention Schmidt gaining-directing devices to call attention to Miller important details or considerations pertinent Popham to the purpose-objectives of the film. 4. Provides built-in opportunities for relevant Schmidt practice, whether covert or overt. P0pham 69 Table l.--Continued. Instructional Design Characteristic Source 5. Exhibits use of active rather than passive Schmidt sentence structure in the commentary. 6. Provides supplementary materials such as work- sheets, to engage the learner in pertinent Torkelson learning tasks. aThis study assumed that film evaluators can be trained, potentially, to recognize the instructional design characteristics listed in Table 1. The study also assumed that few films, if any, would likely exhibit all or even a high percentage of characteris- tics noted. This is so, because of the creative diversity of pro- ducts developed by the film industry, and evolving definitions and notions of what constitutes an "instructional" film. bSource of data: Characteristics listed above were derived from experimental research findings and conclusions which support the characteristics specified, reviewed by the authors indicated. The work of authors listed is reviewed on pages 57-65. cEdiing, 1968, p. 184. this view when he noted that two basic ways exist by which to improve the evaluation and selection of instructional materials. "One is through the analysis of the material itself, and the second is through the analysis of pupil responses to this material" (p. 252). The basic essence of the "effects assessment" view was succinctly defined by Carpenter and Froke, the Educational Products Information Exchange (EPIE), and Popham. Carpenter and Froke (1968) defined the quality of instructional materials in terms of the conceptual, attitudinal, and behavioral changes which result from use 70 of the materials; changes associated with the three basic learning domains--the cognitive, psychomotor, and affective domains (p. 30). EPIE (1973) described effects assessment in terms of (a) direct, observable "effects" of instructional materials on learners and teachers; (b) "affects" which are negative and positive feelings evoked; and (c) "side effects," unanticipated effects and affects which occur ("What Is EPIE," 1973). Popham (1967) defined effects in terms of "validated" effects: Instructional products are "validated" when empirical evi- dence is available which confirms the product's ability to produce a desired behavior change in the intended learners (p. 403). Some writers made a distinction between actual effects and potential or estimated effects; for example, Carpenter and Froke (1968), on the basis of whether or not the effects are directly observed or measured, or predicted from inspection or review of the material (pp. iii, v). Film evaluation criteria.--Carpenter and Froke (1968) devised an evaluation form to assess the potential effectiveness of instruc- tional films and/or instructional television programs. Evaluation criteria assessed by the form fell under six general categories: instructional objectives, instructional content, the manner of pre- senting content, technical quality, learner stimulation, and overall evaluation (pp. 30-36). The Computer-Based Project for the Evaluation of Media (1971, 1972a, 1972b) experimented with a variety of methods, data bases, and innovative techniques for evaluating instructional films 71 and other materials used in the field of special education. The project evaluated over 400 films, while experimenting with the use of five types of effects assessment data bases: student attention data, student interviews, responses to open-ended questions, teacher interviews, and field test data (1971, pp. 1-41). Instructional materials evaluation criteria.--Quisenberry and others (1974) listed and described the rationale for 35 factors use— ful for evaluating films and filmstrips designed for use with young children. The factors were clustered in seven categories, which dealt primarily with the potential effects of the materials upon learnings to be acquired. The categories dealt with these considera- tions: aesthetic values, concept development, experiences with literature, interpersonal relationships, language development, move- ment, and self-actualization (pp. 60-61). Komoski (1974) described his "learner-verification-revision“ view of instructional materials evaluation. Komoski noted that high-quality materials are those which have been (a) field-tested to ascertain "what specifically learners have learned" and "what spe- cifically they were supposed to but did not learn" from the mate- rials; and (b) modified, revised accordingly, to ensure their effectiveness (pp. 363-64). Edling's review (1968) of research on educational objectives and educational media cited several studies which demonstrated that the effectiveness of instructional materials, including films, can be improved when revised on the basis of learner-response data. One 72 particularly notable study was reported by Vandermeer, Morrison, and Smith (1965). These researchers found that film revision efforts attending to instructional design characteristics significantly improved the effectiveness of a pair of instructional films, as measured by mean pre- and post-revision achievement test scores (Edling, 1968, p. 184). Instructional materials selection criteria.--Stake (1967) emphasized the importance of providing, Unmaterials selectors, information about user satisfactions and dissatisfactions, informa- tion important to different users, and information about differences among products (pp. 7-8). Stake concluded that selectors need both soft and hard evaluation data, collected from producers, analysts, and users. They need . to know the physical properties of the product, the purpose for which it was intended, the conditions under which it has been used, the actual results of its use, the judgments of users and observers (p. 9). The National Center on Educational Media and Materials for the Handicapped developed an evaluation form attending to four major factors: the significance of the purpose of the product being evalu- ated, the relationship of the product to specific learner character- istics, the face validity of the product, and evidence of the product's actual effectiveness obtained from field testing or other usage situations ("Report on the NCEMMH Media Selection Conference," 1973’ pp. 24-25). 73 Factors Which Influence Relevance Judgments Topics treated in this section include definitions of rele- vance and relevance judgment, related research studies, and the reliability and validity of relevance judgments. Definitions of Relevance and Relevance Judgment The concept of relevance received considerable attention during the past two decades or so within the information science literature. Noteworthy definitions obtained from the information science literature follow. Definitions of relevance.--Clason (1973) defined relevance iri terns. of pertinence: “Pertinency, Relevance: The state or quality of implying close, logical relationship with and importance to the matter under consideration" (p. 278). "Pertinent, Relevant: The quality of being related to a subject sought" (p. 277). Hillman (1964) described relevance in terms of "conceptual relatedness" (pp. 26-34). Saracevic (1970b) described relevant information in terms of factors such as "utility," "importance," "appropriateness, appli- cability," and "usefulness" (p. 121).2 On the basis of his compre- hensive review of the literature, Saracevic concluded that: Relevance indicates a relation between a source and a des- tination in a communication process; more specifically, relevance is a measure of effective contact between a source and a destination (p. 112). Halpin and others (1970) noted that: 74 Although a precise definition of information relevance varies with the specific research area, it is usually viewed as information that is useful to a subject in accomplishing an assigned task (p. 1). Cook (1971) stated that: "Relevance decisions are generally regarded as an individual's internal, subjective evaluation of how well a document3 meets the requirements of an information need“ (p. 29). Definitions of relevance judgment.--Cuadra and Katter (1967a) concluded that relevance judgment is basically a "content-matching" process, in which the content of a document is matched with the requirements of a stated query or information request (p. 77). Saracevic (1970b) stressed the distinction between relevance judgment as a process, and as a product. He defined the process as an assessment of the degree of "effective contact" between an infor- mation source and an information need. He emphasized that any given judgment "is only one of a variety of available indicators of the effectiveness of contact" established (p. 138). Cook (1971) defined the judgment process in terms of a model of decision making conceptualized by Martin and Wilkens (1968). The underlying principle of the decision model was concerned with the notion of a relevance "threshold" used by individuals to dis- criminate perceptions of relevance from nonrelevance (Cook, 1971). Perceptions of relevance occurred, theoretically, when the "value" of a particular message or recognizable part of a message exceeded some minimal limit, a limit established subjectively by a given individual (Cook, 1971, pp. 34-56). 75 Travers (1968) described relevance judgment in terms of a number of complex, intellectual processes and functions which operate when individuals attend to stimuli presented during a communication process. Travers' work suggested that any given relevance judgment involves these and other cognitive, information-processing functions: 1. "Selecting," "matching," "monitoring," and "correlating" functions which help individuals to discriminate per- tinent from nonpertinent information. 2. "Prioritization" and "selective filtering“ mechanisms which help individuals to assess the relative "value" of information which is processed (pp. 82-90). Related Research Risner (1971) evaluated the effectiveness of film descrip- tions and subject headings contained in the NICEM master data base. He investigated the degree to which the descriptions, the subject headings, and the films described were perceived to be relevant to each other by users of NICEM film catalogs. Risner proposed that the following contextual factors were important considerations involved in relevance judgments made about instructional films: 1. Intended or anticipated instructional purposes. 2. The particular film considered for use. 3. The particular descriptive statements used to describe a given film, such as annotations, subject headings, or bibliographic data. 76 4. Subject areas treated in specific instructional situa- tions, such as third grade reading. 5. The cognitive set of users, as influenced by their back- ground and experience. 6. Requests, stated or implied, which reflect "the antici- pated information need of the user." 7. The particular educational environment within which a film is to be used, e.g., a specific ethnic or socio- economic community. 8. Work functions such as teaching. 9. The types of responses made by individuals when con- fronted with specific types of film information (pp. 27-28). Upon completion of his review of related investigations, Saracevic (1970b) concluded that five sets of factors tended to influ- ence the relevance judgment process: (a) the particular document or document representation analyzed; (b) the particular query or infor- mation request made; (c) the judgment conditions and situations used; (d) the mode of expression used to display ratings obtained; and (e) people, the judges involved in making the judgments (pp. 135-37). The major conclusions reached by Saracevic for each of the above factors are summarized in Table 2. Saracevic based his conclusions upon a wide range of experiments completed prior to 1970, including two relatively large, comprehensive series of studies conducted by Cuadra and Katter (1967a), 1967b) and Rees and Schultz (1967). All of the experiments reviewed by Saracevic dealt with judgments made 77 about print-form documents (books, articles, abstracts, or other textual materials). Table 2.--Empirical research conclusions about factors assumed to influence relevance judgments obtained from print-form documents and document representations. 1. Factor: Documents and Document Representations a. Documents and document representations appear to be the major factors likely to influence relevance judgments, of the five factors listed in this table. In particular, the following variables appear to influence relevance judgments made: (1) The subject matter content of the document or represen- tation, in comparison to the subject matter content of queries; (2) Elements of style; (3) The specificity of content treated; and (4) The length and detail of information supplied. 2. Factor: Queries a. If statements in documents resemble statements in queries, the documents are likely to be deemed relevant to the query. The more one knows about a query, the more he tends to be stringent in judgments made (more items tend to be perceived as not relevant or less relevant). The less one knows about a query, the more he will likely be lenient in judgments made (more items will tend to be perceived as being relevant or more relevant). Significant differences found in studies involving use of queries may have resulted from the types of queries used. 3. Factor: Judgmental Conditions and Situations a. Different definitions of relevance may yield differences in relevance ratings obtained. Situations in which judgments are made under pressure tend to yield more lenient ratings (higher, inflated ratings). Designation of different intended uses for documents may produce differences in judgment patterns. 78 Table 2.--Continued. 4. Factor: Mode of Expression a. Different kinds of scales (rating, ranking, and ratio scales) can be expected to yield slight differences in judgments, in general. When rating scales containing more rating categories (up to ten) are used, raters feel more certain about judgments made. The end points of rating scales tend to be used most often for judgments made; generally, documents are deemed to be very relevant or very nonrelevant. 5. Factor: People a. Subject matter knowledge appears to be the most influential characteristic of judges which influences relevance judg- ments made. Subject matter experts tend to be more stringent in their judgments than persons with little knowledge of the subject treated; nonexperts tend to be more lenient. Or, in more general terms, the more one knows about the subject treated, the less relevant items appear to be; the less one knows, the more everything tends to be perceived as being relevant or more relevant. The judgments00 Statements of overall value. Ratings of overall value. Strengths and weaknesses. Technical quality. Product comparison information. Usage data. Educational author (including producers, consultants, and sponsoring agency) (pp. 68, 73, 78). 83 The majority of film descriptions analyzed by Latzke con- tained only three types of information elements: physical data (primarily title, source, and film length); audience or age-grade level; and a nonevaluative content description, one to three sen- tences in length. Ratings of overall value were provided as film quality cues, predominantly, in descriptions containing evaluative data (pp. 68, 78). Gilkey (1962) analyzed and described the content of film descriptions commonly used prior to 1962 (pp. 29-34). Information entries and relevance and quality cues exhibited by the descriptions were similar to those isolated by Latzke (1971). The publication, Guides to Educational Media, prepared by Rufsvold and Guss (1971) described information entries contained in 19 commonly available film guides (pp. 12-74). Several unique infor- mation entries provided relevance cues in a few of the guides, entries which were not isolated by Latzke (1971). The unique entries were these: availability of information, series title, TV clearance, suggested uses, performers, edition or version, and country of origin. Unique entries which reflected quality cues were these: interest level and evaluator. The majority of guides contained these nine information entries: title, source, descriptive annotation, film length, color or black/white, distributor, producer, target audience, and date. Only three guides used "evaluative anno- tation" entries, to provide film quality information cues (PP. 12-74). Gilkey (1962) and S. C. Johnson (1971) analyzed the pref- erences of science and social studies educators, reSpectively, for 84 particular kinds and combinations of film description information elements. Both investigators obtained similar findings from their studies. The studies expressed favor toward descriptions which supplied the following types of information: bibliographic data; a list of tapics treated; a one-paragraph content description; an evaluative appraisal, including grade-level, teaching purposes, strengths and weaknesses, and an overall rating of excellence; and a listing of related materials, textbook correlations, supplementary readings, and related activities (Gilkey, 1962, p. 99; S. C. Johnson, 1971, pp. 149-50). Unique Relevance and Quality Cues A number of film description styles were noted which illus- trated relevance and quality cues not commonly exhibited by most descriptions. Relevance information cues.--The Computer-Based Project for the Evaluation of Media developed film descriptions which included: 1. Lists of validated multiple choice test items, coded by grade level, recommended for use with specific objectives established for the films. 2. Descriptors and codes associated with (a) emerging learning hierarchies, taxonomies of objectives and curriculum- instructional classification systems, and (b) related objectives and test items. 85 3. Vocabulary analyses based upon comparison of words voiced in film commentaries, with a standard list (Bond, 19728, pp. 89-103; 1972b, pp. 126-35). Film reviews produced by the National Coordinating Council on Drug Education (1970, pp. 1-51; 1971, pp. 1-64) contained des- criptors and identifiers specifying these target groups: suburban, inner-city, parents, professional personnel, program planners, and school administrators. The Library of Congress (MARC) Development Office (1970) developed a machine-readable descriptive cataloging standard to aid production of film catalogs and cataloging cards used in libraries and learning resource centers. The cataloging format included use of these types of information entries: the language of the sound track (English, French, etc.); country of producer; geographic des- criptors and notes; notations regarding film usage limitations, and actors or actresses portrayed; equipment limitations; and filmic techniques used such as animation, live action, or otherwise (pp. 13- 60). Additionally, special entries were designed to describe films for archival collections: category (type) of film; accompanying production and publicity materials; presentation format; and censor- ship information (pp. 18, 42-43, 54-55). The Environment Film Review (1972) described critical reviews of 627 films dealing with environmental science and environmental education t0pics. Descriptors in five detailed indexes were included to aid identification of pertinent films: title, key word, subject, industry, and sponsor indexes. 86 Film quelity information cues.--The Computer-Based Project for the Evaluation of Media (1972) developed film descriptions which included: 1. Evaluations prepared for materials producers, based upon field test evidence. 2. Indices of effectiveness obtained for specific learning objectives. 3. Attention profile data summarizing the degree to which students maintained eye contact with a given film when they viewed it. 4. Student reaction data (Bond, 1972a, pp. 89-103; 1972b, pp. 126-35). Film reviews produced by the National Coordinating Council on Drug Education (1970, pp. 1-51; 1971, pp. 1-64) included evalua- tive comments provided by a panel of film evaluators. The panel represented viewpoints of various drug education experts, special- ists, students, lay persons, and persons involved in the operation of drug education programs. The reviews were often negative in tone, critical of films using questionable approaches or containing many inaccuracies. A distinctive film review format was illustrated by Ladd (1972). Along with conventional film information, Ladd's reviews included a composite set of reactions of ten students taken at ran- dom, and judgments of a teacher, subject matter specialist, and audio- visual media specialist. Ratings were provided for approximately 30 different evaluation factors for a given film. From five to ten 87 factors were rated by each evaluator, pertinent to his/her special vieWpoint or expertise (pp. 299-303). The Library of Congress MARC format (1970) illustrated two types of film quality information cues: producer/production credits and references to published reviews of a given film (MARC, 1970, pp. 33-55). Films judged to demonstrate superior cinematic treatment and/or subject coverage were marked with asterisks in the Environment Film Review (1972) to draw attention to them. Similarly, Wood (1974) reported that the Booklist designates worthy film reviews with "stars," "as a mark of quality," to draw attention to especially good materials (p. 219). EPIE's evaluations (1970) of ten drug abuse films contained questionnaire data obtained from groups of student evaluators from schools differing in socioeconomic characteristics. The evaluations provided information on the frequency and nature of responses made about the effectiveness of the films, strengths and weaknesses noted, and "likes" and "dislikes" found ("Ten Drug Abuse Films," 1970, pp. 16-27). Criticisms Made of Information Cues Supplied in Film Descriptions Numerous indications were found in the literature that the design and content of film descriptions commonly used by selectors are somewhat questionable. In his article dealing with film information system specifi- cations, S. C. Johnson (1971) stated: 88 Virtually all current media information systems, includ- ing even university film library catalogs, do not meet even minimal criteria for integrity, availability and usefulness of information (p. 7). Indications that teachers feel available film descriptions are inadequate were reported by Shetler (1961, p. 3811), Wilson (1972, p. 28), and Latzke (1971, pp. 99-100). Latzke described a survey of teachers in New York State (Films for Education, 1963) which revealed that teachers generally felt uncertain about the potential contributions films could make, based upon reading film descriptions provided in conventional film catalogs. Latzke's study (1971) of film descriptions supplied in profes- sional periodicals reported several significant conclusions. First, the type of information recmmended by film evaluation-selection experts and desired by film selectors is not usually contained in the descriptions. Second, educators, editors of serial publications, and media specialists generally support the need for better des- criptive and evaluative reviewing of media (pp. 99-100). Gilkey (1962) reported that nonevaluative film descriptions were almost universally rejected by participants of his study, which investigated types of film descriptions preferred by sCience educa- tors (p. 98). A state-of-the-art assessment of audiovisual reviews by McDaniel in 1970 severely criticized the nature of reviews supplied in professional periodicals, including film reviews. McDaniel's major criticisms dealt with the quality, consistency, and "intel- lectual reliability" of information provided; the general lack of 89 indication to readers of the criteria base used for reviews supplied; and the lack of variety of viewpoints expressed, especially negative, critical remarks (pp. 84-86). Palmer's study (1973) reported criticisms of six often-used sources of published film reviews and evaluations. Noteworthy criticisms mentioned were these: the lack of relevance of informa- tion provided to the needs and interests of clientele served; the lack of indication of who actually participated in making a given evaluation; the variability of the quality of information supplied; and the noncommittal, neutral nature and brevity of information supplied (pp. 296-98). After viewing the literature dealing with the film evaluation- selection process, Baird (1973a) concluded: While some value is gained by published evaluations and reviews, there seems to be much that can and should be done to make these more helpful and available to those who evaluate and select educational films (p. 31). Recommendations Made for Improving Information Cues Supplied in Film Descriptions Latzke (1971) recommended five major roles for film reviews, which imply distinct information requirements for the content of effective film descriptions. The roles recommended were these: 1. Identification of films pertinent to specific user needs, interests, and purposes. 2. Final selection of films, especially when enough time for formal previewing is unavailable. 90 3. Improved film utilization, by basing selection upon recognizable strengths and weaknesses of a given film. 4. Selectors' professional_growth, by basing selection upon criteria and judgmental factors related to specified film uses, as summarized by credible film reviewers. 5. Product improvement and development, by improving criti- cal knowledge and awareness of qualities of the medium (pp. 31-35). Latzke (1971) found that about 50 percent of the film des- criptions supplied by the periodic literature in 1969 contained evalu- ative information elements of any sort (p. 183). He noted that the majority of film descriptions supplied could have been improved significantly by supporting ratings, giveniuith statements of the strengths and weaknesses of the films; by supplying effects assess- ment information obtained from actual use of the materials; and by including information about comparisons made with other products and suggestions for particular uses (p. 102). Wilson (1972) recommended the use of "behavioral descriptors" within media descriptions, to succinctly describe behaviors, tasks, or skills to be accomplished as a result of using a given media product (p. 29). Fourteen information elements were identified as being critical to the description of nonprint media, including films, by the Systems and Standards for the Bibliographic Control of Media Institute, 1969-70. Described by Brong (1972), the recommended elements included the title, creator, edition statement, production 91 date, release date, producer, distributor, physical characteristics (such as length, color, sound, etc.), series statements, descriptive annotation, subject headings or tracings, location indicators, intended audience, and notes for amplifying and extending the des- cription (pp. 58-61). In 1973, the Association for Educational Communications and Technology (AECT) published its Standards for Cataloging Non-Print Materials, by Tillin and Quinly. According to AECT, these standards are compatible with the Library of Congress MARC format ("AECT Materials," n.d., p. 12). General Conclusions Drawn From the Review The review of related literature and research dealt with a wide range of viewpoints and empirical evidence. The generalizations that follow summarize important inferences and conclusions drawn from the review. The Distinction Between Relevance and QualityHCues 1. The distinction is useful in aiding clarification of the types and kinds of information which may be beneficialto include in film descriptions. 2. The distinction is more meaningful for film description purposes (i.e., for describing and interpreting analyses and evalua- tions made about films) than for actual evaluation-selection purposes. 3. When evaluating or selecting films, the distinction that a given criterion is a significant criterion, useful as an indicator 92 of excellence or quality, is more important than the distinction that the criterion is classified as a "relevance" or "quality" criterion. 4. Whether an information element or indicator is judged to be a "relevance" or "quality" cue depends primarily upon the frame of reference used for judgments made and the definitions of relevance and quality which are used. 5. Meanings are in people, not films or film descriptions. Therefore, film evaluators and selectors will identify important characteristics and attributes of films, depending upon the degree to which (a) the particular attributes are clearly defined and (b) people are trained or assisted to recognize various relevance and quality information cues exhibited by films and film descriptions. Evaluation-Selection Criteria Which Influence Relevance and Quality Judgments 1. Standards do not yet exist which are generally agreed upon across disciplines and special interest groups, which define the characteristics of a "high quality" film. 2. Criteria used in making relevance and quality judgments vary widely in terms of source, emphasis, definition, and associated meanings. They are subject to much subjective interpretation. They do not convey the same meanings to all who use them. 3. A comprehensive "master list" of criteria applicable to all evaluation-selection purposes is not presently available nor likely to ever be produced, due to the diversity of related 93 viewpoints and opinions which exist and the complexity of the subject. 4. Criteria which are important to use in the selection of films will vary depending upon the purpose and stage of the selec- tion process; e.g., whether the selection is fgg_or fgpm_a film collection, or more specifically, for identification, determination, previewing, or formal purchase or usage decision-making purposes.4 5. Generally speaking, evaluation-selection criteria actually used in the evaluation-selection process are not necessarily stable commodities. Although some criteria can be defined rather precisely, the degree to which they are perceived to be meaningful, important, useful criteria may change significantly across time, across items which are evaluated or selected, and across both individuals and groups. Characteristics of Films Which Influence Relevance Judgments Elicited From Them Important, influential film characteristics are these: 1. Purposes and objectives. Subject matter emphases. Other content emphases. Intended audiences and users. Target population slant. Instructional design. Vocabulary level. ooximmbwm Comprehension level. 94 9. Intellectual, skill, or affect level.5 10. Production format. 11. Physical characteristics. 12. Datedness. 13. Other unique attributes. Nonfilmic Factors Which Influence Both Relevance and Quality Judgments Made About Films 1. Relevance and quality judgments can be made meaningfully, reliably, only when made in relation to "anchor referents" which define the frame of reference established for the judgments.6 2. The key issues are not "relevance" and "quality" per se, but rather, relevance and quality in relation to who, what, where, when, why, how, and how well considerations pertinent to intended uses of films. 3. These are important nonfilmic factors which influence relevance and quality judgments made about films: arena. (1) Their subject matter interests and expertise. (2) Their training and experience, especially as evaluators-selectors. (3) The particular viewpoints they hold about film quality and relevance. (4) Their perceptions of potential benefits to be gained by using a given film. C. 95 (5) Their perceptions of the overall compatibility of a given film with the overall context of its intended use. (6) The particular evaluation-selection criteria which they use in making a given judgment. (7) Their perceptions of the credibility of information and information sources considered. Information requests and request statements (1) The purpose of the request. (2) The clarity, specificity, and complexity of the request. Judgment conditions and circumstances (1) Evaluation-selection purposes. (2) Appraisal methods used. (a) The type of appraisal made. (b) The characteristics of evaluation forms and instruments used. (3) Definitions of relevance and quality used. (4) Anchor referents used to define the judgment pur- pose and context. (5) Learning environments and settings in which films are to be used. (6) Film descriptions. (a) The types of IPI 's and F01 '5 provided within them. (b) Descriptors and indexing schemes used in indexes to film descriptions. (7) Intended learning or teaching purposes, objectives and uses. 96 (8) Environmental pressures (e.g. , time constraints). d. Modes of regponse (l) The response mode used to exhibit judgments made (i .e. , ratings, subjective comments, checklists, etc.). Characteristics of Films Which Influence Quality Judgments Elicited From Them These filmic characteristics are important influences: 1. Technical quality attributes. 2. Instructional design characteristics. a. Purpose-objective attributes. b. Vocabulary level attributes. c. Comprehension level attributes. d. Learner-audience involvement attributes. 3. Other inherent attributes. a. Content validity. b. Contemporariness-datedness. c. Message slant. d. Overall message design and treatment. 4. Other characteristics--those which elicit Other percep- tions of relevance or quality. The Extent to Which Available Film Descriptions Exhibit Relevance and Quality Information Cues 1. In general, these types of IPI's only are commonly found in most film descriptions used within the marketplace: 97 Subject matter emphases. Purposes or objectives. Target audiences or audience levels. Physical data (title, length, color, and date). Descriptive annotations, usually one to three sen- tences in length. 2. In general, most film descriptions used within the marketplace: a. Do pp£_reflect a large number of wide range of the types of relevance and quality cues identified by this study. Contain few or none of the types of FQI's identified by this study. When they do, the FQI's are usually present in the form of overall ratings of excellence or value and/or indirect quality indicators. Are designed for'bibliographiccontrol purposes and/ or to aid identification of films for previewing purposes. Are Hgt_necessarily designed to aid selection fygm_ film collections. Are pgt_necessarily designed to assist the final selection decision-making process. Assumptions About Instructional Pertinence Indicators and Film Quality Indicators Important assumptions derived from the review of related research and literature follow. 98 Assumptions: Instructional Pertinence Indicators 1. Instructional pertinence indicators (IPI's) are information cues exhibited by films and film descriptions which assist selectors to judge the degree to which a given film is pertinent, logically related, to a given instructional situation. 2. IPI's assist selectors to judge the degree towhich agiven film meets specific criteria or limits of acceptability, establishedcn* assumed for different levels or gradations of instructional pertinence. 3. IPI's exhibited by films are recognizable characteristics or cues which indicate these film attributes:7 a. Purposes or objectives. b. Subject matter emphases. c. Other content emphases. d. Intended audiences and users. e. Target population slant. f. Rationale (theoretical or conceptual basis). 9. Vocabulary level. h. Comprehension level. 1. Intellectual, skill, or affect level. j. Film medium characteristics. k. Physical characteristics. 4. IPI's exhibited by film descriptions are words, descrip- tors, statements, phrases, or other information elements which des- cribe the IPI's described in assumption three above, and these related film usage considerations: Usage suggestions. Curriculum integration suggestions. Accompanying guides and materials. Related field test or usage data. OOUD 99 Assumptions: Film Quality Indicators 1. Film quality indicators (FQI's) are information cues exhibited by films and film descriptions, which assist evaluators and selectors to judge the value, excellence, or effectiveness of a given film for a given instructional situation. 2. FQI's assist selectors to judge the degree to which a given film meets specific criteria or limits of acceptability, established or assumed for different levels or gradations of quality. 3. The general types of FQI's exhibited by films can serve as indicators of these types of film quality: a. Inherent quality (inherent value or excellence). b. Technical quality. c. Instructional design quality. d. Actual effectiveness. e. Potential effectiveness (predicted, estimated effectiveness). 4. The general types of FQI's which can be provided in film descriptions are these:8 a. Ratings. b. Critical appraisals. c. Standard comparisons. d. Awards of merit. e. Effectiveness indices. f. Effectiveness estimates: indices of potential effectiveness. 9. Technical quality indicators. 100 h. Instructional design quality indicators. i. Inherent quality attributes. j. Evaluations supplied from specific sources. k. Indirect quality indicators. General Assumptions The following assumptions apply to both IPI's and FQI's. 1. Some kinds of IPI's and FQI's exhibited by films are relatively easy to pinpoint, isolate, and describe, while others are not. This is so, because of the complexity of information cues exhibited by films, the overlapping nature and duration of the cues, and other factors. 2. IPI's and FQI's are definable, in terms of the general types of observable film attributes and characteristics which imply their existence and presence. 3. IPI's and FQI's can be defined meaningfully, reliably, only in relation to "instructional anchor referents."9 The anchor referents define characteristics of instructional situations which serve as the frame of reference for judgments of instructional per- tinence and quality made about films. Important instrUctional anchor referents are these: a. Subjects and topics: specific subjects and topics of instruction. b. Instrugtional purposes: intended goals, objectives, and film uses. do . 101 Curricula: specific curricula and instructional pro- grams dealing with particular subject matter areas or disciplines. Learners: specific types, groups, or levels of learners which comprise the film audience. Learner characteristics: such as aptitude, IQ, achievement level, vocabulary-comprehension level, reading ability, etc. Learning,domains: cognitive, psychomotor, and attitudinal-affective dimensions of learning. Learning conditions: specific instructional or film usage conditions which must be met. Educational environments: schools or other places of learning with distinct geographic, socio-cultural, socioeconomic, socio—political, ethnic-racial, or religious characteristics (rural, inner-city, urban, suburban, Catholic, industrial-corporate, etc.). i 1 vi in s: particular schools of thought, theories of instruction, or philosophies of instruction which are commonly espoused and practiced by film users. Film users: specific types of users, such as teach- ers, instructors, adults, parents, industrial per- sonnel, administrators, professionals, etc. 102 4. IPI's and FQI's exhibited by both films and film des- criptions are major factors which influence the selection of films both fp§_and fremeilm collections. 5. IPI's and FQI's are best defined in terms of specific film evaluation-selection criteria and instructional anchor referents. 6. Two distinctly different types of IPI's and FQI's exist: those exhibited by films and those by film descriptions. 7. IPI's and FQI's exhibited by films are not the same as the verbal expressions (words, phrases, descriptive statements, etc.) used to describe them in film descriptions. IPI's and FQI's provided in film descriptions are only symbolic referents which convey limited information about film characteristics, attributes, and usage effects noted by film evaluators. 8. IPI's and FQI's lie in the mind of the beholder, because meanings associated with specific evaluation-selection criteria are in people, not films or film descriptions. 9. Some information cues serve singular roles as IPI's or FQI's, whereas other information cues play dual roles as both IPI's and FQI's. 10. A comprehensive "master list" of IPI's and FQI's applicable to the description of any and all films and usage purposes is not presently available nor likely to every be produced. Specific IPI's and FQI's will vary with the information demands and requirements of different subject matter disciplines and special interest groups. 11. Likely, some general types of IPI's and FQI's can be defined which cut across subject matter disciplines and special 103 interest groups. However, other specific IPI's and FQI'S are likely unique to specific disciplines and Special interest groups. 12. Some IPI's and FQI's which can be provided in film descriptions have equivalent, recognizable counterparts as film attributes or characteristics, while other IPI's and FQI's do not. 13. IPI's and FQI's can be provided in film descriptions for two basic purposes: to aid the selection of films f9:_or fypm_ film collections. 14. The types and specific kinds of IPI'S and FQI's which can be included in film descriptions to aid selection fg[_and fgpm_ film collections may not necessarily be the same. Instructional Pertinence Indicators and Film Quality Indicators Suggested From the Literature for Inclusion Within Film Descriptions Table 3 lists instructional pertinence indicators, and Table 4, film quality indicators, derived from the review of related literature and research. The IPI's and FQI's listed in the tables were identified by this study as being worthy of potential inclusion within film descriptions, and of additional research and experimen- tation. The indicators are categorized into functional groups according to the general types of IPI's and FQI's identified by this study. Tables 3 and 4 provide a comprehensive summary of the kinds of specific indicators reflected across authors and publications reviewed in Chapter II, for each type of IPI or FQI noted. Both com- monly suggested and unique types of IPI's and FQI's are listed.10 104 Table 3.--Instructiona1 pertinence indicators suggested for inclusion within film descriptions. Definition: Instructional Pertinence Indicators Instructional pertinence indicators are words, descriptors, statements, phrases, or other information cues provided in film descriptions, which assist selectors to judge the degree to which materials are "pertinent, logically related" to important "instructional anchor referents," listed below, which define the particular requirements of a given instructional situation: 1. Instructional purposes 2. Subjects and topics 3. Curricula 4. Learners 5. Learner characteristics 6. Learning domains 7. Learning conditions 8. Educational environments 9. Instructional viewpoints 10. Types of film usersa A. Indicators to Assist Selection for Film Collections 1. To assist identification of available films pertinent to user needs and interests 2. To assist determination of which films to preview and evaluate 3. To assist evaluation of films 4. To assist final selection of films B. Indicators to Assist Selection From Film Collectionsb (Same as A above) C. Purpose-Objective Indicators 1. Stated or implied teaching learning intents: ' a. Cognitive: related to specific knowledges or intellectual skills; visual discriminations and identifications; facts, concepts, principles, rules, or procedures to be learned b. Psychomotor: related to specific manual skills tasks, com- petencies, or behaviors to be learned or demonstrated c. Affective: related to specific feelings, attitudes, opin- ions, or other psychological-emotional states to be learned or demonstrated 2. Statements of behavioral objectives 3. Behavioral descriptorsc 0. Content Emphases Indicators 1. General subjects treated 2. Specific tapics treated 105 Table 3.--Continued. 3. 4. 0‘01 10. 11. 12. 13. 14. 15. Basic concepts and main ideas treated Details or examples used to illustrate or support concepts and main ideas treated Major content themes and relationships treated Content emphasized in support of the purposes and objectives of the film The scope and coverage of content treated The organizational pattern, structure, and sequence of the presentation The basic approaches and methods used to treat the subject content (e.g., case-study, experimental, didactic, dramatic, histerical analysis, explanatory, open-ended, contrastive, etc. The basic viewpoints expressed (e.g., scientific, moralistic, conservative, practical, authoritarian, satirical, critical, etc. The particular biases, slant, and tone of the film message Major settings depicted (socioeconomic, geo raphic, vocational, historical, avocational, recreational, etc.§ Characterization (the types of individuals, people, socio- cultural groups, sexes, age levels, social institutions, etc., portrayed) Micro-relevance characteristicsd The contemporariness and datedness of the film content E. Intended Audience Indicators 1. 2. 3. The intended film user (teacher, instructor, parent, profes- sional, industrial, etc.) The intended learner (pupils, students, trainees, adults, general public, etc.) The intended learner level a. Grade or academic level: preschool, fifth, high school, etc. b. Chronological age, mental age, or 10 c. Beginner, intermediate, advanced, expert, etc. Expected learner prerequisites: Indications of prerequisite levels of knowledge, skill or experience, etc., expected of the intended learner, to optimally benefit from the film message F. Target Population Slant Indicators Indications of the actual slant of the material. Characteristics of the material which indicate its limitations and general usage potential for specific target groups 1. Indications of values represented, characteristic of specific: a. Socio-geographic locales (urban, suburban, rural, inner city, etc.) b. Socioeconomic and sociocultural groups (e.g., ethnic minority groups, religious groups, etc.) 106 Table 3.--Continued. AWN c. Sociopolitical concerns (e.g., sexism, racism) d. Subject matter disciplines or special interest groups The authority figure represented by the commentator Particular settings depicted or characterizations portrayed Content which is likely to be very offensive or repulsive to some groups G. Rationale Indicators l. (JON The underlying theoretical or conceptual foundation, conceptual scheme, view of learning, or pedagogical approach serving as the basis for development and use of the material Important instructional design characteristics of the material Characteristics of the material pertinent to Specific subject matter disciplines or special interest groups H. Field Test or Usage Data 1. 2. 3. The "anchor referents" designated for effectiveness data sup- plied for the film The "anchor referents" for which the film has been demonstrated to be an effective or ineffective instructional medium Sources of field test or usage data 1. Suggested Usage Indicators macaw—- o o o o o 6. wa-fl Suggested instructional uses or usage conditions Specific limitations of use Suggested student-teacher activities Suggested student evaluation or self-evaluation activities Special requirements to be met for effective use a. User competencies, qualifications, requirements b. User training requirements c. Use with other components of a series Suggested readings and materials Curriculum Integration Indicators Curriculum integration or articulation suggestions Textbook correlation information Series information for packages or groups of materials Curriculum or instructional classification codes or descrip- tors (for classifying the subject matter content, the nature of related objectives, developmental tasks or instructional activities, etc.)6 Indications of particular curricula for which the materials were prepared (e.g., BSCS Biology, PSSC Physics, etc.) 107 Table 3.--Continued. K. Accompanying Guides or Materials \OCDVO‘U'I-DWN—J Teaching guides, curriculum integration aids, etc. Testing and evaluation questions or devices Teacher training materials Sources of published evaluations and reviews of the material Actual evaluations, reviews, or product effectiveness reports Subject matter content analyses Transcriptions of the commentary, dialogue, or script Shot-sequence listings or descriptions Supplemental readings or other supplementary materials Vocabulary Level Indicators 1. DOOM OEDNO‘UT o o o o o The intended and actual grade levels of the commentary and dialogue (preschool, lower elementary, fifth grade, high school, college, etc.) The readability or reading level of the commentary and dia- logue, in relation to predefined levels or standards A listing of the prerequisite vocabulary needed for adequate comprehension Comparisons of words found in the commentary and dialogue, with standard lists. Lists of words sampled from the commentary and dialogue Complete transcriptions of the commentary and dialogue A subjective or objective analysis of the vocabulary level Word counts for new words, reused words, and sight words Quantified indices (ratings, frequency distributions, other statistical indices, etc.) of the vocabulary level Comprehension Level Indicators 1. 0101-8:- u {\D The intended and actual audience levels (preschool, lower elementary, fifth grade, high school, college, etc.) Prerequisite knowledge, skills, experiences or abilities, etc., needed for effective comprehension Quantified indices (ratings, frequency distributions, etc.) of the comprehension level The language of the commentary and dialogue, if not English A subjective or objective analysis of the comprehension level Student attention data Intellectual, Skill,yand Affect Level Indicators 1. Taxonomic classification levels, codes, or descriptors for con- tent, objectives, or evaluation questions specified for the film a. Cognitivef b. Psychomotor or behavioral9 c. Affectiveh 108 Table 3.--Continued. 2. Student response classification codes or descriptors which illustrate responses elicited from the film or from asso- ciated testing and evaluation questions1 0. Film Medium and Production: Format Characteristics 1. The type of film (documentary, newsfilm, TV-cleared, cartoon, demonstration, fictional, silent, training, discovery- inquiry, etc.) 2. Basic and supplementary film techniques used (animation, time- lapse, slow motion, aerial photography, stop-action, montage, etc. P. PHysical Data Title Length (running time) Color or black/white Rental price Purchase price Place of origin (e.g., the country, if not the USA) Date (release, production, or copyright date) NO‘U‘AWN—i 0 o o o o o o aThese "instructional anchor referents" are those defined on pp. 100-101. bThe four-step process defined by Baird (1973b, pp. 2, 7, 10) for film library selection appears to be applicable to the selection from film collections process as well. Baird's study suggested that different information elements may be useful for different steps of the film selection process. cRefers to behavioral descriptors described by Wilson (1972, pp. 28-29). ClRefers to "things specifically related to the child's own imme- diate environment and milieu," suggested by Steiner (1972, p. 2090A) . eRefers to codes and descriptors provided for Computer Based Resource Units (CBRU) and those suggested by Heiss and Mischio's social learning curriculum model , described by Bond (1972b, pp. 116-18, 121-23). fFor example, classification schemes and categories described by Bloom and others (1956). 9For example, those described by Simpson (1966-67, pp. 110-44) . hFor example, those described by Krathwohl and others (1964) . 1For example, Heiss and Mischio described five basic student responses: labeling, detailing, inferring, predicting, and general- izing. These categories have been used to classify evaluated mate- rials, as reported by Bond (1972b, pp. 52-54, 123). 109 Table 4.--Film quality indicators suggested for inclusion within film descriptions. Definition: Film Quality Indicators Film quality indicators are words, descriptors, phrases, statements, or other information cues provided in film descriptions, which assist selectors to judge the quality of instructional materials for specific instructional situations; i.e., to judge the potential excellence, value or effectiveness of the materials in terms of important "instruc- tional anchor referents" listed below: A 1. Instructional purposes Subjects and topics Curricula Learners Learner characteristics Learning domains Learning conditions Educational environments Instructional viewpoints Types of film usersa OSOCDVOSUTDUJN C. Indicators to Assist Selection for Film Collections 1. To assist identification of available "high quality" films pertinent to user needs and interests 2. To assist determination of which films to preview 3. To assist evaluation of films 4. To assist final selection of films Indicators to Assist Selection From Film Collectionsb (Same as A above) Ratings (scaled) Ratings made about specific film evaluation-selection criteria and factors. For example: 1. Ratings pertaining to specific views and aspects of film quality a. Ratings of actual effectiveness (measured, observed) b. Ratings of potential effectiveness (predicted, claimed, estimated) c. Ratings of overall quality 3. Published ratings 4. Ratings supplied by important evaluation sources Critical Appraisals Subjective appraisals, objective or critical analyses, or criti- cal comments made in terms of specific film evaluation-selection factors and criteria. For example: 110 Table 4.--Continued. Appraisals pertaining to specific views and aspects of film quality. Appraisals supplied by important evaluation sources Appraisals of: a. Overall quality b. Strengths and weaknesses noted c. Comparative merits, when a given film is contrasted with other products d. Paradoxes and contradictions notedC e. Specific quality characteristics, attributes, or film effects noted f. The potential usefulness and effectiveness of accompanying guides or materials 9. The trustworthiness (validity, reliability, credibility, etc.) of evaluative information supplied h. Published ratings or other evaluations made of a given film 4. Criterion- -specific reasons underlying why a specific rating or appraisal was made (JON-d E. Standard Comparisons Ratings or critical appraisals made in terms of predefined stan- dards of film quality F. Awards of Merit Awards of excellence won by a given film for meeting established product development standards or other standards of film quality G. Effectiveness Indices Indications of the actual effectiveness (observed or measured) of a given film for specific instructional purposes, users, learners, and learning circumstances, obtained from field testing or actual usage situations. Indications of the actual outcomes (observed or measured effects) resulting from use of a given film; or com- parisons of actual outcomes made in relation to intended outcomes (predicted, expected, desired, or claimed effects) 1. Effectiveness domains 8. Cognitive: reTated to the learning of specific knowledge or intellectual skills; visual discriminations and identifica- tions; or facts, concepts, principles, procedures, and rules b. Eeychomotor-behavioral: related to the learning of spe- cific manual skills, tasks, or behaviors c. Affective: related to the learning of particular feelings, attitudes, Opinions, or other psychological-emotional states 2. Measures of effectiveness a. C0 nitive: (1i Achievement test scores (2) Pass/fail achievement results (3) Response error rates 111 Table 4.--Continued. b. C. d. Psychomotor-behavioral: (1) Behavioral performance measures and test results (2) Aptitude test results (3) Student attention measures (4) Interaction analyses measures (5) Behavior acquisition, change, and extinction rates Affective: (1) Opinions and attitudes (2) User satisfactions (likes-dislikes, satisfactions- dissatisfactions, etc.) (3) User preferences (preferences-nonpreferences, interests or appeals, attractiveness-repellence, etc.) (4) Measures of other feelings, moods, motivations, or psychological-emotional states Inter-domainal: (l) Standardized test results 2 Norm-referenced and criterion-referenced measures Self-report measures Retention measures (retention of skills, behaviors, knowledge, or affects learned) Baseline comparisons Experimental research comparisons (a) Experimental vs. control group (b) Single shot vs. replicated assessments 3 4 0‘ 0" VV vvv 3. Evidence of effectiveness Indications of considerations below, addressed by field test tryout reports, actual usage reports, and claims or testi- monials made about a given film by producers or evaluators of the product: a. (DO—OCT f. Actual outcomes, both positive (desirable) and negative (undesirable) Expected, predicted, desired outcomes vs. actual outcomes Intended vs. unintended outcomes Concomitant learnings found to occur when the product is used Related cost-efficiency, cost-effectiveness, and cost- benefit considerations ‘ Methods used to collect, analyze, and interpret measures of effectiveness obtained 4. Types of indices Related ratings, critical appraisals, standard comparisons, awards of merit, and objective indices Objective indices a. b. c. d Ratings Gain scores (pre/post-test changes) Efficiency indices (assessments of speed or rate of learning Statistical indices (1) Frequency distributions (quantity per factor or variable isolated) (2) Ratios, percentages, averages, medians, etc. 112 Table 4.--Continued. (3) Correlations (rank or product-moment) (4) Measures of significant differences e. Graphic indices (charted, graphed, profiled) H. Effectiveness Estimates: Indices of Potential Effectiveness Predictions or estimates of the potential value, excellence, or effectiveness of a given film. For example, predictions or estimates of: 1. The types of instructional purposes, learners, and users likely to benefit most or least from use of a given film 2. The potential of the film to achieve its intended purpose, for intended learners, for specific types of users and learning circumstances 3. Specific responses of learners or learning outcomes (cognitive, psychomotor-behavioral, affective) likely to result from use of the films, whether desirable or undesirable 4. Related cost-efficiency, cost-effectiveness, or cost-benefit considerations 5. The degree to which a given film exhibits characteristics consistent with claims for it (face validity) Types of Indices Related ratings, critical appraisals, objective indices, standard comparisons, and awards of merit 1. Technical Quality Indicators Indications of judgments made about the professional caliber of the technical characteristics of a given film, under the control of the film designer and production crew, such as: 1. Visual qua1i_ty: The clarity of focus; color and contrast control; color vs. black/white factors; proper film exposure; photographic composition, camera work, acting and editing techniques; and the choice and nature of settings and visual aids used 2. Audio quality: The clarity, intelligibility, and pacing of the sound; the nature, tone, and style of the dialogue and narration; and the appropriateness of background music and sound effects 3. Audio-visual synchronization: The use of complementary (mutually supportive, mutually relevant) audio-visual stimuli; use of non- conflicting information channels; overall synchronization 4. Overall technical guality: Overall audio-visual quality; over- all use of the unique characteristics and advantages of the film medium. Types of Indices Related ratings, critical appraisals, objective indices, standard comparisons, and awards of merit 113 Table 4.--Continued. J. Instructional Design Quality Indicators Indications of judgments made about the following filmic factors which influence learning from films (factors described on pages 57-69) 1. Purpose-objective factors 2. Vocabulary level 3. Comprehension level and pacing a. Organizational structure and sequence b. Programing c. Audio-visual synchronization d. Information load 4. Learner-audience involvement factors Types of Indices Related ratings, critical appraisals, objective indices, standard comparisons, and awards of merit K. Inherent Quality Attributes Filmic characteristics which imply that a given film exhibits inherent value, excellence or worth, including technical quality, instructional design quality, and other film quality character- istics described on pages 47-69 Types of Indices Related ratings, critical appraisals, objective indices, standard comparisons, and awards of merit L. Evaluations Supplied From Specific Sources Assessments of film quality provided by persons or groups having specific types of expertise or credentials; for example, evalua- tions supplied from: Film producers and distributors Evaluation panels Professional groups The serial literature Miscellaneous reviewing agencies (EFLA, Landers, etc.) Individual evaluators or reviewers (trained and untrained) Subject matter experts Teachers, instructors, faculty Learners: students, pupils, trainees Media production specialists Learning Specialists (educators, psychologists) Film librarians Administrators mmpr—a S‘hWD-OU’Q 114 Table 4.--Continued. 1. Lay persons j. Parents k. Others M. Indirect Qualitinndicators Information elements which provide secondary quality cues or information credibility cues, which are not directly observable as film characteristics; for example: Film sources (producer, distributor, sponsor, author) Dates (release, production, or c0pyright date) Production credits (creative consultants, credentials of film source, etc.) Sources of evaluation data Sources of effectiveness data The number of c0pies of a particular film which have been circulated 7. Indications of developmental/revision procedures used to pro- duce a given film 0101-53- LON-J aThese "instructional anchor referents" are those defined on pages lOO-lOl. bThe four-step process defined by Baird (1973b, pp. 2, 7, 10) for film library selection applies here as well, to the selection of films from film collections. cRefers to "paradoxes and contradictions" noted by Eash (1974, p. 38), regarding strengths and weaknesses found when one analyzes the instructional desi n characteristics of materials des- cribed by Eash (1969, pp. 18-241 in terms of internal consistency, unity, and face validity considerations. dThe contemporariness or datedness of a given film can often be inferred from production, copyright, or release dates supplied in film descriptions. 115 Critigue The related literature and research was reviewed in terms of the major focus of this study: investigation of the influence of relevance and quality information cues upon the film selection pro- cess, from a behavioral viewpoint of the selection process. Follow- ing is a brief discussion of the state-of-the-art of the related research and literature. Thereaften, a summary of important ques- tions raised by the review is provided, dealing with needed research and evaluation efforts. Answers are needed to the questions posed, to develop a behavioral theory of film description design, a theory based upon the influence of relevance and film quality information cues. The State-of-the-Art of the Related Research and Literature The related research and literature dealt with the theory and practice of several basic processes, namely, film evaluation, film selection, and the design and evaluation of film descriptions. Related subject-topic emphases and gaps noted within the literature are summarized below. Literature emphases.--Overall, the related literature basically discussed assumptions and opinions about subjects and t0pics treated, rather than related empirical, research-oriented evidence and findings. Although a number of significant studies were reported, by far, opinion-oriented discussions dominated the literature. 116 In addition to subjects and topics emphasized by this review, other important topics which have been discussed considerably in the literature to date were these: general problems and diffi- culties of the evaluation-selection process; people who should be involved in the evaluation-selection process, and the training and experience required of them; the availability and use of published reviews; and materials evaluation-selection methods and procedures (Baird, 1973a, pp. 22-30). Generalpgaps within the literature.--Overall, eight sig- nificant gaps were noted: 1. Only a few empirical investigations and research studies closely related to this study were reported. 2. Relatively few writings made any distinction between relevance and quality considerations. 3. No conceptual models were described of the film evaluation- selection process, i.e., cognitive, decision-making, perceptual, or behavioral models. 4. Critical discussions were not abundantly offered. In general, the literature dealt with many theoretical and practical evaluation-selection considerations, but most were not critically examined or questioned. 5. Relatively few writings provided useful state-of-the-art assessments, i.e., information summaries dealing with what is known, assumed, and unknown about subjects or topics treated. 6. Most selection-oriented writings did not make distinctions between the selection fpy_and from film collections processes. 117 7. No comprehensive theory was defined or described by any author, of the film evaluation-selection process. 8. No comprehensive discussion was found of the relevance or film quality judgment processes per se, as related to the film evaluation-selection process. Needed Research and Evaluation Efforts: Critical Questions The review of related research and literature raised many questions worthy of additional attention. The critical questions noted below address subjects and topics which have been treated little, or not at all, within the literature to date. Film selection theory,-- 1. In what ways can the film selection process be concep- tualized and modeled in behavioral, empirical terms, using notions of relevance and film quality as the basis of definition? 2. What kinds of judgments, cognitive comparisons, do selectors make during the film selection process? 3. What kinds of measurable responses are representative of the types of film selection judgments made by selectors? 4. What cognitive processes and decisions occur when film selectors make relevance, quality, or other film selection judgments? 5. What types of relevance and film quality information cues help what particular groups of selectors to make what kinds of film selection decisions? 6. Are "instructional pertinence" and "film quality" sig- nificantly different parameters which can be assessed and measured 118 independently, or are they really interdependent aspects of the same parameter, "relevance"? 7. Is the selection of films fgpm_film collections a multi- stage selection process similar to the four-step process identified by Baird?n 8. What environmental factors dealing with relevance and quality assessment considerations need to be controlled, to improve the film selection process? Film selection practice.-- 1. What film quality and relevance assessment controls are used by film libraries to develop and maintain satisfactory film collections? 2. What kinds of instructional pertinence and film quality indicators are considered in film evaluation-selection forms and instruments used by different types of film libraries? 3. To what extent are sources of information other than film library catalogs actually used by selectors as sources of rele- vance and film quality information; and for what particular selec- tion purposes are they used? Film description design and evaluation: theory.-- 1. What can be learned from the literature and research dealing with (a) perception theory; (b) the theory and practice of information science, especially the literature dealing with rele- vance assessment, documentation, abstracting and evaluation consid— erations; and (c) product development and evaluation theory, to improve the design and evaluation of film descriptions? 119 2. What kinds of behavioral assessment methods can be used to evaluate the effectiveness of different film description styles? 3. What specific measures, baseline measurements, and indices of selector response can be used to determine the degree to which different film description styles elicit desirable, intended selection judgments? 4. What tests of significance, statistical or otherwise, can be used to define the strengths and nonstrengths of different types of film descriptions, in behavioral terms? 5. What kinds of relevance and quality indicators should be included in a given film description style, to assure that it fulfills the information demands and requirements of the particular selectors for which it was designed? 6. Can a general purpose film description style be designed and used effectively for multiple selection purposes, or are differ- ent film description styles needed to provide an adequate array of relevance and film quality indicators for different purposes and target selector groups? 7. What characteristics and attributes define a "good" film description style for a given target selector group, in terms of the types of relevance and film quality indicators which can be provided in film descriptions? 8. What is the underlying rationale, defined in both theo- retical and practical terms, and in terms of film quality and rele- vance considerations, behind the design of various styles of film descriptions which have been produced to date? 120 9. What are the simplest and most effective ways possible to provide important instructional pertinence indicators and film quality indicators in film descriptions? 10. What kinds of instructional pertinence indicators and film quality indicators should be included in film descriptions designed to aid the selection of films fygm_film collections? 11. Can a film description style be designed to enable selectors to accurately and reliably predict the actual effective- ness of films for particular educational purposes, to preclude the necessity of having to preview a film before actually using it? 12. How can professional talent, professional organizations, and computer technology be integrated into a productive working relationship to develop more effective research methods, methods needed to improve the design and evaluation of film descriptions? 13. What guidelines can be provided to help develop useful standards for the design and evaluation of film descriptions? Film description design and evaluation: practice.-- 1. What is the actual incidence of instructional pertinence and film quality indicators within currently available film des- cription styles? 2. What methods, and particularly what guidelines, instru- ments, and forms, have been developed to date, to assist the design, preparation, and evaluation of film descriptions? 3. What kinds of film description styles actually influence the film selection process, and for what selection purposes and target selector groups? 121 4. To what degree do presently available film description styles enable specific selector groups to make accurate, reliable predictions of the actual effectiveness of a given film? 5. How reliable are currently available methods used to design, prepare, and evaluate film descriptions? Summary This study investigated the influence of quality and rele- vance information cues upon the selection of instructional films. The related research and literature was reviewed to identify docu- ments useful in defining the theoretical and conceptual foundations of the study. The review focused attention upon four important topics: 1. Factors which influence the making of film quality and relevance judgments. 2. The importance of film descriptions and the types of information cues provided within them, to the film evaluation- selection process. 3. Types of evaluation-selection criteria associated with basic views of film quality, criteria assumed to be important to mention within film descriptions. 4. The characteristic nature of quality and relevance information cues exhibited by existing film descriptions. General conclusions drawn from the review were provided for each of the four topics above. As well, a critique of the 122 state-of-the-art of the related research and literature was provided, including recommendations for needed research and evaluation efforts. Emphasis was placed by this review upon: 1. Identification of the kinds of relevance and film quality information cues which can be supplied potentially, within film descriptions, as instructional pertinence indicators (IPI's) and film quality indicators (FQI's); 2. Definition of underlying assumptions about the nature and characteristics of IPI's and FQI's; and 3. Identification of research and evaluation efforts needed to develop a behavioral theory of film description design, a theory based upon the influence of relevance and quality information cues upon the film selection process. The following critical factors were suggested by the review as influencing judgments made about the relevance and quality of instructional films: 1. The characteristics of a given film. 2. The characteristics of film descriptions. 3. People, especially 8. Their basic viewpoint about the film evaluation- selection process; b. The types of evaluation-selection criteria which they consider to be meaningful; and c. The perceived value, compatability, and credibility of information considered by them during the judgment process. 123 Information requests and request statements. Judgment conditions and circumstances, especially: a. Selection purposes considered, i.e., whether a given selection is made (a) fp§_or fgpm_a film collection and (b) for identification, determination, evalua- tion, or final selection purposes. Definitions of relevance and quality considered. Anchor referents used as the frame of reference for judgments made. Types of appraisal methods used. The purpose for which and the environmental context within which a given film is to be used. The types of IPI's and FQI's identified from the review were defined in terms of ten "instructional anchor referents" assumed to serve as the frame of reference for relevance judgments, quality judgments, and selection decisions made by film selectors. The instructional anchor referents identified from the review were the following: 1. N 0101-500 Subjects and topics. Instructional purposes. Types of curricula. Types of learners. Learner characteristics. Learning domains. Learning conditions. Educational environments. 9. 10. 124 Instructional viewpoints. Types of film users. The types of IPI's suggested for inclusion within film descriptions were the following; namely, indications of: 1. 2. 3. 10. 11. Instructional purposes or objectives. Subject matter emphases. Other content emphases. Intended audiences and users. Target population slant. Rationale (theoretical or conceptual basis). Vocabulary level. Comprehension level. Intellectual, skill, or affect level. Film medium characteristics. Physical characteristics. The types of FQI's suggested for inclusion within film des- criptions were these: 1. 2. 3. Ratings. Critical appraisals. Standard comparisons. Awards of merit. Effectiveness indices. a. Estimates of predicted or potential effectiveness. b. Indications of actual effectiveness. 125 Inherent quality attributes. a. Technical quality indicators. b. Instructional design quality indicators. c. Other. Evaluations supplied from specific sources. Indirect quality indicators. 126 Footnotes--Chapter II 1This report is a 12-page summary of Baird's doctoral dis- sertation (Baird, 1973a). 2This article is basically excerpted from pp. 1-161 of Saracevic's doctoral dissertation (Saracevic, 1970a). 3The term "document" refers here to primary, print-form information sources such as books or articles, in fashion similar to the general definition of the term commonly used in the field of information science. In the broader sense of the term, films are considered to be "documents" for this study, since they are primary sources of nonprint information. 4This generalization refers to the four stages of selection identified by Baird (1973b, pp. 2, 7, 10). 5Refers to taxonomic levels of the cognitive, psychomotor, and affective domains exhibited by films. 6Refers to Eash's notion of "anchor referents," in Eash (1969, p. 18). 7Specific examples of these types of IPI's are illustrated in Table 3. pp. 104-108. 8Specific examples of these types of FQI's are illustrated in Table 4. PP. 109-114. 9The notion of "instructional anchor referents" was derived from Eash (1969, p. 18), who stressed the importance of designating the "sample pupil population" which serves as the "anchor referent" for evaluations made by instructional materials evaluators. 10IPI's and FQI's "suggested" from the review of related literature and research are those implied from the review, as con- cluded by the investigator. No documents reviewed for this study directly addressed the existence of IPI's or FQI's. The notion of the existence of IPI's and FQI's was postulated, created for this study, by the investigator. 11Refers to the four stages of selection identified by Baird (1973b, pp. 2, 7, 10). CHAPTER III EXPERIMENTAL DESIGN AND METHODOLOGY This chapter describes the experimental phase of the study. These t0pics are addressed: the purpose of the experimental phase; the experimental method and design; and related measurement methods, assumptions, and limitations. Purpose The experimental phase of the study was designed to obtain empirical evidence of: l. The validity and reliability of the model of film selection, and 2. The utility of the model for evaluating the effective- ness of the experimental film description styles. The Experimental Method An exploratory, fixed effects experiment was designed to compare the similarity of film selection judgments elicited from (a) three specially designed film description styles (abstracts), with (b) corresponding judgments elicited from the films described by the descriptions. 127 128 Independent Variables Three independent variables were investigated: (a) treatments (the types of decision-making information provided: filmic or abstracted); (b) stimulus conditions (the set of circumstances posed to GilClt the Judgments); and (c) stimulus films (two specific experi- mental films). Dependent Variables Three dependent variables were measured: 1. Relevance (R), defined as the degree to which a given experimental film was judged to be "logically related, pertinent to" a given instructional situation posed in the stimulus instruments used by the experimental subjects. Film quality (0), defined as the degree to which a given experimental film was judged overall to be from "poor" to “excellent" in quality for use with fifth grade elementary school pupils. Subjects were required to interpret the meaning of "quality" using criteria and factors deemed to be uniquely important to them. Betterness (8), defined as the degree to which a given subject felt that one of the experimental films was "better" to use than the other, for a particular instruc- tional situation. The hypothetical instructional situations used to obtain measures of the dependent variables are illustrated in the rating 129 instruments used by the experimental subjects (Film Questionnaire, Appendix 82; Abstract Questionnaire, Appendix 81). Subjects Sixty-four elementary school teachers served as subjects for the experiment. The teachers were drawn primarily from the southern portion of the State of New Mexico. Most of the subjects were elementary school teachers in the Las Cruces, New Mexico, area and smaller communities within the state. They were selected arbitrarily (rather than randomly), primarily from teachers enrolled in summer session classes at New Mexico State University during the summer of 1973. All of the subjects were volunteer participants. Each was required to have one or more years of elementary school teaching experience to participate within the experiment. Ninety-one percent of the subjects were females. Only six men participated as subjects. Treatment Groups Four groups were used, referred to as treatment groups I (T1), II (T2), III (T3), and IV (T4), respectively. Subjects were randomly assigned to the groups, sixteen per group. Treatment groups I, II, and III used different film descrip- tion styles as experimental stimulus materials during the first session of the experimental phase of the study, whereas treatment group IV used the corresponding films as stimulus materials during 130 the first session. Later, each of the treatment groups used the experimental films as stimulus materials during a second experimental session. Stimulus Materials Three types of stimulus materials were used: the experimental films, the experimental abstracts, and a pair of parallel instruments which requested responses of the subjects. The experimental films.--Two films were selected from a regional film library to meet four basic criteria. The films were to be: 1. Similar in purpose, i.e., usable for a common instruc- tional intent or objective. 2. Dissimilar in overall quality. 3. Suitable for use with fifth grade elementary school pupils. 4. Similar in terms of running time (length) and color. The two films selected treated the subject "water," and were arbitrarily designated as film one and film two, respectively. Film one was basically a science-oriented film which explored the physical properties and characteristics of water. Film two was a social-studies-oriented documentary which illustrated and explained how water is obtained from its natural sources, purified, and trans- ported to cities for various uses. The two films are described in detail in the experimental abstracts (Appendix C). The quality of the experimental films was assessed through use of the Film Quality Rating Instrument (Appendix Al) by a sixteen- member film evaluation panel. The panel consisted of four elementary 131 school teachers, elementary education professors, media specialists, and subject matter specialists, respectively. The panel members were chosen from persons recommended by College of Education faculty and science department heads at New Mexico State University. The members were selected on the basis of demonstrated skill in evaluat- ing instructional films. The four types of panel members were selected so that different basic viewpoints would be represented by the composite film quality ratings obtained from the panel. The mean quality ratings obtained from the film evaluation panel indicated that film one tended to be rated as "good," and film two as "average." The means were found to be significantly different at the p < .05 level, based upon a t test for correlated sample means (Ferguson, 1966, pp. 169-71). Indices of intersubject agreement were computed for the quality ratings obtained, using a split-half reliability assessment method. Pearson product-moment correlations were used to obtain the indices. The experimental abstractS.--All of the experimental abstracts were designed to meet six essential criteria. The abstracts: 1. Were similar in terms of basic content, format, and arrangement. 2. Contained the same general types of instructional per- tinence indicators (IPI's). 3. Contained an array of IPI's exhibiting content and face validity. 132 4. Contained IPI's which were relatively neutral and objective in tone, providing no direct indications of the quality of a given film. 5. Were similar in length, from three-fourths to one page in length. 6. Were readable by elementary school teachers. The abstracts did pp: contain any vague, ambiguous, unfamiliar terms or phrases, especially technical ones. The abstracts differed, however, as follows: Type I abstracts contained no overall ratings of film quality. Type II abstracts contained "valid" overall ratings of film quality as assessed by the sixteen-member panel of film evaluators. Film one was rated "good," film two as "average,“ for grades 4-6. Type III abstracts contained "invalid" overall ratings of film quality, ratings diametrically opposed and somewhat more dissimilar than the ratings added to the type II abstracts. Film one was rated "fair," film two as "very good,“ for grades 4-6. Treatment groups I, II, and III used the type I, II, and III abstract styles respectively, as stimulus materials during the first experimental session. The type I, II, and III abstract styles are illustrated in Appendices C1, C2, and C3, respectively. The type II and III abstract styles also contained the fol- lowing footnote regarding the quality ratings supplied within them: "Rating scale:. Verypoor, Poor, Fair, Average, Good, Very Good, Superior. Ratings determined by a pane of e ementary teachers, elementary education professors, subject matter specialists, and audio-visual media specialists." 133 The quality ratings supplied for grades 4-6 for the type II abstracts varied one unit, and for the type III abstracts, three units, in relation to the seven-unit rating scale which was used. A second film quality rating was also supplied in the type II and III abstract styles, as noted below: Film One Film Two Abstract Style Quality Ratings Qualitnyatings Type II Grades 4-6: Good Grades K-3: Poor-Fair Grades 7-9: Average-Good Grades 4-6: Average Type III Grades 4-6: Fair Grades K-3: Good Grades 7-9: Fair-Average Grades 4-6: Very Good The second rating was provided to suggest potential limitations of use of the experimental films for different audiences, and to subdue attention to the quality rating supplied for the films for fifth grade level pupils. Overall ratings of excellence were used as film quality indi- cators in the experimental abstracts, for the sake of simplicity. This approach was used to allow more space to be used within the film descriptions for the provision of IPI's. The following information elements were provided in the experi- mental abstracts as IPI'S: ' 1. Physical data. Audience level. A list of subject areas treated. A list of subject topics treated. U'I#OON A one-paragraph-length synopsis of the subject matter content emphasis. 134 6. A brief vocabulary-comprehension level analysis. 7. Suggested uses. 8. Target p0pulation slant indicators. Criteria used to define the types of IPI's supplied by the experimental abstracts are specified in "The Form for Assessing the Content and Face Validity of Film Abstracts" (Appendix A2) developed for this study. The criteria were selected for use because they were commonly suggested as being important selection criteria during the literature review. Instruments.--Two parallel instruments were used by the experimental subjects, the "Abstract Questionnaire" (Appendix 81) and the "Film Questionnaire" (Appendix 82). The questionnaires were similar in that all requested the subjects to provide the four relevance ratings and the betterness ratings desired for the study. The cover Sheets provided a self-read set of directions for the subjects which explained how they were to participate within the experiment. As well, each instrument requested the subjects to indicate if they had ever seen film one or film two prior to the study. Also, a one-page "Elementary Teacher Survey" form (Appendix 83) was appended to the instruments. The survey form requested background information about the training, experience, teaching preferences, and working environment of the subjects. The questionnaires were different in that: 1. Those used during session one for the T2 and T3 treatment groups did not contain film quality rating requests and 135 related subjective comments requests, while all other questionnaires did. 2. The order of presentation of questions pertaining to the seven experimental stimulus conditions was randomized. 3. The order of presentation of questions pertaining to film one and film two was both counterbalanced and randomized. In 50 percent of the instruments the questions addressed film one first and film two second, and vice versa. 4. The order of presentation of the pair of experimental abstracts was alternated from instrument to instrument. The instruments were field-tested with seven members of the film evaluation panel on several occasions, before being used with the experimental subjects. This procedure was used to identify revisions necessary to simplify the instruments, to assure that they conveyed intended meanings, and to rectify unanticipated problems associated with their use. The instruments were modified accordingly, based upon feedback obtained from the panel members. Procedure Sixty-four available subjects were identified. Most were provided with a copy of the "Elementary Teacher Survey Form" approxi- mately one to two weeks prior to participation in the experiment. The completed forms were used to identify and counterbalance across treatment groups, teachers with strong science or social studies teaching preferences, and teachers with other characteristics con- sidered to be potential experimental nuisance variables. 136 Session one.--Sixteen subjects were randomly selected as members of treatment group IV, and scheduled across a two-day mid- week period to view the experimental films. (During the randomiza- tion procedure a few counterbalancing adjustments were made, however, involving several subjects.) Most of the subjects were scheduled to participate in session one at times convenient to them, in groups of two to five subjects, in one of six treatment periods. The periods were held throughout the day from early morning through late afternoon. The "Film Questionnaires" used as stimulus materials were randomly sorted within two piles: those requiring the use of film one first, and the use of film two first. The questionnaires were then passed out alternately from each pile, as subjects came along. When all subjects scheduled for a given period were present and had completed reading the set of directions accompanying the questionnaires, the first film was shown. Approximately ten minutes later after all subjects completed responses associated with the first film, the second film was shown. Thereafter the subjects completed responding to the questionnaires. The use of film one or film two as the first film shown was alternated, every other treatment period. And in most cases the experimental films were viewed in the same room, a room commonly used for film viewing purposes. The following week the remaining forty—eight volunteer subjects were randomly assigned (and partially counterbalanced) to 137 treatment groups I, II, and III. The subjects met the investigator individually or in small groups throughout the day, also across a two-day midweek period. The subjects met at times convenient to them in a prearranged room in the New Mexico State University Library. Each of these subjects used the Abstract Questionnaires as experi- mental stimulus materials. Session two.--Session two was held approximately two weeks after participation in session one, for all subjects. Most of the subjects participated in session two at times most convenient to them. This was achieved by providing session one subjects with a schedule indicating the time and place of session two film showings, which were scheduled every hour on the hour from early morning through early evening, during the middle of the week. During session two, a "Film Questionnaire" was distributed to each subject, followed by presentation of the experimental films and the recording of subject responses. Subject participation in session two was directed in the same manner as participation of the treatment group IV subjects during session one. The order of presen- tation of the stimulus films was also alternated for each group of subjects. The films were shown by necessity in two different locations, one of which was the same room used by the treatment group IV sub- jects during session one. Both rooms were good film viewing rooms, however (free from noise, good lighting control and projection screens, etc.). 138 The films were usually viewed in groups of three to six subjects, although the number of viewers present per showing ranged from one to eight subjects. Sessions one and two.--The investigator served as the proctor for all experimental sessions, with all subjects. At all sessions conversation was very minimal with the subjects. The questionnaires were used as the primary vehicle for explaining and directing sub- ject participation within the experiment. At all sessions, the subjects were asked to refrain from talking to others about the experiment until after all of the session two periods were completed. Most (90 percent or more) of the film showings were held in the two film viewing rooms used for session two. These rooms were located on the New Mexico State University Campus. All other show- ings were held in other suitable locations. The Experimental Design Design Schematics The basic experimental design used was the repeated measures design illustrated in Figure 4. Ratings and subjective comments obtained from the eight treatment-session groups depicted in Figure 4 were compared to test the experimental hypotheses and critical assumptions of the study. The eight groups are referred to as the t1, t2, t3, t4 and t1', tz', t3', t4' treatment-session groups, for session one and two, respectively. 139 Treegflgnt Subjects Segilon Seezgon '1 S1 ’516 t1 t1' T2 S17'532 t2 ‘2' T3 $33-$48 t3 t3' T4 549-564 ‘4 1‘4' Figure 4.--The units of analysis: the eight treatment-session groups. The overall design of the experimental phase of the study is modeled in Figure 5. Figure 5 shows the seven basic ratings which were usually obtained for each treatment-session group: the four relevance ratings, a pair of quality ratings, and a betterness rating. Only five ratings were obtained for treatment groups II and 111 during session one, however, because quality ratings were not requested or obtained, as indicated by the four session one cells blocked in with X's in Figure 5.1 Individual experimental designs for the relevance, quality, and betterness measures, excerpted from Figure 5, are illustrated in Figures K1, K2, and K3 (Appendix K), respectively. The Stimulus Conditions Seven specific stimulus conditions were employed, as illus- trated in Figure 5, referred to as the F1, F2, 0 F O 1 1’ O F 2F2’ 31’ 140 .xnepm we» we mmwee pwpemewemaxm new we emwmmn ppwem>o web--.m weaned emcee m>wuumnno n no 1 meemwms moew>mme n m o3» o>wuomneo n we ozu epwu 1 we meemwms zuwpweo u o wee m>wpumneo u Po weo e_we n Fe meamwms mmweewuumm u m "new m4 enmiavm eh wnmimmm Mk Nmminpm N» opmi —m pk N F N N m _ N N N F _ N m N m eeeee A a iv e L _L Nd o e o L o d o A d_ev o a Fe Ni o .emo NeNo _e_o meeeeeem Heme m o e m o e -eeeee oz» eowmmmm weo eowmmmm meowuwneou mapzepum new .mmeamwmz newnememo .meowmmmm 141 03F2, and 03(F1F2) stimulus conditions. For the F1 and F2 stimulus conditions ratings were obtained about film one and film two, respec- tively. For the remaining stimulus conditions, ratings were obtained about film-objective combinations; i.e., for film one and film two in relation to specific units of study referred to as the O], 02, and O3 objectives, respectively. For example, for the 01F1 condition F1 was rated in relation to the O1 objective; for the 03F2 condi- tion F2 was rated in relation to the O3 objective; and likewise for the 02F2 and 03F1 conditions. For the 03(F1F2) condition both film one and film two were considered in relation to the O3 objective. The F1 and F2 stimulus conditions were used to elicit quality ratings from the experimental subjects. The 01F], 02F2’ 03F], and 03F2 stimulus conditions were used to elicit the relevance ratings. The 03(F1F2) stimulus condition was used to elicit the betterness ratings. It is important to remember that the notations "F1" and “F2," when used as stimulus condition designations, refer to ratings made ppppp F] and F2 respectively, whether film one or film two or descriptions of them were used to elicit the ratings obtained. In other words, the F1 and F2 designations do pp;_imply that ratings were elicited fgpm_the use of F1 or F2 respectively, as stimulus materials. The 01F1 stimulus condition was designed to yield a high relevance rating on the average. This condition was contrived by 142 designating as the O1 objective a unit of study demonstrated to be very pertinent to film one. The 02F2 stimulus condition was also designed to yield a high relevance rating on the average. This condition was contrived by designating as the O2 objective a unit of study demonstrated to be very pertinent to film two. The 03F1 and 03F2 stimulus conditions were also designed to yield high and relatively equivalent relevance ratings, on the average. These conditions were contrived by designating as the O3 objective a unit of study demonstrated to be pertinent to both F1 and F2. The particular rating circumstances associated with each specific stimulus condition are illustrated in the rating instruments used by the experimental subjects (Film Questionnaire, Appendix 82; Abstract Questionnaire, Appendix 81). Experimental Units: Units of AnETysis Subjects served as experimental units for the experimental phase of the study. Data analysis comparisons made of ratings and subjective comments obtained from the eight treatment-session groups depicted in Figure 4 included only comparisons within treatments (measures of intrasubject agreement) and within sessions (measures of intersubject agreement). Diagonal comparisons (across sessions and treatment groups, both) were not of interest to this study, because 143 interpretation of the factors which may have influenced the experi- mental results associated with such comparisons were confounded, indiscernible. General Measurement Methods: The Manifest Level of the Film Selection Model Both objective and subjective measurement methods were used. Rating Scales The rating scales used to manifest the relevance, quality, and betterness judgments elicited from use of the model are illus- trated in Figure 6, and in the stimulus instruments (Abstract Ques- tionnaire, Appendix 81; Film Questionnaire, Appendix 82). Anticipated Rating Patterns Described next are the patterns and combinations of ratings expected to be illustrated by responses obtained from selectors, when forced-choice selection judgments are made involving comparisons of pairs of films. The rating patterns can be expected whether the selection judgments are elicited from films or film descriptions. Note that when the model is used to compare judgments made about a given pair of films, one of the stimulus films must be designated as film one (F1) and the other as film two (F2). (This decision is an arbitrary one; either film can be designated as film one or two.) Possible rating patterns.—-The twenty-seven "blocks" within the three-dimensional matrix depicted in Figure 7 represent the total .Ponoe eowuooPom spew men euwz now: mopwom mewuwm11.n acumen 144 opwom mmoeeowpom mucm .588 8:: A Let; .538 See 2: 5.: oz I 25 E: m N N w m e m N _ ePwem appease noow omwe coon coweooam zeo> noow io>< Lewd coon xeo> N m m e m N p opwom ooew>opom new>mFom uew>opom pew>opom No2 A 8... A eez op m w u o m n m N P 145 N p couoomum... A .._v ..: 338" 5 AN“: r: 2: er: 1 Na 25 E: n r we go; meme 3.525 n we Pu Lee memuwm .3285" :o Nd see meme ooew>opom n we _L Lee 3sz ooew>opomn C neomog .mepwm eo mewwo mep>po>ew meo_mwoon eopuoopom oowoeoinoocow cow mmewuwe mmoeeoppoe new .xuwpwso .moew>opmc mo meo_pwewoeoo opepmmoeii.u oeemwu ONIlVZi SSBNHEILLHB 146 range of possible response patterns which can occur; combinations of the three basic types of rating results obtainable from a given rater, for each of the three measures assessed. For example, the lower-right-hand-corner block of the matrix illustrates this response pattern: when relevance and quality ratings for film one (F1) are both found to be greater than those for film two (F2) respectively, with F1 rated as better than F2 (F1 > F2) for the particular forced- choice stimulus condition posed. Macro-type rating patterns.--Figure 8 illustrates four other important, expected rating patterns for relevance, quality, and better- ness judgments from a given selector. The figure Shows in schematic form, the four basic patterns of responses defined below as type I, II, III, and IV rating patterns. The particular film selection principle (of those defined in chapter I, p. 27) illustrated by each type of response pattern is noted in parentheses. Type I Rating Pattern (Principle C) a. The pair of relevance ratings obtained are equal; b. The pair of quality ratings obtained are equal; and c. The "no difference" betterness option is selected for the situation posed. Type II Rating Pattern (Principle 0) a. The pair of relevance ratings obtained are equal; b. One of the two films is judged to be higher in quality; and c. The film judged to be higher in quality is selected as better to use for the situation posed. Type III Rating Pattern (Principle 8) a. One of the two films is judged to be higher in rele- vance; and b. The film judged to be higher in relevance is selected as being better to use for the Situation posed. 147 Type IV Rating Pattern (Antithesis of Principle A) Any and all patterns exhibited by a given selector other than those predicted as Type I, II, and III patterns above. These patterns are referred to as macro-type rating patterns. Micro-type ratingypatterns.-—Nine specific types of rating patterns, referred to hereafter as micro-type rating patterns, were expected to be exhibited in most cases by responses obtained from the experimental subjects. The nine micro-type patterns, the char- acteristic rating conditions and selection principles associated with each, and the probability of chance occurrence of each are illustrated in Figure 9. The nine micro-type patterns are logical, predictable ones, consistent with the previously described selection principles A. B, C, and D. Eighteen other micro-type rating patterns can be defined, listed in Figure 10, corresponding to the remaining set of possible rating combinations suggested by Figure 7. The eighteen other micro- type patterns are all illogical ones. All contradict selection principles A, 8, C, and D. All fall under the Type IV rating pattern illustrated in Figure 8. Frequency distributions.--The number and percentage of subjects associated with a given treatment-session group were tabu- lated, whose responses illustrated macro—type I, II, III, and IV rating patterns for the forced—choice stimulus condition 03(F1F2) which was posed. This was done to determine the degree to which selection principles A, 8, C, and D were reflected by the subjects' responses. 148 .>H new .HHH .HH .H mma>u1ocuwz .‘lll‘iIIV-Ii ”mecoupwo mewowe nouoooxm11.w oeemwe Ne Lee mewowm x~wpw=o n Nu m coe mewuwm xuwpwzc n Po we now me_uwm ooew>o_om u me an Lee mePuwm ooew>o_om u pg eowuoo ooeoeomewo oz n oz esp e_ce 1 Nd eeo sped 1 _d neomog eeouowe eeouuwe eeouowe econowe HHH Ne>e >H Na>e He Ne>e H Ne>e nm1111111m i 11111 drif$wfiwflfliJ. 1111111111 J 11111 J _ _ _ _ _ _1 ezee : ezee No L. Nu _ C _ N _ _ eee_em eeepem oee_em _ oee_em _ m _ u _1 fil _ _ oz eeepdm _ eee_em _ _ _ _ eee_em _ _ _ _ 11111 _“_II1 Nev; FNox_o_ _ _ _ _ _ _ _ . _ _ _ — _ _ _ oz r... ne> _ _ _ _ 411 N _ _ Nev: Nex: 111111111 _ Nan—e n H H H oz mo> m1, e — 149 The Probability Type of Rating Results Obtained Selection of Chance Rating Pattern for Pairs of Films Principle Occurrence . Illustrated for a Given Mecrp- Micrp- Relevance Quality Betterness by the Rat- Stimulus yp yp ing pattern Condition N d'f- I I-a r1 = r2 q1 = q2 fgre;ce c .4% .4% II-a q] > q2 F1 4.9% 11 r1 - r2 0 9.8% III-a q1 = q2 F1 2.4% III-b r1 > r2 q1 > q2 F] 7.1% III-c q] < qz F1 7.1% III 8 33.2% III-d q1 = q2 F2 2.4% III-e r1 < r2 q] > qz F2 7.1% III-f q1 < q2 F2 7.1% Legend: F1 = Film one r1 = Relevance rating q1 = Quality rating for F for F] . F2 = Film two r2 = Relevince rating q2 = Quality rating for F2 for F2 Figure 9.--Expected micro-type rating patterns for forced-choice selections. 150 Rating Results for Pairs of Films Figure lO.--The type IV micro-type rating patterns. Micro- Macro- type type Relevance Quality Betterness IV-a F] -_—-——_—1‘ I r1 = r2 q1 = q2 IV-b F2 IV-C F2 ql > q2 . IV-d No difference 11 r1 = r2 IV-e F] qz > q1 . IV-f No difference IV-g F2 q1 q2 IV-h No difference IV-i F2 r > r q > g IV—j 1 2 1 2 No difference IV-k F2 q2 > qi . IV-l No difference III IV-m F] q1 ' q2 . IV-n No difference r2 > r] q] > q2 IV-p No difference IV-q F] “2 > q1 . IV-r No difference Legend: F1 = Film one r1==Relevance rating q1==Quality rating for F1 for F1 F2 = Film two r2 =Relevance rating q2 =Qua1ity rating for F2 for F2 151 A clear majority of subjects were expected to exhibit macro-type I, II, and III rating patterns, rather than macro-type IV patterns, whether the ratings were elicited from use of the experi- mental films or abstracts. This prediction was a reasonable one, consistent with selection principles A, 8, C, and D. Subjective Comments Subjects were requested to describe the particular criteria or key factors which they considered in making the quality and betterness ratings which they made. Their comments were requested to determine the degree to which film quality indicators and instruc- tional pertinence indicators influenced the making of their quality and betterness judgments. The comments were analyzed in two successive stages. During the first stage, all of the reSponses were read to identify the general types of comments which were made, overall. The general types of evaluation-selection criteria referred to in the responses were also identified and categorized. Three basic types of comments were then defined, referred to hereafter as type P, Q, and PQ comments, as noted below: Type of Comment Referent Criteria P Instructional pertinence factors Q Film quality criteria PQ Both instructional pertinence and film quality criteria 152 During the second stage, the frequencies with which the types of referent criteria and comments were exhibited by the treatment-session groups were tallied. Sentence structural elements (phrases, clauses, and sentence fragments) were used as the unit of analysis for the response fre- quency counts which were made. Representative examples of the type P, Q, and PO comments are illustrated in Tables G1, GZ, and G3, respectively (Appendix G). The tables list strength-oriented and nonstrength-oriented comments, categorized by the dominant evaluation-selection referent criterion exhibited by each. Both abstract-elicited and film-elicited subjective comments were obtained for the betterness measure. However, only film- elicited comments were obtained for the quality measure; and no com- ments were solicited for the relevance measure. The Measurement Methods Used for Comparing the Experimental Abstracts Both objective and subjective measurement methods were used. Measures: Reliability and Consistency Two basic types of measures were obtained, namely, measures of: 1. The reliability of ratings elicited from the experimental films; and 2. The consistency (similarity) of ratings elicited from a given abstract style, and corresponding ratings elicited from the experimental films or another abstract style. 153 Definitions of Reliability and Consistengy Two types of reliability are referred to in this study, intersubject and intrasubject reliability, defined as follows: Intersubject Reliability: A measure of the degree to which the same pattern Of’responses is elicited from a given stimulus (films) presented to different groups of people. Intrasupject Reliabiliiy: A measure of the degree to which the same pattern of responses is elicited from a given stimulus (films) presented in succession to the same group of pe0p1e. Similarly, two types of consistency are referred to in this study, defined as follows: Intersubject Consistency: A measure of the degree to which the same pattern of responses is elicited from a stimulus (film or film description) and a surrogate stimulus (cor- responding film description) presented to different groups of people. Intrasubject Consistengy: A measure of the degree to which the same pattern of responses is elicited from a stimulus (film or film description) and a surrogate stimulus (correspond- ing film description) presented in succession to the same group of people. The term "consistency" is also sometimes used in the generic, nontechnical sense in this study. When so used, the term refers to the degree to which patterns or types of responses being compared are generally the "same," "similar" to one another, as reflected by the particular index of response used for a given comparison of interest. The definitions of reliability and consistency used in this study are based upon definitions of "reliability," "generality," and “predictive validity" described by Vinsonhaler (1966, pp. 1-4).2 154 Indices of Reliability and Consistency Mean ratings and Spearman rank correlation coefficients (Ferguson, 1966, pp. 216-18) were used primarily, as indices of reliability and consistency. Mean ratings: ANOVA indices.--Mean relevance, quality, and betterness ratings obtained for the respective stimulus conditions, for each of the eight treatment—session groups, served as basic indicators of the degree of similarity of the selection judgments made by the experimental subjects. For the sake of comparison with the rank indices obtained, comparisons made of the mean ratings are referred to as the ANOVA indices. Analysis of variance (ANOVA) methods, multiple comparison techniques, and F tests were used to distinguish similar mean rating distributions from dissimilar ones. Rank correlation indices.--Spearman rank correlation coeffi- cients were obtained by correlating pairs of ranked means profiles defined for the eight treatment-session groups. The two types of profiles correlated are illustrated in Figure 11. To obtain the ranked means profiles, means obtained for the quality and betterness measures were converted into "adjusted" means: means adjusted to a common, ten-point metric corresponding to that used for the relevance measure. The rank order of the particular means of interest for each treatment session group was then determined, 155 to define each given profile. Thereafter, profiles of a given type were correlated in pairs within treatments and sessions, to obtain the desired indices of reliability and consistency. The two types of indices obtained are referred to as the relevance and the combined-means rank indices respectively, accord- ing to the particular measures reflected by the indices. as noted in Figure 11. . T e f P ofile Measure Stimulus yp 0 r C b' d Condition om ine Relevance Means 01F1 rx rx 0 F r r Relevance 2 2 x x 03F1 rx rx 03F2 rx rx Betterness 03(F1F2) Yx F r Quality 1 x F2 rx Legend: r =the rank order of the mean rating obtained for a given stimulus condition and treatment-session group Figure ll.--Types of ranked means profiles obtained for the treatment-session groups. 156 For session one, the grade 4-6 quality ratings specified in the type II and III experimental abstracts were used as mean ratings, for purposes of deriving rank values for the combined-means profiles associated with treatment groups II and III. CorreSponding indices.—-Figure 12 illustrates the particular treatment-session group comparisons made for the corresponding indices of consistency and reliability which were obtained. Of twenty-eight possible pair-wise comparisons of the eight treatment-session groups investigated, only sixteen of the comparisons were of interest to the study. The reader should note from Figure 12 that two types of intersubject consistency indices were investigated: those involving ratings elicited from both abstracts and films, and those involving comparisons between abstracts only (referred to hereafter as abstract vs. film and abstract vs. abstract indices, respectively). Significance tests: rank indices.--To determine if the ranked means profiles obtained from the experimental treatment-session groups varied significantly, the correlation obtained for a given pair of profiles was compared to the appropriate critical value (p = .01, one-tail) for the Spearman rank correlation coefficient (Ferguson, 1966, pp. 219-20, 414). Correlations which varied significantly from zero were assumed to reflect ranked means profiles which were similar, whereas correlations which did pp; vary significantly from zero were assumed to reveal profiles which were significantly dissimilar. 157 Treatment Groups Type of Compared for the Indices Type of Consistency Reliability Indices For the For the Indices Consistency Reliability Indices Indices t1 vs. tf Intrasubject . . Intrasubject consistency t2 VS' t2 t4 VS' t4 reliability t3 vs. t3 I I t1 vs. t4 t1 vs. t4 Intersubject 1 i consistency. t2 vs. t4 t2 VS‘ t4 (abstract/film) t3 vs. t4 t3' vs. t4' Intersubject reliability I I Intersubject t1 ”5' t2 t1 VS: t2 consistency 1 1 (abstract/ tl VS' t3 t1 VS’ t3 abstract) t2 vs. t3 t2' vs. t3' Figure 12.--Treatment-session group comparisons associated with corresponding consistency and reliability indices. Significance tests: ANOVA indices.--To determine whether or not the magnitude of mean ratings obtained for different treatment- session groups varied significantly, an omnibus ANOVA F-test was first used to compare the variance distributions of ratings obtained for a given stimulus condition or set of conditions. The p S .01 level was used for all initial F-tests. Significant F-tests were assumed to reflect the presence of two or more significantly dif- ferent, i.e., "inconsistent" or "unreliable," mean ratings. 158 If a given F-test did not imply the presence of significant treatment-session group differences, all treatment-session group comparisons associated with the test were assumed to reflect mean ratings which were relatively "similar," i.e., "consistent" or "reliable." To isolate mean treatment—session group differences implied by significant omnibus F-tests, four types of follow-up tests were used, as apprOpriate. All intrasubject comparisons made (for pairs of means within treatments, across sessions) were tested using an independent F-test procedure. Intersubject comparisons were made (for pairs of means within sessions, across treatments) using one of the three testing proce- dures which follow: The Duncan multiple range comparison technique (Edwards, 1968, pp. 130-35)3 was used for comparing unit cell means. The F-test values and p levels for marginally significant intersubject comparisons were estimated through use of the following formula, for nonhierarchical ANOVA designs (e.g., treatments [4] x sessions [2]) (df 1, 3o; Winer, 1971, pp. 380, 442): F="(Yl'-x-2 2 MSe where X} and Xé are the particular unit cell means of interest; MSe, the within cell (interaction) error term; and n the number of subjects per treatment group. 159 This formula was used to test, across treatments, pairs of overall interaction means (combined unit cell means) for hierarchical ANOVA designs (e.g., treatments [4] x sessions [2] x stimulus condi- tions [4]) (df 1, 30; Winer, 1971, p. 544): 2 _ (ATPx ' 553212 T 2 (Mse(a) + Tq-l) MSe(b))/nrq where A18x and AZBx are the interaction cell means (combined unit cell means) of interest; Mse(a) and MSe(b) the treatment and column error terms; n the number of subjects per treatment group; q the number of levels of the first repeated factor; and r the number of levels of the second repeated factor. ("q" and "r" are the same values for the 4 x 2 x 2 designs used in this study. Therefore, the same formula was usable for tests based upon the 4 x 2 x 2 designs.) Other Descriptive Indices Other simple statistical indices found to be useful for illus- trative purposes were also employed, including Pearson product-moment correlations (Ferguson, 1966, pp. 110-12), frequency distributions, ratios, proportions, and average values. Measurement-Related Assumptions The experimental phase of the study was founded upon a number of important measurement-related assumptions which follow. The criti- cal assumptions which were tested for the study are each preceded by an asterisk (*). 160 The degree to which the critical assumptions were confirmed by the experimental data is described in Chapters IV and V. General Assumptions *Reliable film selection judgments in the form of relevance, quality, and betterness ratings can be elicited from both films and film descriptions. *Rated judgments of film quality, relevance, and betterness elicited from films can be used effectively as a baseline of com- parison for corresponding judgments elicited from different film description styles. Indices of response which Show Significant differences (p S .01) indicate that the corresponding selection judgments asso- ciated with the indices have been influenced significantly in dif- ferent ways by the experimental treatments which elicited the responses. *The rating patterns exhibited by the experimental subjects will tend to reflect the principles of selection defined by the film selection model. *The types of subjective comments made by the subjects and the frequency with which they occur will tend to support the ratings which are made. *Rated relevance, quality, and betterness judgments will tend to be made as independent judgments. The ratings will not tend to be strongly correlated (p S .01). 161 The Film Selection Process Relatively large differences in the perceived relevance of different films will significantly influence (p f .01) the magni- tude of rated relevance and betterness judgments made about the films, whereas relatively small differences will ppi. Relatively large differences in the perceived quality of different films will significantly influence (p S .01) the magnitude of rated quality and betterness judgments made about the films, whereas relatively small differences will ppi. Film Descriptions *Film description styles which are pp: "good'I styles will tend to elicit mean quality and betterness ratings which vary Signifi- cantly (p S .01) from those obtained from the corresponding films, whereas styles which are "good" will pgi. *Film description styles which are "good" styles will tend to elicit rank correlation indices of consistency (abstract vs. film) which are significant from zero at the p S .01 level, whereas styles which are ppi_"good" styles will ppi. If a film description style tends to exhibit behavioral responses (mean ratings and ranked means) characteristic of a "good“ style, it contains an adequate array of instructional pertinence indicators and film quality indicators. If a film description style does ppi_tend to exhibit beha- vioral responses (mean ratings and ranked means) characteristic of 162 a "good" style, it does pp; contain an adequate array of instruc- tional pertinence indicators and film quality indicators. *80th the lack of and the presence of film quality indicators (including overall ratings of film quality) within film descriptions can cause the magnitude of rated quality and betterness judgments elicited from the descriptions to vary significantly (p S .01) from the corresponding ratings elicited from the instructional films. *The presence of film quality indicators (including overall ratings of film quality) within film descriptions can cause the descriptions to elicit rated betterness judgments which vary signifi- cantly (p S .01) from those elicited from film descriptions which do pgi_contain the film quality indicators. *Neither the presence nor the lack of film quality indicators (including overall ratings of film quality) within film descriptions will cause the descriptions to elicit rated relevance judgments which vary significantly (p f .01) from those obtained from the correspond- ing films. *Film descriptions which contain film quality indicators (including overall ratings of film quality) will ppi_elicit rated relevance judgments which vary significantly (p S .01) from those obtained from film descriptions which lack the film quality indicators. The Film Selection Model *The model is a reliable model of the forced-choice film selection process. 163 *The model can be used to distinguish "good" film description styles from those which are ppi_"good." *The model can be used to distinguish film description styles which are "effective" or "more effective" ones (adequate, suitable, reliable, desirable, etc.) from "less effective" ones. Macro-type I, II, III, and IV rating patterns and a wide range of micro-type patterns will tend to be exhibited by groups of film selectors, when the model is used to elicit forced-choice selec- tion judgments. *Macro-type I, II, and III (rather than IV) rating patterns will tend to be reflected predominantly by the subjects' responses, whether the rating patterns are elicited from the experimental films or descriptions. The frequencies of the different types of rating patterns exhibited by responses elicited from use of the model will tend to vary with the particular stimulus films compared, and the frame of reference (context, instructional situations, stimulus conditions) established for the selection judgments which are requested. The Indices of Reliability and Consistency *The ANOVA and rank indices will tend to exhibit similar, mutually supportive experimental findings--both the indices of reliability and the indices of consistency--for different styles of film descriptions which are compared via use of the model. *The intrasubject and intersubject indices obtained for a given type of index (ANOVA, rank, reliability, consistency, abstract 164 vs. film) will tend to demonstrate similar, mutually supportive experimental findings. Measures of intrasubject and intersubject reliability obtained from films will tend to be relatively equivalent; neither type will tend to be higher or lower than the other. Measures of intrasubject and intersubject consistency (abstract vs. film) obtained for a given film description style will tend to be relatively equivalent; neither type will tend to be higher or lower than the other. The degree to which corresponding indices of consistency and reliability are similar or dissimilar is a direct function of the degree to which a given film description or description style validly represents characteristics of the corresponding film(s) which influ- ence the selection decision-making process. When corresponding indices of consistency and reliability are very similar or dissimilar for a given film description or des- cription style, the description or style can be referred to as exhibiting "high" or "low predictive validity," respectively.4 The rank correlation indices.--Rank correlation indices of reliability obtained from use of the model will tend to be: *a. High (+.65 and above), *b. Significant from zero at the p S .01 level, and c. Higher in magnitude than indices of consistency elicited from film descriptions. 165 *The ANOVA indices: mean ratings.--Corresponding film-elicited, mean relevance, quality, and betterness ratings will tend to be similar. The means will not vary significantly at the p S .01 level. Mean ratings of relevance, quality, and betterness elicited from films and film descriptions will tend to be stable, reliable means, even though individual rater judgments can be expected to vary considerably for a given treatment-session group, for any given measure. The Quality Measure The mean quality ratings elicited from the experimental films from the experimental subjects will ppi_vary significantly (p S .01) from those obtained from the film evaluation panel. As for the film evaluation panel, the mean quality ratings elicited from film one from the experimental subjects will be signifi- cantly higher (p S .01) than those elicited from film two, for each respective experimental treatment group. The instinctive, intuitive definition or view of film quality held by most teachers is relatively stable, likely, over short periods of time. Hence, it is just as fruitful to ggi_define the term "quality" as to define it, in data-collection instruments used by the experimental subjects. The Relevance Measure For the majority of subjects associated with each of the experimental treatment-session groups, the relevance ratings elicited 166 from film one and film two for the forced-choice stimulus condition will pep; a. Vary substantially, nor b. Show systematically shifted magnitudes (i.e., the ratings for film one will ppi_tend to be greater or lesser than those for film two). The mean relevance ratings elicited from the experimental sub- jects will tend to fall within the 7.0 to 9.0 rating range. The Experimental Abstract Styles General assumptions.--The three experimental abstract styles are distinctly different film description styles. *Each of the abstract styles will tend to exhibit a unique, distinctive pattern of selection judgment response tendencies. The type I abstract style.--*This style is ppi_an effective style. It will tend to elicit indices of consistency characteris- tic of a style which is pgi_a "good" style. The type II abstract siyle.--*This style is an effective style. It contains an adequate array of instructional pertinence indicators and film quality indicators. It will tend to elicit indices of consistency characteristic of a style which is a "good" style. *As well, the type II style will elicit selection judgment response tendencies which: 1. Are most similar overall to those obtained from the experimental films. 167 2. Vary significantly (p S .01) from those obtained from the type I and III abstract styles. The type III abstract style.--*This style is ppi_an effective style. It will tend to elicit indices of consistency characteristic of a style which is pp; a "good" style. *Additionally, the type 111 style will elicit selection judg- ment response tendencies which: 1. Are least similar overall to those obtained from the experimental films. 2. Vary significantly (p S .01) from those obtained from the type I and II abstract styles. The qujective Comments *Subjective comments obtained from use of the model can be defined meaningfully as being instructional-pertinence-oriented and film-quality-oriented comments, in terms of the types of evaluation- selection referent criteria exhibited by the comments. Comments obtained for quality ratings will tend to predomi- nantly exhibit film-quality-oriented, evaluation-selection referent criteria, rather than instructional-pertinence-oriented referent criteria. Comments obtained for relevance ratings will tend to pre- dominantly exhibit instructional-pertinence-oriented referent criteria, rather than film-quality-oriented referents. 168 Comments obtained for betterness ratings will tend to exhibit both instructional-pertinence-oriented and film-quality-oriented referent criteria. The types of referent criteria which will be predominantly exhibited for betterness ratings will tend to vary directly with the types ofrating patterns (macro-type and micro-type) which are dis- played. *Groups of selectors who exhibit significantly different (p S .01) rating distributions will also tend to exhibit subjective comments and associated indices of response which are distinctively different. The frequencies with which specific types of comments are made and referent criteria are exhibited can be used meaningfully as indices of response, for comparing the effectiveness of different film description styles. Nuisance Variables *Significant main sessional effects (p S .01) will ppi_tend to be exhibited systematically within the experimental measures, across the treatment groups. Nor will they tend to be revealed by the film-elicited responses obtained from treatment group IV. The subjects' teaching preferences will ppi_bias the rated judgments obtained for any of the treatment-session groups. The subjects' ratings will be made independently of their teaching preferences. 169 Limitations of the Experimental Design and Methodolqu Related limitations deal with the validity and reliability of the experimental method and design, the control of experimental error, and other important factors of significance to the interpre- tation and generalizability of the experimental results. Noteworthy limitations are treated in relation to the following topics: the repeated measures design; the selection and use of the experimental subjects; sampling and randomization considerations; the independent and dependent experimental variables; nuisance variables; the particu- lar measurement and data analysis methods used; and specific problems which were encountered during the experimental phase of the study. The Repeated Measures Design One of the major limitations of the repeated measures design, the basic design used for the experimental phase of the study, is that it usually produces systematic differences within subject responses associated with successive measures or trials; because "the effects of prior treatments are not usually erasable," according to Campbell and Stanley (1969, p. 6). Therefore, to reduce the influ- ence of participation within session one upon session two responses, a two-week "dissolution period" was used between the sessions. This approach was used to increase the likelihood that the experimental subjects would "forget" details and information cues associated with the stimulus conditions and the particular responses which they made. 170 Theoretically Speaking, use of the two-week interim period increased the possibility of uncontrolled "historical effects" (Campbell & Stanley, 1969, p. 5); i.e., the effects of events occur- ring between sessions one and two which could have significantly altered the nature of subject responses made in session two. Ran- domization of the experimental subjects across the treatment groups was assumed to have remedied this possibility, however. The particular experimental design used in this study con- founded treatment influences with sessional influences for treatment groups I, II, and III. The remote possibility was also present that the experimental results were influenced by history x treatment group interaction effects. Such effects were very unlikely, however, Nevertheless, if significant sessional or history influences occurred (contrary to basic assumptions of the study), one would expect that one or more treatment groups during session two would exhibit response patterns significantly different than those asso- ciated with treatment group IV during session one. Analysis of responses associated with treatment group IV and other treatment groups combined, via use of the two-way ANOVA statistical analysis and interaction plotting methods, was expected to allow important discriminations to be made about likely treatment vs. sessional or history influences. Because both history and session effects were adequately controlled for by the overall experimental design and data analysis methods used, the investigator assumed that any significant differences 171 found for responses distinctly associated with the experimental abstracts could be attributed to treatment effects rather than to sessional or history effects. Sampling Considerations The samples of teachers, films, abstract styles, and stimulus conditions investigated were all arbitrarily selected (rather than randomly selected). The samples were also relatively narrow in coverage, scope, and dimension. Factors investigated were not repre- sentative of the broader, general populations of items from which the samples were selected. The Independent Variables Treatments: abstract styles.--The three experimental film descriptions were designed by the investigator. They were not rep- resentative of commonly used film description styles. Stimulus conditions.--The instructional situations Specified as the frame of reference for the selection judgments which were made, involved the following instructional anchor referents only: use of the experimental films to supplement or enrich a fifth-grade elemen- tary school class in the southwestern United States; and three spe- cific units of instruction, the O], 02, and O3 objectives. Because the stimulus films were not actually designed for the same specific instructional purpose, a somewhat "artificial" purpose was contrived, the O3 objective, for the forced-choice stimu- lus condition. 172 The relevance-oriented stimulus conditions investigated, tended to elicit "high" and "mediocre" relevance ratings. Condi- tions tending to elicit "low" relevance ratings were not used. Stimulus films.--One pair of films was used: a pair which did not quite meet as adequately as desired,an important criterion established for the experimental design, the equivalence of purpose criterion. Although both films treated a common subject, the films conveyed distinctly different film messages. As well, the stimulus films tended to be viewed as "mediocre" in relevance to the forced- choice selection situation posed, rather than "high," as desired. Measurement Methods Related limitations of the experimental method and design deal with the validity and reliability of methods used to measure and assess four major considerations: the predictive validity of document abstracts; the validity of the experimental abstracts; the quality ratings elicited from the experimental films, from both the film evaluation panel and the experimental subjects; and the depen- dent variables. The predictive validity of document abstracts.--According to Vinsonhaler (1966), important factors which influence measures of the predictive validity of document abstracts are these: 1. The characteristics of the documents which are abstracted; 2. The purposes for which the abstracts and documents are used; 173 3. The characteristics of the individuals who use the docu- ments and the abstracts; 4. The importance and kind of information included within or excluded from the abstracts; and 5. The type of response used as the measure of predictive validity (i.e., categorical, comparative, rated, ranked responses, etc.)(pp. 4, 7, 10, 11). Vinsonhaler (1966) also noted three other limitations of predictive validity indices obtained for document abstracts, which help to define the limits of this study. First, in order to effec- tively use the indices as indicators of the validity of the abstracts, the measured responses upon which they are based should demonstrate both intrasubject and intersubject reliability (p. 4). Second, use of the indices is best suited for comparisons of abstract styles designed for a specific set of documents, which are representative of a particular collection of documents (p. 11). Third, in order to demonstrate that a given abstract style has general predictive validity, its predictive validity must be demonstrated for various kinds of subject responses (p. 7). The rank correlation indices of consistency obtained for the relevance measure for this study are synonymous with the indices of predictive validity described by Vinsonhaler (1966, pp. 1-2, 4, 9-10). Hence, the results of this study are generalizable only to situations and circumstances implied by the above factors and limitations. The indices of consistency and reliability obtained for this study are normative, i.e., valid for groups of selectors rather than 174 for any given individual. The indices are based upon the averaging and counterbalancing of both typical and atypical ratings, resulting from individuals with different interests, knowledge, and rating skills, and from measurement error (p. ll). One of the advantages of the indices is that they tend to "average out" atypical responses, and therefore to exhibit relatively stable magnitudes, even though "subjects usually show a fairly wide range of individual differences in comparative and discrimination behavior" (p. ll). The validity of the experimental abstracts.--Formal measures of the validity of the experimental abstracts were not obtained prior to the experimental phase of the study. The validity of the type 11 abstract style, the style assumed to most adequately represent and best describe the experimental films, was determined on the basis of a priori criteria and related subjective opinions. The criteria used are Specified in "The Form for Assessing the Content and Face Validity of Film Abstracts" (Appendix A2) developed for this study. Preliminary use of the form with selected members of the film evaluation panel suggested that the experimental abstracts were suitable ones. Hence, the collection and comparison of formal validity measures was not pursued via further use of the form. The indices of consistency obtained as measures of the validity of the experimental abstracts are measures of predictive validity for rated responses only. Other types of predictive validity indices were not obtained by this study. 175 The validity of the film quality ratings elicited from the film evaluation panel.--The mean quality ratings supplied within the type 11 experimental abstracts were obtained from the film evaluation panel through use of the "Film Quality Rating Instrument" (Appen- dix Al). The content and face validity of the instrument were estab- lished by including within it,items based upon factors commonly sug- gested as being important film evaluation criteria during the review of related literature. The instrument was developed for the study by necessity. No other instrument was found during the review of related literature to be appropriate to use for this study. The reliability of the film quality ratings elicited from the film evaluation panel.--The rating instrument used by the panel was field-tested with selected members of the panel, using various films as stimulus materials, and modified accordingly to simplify and improve it before being used to evaluate the experimental films. The actual reliability of the instrument was not determined prior to the study, however. The reliability of the quality ratings was measured in terms of intersubject reliability only, using a Split-half reliability method. The reliability of the rating instrument was measured using an analysis of variance approach described by Niner (l97l, pp. 283—89)}5 176 The validity of the measures obtained of the dependent variables.-—A basic, a priori assumption of this study is that the relevance, quality, and betterness parameters exhibit adequate face validity, when viewed together in composite fashion as the model of film selection investigated. The rating scales employed within the Film Questionnaire and the Abstract Questionnaire to manifest the judgments of relevance, quality, and betterness made by the experimental subjects were created for the study. Field testing of the questionnaires sug- gested that the rating scales could be used effectively for their intended measurement purposes. Other rating scales were not sug- gested from the review of the literature as being more appropriate to use. The film quality ratings elicited from the film evaluation panel and the experimental subjects both served as valid indices of the quality of the experimental films. The mean quality ratings associated with each set of ratings represented distinctly different measures, however. The mean quality ratings elicited from the panel represented combined measures of film quality, as perceived by the four types of evaluators on the panel, namely, elementary school teachers, elemen- tary education professors, media specialists, and subject matter specialists. On the other hand, the ratings elicited from the experi- mental subjects represented measures of the film quality perceptions of elementary school teachers only. l77 Evaluation criteria specified for measures of the quality of the experimental films varied for the film evaluation panel and the experimental subjects. Nineteen specific factors were considered as evaluation criteria within the rating instrument used by the panel, before the panel rated the overall quality of the films. However, the instinctive, intuitive definition or viewpoint of film quality held by each experimental subject was used in singular fashion as the film evaluation criterion by the subjects. "Film quality" was not defined for the subjects for two basic reasons. First, it seemed fruitful to determine the existing, natural, undisturbed film evaluation perceptions of the experimental subjects rather than to provide them with specific film evaluation criteria to consider, which likely would have steered their thinking in other directions. Second, and somewhat paradoxically, measurement errors were expected to be significantly higher, rather than lower, if "film quality" was defined. Without some sort of training, errors were expected to increase considerably, associated with the subjects' abilities to recognize and accurately interpolate the corresponding value of specific film evaluation criteria to be rated. For the combined-means rank correlation indices obtained, because the quality ratings supplied in the experimental abstracts were used to compute the indices, the indices did not measure the same parameter for the t1, t2, and t3 treatment-session groups. For t], the indices measured the ggtggl_influence of the type I abstract style upon selector responses; for t2 and t3, the indices measured 178 the hypothetical influence of the type II and type III abstract styles upon selector responses. Use of the repeated measures experimental design was expected generally to deflate the magnitude of treatment-session group differences found, and to mask the presence of borderline treatment-session group influences. As well, the use of very small sample sizes required the use of tests of statistical significance which were very stringent. Consequently, the overall likelihood of making type II interpretation errors6 was expected to be somewhat inflated for the study. The reliability of the measures obtained of the dependent variables.--Only film-elicited indices of reliability were obtained. Abstract-elicited indices of reliability were ngt_employed. It was possible to estimate the reliability of the abstract-elicited ratings only, from the standard deviations obtained of the ratings. The rank correlation indices used for treatment-session group comparisons were based upon very small sample sizes. The number of factors correlated was from four to seven. Consequently, the rank correlation significance tests were very stringent. The significance tests used for the ANOVA indices were also very stringent, since they too were based upon small sample sizes (n = l6). Despite the small sample sizes which were used, generally speaking, both the ANOVA and rank indices of reliability obtained were relatively stable ones. However, the rank indices of consistency l79 were found to be distorted, requiring subjective interpretation of related experimental findings. Nuisance Variables Because small sample sizes were used for the experimental treatment groups, several efforts were pursued to control nuisance variables which were thought to be capable of producing undesirable, systematic effects upon subject responses. For example, to reduce the potential differential influence of subject motivation and fatigue, the subjects were allowed to participate during all experimental sessions at times convenient to them, from early morning through early evening hours. The fre- quency distribution of subjects who participated during different periods of the day was relatively equivalent across the periods, for all experimental sessions. The sequence of presentation of the experimental films and of the experimental abstracts was alternated during all experimental sessions. Approximately 50 percent of the subjects responded to film one first and film two second, and vice versa, during each experimental session. As well, approximately 50 percent of the subjects who responded to a given film first during session one responded to the other film first during session two, in each treat- ment group. The sequence of presentation of the stimulus conditions pro- vided in the Abstract Questionnaires and the Film Questionnaires was both randomized and counterbalanced. This procedure was used to I80 prevent questions from being asked systematically about the relevance, quality, or betterness measures, per se, in the beginning, middle, or final sections of the questionnaires. The Data Analysis Methods In order to compute the frequency of the type I, II, III, and IV rating patterns which occurred, it was necessary to define the difference between ratings which were "equal," relatively "equivalent," and dissimilar. Equal ratings were those indicated by the same numeral, on a given scale (e.g., l,l; 7,7; lO,lO; etc.). Equivalent ratings were those within one rating unit of each other (e.g., 4,5; 7,6; 9,8; etc.). Ratings which were not equal or equivalent included all other combinations. Problems Encountered Two noteworthy problems were encountered during the experi- mental phase of the study, each of which has been alluded to in prior sections of this chapter. First, because the two experimental films used were not truly designed for the same instructional purpose, likely, the forced- choice situation posed (the O3[F]F2] stimulus condition) was not highly representative of those commonly encountered by teachers as film selectors. The experimental conditions actually established were somewhat different than those desired. Consequently, it was necessary to interpret the results of the experimental phase of the study accordingly. 181 Second, while designing the experimental phase of the study, the investigator assumed that the raw score ratings for sessions one and two could be correlated directly, within treatment groups, to provide meaningful, independent indices of reliability and con- sistency for the relevance, quality, and betterness measures, respectively. However, the anticipated skewed data distributions did not encourage the use of such correlations, at least without the use of other, more desirable measures. Consequently, the rank correlation indices of consistency and reliability used as experi- mental measures were incorporated after the experimental phase of the study was designed and administered. Summary Chapter IIIdescribed the experimental phase of the study. These major topics were addressed: the purpose of the experimental phase; the experimental method and design; and related measurement methods, assumptions, and limitations. The purpose of the experimental phase of the study was twofold; namely, to obtain evidence of: l. The validity and reliability of the model of film selec- tion investigated by the study, and 2. The utility of the model for evaluating the effective- ness of the experimental film description styles. The basic experimental design used was a repeated measures design. By way of the design, ratings of relevance, quality, and betterness were elicited from three sets of experimental abstracts, 182 which varied in terms of the kind of information supplied about the overall quality of a pair of experimental stimulus films employed for the study. The ratings were compared with those elicited from the experimental films to determine the degree to which the films and abstracts elicited similar response patterns. The experimental subjects were elementary school teachers, primarily women. All were volunteer subjects. They were randomly assigned in groups of sixteen to four experimental treatment groups. Three independent variables were investigated: treatments (the type of decision-making information provided: filmic or abstracted); stimulus conditions (the set of circumstances posed to elicit the judgments); and stimulus films (two specific experimental films). The dependent variables were the attributes relevance, quality, and betterness defined by the film selection model. The experimental subjects were the experimental units. The units of analysis were the eight basic treatment-session groups with which ratings were associated for a given dependent measure. Two parallel questionnaires were used as instruments by the experimental subjects, the Abstract Questionnaire and the Film Ques- tionnaire. During session one, treatment groups I, II, and III used abstract styles I, II, and III as stimulus materials, respectively, whereas treatment group IV used the experimental films. All of the abstracts contained the same basic array of instructional pertinence indicators. However, the abstract styles differed basically, as I83 follows. The type I abstracts contained no ratings of film quality. The type II abstracts contained "valid" ratings, as assessed by a formal film evaluation panel. The type III abstracts contained "invalid" ratings, ratings diametrically opposed to and more dis- similar than those supplied in the type II abstracts. During session one, treatment group IV used the experimental films as stimulus materials. Approximately two weeks after participation in session one, all subjects participated in session two, using the experimental films as stimulus materials. Two fundamental types of measures were compared, measures of reliability and consistency. Mean ratings and correlation coeffi- cients obtained for ratings elicited from the films only, served as indices of reliability for the selection judgments which were made. The "indices of reliability" served as the baseline of comparison for corresponding measures obtained from the experimental abstracts, which were defined as "indices of consistency.“ The particular film evaluation-selection criteria underlying the judgments of film quality and betterness made by the experimental subjects were also investigated. This was achievedby analyzing comments solicited from the subjects which described why they made the types of ratings which were made. The comments were analyzed to determine if film quality or instructional pertinence factors tended to influence the particular rating patterns exhibited by the experimental treatment-session groups. 184 Measurement-related assumptions were identified underlying the film selection process, the experimental design, the model of film selection, and the particular methods used to operationalize the model. Important limitations of the experimental phase of the study were also identified, dealing with the validity and reliability of the experimental method and design, the control of experimental error, and other factors of significance to the interpretation and generalizability of the experimental results. Noteworthy limitations were described regarding these factors: the repeated measures design, sampling and randomization considerations, the independent and dependent experimental variables, the particular measurement and analysis methods used, and specific problems which were encountered. 185 Footnotes--Chapter III 1Quality ratings were not requested for treatment groups II and III during session one, because the stimulus abstracts used by the subjects contained overall ratings of film quality. The subjects would likely have "parroted" back the same ratings indicated in the abstracts. 2Vinsonhaler (1966, pp. 1-4) defined the "predictive validity" of a "document abstract" in terms of the similarity of relevance judgments elicited from a given abstract style and the corresponding documents described by the abstracts. Hence, the measures of con- sistency used in this study were essentially measures of the predic- tive validity of the experimental abstract styles. Also, Vinsonhaler suggested the use of measures of "reliability" and "generality" as the baseline of comparison for measures of predictive validity. He defined reliability_in terms of the degree of "intrasubject agreement" exhibited by judgments elicited from the documents, and generality in terms of the amount of "intersubject agreement" displayed by judgments obtained from the documents. In this study, the experimental films serve as the "documents" described by the experimental abstracts. 3See also Duncan (l955, pp. l-42). 4The term "predictive validity" used here refers to Vinson- haler's notion of the predictive validity of document abstracts, defined in Vinsonhaler (l966, pp. l-2). Vinsonhaler defined "pre- dictive validity" as (a) "the degree of response similarity between a document and its abstract" (p. l) and (b) "the degree to which subjects' responses to the abstract may be used to predict responses to the complete document" (p. 2). Limitations of measures of the predictive validity of document abstracts noted by Vinsonhaler are discussed on pp. l72-l73 of this chapter. 5See also "Hoyt Reliability" in Helmstadter (l964, pp. 73-74); "The Analysis- -of-Variance Approach to Reliability" in Guilford (1954, pp. 383-85); or Hoyt (l94l, pp. l53-60). 6Type II errors: errors of not finding significant differ- ences among data elements compared, when significant differences are truly present, but remain undetected because they are small or attenuated by experimental error. Keppel (l973) noted that too much emphasis has been placed upon avoidance of type I errors in the statistical analysis litera- ture. (Type I error: the error of finding significant differences among data elements compared, when true, significant differences are not really present.) According to Keppel, the use of excessively high confidence limits in initial studies may result in the making of serious type II errors. Real significant differences may remain 186 undetected, retarding development of the research under investiga- tion. Keppel also noted that "bogus" findings resulting from type I errors will tend to be refuted by follow-up investigations and other research findings. Nith Keppel's comments in mind, a p S .10 confidence level was originally established for data analysis purposes for the experi- mental phase of this study. However, because numerous marginally significant differences were found (.02 S p S .18) to be exhibited within the experimental data, which may have been caused by several types of rating errors which were identified, a p S .01 level was used. This more stringent approach was taken to reduce both the experimentwise and per comparison error rates for type I errors. For a comprehensive discussion of type I and type 11 error considerations, see Keppel (pp. l53-55, l62-63). See Keppel also for a comprehensive discussion of experimentwise and per comparison error rates (pp. 134-35, 152-63). CHAPTER IV EXPERIMENTAL FINDINGS AND RESULTS This chapter is divided into eleven discussion sections, each of which begins with a summary of related findings. The chapter is organized in the following way. First, an overview of the general findings of the experimental phase of the study is given. The overview summarizes findings dealing with the experimental hypothe- ses and important critical assumptions underlying the study. Second, data and supplemental findings are reported for each hypothesis. Third, a presentation of results is provided of analyses made of the film evaluation panel's quality ratings and subjective com- ments, and the subjects‘ subjective comments. Fourth, related details are treated regarding the critical assumptions of the study. General Findings: Summary The General Question Posed by the Hypotheses Are film selection judgments significantly influenced by the use of overall ratings of film quality within film descriptions? 187 188 Overall, the experimental evidence revealed that rated judg- ments of the guality, relevance, and betterness of instructional films can be substantially and systematically influenced by both the pres- eflge_and the leek of overall ratings of film quality within film descriptions. Strongly divergent rating tendencies resulted from the presence and the lack of the overall ratings within the experi- mental abstracts. When compared to the corresponding film-elicited judgment tendencies which occurred, the abstract styles which contained the overall ratings showed tendencies to elicit mean relevance and bet- terness ratings which were substantially inflated in value (.03 S p 5 .15), whereas the abstract style which lackeg the overall ratings tended to elicit mean relevance, quality, and betterness ratings which were substantially deflated in value (.02 S p S .15). As well, the experimental evidence revealed that the rated film selection judgments were significantly influenced (p S .01) by the presence of the overall ratings of film quality--both the ygljg_ and invalid ratings. But to the contrary, the experimental evidence did ggt_confirm that the lack of overall ratings of film quality within the type I abstracts significantly altered the ratings elicited from the abstracts. The degree to which the ratings obtained for the experimental phase of the study were "significantly influenced" varied with (a) the types of judgments compared, (b) the particular measurement methods and indices used to manifest and compare the judgments, and (c) the baseline of comparison which was used. In particular: 189 l. The rank order relationships¢yfthe mean relevance, quality, and betterness ratings, combined, were significantly altered by the presence of both the ygljg and invalid ratings of film quality within the experimental abstracts, but 395 by the lack of the overall rat- ings (Hypothesis Five). 2. The rank order relationship of the mean relevance ratings was significantly altered by the presence of the invalid ratings, but ggt_by the presence of the valid ratings or by the lack of the overall ratings (Hypothesis Four). 3. The mean magnitude of the relevance ratings was signifi- cantly altered (in one of four cases) by the presence of the invalid ratings, but ggt_by the presence of the valid ratings or by the lack of the overall ratings (Hypothesis Two). 4. The mean magnitudes of the quality ratings elicited from the abstract style which lacked overall ratings of film quality were ggt significantly changed (Hypothesis One). 5. The mean magnitudes of the betterness judgments obtained from the experimental abstract styles were ggt_significantly altered (Hypothesis Three). The Hypotheses Hypothesis One was rejected. Hypothesis Two was rejected. Hypothesis Three was rejected. Hypothesis Four was rejected. Hypothesis Five was confirmed. 190 Critical Assumptions The model of film selection.--As predicted: l. The model was found to be a reliable one in that macro- type I, II, and III rating patterns (rather than macro-type IV) tended to be reflected predominantly by the subjects' responses, whether the ratings were elicited from the experimental films or the experi- mental abstracts. 2. It was possible to determine and meaningfully compare the general effectiveness of the experimental abstract styles. For example, the least effective abstract style was clearly distinguished from those which were more effective ones. 3. Both the presence and the lggk_of overall ratings of film quality within the experimental abstracts substantially influenced the magnitude of the relevance, quality, and betterness ratings elicited from the abstracts. 4. It was possible to classify the subjective comments obtained into distinct categories, defined in terms of the types of instructional-pertinence-oriented and film-quality-oriented evaluation-selection referent criteria exhibited by the comments. 5. The types of subjective comments made by the subjects and the film evaluation panel, and the frequency with which they occurred, tended to support the ratings which were made. But contrary to expectation: l. Agggg? film description styles were 393 clearly distin- guished from those which were not "good." Contradictory experimental results were obtained. 191 2. The relevance, quality, and betterness ratings were npt always independently made. Strong intermeasure correlations (p S .01) were sometimes found. The experimental abstract styles.--As predicted: l. Each of the abstract styles exhibited a unique, distinctive pattern of rating tendencies. 2. The type III style, which contained the invalid overall ratings of film quality, was found to be the least effective style. It elicited response tendencies which were most extreme, i.e., least similar to those obtained from the experimental films. However, contrary to expectation: l. ane_of the abstract styles was clearly found to be an effective style. Each of the styles showed tendencies to elicit distorted mean relevance ratings, and mean rating ranks which varied significantly (p S .01) from the film-elicited ranks. 2. Each of the styles exhibited characteristics of styles which were both "999g? and not "goo ." 3. The response indices obtained from eppp_of the three abstract styles were sometimes systematically "shifted" in the same direction. 4. The type I style, which contained np_overall ratings of film quality: a. elicited indices of response overall most similar to those obtained from the experimental films. 192 b. did ppt_always tend to elicit indices of consistency characteristics of a style which was nota "good" style. 5. The type II style, which contained the yglig_overall ratings, did npt: a. always tend to elicit indices of consistency character- istic of a "9999? film description style. b. elicit rating tendencies overall most similar to those obtained from the experimental films. 6. The type III style did ppt_always tend to elicit indices of consistency characteristic of a style which was ppt_a "good" style. 7. The type I, II, and III styles tended to elicit mean rat- ings which were "19!," "higher," and "highest" in value, respectively. The measurement methods.--As predicted: l. The experimental films tended to elicit relatively gtgple_ mean ratingydistributions. The film-elicited indices of reliability, both ANOVA and rank, usually provided a relatively stable baseline of cgmparison for the corresponding indices of consistency which were obtained. The film-elicited frequencies obtained of the type P, Q, and PQ comments were not always stable, however. The frequencies associated with some of the treatment-session groups showed dis- tinctly different fluctuation patterns. 2. In some instances a given treatment-session group showing anomalous mean ratings or rating patterns also exhibited anomalous subjective indices of response (subjective comments). 193 3. Overall, the rank and ANOVA indices of reliability tended to indicate similar experimental findings at the p S .Ol level. 4. The intrasubject and intersubject indices of consistency (abstract vs. film) obtained for a given type of index (rank, ANOVA) tended to show similar experimental results at the p S .01 level. 5. The film-elicited responses (treatment group IV) did ppt_ tend to exhibit strong or marginally significant sessional effects. However, contrary to prediction: 1. The ANOVA and rank indices of consistency did ppt_always tend to indicate similar, mutually supportive experimental results at the p S .01 level. Npp_did the intrasupject and intersubject indices of reliability. 2. The repeated measures method employed by the model tended to produce strongyoverall sessional effects (p S .Ol) within the experimental data. The effects were exhibited by the relevance and quality measures. Numerous marginally significant (.02 S p S .15) simple sessional effects were also revealed by the abstract-elicited relevance and quality measures. As well, sessional effects were noted in the frequency distributions obtained of the types of sub- jective comments which were made. 3. Numerous marginally significant rating differences (.02 S p S .15) were found. Although the differences partially rep- resent sampling discrepancies and accumulations of other experimental error factors, the rating tendencies associated with the differences were generally consistent with the principles of selection defined for the study. 194 The experimental conditions.-- l. The desired experimental stimulus conditions were npt achieved. A substantial number of the subjects' film-elicited ratings were expected to exhibit the microtype II-a pattern (r1 = r2; q1 > q2; F1 = better) for the 03F1F2 stimulus condition. But this did npt_occur. Nevertheless, the objectives of the experimental phase of the study were still adequately met. 2. As expected, the experimental variables "filmp," "gtimpr lus conditions," and ”treatments" were sometimes found to elicit significantly different rating distributions. 3. Also as predicted, the subjects' teaching preferences did not interfere as an experimental nuisance variable. Hypothesis One: The Subjects' Quality Ratings, ANOVA Indices Hypothesis One The mean magnitude of quality ratings elicited from the experimental abstract style acking the overall ratings of film quality will be significantly different than that elicited from the experimental films. General Findings Hypothesis 0ne.--The hypothesis was rejected. In almost all cases, the abstract-elicited and film-elicited mean ratings did npt_ vary significantly from each other. The indices of reliability.--The experimental films were found in all cases to elicit relatively stable, reliable rating distributions. The corresponding mean ratings elicited from each of the experimental films did npt_vary significantly. 195 The ratings elicited from the experimental abstracts are assumed to be reliable ones. (No formal indices of the reliability of the abstract-elicited ratings were obtained for this study.) The indices of consistency.--The intrasubject and inter- subject ANOVA indices (abstract/film) indicated similar experimental results when comparisons were made of the unit cell means. In each case the means compared did npt_vary significantly. Relatively strong, marginal sessional differences were noted, however, (film two, p = .02; film one, p = .06). Comparisons made of the combined cell means (abstract/film) showed dissimilar results. The intrasubject means varied signifi- cantly (p f .Ol) in two of three cases, whereas the intersubject means varied only marginally (p = .14). The experimental abstracts.--The type I experimental abstracts tended to elicit mean quality ratings which were lower than those elicited from the correSponding experimental films (Figures l3 and 14). In some cases the abstract-elicited means were signifi- cantly lower, but in most cases they were not. The significant differences found (intrasubject indices) were attributed to ses- sional measurement errors, i.e., to the presence of sessional effects. Hence, overall, the abstract-elicited and film-elicited mean ratings were relatively similar in nature. The independent variables.-—The stimulus films, films one and two, did npt elicit significantly different rating distributions. 196 Contrary to prediction, the means elicited fppn_films one and two were essentially equivalent. No significant main film effects were noted. Table 5 and Figure 13 show however, that within the treat- ment groups the means elicited fpp_film two tended to be higher than those for film one, whether elicited from the experimental abstracts or films. (The other two independent variables, treatments and stimulus conditions, were not of interest for Hypothesis One.) Nuisance variables.--Significant overall main sessional effects were found. The subjects' mean ratings shifted upward in all cases from session one to session two, whether elicited initially from the experimental abstracts or from the experimental films (Figure 14). As anticipated, the subjects' teaching preferences were npt_ strongly associated with the quality ratings which they made. The Pearson product-moment correlations (intrasubject, session one vs. session two) obtained of the subjects' quality and preference ratings tended to be low in value (most were between -.19 and .37). ANOVA analyses.--The omnibus ANOVA analyses revealed the presence of np significant overall main treatment effects, and no significant overall main film effects. Strong overall main sessional effects (p s .01) were iden- tified by two of five related omnibus analyses. Significant, simple main sessional effects were not found via follow-up analyses which were made, however. 197 No significant interaction effects were found, although a relatively strong, marginal overall treatment x session effect was noted (p = .03). Marginal, simple treatment x session interaction effects were associated with ratings obtained from film one (p = .12) and film two (p = .23). Marginal, simple treatment x film interaction effects were also found, associated with ratings obtained from the film evaluation panel-—and treatment group I (p = .15) and treatment group four (p = .17). No other noteworthy interaction effects were revealed. Mean Ratings and Standard Deviations The mean ratings and standard deviations obtained from the experimental subjects and evaluation panel are listed in Table 5. The relationship of the means listed in Table 5 is illus- trated in Figures 13 and 14. Figure 13 shows the interaction of films and treatment groups for each experimental session. Figure 14 depicts the interaction of sessions and treatment groups for each stimulus film, and for films one and two, combined. Unit cell means and combined means associated with the intrasubject and intersubject indices of consistency obtained are illustrated in Table 6. Important mean rating differences identified via the ANOVA analyses are indicated in Tables 5 and 6 and Figures 13 and 14. The footnotes to the tables and figures list the particular analyses which revealed the differences, and the location within the appendices 198 Table 5.--Means and standard deviations of the overall quality ratings elicited from the experimental subjects and film evaluation panel. Session 51 52 Film F1 F2 F1 F2 M *4.81 *4.19** Panel so (1.12) (1.21) M 4.44** 4.81*** 5.38** 5.62*** 1 1 so (1.36) (1.32) (1.54) ( .80) Treatment 1 M ... ... 5.56 5.50 610”” 2 so ... ... (1.26) ( .81) M ... ... 5.06 5.19 T3 so ... ... (1.52) (1.27) M 4.94 5.25** 5.06 5.50 1 4 so (1.43) (1.43) (1.43) (1.09) ' a Grand Se5510n one 4.69 5.03 Subject - b Means: Sess1on two ... ... 5.27 5.45 Film- . . . . F1lm one F11m two El1c1ted - c c°mb‘"ed 5.20 5.41 Note: Means superscripted with the same number of asterisks, within columns and rows, were marginally different. ' 16. 3 I 64. 3 II 80 (sessions one and two, combined). 3 II *p < .15. .10. .05. i' I- 'O A ***p A Table 6.--Session one vs. 199 session two: Mean quality ratings. Session 51 52 Film F1 F2 F1 F2 Treatment 1] 4.44C 4.81b 5.38C 5.62b uroup Means T4 4.94 5.25 5.06 5.50 s x F 4.69b” 5.039' 5.22b” 5.569' Co]umn 1 X 59 (1151) 4.633'“1 (1152) 5.506' Means (145]) 5.10d (1452) 5.28 Session 4.86a 5.39a Note: Means with the same superscripts (a, a; b", as noted below. sively, from a to d.) tables. b", etc.) varied (The significance level decreases progres- See Appendix I for related ANOVA results 11.7; df 1,30; p < .01); (Table I1). 16.3; df 1,15; p < .01); (Tab1e 12). 7.1; df 1,15; 4.5; df 1,30; 5.3; df 1,30; 4.0; 2.4; df 1,15; df 1,30; 9 = . 02); p = . 04); p = .06); p = .14), ANOVA . ANOVA . ANOVA . ( ( p = . 0); (ANOVA ( (ANOVA . ) ) ). ) ) 9A relatively strong but marginal overall treatment x session interaction effect was found (F = 4.9; df 1,30; p = .03); (Table 11). 200 1 . 1 6 11' 1. 6 T *9: I a 9 C ...... y<..,3**1b Mean :::::s:=:25;::+:<22;:; Rating 5 aexfiovxc+b _ 5 6*.“ \T] * 1C 1 4 (r PanelJ ***4b L 4 I _L J 1 Session Oned Session Two Figure l3.--Mean quality ratings: The interaction of treatment groups and films, by sessions. Note: Primed asterisks (*') denote intersubject differences. The shaded portions show the range 0f the film-elicited means. a(F = 3.6; df 1,30; p = .07); (ANOVA). b(df 1,30; p = .05; Duncan's). C(F=2.4; df 1,30; p = .14); (ANOVA). dThe T1 vs. T5 and T4 vs. T5 interactions were both mar- ginal; p = .15, p = .17, respectively (ANOVA). * s .15. **p 5 .10. ***p s .05. 201 ;;11 ...... r Panel 1 1 , ~~4 32 S] 52 Film Onef Film Twof Combined Meansf Figure l4.--Mean quality ratings: The interaction of treatment groups and sessions, by stimulus films. Note: Plain asterisks (*) denote intrasubject differences. Primed asterisks (*') denote intersub' See Appendix I for related ANOVA resul iect differences. ts tables. The shaded portions show the rangecyfthe film-elicited means. a(F = 2.4; df 1,30; 4.0; df 1,15; 7.1; df 1,15; 16.3; df 1,5; P = .14); (ANOVA). 9 e(df 1.30; p = .05, Duncan's). fEach of the three interactions was marginal: .12; film two, p = .23; combined means, F = 4.9, one, p = df 1,30; p *p **p ***p **** P |/\ IA IA IA .15. .10. .05. .01. 06) ( = .02), (ANOVA). 01) ( ANOVA). Tab1e 12). film llsl 11.1l 202 of corresponding, conventional ANOVA results tables. The footnotes also summarize, for each analysis, the particular F, df, and p values obtained. (These footnoting procedures will be used when appropriate, for other data tables presented in later sections of this chapter. Results tables are provided in the appendix for analyses showing significant differences at p s .01, only.) Hypothesis Two: The Subjects' Relevance Ratings, ANOVA Indices Hypothesis Two The use of overall ratings of film quality within the experimental abstracts will npt significantly alter the mean magnitude of relevance ratings elicited from the abstracts. General Findings Hypothesis Two.--The hypothesis was rejected. The mean magni- tude of the relevance ratings was significantly altered (p S .01) by the presence of the invalid ratings, but not by the presence of the valid ratings or by the lack of the overall ratings. The indices of reliability.--Overall, the experimental films were found to elicit relatively stable, reliable rating distribu— tions. Only a few relatively minor rating anomalies were noted. The indices of consistency,--Overall, the intrasubject and intersubject ANOVA indices revealed similar experimental results. In most cases when unit cell means were compared, both the intra- subject and intersubject indices indicated that the abstract-elicited and film-elicited mean relevance ratings did npt_vary significantly. 203 In most cases as well when combined cell means were compared, both the intrasubject and intersubject indices (abstract/film) essentially showed the same results noted above for the unit cell comparisons. The experimental abstracts.--In all cases the intersubject, abstract vs. abstract comparisons revealed that the mean ratings elicited from each abstract style did npt vary significantly from each other. And in most cases the abstract vs. film comparisons indi- cated that neither of the three abstract styles elicited mean ratings which varied significantly from those obtained from the correspond— ing films. Overall mean rating pattern.--The relevance means tended to exhibit three distinct "shifting" effects (overall main effects). 1. The "fail safe" phenomenon1 across sessions. The mean ratings obtained for the "high" relevance conditions (01F1 and 02F2) tended to upshift from session one to two, whereas those obtained for the "mediocre" relevance conditions (03F1 and 03F2) tended to downshift from session one to two (Figure 17). 2. "Abstract vs. film" shifts, during session one. The mean ratings obtained fpp film two from pll_of the abstracts, were 211 higher than those actually elicited fppn film two. In most cases (five of six), the abstract-elicited mean ratings obtained for film one were lower than the film-elicited means (Figure 18). 3. Similar intersubject and intrasubject "abstract vs. film" §nift§. The "fail safe" sessional shifts tended overall, to be in 204 the same direction as the "abstract vs. film" shifts which occurred during session one. The independent variables.--Genera11y speaking, the stimulus films, films one and two, did npt elicit significantly different rating distributions. The mean ratings elicited from the type III abstract style for films one and two varied significantly, however. The experimental treatments did not always elicit similar ratings. The mean relevance ratings obtained from the type III abstract style and the experimental films varied significantly. Although the "high" and "mediocre" relevance stimulus condi- tions tended to produce sessional "fail safe“ rating distributions which were opposite in direction, the corresponding mean ratings obtained for a given stimulus condition did npt_tend to vary sig- nificantly. Nuisance variables.--Strong, overall sessional effects (p s .01) were found in the analyses of the unit cell means and combined means. Most of the simple main sessional effects found were npt_significant ones, however. As anticipated, the subjects' teaching preferences were npt strongly associated with the relevance ratings which they made. As for the quality ratings, the Pearson product-moment correlations obtained of the subjects' relevance and preference ratings tended to be low in value (all of the values were between -.33 and .37). ANOVA analyses.--The omnibus analyses revealed the presence of np_overa11 main treatment effects and few overall main film 205 effects. Several strong overall main sessional effects were found, however. Follow-up analyses revealed that simple main effects were most often associated with treatment group three; the effects were induced from use of the type III abstracts. Relatively few simple main effects occurred. Several significant, overall main session x stimulus condi- tion interaction effects occurred (in three of nine cases). No other significant interaction effects were found. Several marginally significant, overall main interaction effects were also found: treatment x session (p = .03; p = .19), treatment x condition (p = .08), and session x condition (p = .02) effects. Simple interaction effects were npt_tested for the relevance measure. Mean Ratings and Standard Deviations Tables 7, 8, 9, and 10 list the mean ratings and standard deviations obtained for the 01F], 02F2, 03F], and 03F2 stimulus conditions, respectively. Tables 7, 8, 9, and 10 reveal that numerous, marginally significant mean rating differences were found. Interaction data plots: Unit cell means.--The relationships of the means listed in Tables 7-10 are illustrated in Figures 15, 16, and 17. Figures 15 and 16 show the interaction of treatment groups and sessions for each stimulus condition. Figure 17 illustrates 206 Table 7.--The 01F] stimulus condition: Means and standard deviations of the relevance ratings elicited from the experimental subjects. Stimulus Condition 01F] Session 51 52 M *ie,f7.38***b 8.88***b T so (2.47) ( .58) M 7.88**C 8.88**c Treatment T 50 (2.42) (1.58) Group M *‘f8.00 8.56 T so (2.00) (1.89) M *'98.25*d 9.06*d T SD (2.35) (1.48) Column means 7.88“"""""a 8.85"‘**"‘a Note: 1. 2. Plain asterisks (*) denote intrasubject differences. Primed asterisks (*') denote intersubject differences. 'A IA M Ii\ 12.24; df = 1, 60; 6.8; df = 3.5; df = 2.9; df = 2.84; df = 2.02; df = .20. .10. .05. .01. 1, 15; P l. 15; p l. 15; p 1. 30; p 1. 30; p p = .005); (Table 01, Appendix J). .02); (ANOVA). .08); (ANOVA). .11); (ANOVA). .11); (Miner formula). .18); (Winer formula). 207 Table 8.--The 02F2 stimulus condition: Means and standard deviations of the relevance ratings elicited from the experimental subjects. Stimulus Condition 02F2a Session 5] 52 M 8.38 8.31 SD (1.54) (1.62) M 8.44 **'d8.63 T2 —————- Treatment SD (1-54) (1.20) Group M 8.44**b **'d7.56**b T3 —- so (1.09) (2.03) M 7.63*C 8.56*C T4 ...... SD (2.55) (1.50) Column means 8.22 8.27 Note: l. Plain asterisks (*) denote intrasubject differences. 2. Primed asterisks (*') denote intersubject differences. Am marginally significant overall treatment x session interaction occurred (F 25; df= 3, 60; p= .09); (ANOVA). b(rs = 3. 3; df = 1, 15; p = .09); (ANOVA). C(F = 2. 9, df = 1, 15; p = .11); (ANOVA). d (df = 3, 60; p = .10, Duncan's). 208 Table 9.--The 03F] stimulus condition: Means and standard deviations of the relevance ratings elicited from the experimental subjects. Stimulus Condition 03F] Session S] 52 M *'C5.44 4.69 T1 so (2.55) (2.27) M *'96.50**P 5.38**b 1 Treatment 2 so (2.06) (2.57) Group M 5.69 5.44 T3 so (2.79) (2.60) M 5.75 5.63 T4 so (3.41) (2.91) Column Means 5.85*a 5.29*a Note: 1. Plain asterisks (*) denote intrasubject differences. 2. Primed asterisks (*') denote intersubject differences. a(F = 2.63; df = 1, 60; p = .12); (ANOVA). b(F = 3.2; df = 1, 15; p = .10); (ANOVA). c(F = 2.20; df = 1, 30; p = .16); (Miner formula). *p 5 .20. **p 5 .10. 209 Table lO.--The O3F2 stimulus condition: Means and standard deviations of the relevance ratings elicited from the experimental subjects. Stimulus Condition 03F26 Session S} 52 M h1r*'67.00*b 5.81*b T1 50 (1.96) (2.45) M *1'n'r'f7.44*C 6.38*C T2 Treatment SD (2.06) (2.36) Group M h****'98.13****d 6.44****d T3 SD (1.62) (1.93) M ****'e,f,95.63 5.81 T4 SD (2 89) (2.73) Column Means 7.05****a 5 1]****a Note: 1. Plain asterisks (*) denote intrasubject differences. 2. Primed asterisks (*') denote intersubject differences. aAn overall main treatment and main session effect occurred, each (Table 02, Appendix J). (F5 = 7.45; df = l, 50; p = .01); (F1 =2.17; df = 3, 60; p = .10). b(F = 2.6; df = 1, 15; p = .13); (ANOVA). c(F = 2.5; df = 1, 15; p = .14); (ANOVA). d(F = 9.6; df = 1. 15; p = .01); (ANOVA). e(df = 3, 60; p = .10, Duncan's). f(df = 3, 60; p = .05, Duncan's). g(df = 3, 60; p = .01, Duncan's). "(F = 2_3g; df = 1, 30; p = .14); (Miner formula). 210 the interaction of stimulus conditions and sessions, by treatment groups. The intersubject (session one) differences noted in Tables 7-10 are similarly noted with asterisks in Figure 15; the intrasubject (ses- sional) differences,in Figure 17; and both, in Figure 16. The sessional "fail safe" effects which occurred are best illustrated in Figure 17. The figure depicts the "upshift" tendency from session one to two for the "high" relevance conditions (01F1 and OZFZ); and the corresponding "downshift" tendency for the "mediocre" relevance conditions (03F1 and 03F2). Although Hypothesis Two was sometimes confirmed, Figures 15-17 clearly illustrate the influence of the film quality indicators sup- plied in the typeII and III abstracts. 0f the eight mean ratings elicited from each of the 03F] and 03F2 conditions, the highest rating obtained for each condition was elicited from the type II and III abstract styles, respectively; the style supplied with the "higher" film one vs. film two rating, respectively. The figures show that the mean rating elicited from the type III abstracts (treatment group III) for the 03F2 condition was strongly different than the corresponding film-elicited mean. Both the intersubject and intra- subject ANOVA comparisons showed significance at p s .01. Four relationships depicted in Figures 15, 16, and 17 also demonstrate the influence of the lack of film quality indicators in the type I abstracts. To illustrate, two relationships are directly attributable to the relatively "low" mean elicited for the 01F1 211 condition from the type I abstract style. First, the means elicited for the 01F1 condition from treatment group one are the only OlFl-elicited means which exhibited (a) a strong, "fail-safe" upshift from session one to two (p = .02), and (b) a substantial intersubject shift (p = .11). Second, the "low" type I abstract- elicited mean is the lowest of the four session one means obtained for the 01F1 condition. Third, in all cases the type I abstract- elicited means were always the lowest values obtained from the three experimental abstract styles. Fourth, the treatment group I means obtained for both the 03F1 and 03F2 conditions during session one are closest to the average film-elicited means obtained for each condition, respectively. Overall, these four findings imply that the subjects' judgments were more "uncertain" about the high 01F1 relevance condition, and more "certain" about the mediocre relevance conditions, when elicited from the type I abstracts (rather than from the type II and III abstracts). Figures 15 and 17 reveal another important relationship. The abstract-elicited means for the O3 objective for film two were all substantially higher than those for film one. This relationship was strongly maintained in session two, as well. In contrast, the cor- responding film-elicited means obtained from treatment group IV were relatively similar; and the mean for film one, higher than that for film two, in session one. Numerous main treatment effects are illustrated in Figure 15 (parallel, nonintersecting treatment plots). Although none of the 212 effects was significant, the treatment group III vs. IV main effect was marginally significant (p = .04), as was the treatment group II vs. IV effect (p = .12). The main treatment effects illustrate the "carry-over" of learnings and response tendencies made in session one, to session two. The effects imply, as expected, that the subjects' perceptions dur- ing session one were often npt_totally dissolved by the two-week waiting period used between sessions one and two. The perceptions still transferred over to session two. Figure 17 also shows the relatively strong session x condi- tion interactions, and main stimulus condition effects which occurred. (The main effects were expected, especially between the high and mediocre relevance conditions.) The abstract vs. film shifts.--Figure 18 compares the intra- subject and intersubject shifts which occurred. The figure illus- trates the two general filmic tendencies of the abstract-elicited responses. Mean responses for film one were usually lower, and for film two usually higher, than the corresponding film-elicited means obtained in both sessions one and two. (The 03F1 condition was an exception to this finding.) Figure 18 also illustrates the "carry-over" effects which occurred. The effects are particularly noticeable in the T2 and T3 panels, film two, for the 03F2 condition. The intrasubject, session two means (film) in each case are noticeably inflated, when contrasted with the corresponding intersubject, session one means (film), below each. 213 Mean Relevance Rating 0152 0ze 0351 O3F2 Stimulus Conditions and Sessions Figure 15.--Mean relevance ratings: The interaction of treatment groups and sessions, by stimulus conditions. Note: 1. Shaded portions define the range of the five film-elicited means. 2. Primed asterisks (*') denote intersubject differences (abstract/film). *p ES .15. ** S .10. ***p S .05. ****p 5 .0]. 214 .Ae_wm\uomcumamv mmocmcmwm_u uomwoamcmpcw mpocou A.nv mxmwgmumm ume_ca .N .mmucmcmmmwu powwnzmmcpcm muocmc Any mxmmgmumm :Pmpa .F .Po. .mo. .e.. .m—. .mcowuwucou mspas_um xn .mazocm acmapmmgu use mcowmmmm we :owuomcmucv msh “mmcwumg mucm>mpmc cams--.op mczmwm ............ ........... ............ uuuuuuuuuuu nnnnnnnnnnnn memos powwo_pmuep_w mo ounce m:__mmmm ozu :o_mmmm VII III (II4II III oco :owmmmm - u 1 n a museum; manage acmsummch use mco_u_ucou ms—zsmum Numc .umo memo _a_o we me me _» eh me we .p 5e me we .h 6» mp Nb Fe .41 (A) . _ d _ q _ q). u _ (a . . . A VI VI VI VI N )I IlIlIIIIul I I I. III. I "muoz oe_oe¢ muca>m_mm cam: Mean Relevance Rating 01 215 J '0ze ‘I’ O1F1 * 03F2 03F1 0 F ‘\ \ 03F1~ 03F1 J 031-14..~ I J 1 L 1 1 I I s; 52 s] 52 s; 52 s] 52 T] 12 Le, T3 T4 Treatment Groups and SesSions Figure 17.--The interaction of stimulus conditions and sessions. by treatment groups. Note: IA IA IA IA Asterisks (*) denote intrasubject differences. 216 .po. w aoatw .mo. w a*«« .o_. w 8.. .cm. w Q5 .mmocmcmwowu Homnaamcmu:_ muocmu A.«v mxmwgmumm vmepca .N .mmocmcmmwwu powwazmmcucw muocmu A8V mxmwcmumm crop; .— .masocm pewsumogu ucm msp_$ an .mcomuwucoo mzpzswum mucm>mpmg mg» cow umuo: mumwgm e_w$ .m> “umcumam aumnaamgmucw ucm nomnasmmgacp m:pu-.mp mczmmu 03» 2:... mzo 2:... 353.355 30.33335 n \\ . aucmuflmcoo 333335 u \\ 3:39.28 “3.3352,. n\\ 3:33:06 «33335.... n\\ me we _e mp Np .h Ammwa e< _=_ea e< _e__a- e4 eF_a e< e__e e< spew e< 1. a 41' fl «1 a J fl 4 _ A A m .... 4m m .e. .. m Til/i .ii/ \\ mm 1.\._n / Numo ... / ... o to o .4. ....o .. o . z x / / 9.53. / mocm>m_mm / Afm CQWE N 1. L1 N — — ao \ .1/ III _:o J_o _\ / / 11w w .... .\ \ \ \ 4. w _/ rlllllbv ”whv\\\\ \. .nx 11:) we ... O G 4T Lra o T d.‘ ...‘ v a -P1-..)L- % hi: 1 -c-e-_ 11;:-e-; e elt11: IF _ F _ .P p 217 Mean Ratings: Combined Cell Means Mean relevance ratings obtained for various combinations of stimulus conditions are listed in Tables 11, 12, 13, and 14. Table 11 lists the grand means obtained from each treatment- session group (the average mean rating obtained across the 01 1, 02F2, 03F], and 03F2 stimulus conditions, combined). Table 12 provides the mean ratings obtained from films one and two, during sessions one and two, respectively (the average of the ratings obtained from the high and mediocre relevance conditions, combined). Table 13 contains the mean ratings elicited from sessions one and two, for the mediocre and high relevance conditions (the average of the ratings elicited from films one and two, combined). Table 14 gives the mean ratings of sessions one and two combined, for films one and two, for both the mediocre and the high relevance conditions. The ANOVA Analyses: Combined Cell Means The combined means analyses revealed significant differences more often, and for more treatment groups, than did the analyses associated with the unit cell means. All of the treatment groups except treatment group one exhibited strong intrasubject differences. Treatment group I showed np_significant intrasubject 0r intersubject rating differences. Treatment group II showed strong filmic differences (p f .01) for the mediocre relevance condition, as did treatment group III 218 (Table 14). Treatment group III also exhibited strong intrasubject sessional differences for the ratings elicited from film two (Table 12). As well, treatment group III exhibited strong intersubject abstract vs. film differences for film two (Table 12). The analyses made of the unit cell means and combined means were similar in that both types tended to show numerous marginally significant rating differences (.02 s p s .15). As well, both types of analyses revealed (a) np_significant abstract vs. abstract dif— ferences and (b) that the type III abstract style was the only style to elicit ratings which varied significantly (P < -01) from those obtained from the corresponding films. Hypothesis Three: The Subjects' Betterness Ratings, ANOVA Indices Hypothesis Three The use of overall ratings of film quality within the experimental abstracts will significantly alter the mean magnitude of betterness ratings elicited from the abstracts. General Findings Hypothesis Three.--The hypothesis was rejected. In all cases, the abstract-elicited and filmbelicited mean ratings did npt vary significantly from each other. The indices of reliability.--In all cases, the experimental films were found to elicit relatively stable, reliable rating dis- tributions. The mean betterness ratings elicited from the films did npt_vary significantly from each other. 219 Table ll.--Mean relevance ratings: Combined means across the four stimulus conditions, by sessions and treatment groups. Session and Stimulus Condition Trgpgmgnt Session One 565510" TWO . 01F1 02F2 03F1 03F2 01F1 O2F2 03F1 03F2 1] 7.05 6.92 12 *‘7.56 7.31 13 *‘7.56**3 7.00m:a 14 *‘6.81 7.26 pggpnn 7.25 7.13 Note: 1. Plain asterisks (*) denote intrasubject differences. 2. Primed asterisks (*') denote intersubject differences (abstract/film). a(F = 4.8; df 1.15; p = .04); (ANOVA). * IA .15. .05. IA **p 220 Table 12.--Mean relevance ratings obtained for films one and two, for the high and mediocre relevance conditions, combined, by treatment group and session. Session 5] 52 5] S2 Stimulus Condition 01F1 O3F1 O1F1 03F1 OZFZ 03F2 02F2 03F2 11 6.41 6.78 **‘7.69* 7.06* Treatment 12 7.19 7.13 ***'7.94d 7.50 Gr°up 13 6.84 7.00 C****'8.28****b 7.00229":b T4 7.00 7.34 c****'6.63d 7.18 Column a a Means 6.86 7.06 7.63*** 7.19*** Note: 1. Plain asterisks (*) denote intrasubject differences. 2. Primed asterisks (*') denote intersubject differences (abstract/film). 3F = 4.2; df 1,60; p = .04 ; (ANOVA). bF = 13.4; df 1,15; p < .01 ; (ANOVA). CF = 8.5; df 1,30; p < .01 ; (Miner formula). dF = 5.4; df 1,30; p = .03 ; (Hiner formula). *p 5 .15. **p 5 .10. ***p s .05. ****P S .0]. 221 Table l3.--Mean relevance ratings obtained for the high and mediocre for films one and two, combined, by relevance conditions treatment group and session. Type of . . Condition Med1ocre Relevance H1gh Relevance Session S1 S2 S1 52 Stimu‘us 0 F 0 F 0 F 0 F 0 F 0 F 0 F 0 F Condition 3 1 3 2 3 1 3 2 1 1 2 2 1 1 2 2 T1 6.22** 5.25** 7.88* 8.59* Treatment T2 **‘6.97** 5.88** 8.16* 8.75* GTOUP T3 **‘6.9l** 5.94** 8.22 8.06 T4 **‘5.69 5.72 C7.94*** c8.81*** C01um” 35 45**** 35 70**** b8 05**** b8 56**** Means ' ' ' ' Note: 1. Plain asterisks (*) denote intrasubject differences. 2. Primed asterisks (*') denote intersubject differences (abstract/film). a(F = 8.1; df 1,60; p = .01); (Table 03, Appendix J). b(F = 6.6; df 1,60; p = .01); (Table J4, Appendix J). C(F = 4.5; df 1,15; p = .05); (ANOVA). *p 5 .15. **p s .10. ***p 5 .05. ****p S .0]. 222 Table l4.--Mean relevance ratings obtained for the mediocre and high relevance conditions, for sessions one and two combined, by treatment group and stimulus condition. Type 9f Mediocre Relevance High Relevance Cond1t1on Stimulus condition 03F1 O3F2 01F1 0ze Sess1on S1 S2 S1 32 _p_ 1 S 11 5.06** 6.41** 8.13 8.34 Treatment “T2 5.94****C i'i'6.91in'rir*C 8.38 8.53 Group 13 5.57418“b di'n'rir'7.29irir*iib 8.28 8.00 14 5.69 di'rir*'5.72 8.66 8.09 fig;fi2“ 5.56****a 6.58****a 8.36 8.24 Note: 1. Plain asterisks (*) denote intrasubject differences. 2. Primed asterisks (*') denote intersubject differences (abstract/film). a(F = 11.1; df 1,60; p < .01); (Table J3, Appendix J). b(F = 9.2; df 1.15; p = .01); (ANOVA). C(F = 7.7; df 1,15; p = .01); (ANOVA). d(F = 4.5; df 1,30; p = .04); (Miner formula). *p f .15. **p 5 .10. ***p f .05. ****p f .01. 223 The indices of consistency,--The intrasubject and intersub- ject ANOVA indices (unit cell means, abstract vs. film) exhibited similar experimental results. The experimental abstracts.--The mean betterness rating elicited from the type I abstracts (which contained no overall ratings of film quality) was very similar to, but lower than that obtained from the experimental films. The two abstract styles which contained overall ratings of film quality elicited mean betterness ratings which were substan- tially inflated, when contrasted with those elicited during session one from (a) the abstract style which did not contain overall ratings of film quality (p = .05, .10) and (b) the experimental films (p = .11, .14) (Figure 19). But overall, the differences noted from the abstract vs. abstract and abstract vs. film comparisons were only marginal ones. The overall mean rating pattern.--The actual pattern was similar to that predicted, in that the type II abstract style elicited mean ratings which did npt_vary significantly from those obtained from the experimental films. The pattern was most dissimi- lar, in that both the type I and type 111 abstract styles did npt elicit mean ratings which varied significantly from the film- elicited means, or from those obtained from the type II abstract style. Additionally, the mean betterness ratings exhibited distinct overall "shifting" effects (overall main effects) similar to those noted about the relevance ratings: 224 l. The "fail safe" phenomenon, across sessions. The mean ratings elicited from treatment groups II and III tended to "down- shift" from session one to two, whereas the ratings obtained from treatment groups I and IV tended to "upshift" from session one to two (Figure 19). 2. Similar intrasubject and intersubject "abstract vs. film shifts. The "fail safe" sessional shifts tended overall, to be in the same direction as the "abstract vs. film" shifts which occurred during session one (Figure 20). The independent variables.--Neither of the variables, stimu- lus films or treatments, showed significantly different rating distributions. Only one stimulus condition, the 03(F1F2) condition, was tested; the influence of different conditions was npt_ascertained for this hypothesis. Nuisance variables.--The mean sessional effects found were not significant ones. As anticipated, the subjects' teaching preferences were npt_ strongly associated with the betterness ratings whiCh they made. As for the quality and relevance ratings, the Pearson product-moment correlations obtained of the subjects' betterness and preference ratings tended to be low in value (seven of the eight values were between -.30 and .31). ANOVA analysis.--The omnibus analysis which was administered revealed the presence of np_overall main treatment effects (p = .22), 225 sessional effects, or treatment x session interaction effects (p = .21). As implied by previous comments, no simple main effects were found. Simple interaction effects were not tested for the better— ness measure. The results of the omnibus ANOVA analysis made of the better- ness ratings are provided in Appendix Hl. Mean Ratings and Standard Deviations Table 15 lists the mean betterness ratings and the standard deviations obtained, by treatments and sessions. The table reveals that the mean ratings exhibited a number of marginal magnitude dif- ferences (.05 5 p 5 .14). The treatment x session interaction plot of the means is provided in Figure 19. Abstract vs. Film Shifts The intrasubject and intersubject, abstract vs. film shifts which occurred are illustrated in Figure 20. The figure reveals that the direction of the shifts was the same for the two treatment groups which used the type II and III abstract styles, the styles which contained overall ratings of film quality. On the other hand, the direction (3f the shifts elicited from treatment group I was npt_the same as that elicited from treatment groups II and 111. It was opposite in direction. (Although not previously illustrated, 226 Table 15.--Means and standard deviations of the betterness ratings obtained from the experimental subjects, by treatment groups and sessions. Session S1 2 T M 4.56***a.b 5.44 1 SD (2.30) (2.47) M 6.38**b’d 5.38 T2 Treatment SD (1-96) (2.36) Group M 5.55***a,c 6.06 T3 SD (2-27) (2.54) M 4.88*C’d 5.81 T4 SD (2.91) (2.40) a df 3,60; p = .05); (Duncan's). F = 2.8; df 1,30; p ( (F = 3.0; df 1,30; p = .10); (Miner). ( .11); (Miner). ( F = 2.5; df 1,30; p .14); (Miner). .15. .10. .05. i» '0 IA IA *** P IA Mean Ratin 227 7.0 a’c*** b,d** i1 . 3 6-0 t :E=E=E=E=E=:=‘=E:E=EIE=E:E> - 32523232525232 """"""" 3E§§§§§§.scorv 5.0 ‘- C.d,, zit-:15iiiifiiiiiiiifiiiiiiiii ~f a’b*** 4.0 ' 1 s1 s2 Session Figure l9.--Mean betterness ratings: The interaction Note: of treatment groups and sessions. The shaded portion shows the range of the film-elicited means. a(df = 3,60; p = .05. Duncan's). 3.03; df 1,30; p m II .10); (Miner). n A 11 n 2.80; df 1,30; p 1.30; p .11); (Miner). “ II 2.50; df .14); (Miner). .15. .10. .05. fl> i- I: 'U D II A ***p 228 1 1 1 T T f A 7 0 F 17:0 \ 6.0 1*- __6.0 Mean Betterness Rating \ 5.0 ir- \ __ 5.0 **I 4.0 l l L 1 L 4.0 A0 Film Ab Film Ab Film T2 ,r’f:= intrasubject consistency ,u’ = intersubject consistency Figure 20.--The intrasubject and intersubject abstract vs. film *9 **p .11. .14. shifts noted for the betterness measure, by.treatment groups. 229 when plotted, the mean ratings obtained for the quality measure from treatment group I (type I abstracts) show abstract vs. film shifts similar to that illustrated in Figure 20 for treatment group I, for the betterness measure). Hypothesis Four: The Subjects' Relevance Ratings, Rank Indices Hypothesis Four The use of overall ratings of film quality within the experimental abstracts will npt significantly alter the rank order relationship of mean relevance ratings elicited from the abstracts. General Findings_ Hypothesis Four.--The hypothesis was rejected. The rank order relationship of the mean relevance ratings elicited from the type III abstract style varied significantly (p = .01) from that elicited from the experimental films and the other two abstract styles.2 The indices of reliability.--The intrasubject and intersub- ject indices of reliability did npt_produce the same experimental results. In all cases the intersubject reliability indices pro- duced very stable, reliable rank correlation values (1.00 in all cases). T0 the contrary, although the intrasubject rank index obtained was relatively high in value (.80), it was only marginally reliable (p 1 .10). The indices of consistency.--The intrasubject and inter- subject indices of consistency (abstract vs. film) were similar in 230 that npne of the indices were significant from zero (equal to 1.00, n = 4). The indices were dissimilar in that (a) the intersubject indices (abstract vs. film) varied significantly from the corres- ponding indices of reliability, whereas (b) the intrasubject indices did npt, The experimental abstracts.--The three abstract styles were similar in that both the intrasubject and the intersubject (abstract/ film) indices of consistency elicited from each were npt_significant from zero. Overall, the rank coefficients elicited from the type III abstracts were least similar to those elicited from the films. The type I and 11 abstract styles produced ranks which were identical in order, but which varied significantly from those obtained from the type III abstract style. The independent variables.--The major variable of interest, treatments, showed significantly different rating distributions. (The variables "stimulus films" and "stimulus conditions" were not tested for this hypothesis.) Nuisance variables.--The sessional effects which were found within the experimental ratings tended to depress the intrasubject coefficients of reliability, and inflate the intrasubject coeffi- cients of consistency which were obtained. The Rank Indices The rank coefficients follow for the three types of indices of consistency which were obtained for the experimental abstract 231 styles. The corresponding indices of reliability are also provided. The treatment-session groups which were compared to obtain the indices are noted in parentheses. Coefficients significant from zero are coded with an aster- isk (*). The critical value of the Spearman rank correlation coef- ficient which was used for significance testing purposes was 1.00 (p = .05; n = 4). (A more stringent test, at p = .01, was not possible Intrasubject indices.-- Type I abstracts .80 (t1 vs. t1.) Type II abstracts .80 (t2 vs. t2.) Type III abstracts .40 (t3 vs. t3.) Reliability index .80 (t4 vs. t4.) Intersubject indices, abstract vs. film.-- Abstracts: Consistency Indices Films: Reliability Indices Type I (t1 vs. t4) .60 1.00* (t]. vs. t4.) Type 11 (t2 vs. t4) .60 1.00* (t2| vs. t4.) Type III (t3 vs. t4) .00 1.00* (t3. vs. t4.) Intersubject indicesyyabstract vs. abstract.-- Abstracts: Consistency Indices Films: Reliability Indices Type I vs. type II 1.00* 1.00* (t]. vs. t2.) Type I vs. type III .80 1.00* (t]. vs. t3.) Type II vs. type III .80 1.00* (t2. vs. t3.) 232 Hypothesis Five: The Subjects' Ratings, Rank Indices, Combined Means Hypothesis Five The use of overall ratings of film quality within the experimental abstracts will significantly alter the rank order relationship of mean relevance, quality, and better- ness ratings, when combined, elicited from the abstracts. General Findings Hypothesis Five.--The hypothesis was confirmed. The rank order relationships of the mean ratings obtained from the type II and III abstract styles varied significantly (p = .01) from each other, from those elicited from the experimental films, and from that of the type I abstract style.3 The indices of reliability.--As for the mean relevance ranks, the intrasubject and intersubject combined means rank indices did npt produce the same experimental results. In all cases, the inter- subject reliability indices produced very stable, reliable rank correlation values (.93 to 1.00). To the contrary, the intrasub- ject rank index obtained was only marginally reliable (p = .02), although it was relatively high (.86) in value. The indices of consistency.--The intrasubject and inter- subject indices exhibited the same pattern of results obtained for Hypothesis Four for the relevance measure, per se. The intrasubject and intersubject indices of consistency (abstract vs. film) were similar in that npne of the indices were significant from zero 3 .89), n = 7). The indices were dissimilar in that (a) the inter- subject indices varied significantly from those obtained from the 233 experimental films, whereas (b) the intrasubject indices did npt. Hence, as for the relevance measure, the intrasubject and intersub— ject indices of consistency did npt_show similar experimental results. The experimental abstracts.--As implied by previous comments, the three abstract styles were similar in that the mean ranks elicited from each varied significantly from those elicited from the experimental films. The abstracts were dissimilar in that they each elicited mean ranks which were significantly different from the others. The type I abstracts elicited indices of consistency most similar to the experimental films. The type II and III abstracts elicited indices of consistency least similar to the experimental films. The independent variables.--The variable treatments showed significantly different rating distributions. (The variables "stimulus films" and "stimulus conditions" were not tested directly for this hypothesis.) Nuisance variables.--As for the relevance rank indices, the sessional effects present within the relevance, quality, and better- ness ratings tended to (a) depress the intrasubject indices of reliability and (b) inflate the intrasubject indices of consistency which were obtained. The Rank Indices The rank coefficients follow for the three types of indices of consistency which were obtained for the experimental abstract 234 styles. The indices of consistency and the corresponding indices of reliability are displayed in a fashion similar to that used to report the relevance ranks obtained for Hypothesis Four. Once again, coefficients significant from zero are coded with an asterisk (*). The critical value of the Spearman rank cor- relation coefficient which was used for significance testing pur- poses was .89 (p = .01, n = 7). Intrasubject indices.-- Type I abstracts .82 (t1 vs. t1.) Type II abstracts .57 (t2 vs. t2.) Type III abstracts .54 (t3 vs. t3.) Reliability Index .86 (t4 vs. t4.) Intersubject indices, abstract vs. film.-- Abstracts: Consistency Indices Films: Reliability Indices Type I (t1 vs. t4) .75 .96* (t1. vs. t4.) Type II (t2 vs. t4) .32 .93* (t2. vs. t4.) Type III (t3 vs. t4) .39 .96* (t3. vs. t4.) Intersubject indices, abstract vs. abstract.-- Abstracts: Consistency Indices Films: Reliability Indices Type I vs. Type II .68 .96* (t1. vs. t2.) Type I vs. Type III .71 1.00* (t]. vs. t3.) Type II vs. Type III .39 .96* (t2. vs. t3.) 235 The Film Evaluation Panel's Quality Ratings Two basic factors were analyzed: the relative magnitude of the panel's ratings and the reliability of the ratings. General Findings 1. The overall mean quality rating obtained for film one was significantly higher (p < .01) than that of film two. 2. The mean criterion ratings, the ratings obtained from each question posed by the Film Quality Rating Instrument, generally supported the overall mean quality ratings which were obtained. 3. The reliability of the mean ratings and of the Film Quality Rating Instrument were found to be adequate for the purpose of the experimental phase of the study. Overall Means and Standard Deviations The mean ratings elicited from questions 20 and 21 of the Film Quality Rating Instrument (Appendix Al) were averaged and used as the quality rating for "grades 4-6" in the Type II experimental abstracts. To compute the averages, the mean ratings obtained from question 20 were first converted to a seven-unit metric (the scale used as the basis of ratings made by the experimental subjects). Table 16 summarizes the means and standard deviations of the ratings obtained from the panel. The average value found for film one was 5.03; for film two, 4.25. These values corresponded closely to the "good" (5.0) and "average" (4.0) ratings established for the ratings used by the experimental subjects. Hence, the rated quality 236 of film one was designated "good" and that of film two, "average" in the type II abstracts. The actual raw score ratings obtained from the film evalua- tion panel are provided in Appendices 01 and 02. The Difference Between the Overall Means Table 16 shows that the mean rating for film one was sig- nificantly greater than the mean rating for film two, for each of three comparisons made (p < .05, df = 15 in each case), whether ratings elicited from question 20, 21, or questions 1-20 combined, of the rating instrument, were used as the basis of comparison. The mean difference (0) between the means, and the product- moment correlation (r) of the 16 pairs of ratings associated with .each comparison, follow: Rating Source E 3 Question 20 .81 .17 Question 21 .62 .13 Questions 1-20 .58 .03 The Mean Criterion Ratings Questions 1-20 of the Film Quality Rating Instrument were each designed to address a specific film quality criterion. The mean ratings obtained for each criterion question are summarized in Table 17. The standard deviation, mean difference, rank of the mean differences, and the difference between the corresponding stan- dard deviations of the ratings for film one and film two are also provided in Table 17 for each criterion question. 237 Table 16.--Means and standard deviations of the quality ratings elicited from the film evaluation panel. Measures Obtained From Questions 20, 1-20, and 21 of the Film Quality Rating Instrument . 6—U ‘ R ' . . Film Measure ngta12t1ng 7-Un1t Rating Scalea . b b Average 420 01-20 020 01-20 021 (020+0211/2 M 4.50** 4.70* 5.25 5.48 4.81*** 5.03 F1 SD .89 .77 ... ... 1.06 M 3.69** 4.12* 4.31 4.81 4.19*** 4.25 F2 SD 1.08 .88 ... ... 1.21 Note: Means with the same number of asterisks, within columns, were found to be significantly different, based upon t-tests of the difference between means of correlated samples. 6The t-test was administered for the 021 column values only. Values in the other three columns were not tested. bThese values were converted from the corresponding 6-unit scale values. *p < .05. **p < .025. p< *** .01. 238 The following findings, illustrated in Table 17, define the basic differences identified between the quality of films one and two. 1. The mean ratings for film one tended to be higher than those for film two, across the twenty criterion questions (in eighteen of twenty cases). 2. Six general criteria and twelve specific criteria exhibited relatively large mean rating differences (0) across films one and two. The criteria are listed below, according to the (D) magnitudes obtained: General Criteria 9. F. Visual quality .98 I. Overall message design .91 E. Expected audience reaction .84 D. Contemporariness-datedness .63 G. Sound quality .62 A. Content validity .50 Specific Criteria 9_ 12. Use of the film medium 1.57 18. Use of mood-tone 1.31 10. Interest-appeal .93 13. Visual content selection .74 ll. Evokes constructive impressions .74 14. Overall visual quality .63 9. Datedness .63 3. Subject matter treatment .63 15. Sound quality .62 2. Subject matter value .62 19. Overall design: learner attributes .50 7. Vocabulary .49 3. The mean quality ratings obtained for each criterion question for film one (mean range = 4.13 to 5.25) were pll_higher in value than the overall mean rating obtained for film two (4.12). 239 The Reliability of the Ratings Three relationships were investigated: the reliability of the overall mean ratings, the rating instrument, and the mean crite- rion ratings. The overall mean ratings.--The split-half, intersubject reliability indices obtained for the experimental films from the panel were these: Film one .96 (p < .01) Film two .60 (p < .10) The correlation coefficients above indicate that the relia- bility of the film one ratings was more stable than that of film two. The Film Quality Rating Instrument.--As computed via the ANOVA reliability assessment method (Miner, 1971, pp. 283-89), the indices of intersubject agreement of ratings obtained for questions 1-20 of the instrument were these (df = 79): Film one .46 (p < .01) Film two .58 (p < .01) The overall means of the ratings obtained from each panel member for questions l-19 of the instrument were also correlated with the ratings elicited from question 20, for each experimental film. The product-moment correlation coefficients computed for each film were these (df = 14): .80 (p < .01) Film one Film two .83 (p < .01) 240 The ratings obtained from question 20 (six-point scale) and question 21 (seven-point scale) were relatively consistent. Mhen correlated, the sixteen pairs of ratings yielded the following product-moment correlation values (df = 14): Film one = .82 (p < .01) Film two = .86 (p < .01) The mean criterion ratings.--The standard deviation data summarized in Table 17 indicate that, overall, the mean criterion ratings were reliable ones. Table 17 reveals that the overall mean standard deviations obtained for film one (1.00) and film two (1.19) were relatively equivalent. The range of the standard deviations obtained for the mean criterion ratings is noticeably greater for film two, though, as noted below, implying that the ratings for film two were rela- tively less reliable than those obtained for film one. Lowest-Highest Deviation Film Standard Deviations Range Film one .82 - 1.24 .42 Film two .83 - 1.77 .94 Table 17 also indicates that overall, relatively small stan- dard deviation differences were present within the pairs of ratings (film one and film two) associated with a given criterion. To illustrate, fourteen of the twenty DSD values shown were less than .30, whereas the six highest were .74, .46, .44, .40, .38, and .32, respectively. 241 Table 17.--11w2meancriterion ratings obtained from the film evaluation panel. FQRIa . . Film One Film Two 56 5' Dc 0# cr‘ter‘°" Mean so Mean so Rank 50 A. Content validity (.50) 1 SM authenticity 5.25 .86 4.81 .83 .44 13 .03 2 SM value 5.00 .82 4.38 1.09 .62 9.5 .27 3 SM treatment 4.69 1.08 4.06 .85 .63 7 .23 4 Contemporary need 4.69 .95 4.38 1.41 .31 15 .46 B. Purpose-objective (.09) 5 Clarity 4.25 1.18 4.38 1.36 -.13 19 .18 6 Achievability 4.13 1.02 3.81 1.42 .32 14 .40 C. Voc-comp level (.25) 7 Vocabulary 4.43 1.03 3.94 1.77 .49 12 .74 8 Comprehension 4.31 1.01 4.31 1.45 .00 17 .44 9 D. Datedness 5.19 1.11 4.56 1.09 .63 7 .02 E. Expected reaction (.84) 10 Interest-appeal 4.31 1.01 3.38 1.20 .93 3 .19 ll Evokes + impressions 4.19 1.17 3.81 1.33 .74 4.5 .16 F. Visual quality (.98) 12 Use of medium 4.88 1.02 3.31 1.20 1.57 1 .18 13 Content selection 4.81 .83 4.07 1.10 .74 4.5 .27 14 Overall quality 4.88 1.02 4.25 .77 .63 7 .25 15 G. Sound quality 5.00 .89 4.38 1.02 .62 9.5 .13 H. Message slant (.10) 16 Positive slant 4.88 1.02 4.63 1.09 .25 16 .07 17 Negative learnings 4.94 1.24 5.00 1.15 -.06 18 .09 I. Overall design (.91) 18 Mood-tone 4.81 .83 3.50 1.21 1.31 2 .38 19 Learner attributes 4.31 1.01 3.81 1.33 .50 ll .32 20 J. Overall rating 4.50 .89 3.69 1.08 .81 ... .19 Overall means 4.70 1.00 4.12 1.19 .57 .19 aFQRI Q# = Film Quality Rating Instrument question number. b0'= Mean rating for film one minus mean rating of film two. Dso C = Standard deviation of ratings for film one minus the standard deviation of ratings for film two. 242 The Film Evaluation Panel's Subjective Comments The panel's comments were analyzed in terms of two basic factors: (a) the strengths and nonstrengths of the experimental films and (b) the general degree to which the comments supported the mean ratings obtained from the panel. General Findings l. The subjective comments generally supported (a) the dif- ference found between the panel's overall mean ratings and (b) the mean criterion ratings, the ratings obtained for each of the ques- tions asked of the Film Quality Rating Instrument. 2. Numerous strengths and nonstrengths were identified about both films one and two. However, overall, the panel's subjec- tive comments, like the panel's quality ratings, tended to indicate that the quality of film one was distinctly higher than that of film two. Strengths vs. Nonstrengths The basic strengths and nonstrengths identified by the panel were those listed below: Mutual Strengths of Films One and Two: 1. The basic educational value of the subject matter which was treated. 2. The general visual quality of the films and their use of color photography. 3. The length of the films. Additional Strengths, Film One: 1. Its effective use of photography, sound, and other strengths of the film medium. 2. Its use of explanation, demonstration, and example. 4. 5. 6. 243 Its suitability of use for different grade levels, including the fifth grade. Its presentation and review of important vocabulary terms. Its basic attention-holding characteristics. Its use of a question-oriented narrative. Additional Strengths, Film Two: l. 2. 3. Its relatively simple level of presentation. Its behind-the-scenes treatment of a topic often taken for granted by children. It treated only a few basic concepts. Mutual Nonstrengths: 1. The subject matter content was somewhat outdated in each film. The characterization was limited; no minorities or females were shown in either film. Some subject matter topics were somewhat underdeveloped in each film. Additional Nonstrengths, Film One: 1. 3. The animated portions sometimes conveyed overly simpli- fied ideas. The film's concept-information load, comprehension level, and presentation pace were somewhat advanced for use with slower fifth grade level pupils. Ambiguous, erroneous information was sometimes presented. Additional Nonstrengths, Film Two: 1. The vocabulary-language level and presentation pace were rather low (oriented toward lower grade levels) for use with fifth grade pupils. ' The film's interest-appeal, intellectual level, and attention-holding characteristics were relatively poor. The script and narration were rather monotonous and dull. Overall, the film did not use very effectively the strengths of the film medium. 244 Subjective Comments vs. Mean Criterion Ratings Tab1e E1 (Appendix E) lists and contrasts (a) the general types of comments made about the quality of film one which were strength-oriented and (b) comments about film two which were nonstrength-oriented. Comments are listed for each criterion ques- tion addressed by the Film Quality Rating Instrument, ordered by the mean difference (0) found for the ratings associated with each ques- tion. The table shows that overall, the basic differences found between the mean criterion ratings were consistently supported by the types of comments which were made. The Subjects' Subjective Comments, Qnality Measure The comments were analyzed in terms of two factors: (a) the perceived strengths and nonstrengths of the films and (b) the fre- quency with which the type P, Q, and PO comments were made. General Findings 1. Overall, the comments tended to refer to both instruc- tional pertinence and film quality criteria. 2. Strengths and nonstrengths of both films were commonly mentioned. 3. The comments generally supported the relative magnitude of the mean ratings which were made and the basic similarities and differences noted about the ratings. 4. The frequency distribution patterns of the types of comments which were made (all of which were film-elicited) were 245 relatively similar, although the pattern associated with treatment group three, session two, tended to exhibit more extreme fluctua- tions (Figure 21). 5. Sessional effects were noted in the frequency distri- butions. Strengths vs. Nonstrengths The most commonly mentioned strengths and nonstrengths of the films were those below: Mutual Strengths of Films One and Two: 1. Their general usability with fifth grade pupils. 2. Their basic educational value and subject matter appeal. 3. Their technical quality attributes. Additional Strengthsg Film One: 1. Its clarity of presentation. Its use of explanations, demonstrations, and examples. Its use of a question-oriented narrative. Its attention-holding characteristics. Its expected positive effects upon pupils (or its actual positive effects upon the subjects). 6. Its presentation and review of important vocabulary terms. 0'1wa Additional Strengths, Film Two: 1. Its applicability of use in Southwest schools. 2 Its use of a general summary of key points at its ending. 3. Its general organizational structure and sequencing. 4 Its relatively simple vocabulary-language level, concept- information load, and presentation pace. Nonstrengths of Film One: 1. Its limitations of use with slower or less-advanced pupils. 2. Its somewhat advanced concept-information load and pre- sentation pace. 246 3. Its somewhat advanced comprehension-difficulty level. 4. Its technical subject matter emphasis and vocabulary. 5. Its lack of emphasis upon some important topics. Nonstrengths of Film Two: 1. Its limitations of use with more advanced pupils. 2. Its attention-holding characteristics. 3. Its lack of emphasis upon some important t0pics. IYPG P, Q. and PO Comments Representative examples of the type P, Q, and PO comments which were obtained are illustrated in Tables 61, 82, and G3, respectively (Appendix G). The tables list the kinds of strength- oriented and nonstrength-oriented comments which were commonly made, categorized according to the dominant referent criterion (evaluation- selection criterion) exhibited by the remarks. The frequency distribution of the type P, Q, and PO comments obtained for films one and two, combined, is summarized in Table 18 and plotted in Figure 21. The tabled and plotted values clearly show that: l. A substantial number of subjects made film-quality- oriented comments (Q + PO) and instructional-pertinence-oriented comments (P + PO). 2. The film-quality-oriented comments tended to be made more often than the instructional-pertinence-oriented comments (0 + PQ/P + P0). 3. The six indices of response obtained for the session one comments from the treatment IV subjects were all higher (sessional) 247 Table 18.--Frequency distribution of type P, Q, and PO comments elicited from the experimental subjects for the quality measure, for films one and two, combined. Session 51 32 Treatment Group T4 T1 T2 T3 T4 P 16 6.5 10 25 13.5 Q 26.5 17.5 17 26 23 PO 35.5 27.5 25.5 19 26 Index: 1 78 51.5 52.5 70 62.5 nge P+PQ 51.5 34 35.5 44 39.5 Comment Q+PQ 62 45 42.5 45 49 P1: .66 .66 .68 .63 .63 9i§9- .80 .88 .81 .65 .78 %$%8- 1.20 1.32 1.20 1.02 1.24 Frequency 248 80 I T451 70 » 60 ~ 50 w 401) 30 0 20 4 10 1 _L , l J l L P | 0 ] PQ lp+po [ Q+PQ L Total Index: Type of Comment Figure 21.--Frequency distribution of the quality comments, by type of comment and treatment-session groups. 249 effect) than the corresponding indices obtained for the session two comments. 4. The treatment group III response frequencies were less stable than those obtained for the other groups. The Subjects' vs. Panel's Responses, Quality Measure General Findings 1. The comments of both the subjects and the panel generally supported the mean ratings which were made, and the basic similari- ties and differences noted about the ratings. Both the types of comments and their frequency of occurrence tended to support the mean rating distributions which were obtained. 2. Overall, the subjects' and panel's opinions were basically reversed about the relative strengths and nonstrengths of the films. The panel tended to identify more strengths about film one, and was also more critical of the nonstrengths of both films than the sub- jects. To the contrary, the subjects tended to identify far more strengths than nonstrengths about film two, and to be least critical of the nonstrengths of film two. 3. Overall, the subjects tended to mention less frequently, and to omit, some categories of evaluation-selection criteria con- sidered by the panel. 4. Type P, Q, and PQ comments were commonly made by both groups. And film-quality-oriented comments (type Q and PO) were made more often than instructional-pertinence-oriented comments (type P and PO) by both groups. 250 5. The subjects tended to make more type P comments than did the panel. Subjective Comments: Similarities The basic strengths and nonstrengths of the experimental films which were generally agreed upon by both the panel members and the subjects were those listed below. Mutual Strengths of Films One and Two: 1. Both exhibited intrinsic, educational value. 2. Both were usable, potentially, with fifth grade pupils. 3. Both exhibited general subject matter appeal. Additional Strengths of Film One: Its use of explanation, demonstrations, and examples. Its use of a question-oriented narrative. Its general clarity of presentation. Its general attention-holding characteristics. Its presentation and review of important vocabulary terms. 0‘ m h OJ N -‘ o o o o o 0 Its general technical quality. Additional Strengths of Film Two: 1. Its relatively simple level of presentation. 2. Its use of a summary of important points and terms at its ending. Mutual Nonstrengths: 1. Both films lacked emphasis upon some important topics. 2. Both films exhibited limitations of use with some types of fifth grade pupils. Additional Nonstrengths of Film One: 1. Its somewhat "advanced" comprehension-difficulty level and rate of presentation for "slower" or "less-advanced" pupils. 251 Additional Nonstrengths of Film Two: 1. Its relatively poorer attention-holding characteristics, overall. 2. Its limitations of use with "more-advanced" or brighter pupils. The panel's and subjects' Opinions were similar as well, in that both were basically split regarding the appropriateness of the comprehension-difficulty-pacing level of film one. Comments commonly indicated that the level was either "too advanced" or "not objec- tionable." Other general similarities were also apparent. For example, a broad range of selection-evaluation criteria was reflected over- all, by both 'the panel's and subjects' responses. The panel's and subjects' comments referred to many of the same film evaluation- selection criteria. The subjects' comments were most similar to those made by the teacher and elementary educator members of the film evaluation panel. Subjective Comments: Dissimilarities The following generalizations summarize the basic difference noted between the panel's and subjects' comments. Generally speaking, the subjects were not as sensitive to all of the evaluation-selection criteria mentioned by the panel, especially those pertinent to the nonstrengths of film two. In par- ticular, the subjects were usually less sensitive and less responsive to nonstrengths dealing with these factors: 252 1. Technical quality attributes, especially the handling and selection of visual content, the sound quality, and use of the strengths of the medium. 2. Message slant characteristics. 3. Content validity attributes, especially the authenticity (up-to-dateness, accuracy), educational value, and treatment of the subject matter. Few comments were made by the subjects about the factors above (other than subject matter treatment). The subjects' comments were most dissimilar from those made by the subject matter specialist and media specialist members of the panel. The media specialists were much more critical of these char- acteristics of film two: the clarity and accuracy of its animated illustrations; the subject matter inaccuracies which it presented; and the film's relatively low intellectual level, poor overall message design, and poor anticipated audience reaction characteristics. The subjects' type P comments frequently referred to factors such as: the assumed purpose of the film's use (especially the hypothetical unit of study); characteristics of school settings and/or curricula with which they were familiar; the subject matter emphases of the films; and the readiness level or other characteristics of the intended, fifth grade audience. The subjects' comments often revealed different subject matter emphases concerns, however, than did the panel's. The sub- jects' comments often dealt with the relationship of the emphases 253 of the experimental films to the unit of study posed for the relevance measure in the film questionnaires. The panel's were usually con- cerned, instead, with subject matter authenticity (accuracy, up-to- dateness) and general educational value considerations. Subjective Comments vs. Overall Mean Ratings As expected, several general relationships were observed about the subjective comments and ratings obtained, whether elicited from the panel or the subjects, namely these: 1. If a given film was rated above average, usually the rater's comments were primarily strength-oriented. The converse was usually found when a given film was rated below average. 2. The more extreme a given rating was found to be, the more extreme also were the corresponding subjective comments. 3. Mhen mediocre ratings were made, both strengths and non- strengths were commonly mentioned. The Subjects' Subjective Comments, Betterness Measure The comments were analyzed in terms of these two factors: (a) the frequency with which type P, Q, and PO comments were made and (b) the frequency with which specific film evaluation-selection criteria were referred to in the comments, overall. General Findings 1. The subjects tended to make both instructional—pertinence- oriented comments (type P and PO) and film-quality-oriented comments 254 (type Q and PO). The pertinence-oriented comments were made much more often, however. In contrast, type Q comments, per se, were made very infrequently. 2. The kinds of P, Q, and PO comments made were similar to those obtained for the quality measure. 3. The "subject-matter emphasis" evaluation-selection cri- terion was referred to much more often than any other single criterion. 4. Overall, the film-elicited frequency distribution values obtained for the type Q comments were relatively similar. However, those obtained for the type P and PQ comments--instructional- pertinence-oriented comments--varied considerably (Figure 22). 5. Overall, the frequency distribution patterns of the comments elicited from each of the three experimental abstract styles were similar; but each was recognizably different than those obtained from the experimental films (Figure 22). 6. The overall frequency distribution pattern elicited from the type IIabstracts was most similar to that elicited from the experimental films, whereas the type I and III abstract styles elicited patterns which were least similar to those elicited from the films (Figure 22). 7. Relatively strong sessional effects were associated with the frequency distribution patterns elicited from each of the three experimental abstract styles (Figure 22). 255 Iype P, Q, and PO Comments Table 19 lists the frequencies of the types of comments which were obtained. The table indicates that instructional- pertinence-oriented comments (P + PQ) were made from 2.23 to 3.62 times as often as film-quality-oriented comments (Q + PQ), per treatment-session group. Frequency distribution plots.-—The frequency values listed in Table 19 are plotted in Figure 22. The figure clearly shows that instructional-pertinence-oriented comments, rather than film-quality- oriented comments, were predominantly made. The experimental abstracts.--Figure 22 also shows that: 1. Overall, the type I and IIIabstract styles tended to elicit inflated frequency values. 2. All three abstract styles tended to elicit frequency values for instructional-pertinence-oriented comments (P and PO) and total values, which were substantially more inflated than the corresponding values obtained during session two. In contrast, the corresponding (session one vs. session two) film-elicited values obtained from treatment grOUp IV did not tend to fluctuate appre- ciably. Evaluation-Selection Referent Criteria A tally was made of the frequency with which the subjects' betterness comments referred to specific film evaluation-selection criteria; the referent criteria identified from analysis of the film 256 quality comments. Eleven criteria tended to be considered by the subjects who chose film one or two for the betterness decision; and four, by the subjects who selected the "no difference" option. The eleven criteria and the absolute percentage of responses which referred to them, by type of comment and selection decision, are summarized in Table 20. Table 20 shows that the subject matter emphasis of the films was clearly the most-often—considered criterion for the three types of selection decisions which were made. The subjects also tended to refer to (a) usage limitations, restrictions, and suggestions; (b) the intended learner; and (c) the purposes for which the film was to be used. As well, the subjects' comments were often phrased in terms of ratings of excellence, i.e., statements of the degree to which a given factor was deemed to be important or significant (e.g., "excellent, average, very important," etc.). Critical Assumptions: The Experimental Phase of the Study Related findings deal with (a) the starting conditions which were desired for the experiment--in particular, the relevance and quality perceptions of the subjects, (b) the experimental abstract styles, (c) the macro-type patterns which were exhibited by the subjects' ratings, and (d) the interdependence of the experimental measures. 257 Table 19.--Frequency distribution of type P, Q, and PQ comments elicited from the experimental subjects for the betterness measure, by treatment-session groups. Session 51 S2 Treatment Group T1 T2 T3 T4 T1 T2 T3 T4 9 39 48 45 42 30 28 39 47 0 2 6 3 5 7 3 5 13 P0 28 13 25 18 16 16 8 6 1 69 67 73 65 53 47 52 66 Index: Type P+PQ 67 61 70 60 46 44 47 53 of Comment Q+PQ 30 19 28 23 23 19 13 19 Ei§9- .97 .91 .96 .92 .87 .94 .90 .80 9i§9- .44 .28 .38 .35 .43 .40 .25 .29 P+P Q+PQ 2.23 3.21 2.50 2.60 2.00 2.30 3.62 2.79 Frequency 258 80 70‘» 60‘— 50 .. 40 4 3011 20 - 104» _ I , I , I I Q 1 P0 IQ+PQ I P I P+PQ l Total Index: Type of Comment Figure 22.--Frequency distribution of the betterness comments, by type of comment and treatment-session groups. 259 Table 20.--The percentage of responses obtained for the evaluation- selection factors referred to most often in the subjects' betterness comments, by type of comment and selection decision. Selection Decision Type of . _ . Comment Evaluation Selection Factor Film No Film One Diff. Two Subject matter emphasis 35% 46% 27% Usage limitations 6% 21% 10% P Intended learner 9% ... 8% Instructional purpose 3% 11% 7% Curriculum integration 4% 5% Educational environment 4% Ratings of excellence 8% 9% 6% 0 Effects: actual, expected 7% 4% Instructional design: programing 3% 2% Learner interest-involvement 6% 3% p0 Vocabulary-language level 3% 3% Total 84% 87% 79% Note: The values provided refer to the absolute percentage of the total number of responses tallied per type of selection deci— sion (column). ‘ 260 General Findings The experimental conditions.-- 1. As desired, the mean film-elicited relevance ratings_ obtained for films one and two for the O3 objective did npt_vary considerably. However, overall, the individual subjects tended to view one of the films as being higher in relevance than the other, in all of the eight treatment-session groups. 2. Contrary to expectation, the mean quality ratings elicited from film one from the subjects did npt_tend to be significantly higher than those elicited from film two. Instead, the subjects tended to rate the quality of film two as higher than film one when they viewed the experimental films. Also, the film-elicited mean ratings obtained for films one and two did npt_vary signifi- cantly in any of the treatment-session groups. 3. As predicted, the subjects' rated teaching preferences were npt_found to be strongly correlated with their relevance, qual- ity, and betterness ratings. The experimental abstracts.-- l. The type II abstracts were expected to exhibit character- istics of a ”good" film description style. The ANOVA indices of con- sistency tended to support this expectation but the rank indices did npt. 2. The type I and III abstracts were expected to exhibit characteristics of film description styles which were npt_"good." The rank indices of consistency tended to support this expectation but the ANOVA indices did npt, 261 3. The type I and IIIabstracts and the type II and III abstracts, respectively, were expected to elicit rating distribu- tions which were significantly dissimilar. This prediction was confirmed by the rank indices, but npt confirmed by the ANOVA indices. 4. The type I and II abstracts were predicted to elicit rating distributions which were significantly dissimilar. The intersubject combined means rank indices confirmed the prediction. However, the ANOVA and intersubject rank (relevance) indices did npt, The interdependence of the measures.--Contrary to expec— tation: 1. Both the experimental films and the experimental abstracts exhibited some strong (P s .01) intermeasure correlation coefficients. 2. The intermeasure correlations exhibited systematic main effects (positive values vs. negative values) which were associated with specific experimental variables. Relevance Ratings: 03Stimu1us Condition Table 21 lists the mean relevance ratings obtained for films one and two for the O3 objective, during session one, for the respec- tive treatment-session groups. The table shows that a very strong overall main film effect occurred (p = .005). The overall effect was clearly associated with the three treatment groups which used the experimental abstracts as stimulus materials. The mean ratings elicited from the type III abstracts varied significantly (p = .01), although those elicited from the other treatment groups did npt. 262 Table 21.-—Mean relevance ratings obtained for films one and two for the O3 objective, during session one, by treatment groups. Session Session One Stimulus Condition 03F1 03F2 T1 5.444:b 7.004b c Treatment 12 6.50* 7.44*c GrOUP T3 5.59**d 8.13**d T4 5.75 5.63 Column Means 5.84**a 7.05**a Note: Plain asterisks (*) denote intrasubject differences. a(F = 8.9; df 1,60; p = .005); (Table J5). b(F = 3.2; df 1.15; p = .09); (ANOVA). C(F = 3.4; df 1,15; p = .09); (ANOVA). d(F = 8.4; df 1.15; p = .01); (ANOVA). *p 5 .10. **p s .01. 263 However, the pairs of ratings elicited from the type I and II abstracts were marginally different in each case (p = .09 and .09, respectively). The differences found between the mean relevance ratings obtained for the 03F1 and 03F2 stimulus conditions follow, by treatment-session group. Treatment Group, Session One Session Two T1 1.56 1.12 T2 .94 1.00 T3 2.44* 1.00 T4 .12 .18 The figures show that the significant difference obtained for the ratings elicited from the type III abstracts (*) was substan- tially larger in value than all others. The mean rating differences above are strongly subdued, however, when contrasted with the actual differences perceived by the individual raters associated with each treatment-session group. The figures which follow show the average absolute rating difference obtained per treatment-session group, when eppn_subject's ratings (rather than group means) were used as the basis of comparison. Treatment Group Session One Session Two T1 2.81 2.62 T2 1.56 1.37 T3 3.18* 2.00 T4 2.75 2.31 The figures show that the average, actual rating difference per subject was greater than 23 percent for five of the eight 264 groups--a substantial difference. Note also that the average actual rating difference exhibited by the type III abstract-elicited rat- ings (*) was approximately 32 percent. Macroalype Rating Patterns The percentagescrfsubjects who made type I, II, or III rating patterns (combined) are noted below for the respective treatment-session groups. Treatment Group Session One Session Two T1 69% 75% T2 56% 62% T3 75% 56% T4 75% 75% For the sake of contrast, the corresponding probability of chance occurrence was 43.4 percent for each group (n 16). These figures reveal that the tendency of the subjects' ratings to exhibit type I, II, and III rather than type IV macro-type patterns was relatively strong, as predicted, whether the ratings were abstract- elicited or fi1m-e1icited. The Intermeasure Correlations The product-moment correlation coefficients obtained as indices of the dependence-independence of the quality, relevance, and betterness measures are provided in Tables 22, 23, and 24. In all cases, the relevance ratings used for computational purposes were those obtained for the O3 objective. 265 Table 22 lists the coefficients obtained for the relevance and quality ratings. Table 23 gives the correlations for the quality and better- ness ratings. Table 24 shows the values derived using the relevance and betterness ratings. Contrary to expectation, the tables show that these ratings were strongly correlated: l. The film-elicited quality and relevance ratings, for both films one and two, treatment group IV (whereas all of the abstract- elicited intercorrelations were relatively low) (Table 22). 2. The film-two-elicited quality and betterness ratings, treatment group IV (Table 23). 3. The relevance and betterness ratings elicited from: a. the type I abstracts, for film two; b. the type III abstracts, for film one, session one; and c. film one, session one (Table 24). The tables also reveal several other noteworthy unexpected findings: 1. The intermeasure correlations involving the betterness ratings were always negative, less than zero for film one; and posi- tive, greater than zero, for film two. This occurred whether the ratings were elicited from the experimental films or the experimental abstracts (Tables 23 and 24). 2. In almost all cases, the intercorrelations obtained of the relevance and quality ratings were positive coefficients (Table 22). 266 Table 22.--The correlation between the subjects' relevance and quality ratings. Relevance vs. Quality Ratings . Treatment Group Session StIWU1US F"m T 1 1 1 1 2 3 4 F1 .27 ... ... .58* S1 ‘k F2 .38 ... ... .63 F1 .28 .35 .33 .61* s 2 F2 -.04 .21 .05 .66* Note: Pearson product-moment correlations were used. The relevance ratings were those obtained for the O3 objective. *p < .01 (p = .01 at .57; n = 16). Table 23.--The correlation between the subjects' quality and betterness ratings. Quality vs. Betterness Ratings Session Stimulus Film: Treatment Group Quality Rating T1 12 13 T4 F1 -.38 ... ... -.43 S1 F2 .49 ... ... .63* F1 -.15 -.05 -.4O -.46 S 2 F2 .05 .21 .16 .60* Note: Pearson product-moment correlations were used. *p < .01 (p = .01 at .57; n = 16)- 267 Table 24.--The correlation between the subjects' relevance and betterness ratings. Relevance vs. Betterness Ratings Session Stimulus Film: Treatment Group Relevance Rating T T T T F1 -.20 -.51 -.58* -,59. S 1 F2 ~52* .17 .39 .33 F1 -.32 -.29 -.20 -,28 S 2 F2 .71* .06 .21 .45 Note: Pearson product—moment correlations were used. The relevance ratings were those obtained for the O3 objective. *p < .01 (p = .01 at .57; n = 16). 268 Footnotes--Chapter IV 1The studies by Cuadra, Katter and others (1967a, 1967b) and other works have referred to the "fail safe" phenomenon. This occurs when raters exhibit relevance ratings which are lower in value than expected. Because of uncertainty factors associated with the rele- vance judgment process, raters will sometimes "play safe." They underestimate the value of a given relevance judgment and make rele- vance ratings which tend to be more lenient than normal in value. However, the relevance ratings obtained in this study tended to show "fail safing" in opposite directions for "high" and "mediocre" relevance situations. Ratings made for "high" situations were often lower than expected, whereas those made for "mediocre" situations were often higher than expected. As well, the betterness and quality ratings exhibited related tendencies. For example, the session one ratings elicited from both the type I abstracts and films tended to be lower than the corresponding session two values elicited from the films, although the quality and betterness ratings in each case were "mediocre" in value. 2The rank coefficients obtained for Hypothesis Four indi- cated that all of the abstract-elicited ranks varied significantly (p = .01) from the film-elicited ranks. This finding suggested that accumulated error factors, rather than the experimental treatments, induced the "significant" treatment differences implied by the coef- ficients. Figure 17 (p. 215) suggests that such may have been the case. To illustrate, the rank order fluctuations reflected by the means depicted in Figure 17 may have been induced from sampling error factors or other specific rating error factors described on p. 303. Mithin the treatment groups depicted, rank order fluctua- tions are indicated by disordinal (criss-crossed) interaction plots, whereas similar rank order relationships are indicated by ordinal (not criss-crossed) interaction plots. Hence, theoretically, at the p S .01 level, means from other samples could have fallen within the "not significantly different" range (p s .02) of the depicted means, resulting in ordinal interaction plots and similar rank order rela- tionships. However, Hypothesis Four was still rejected for two reasons. First, the type III abstract-elicited rank relationship varied sig- nificantly (p = .01) from those elicited from the other two abstract styles--despite the presence of rating error effects common to the three abstract styles. Second, the rank order relationship of the means elicited from the type III abstract style was clearly altered by the extremely inflated (p S .01, ANOVA) mean rating obtained for the 03F2 stimulus condition. 269 3As for Hypothesis Four, the combined means rank coefficients obtained for Hypothesis Five indicated that all of the abstract- elicited ranks varied significantly (p = .01) from the film-elicited ranks. This finding also suggested that accumulated error factors, rather than the experimental treatments, induced the "significant" treatment differences implied by the coefficients, as noted in footnote 2. However, Hypothesis Five was still confirmed, for these reasons. First, one can reasonably assume that the rating error effects which were abstract-elicited were essentially similar for each of the abstract styles. Therefore, the significant differ- ences registered by the rank indices for the abstract vs. abstract comparisons reflected real treatment differences. Second, when the adjusted means (ten-step rating scale) obtained for the quality and betterness measures were plotted in Figure 17 (p. 215), the quality means for the type II and III abstract styles were clearly outside the range of the corresponding means elicited from the films and the type I abstract style. Hence, the rank order relationships of the combined means associated with the type II and III abstract styles were clearly altered by the magnitude of the quality means. (In contrast, the magnitude of the means associated with the type I abstract style always fell within the range of ratings attributable to general sampling and rating error fluctuations. Therefore, the jumbled rank order relationship of the means was basically a reflec- tion of experimental error factors, as noted in footnote 2.) CHAPTER V SUMMARY, CONCLUSIONS. IMPLICATIONS, AND RECOMMENDATIONS Topics of discussion treated by this chapter are organized as follows. First, a general summary is provided of the purpose, objec- tives, and methodology of the study, the model of film selection which was investigated, and the review of related literature and research. Second, an overview description of the experimental phase of the study is given and a summary of the experimental findings and results. Third, conclusions drawn from the experimental phase of the study are treated, followed by a general discussion of the experimental find- ings and conclusions. Fourth, implications of the study and recom- mendations for further research are discussed. The Intent and Emphases of the Study Purpose of the Study Overall, this study was designed to investigate the nature and influence of relevance and guality information cues exhibited by films and film descriptions upon the film selection process, from a behavioral viewpoint of the selection process. 270 271 Objectives of the Study The major objectives of this study were the following: l. To define a behavioral model of film selection based upon judgments made of the perceived relevance and Quality of instruc- tional films, and the basic assumptions, theoretical foundations, and measurement methods upon which the model was based. 2. To obtain evidence of the validity and reliability of the model. 3. To determine if the model can be used to evaluate the effectiveness of different film description styles. Method of Investigation Three basic tasks were pursued by this study. ‘Ejggt, the model of film selection was defined and conceptualized. Second, the related literature and research was reviewed to identify ways in which the model was supported by other studies and writings. Third, an experiment was designed and administered to test important assump- tions of the model. Rationale: The Problem and Need for the Study Traditionally, a "good" film description has been defined primarily on a priori grounds, in terms of recommendations and suggestions offered in the literature about the desirable characteris- tics of an effective film description. This study explored a differ- ent alternative--definition of a "good" film description in terms of empirical evidence obtained of its actual effectiveness. 272 The study focused upon several voids noted within the related literature and research. First, to date, few documents have treated the film selection process and the design and evaluation of film descriptions from a behavioral measurement viewpoint. Second, the media selection literature has flgt_generally attended to what is known about the relevance judgment process of significance to the design and evaluation of film descriptions. The study was founded upon an important underlying assump- tion: that suggestions and recommendations about the specific types of information which should be included in film descriptions should (a) arise from a clearly defined conception or theory of the film selection process and (b) be substantiated by empirical investigation. The Model of Film Selection The essential characteristics of the model and important assumptions upon which it was founded follow. The model, a forced-choice decision model, was defined in terms of a relatively simple notion. Namely, three fundamental types of film selection judgments are made by selectors when they select films from film collections--judgments of: 1. Instructional pertinence, the degree to which a given film is perceived to be relevant to a defined instruc— tional situation; 2. Film quality, the degree of "excellence" or guality exhibited by a given film, in terms of the intended pur- pose and use of the film; and 273 3. Betterness, the degree to which one of a given pair of films is perceived to be "better" to use than another for a particular instructional situation. These judgments are made whether the reading of film descriptions or the actual previewing of films is used as the basis of selection. According to the model, film selection decisions result from a cognitive-affective compatibility assessment process. During the process the characteristics of available films are compared with perceived instructional needs and requirements. The comparisons focus upon instructional_pertinence and film quality considerations. Subsequently, relevance and quality judgments are made; and in turn, betterness judgments and specific accept/reject selection decisions for each film considered during the process. The model was defined in terms of two levels of the film selection process: the latent level, dealing with unobservable, cognitive-affective dimensions of the selection process; and the manifest level, dealing with observable, measurable aspects of the selection process. Assumptions: The Film Selection Process In essence, the forced-choice film selection process is: l. Basically an interactive function of the influence of two types of information cues--instructional pertinence indi- gators (IPI's) and film quality indicators (FQI's)--which are exhibited by both films and film descriptions. 274 2. A function which can be manifested in terms of relevance, film quality, and betterness judgments. Relevance, quality, and betterness judgments are independently made. Generally speaking, the perceived quality of a film will not. tend to influence the degree to which the film is perceived to be relevant to a given instructional situation. Likewise, the perceived relevance of a film to a particular instructional situation will not_ tend to influence the perceived quality of the film. Film selectors will tend to choose films for instructional purposes according to five basic selection principles: Principle A: In general, selectors will tend to choose as better, films perceived to be higher in relevance to an intended use and higher in quality, when provided with choices perceived to be different in quality and relevance to an intended use. Principle 8: When selectors are confronted with pairs of films perceived to be different in relevance to an intended use, the film deemed to be higher in relevance will tend to be selected as better to use regardless of the perceived quality of the two films. Principle C: When pairs of films are perceived to be rela- tively equivalent in relevance to an intended use, and relatively equivalent in quality as well, selectors will tend to indicate that neither film is better to use for the intended situation. Principle D: When pairs of films are judged to be relatively equivalent in relevance to an intended use, but different in quality, the film judged to be higher in quality will tend to be selected as being better to use. 275 Principle E: The strength of a betterness decision is a direct function of the degree to which a selector perceives that the films compared are different in relevance to the intended usage situation and different in quality. Assumptions: Film Descriptions A "999d? film description style is one which tends to elicit selection judgments which are similar to those elicited from the cor- responding films. If a film description style does not tend to elicit selec- tion judgments similar to those elicited from the corresponding films, it is not a "good" style. If a film description style tends to elicit relevance, quality, and betterness judgments similar to those elicited from the corresponding films, it contains an adequate array of IPI's and FQI's. The use of FQI's within film descriptions, including overall ratings of film quality, can affect the nature of selection judgments elicited from the descriptions. In particular: l. The presence of FQI's can significantly influence quality and betterness judgments. 2. Relevance judgments are 99; significantly influenced by the presence or lack of FQI's. 3. The lack of FQI's can significantly influence quality and betterness judgments. 276 The effectiveness of film descriptions can be enhanced by improving the kinds of IPI's and FQI's which they contain. Assumptions: The Model The model is a valid, reliable model of the forced-choice film selection process. The model can be used with the measurement methods employed during the experimental phase of the study to: l. Evaluate the general effectiveness of different film description styles, and 2. Define and distinguish "good" film description styles from those which are not "good." The actual validity and reliability of the model can be assessed by determining the degree to which critical assumptions of the model hold true, as demonstrated by evidence obtained from the experimental phase of this study and other studies. The Review of Related Literature and Research The literature was reviewed to identify writings which: l. Revealed factors that can influence film quality and relevance judgments made during the filmselection process, and 2. Suggested the types of relevance and quality information cues which should be considered for inclusion within film descriptions--instructional pertinence indicators and film quality indicators worthy of additional research and investigation. 277 Conclusions and assumptions drawn from the literature review were summarized: conclusions about the relevance and film quality judgment process, and assumptions about instructional pertinence indicators and film quality indicators. A critique of the related research and literature was also provided. The review suggested that relevance and film quality judgments made about instructional films can be influenced by various factors. The most important factors identified were these: l. The unique information content characteristics of films. 2. The information content characteristics of film descrip- 11cm- 3. Eeeple, especially: a. Their basic viewpoint about the film evaluation- selection process; b. The types of evaluation-selection criteria which they consider to be meaningful; and c. The perceived value, compatibility, and credibility of information considered by them during the judgment process. 4. Information requests and request statements. 5. Judgment conditions and circumstances, especially: a. The particular selection purposes which are considered (for example, previewing vs. final selection); b. Definitions of relevance and quality; 278 c. Anchor referents used as the frame of reference for judgments which are made; d. The types of appraisal methods used; and e. The purpose for which and the environmental context within which a given film is to be used. The literature review also revealed that the influence of quality and relevance information cues on the film selection process is highly dependent upon (a) the ability of evaluators and selectors to recognize important, meaningful quality and relevance cues exhibited by films; and (b) ultimately, therefore, upon the particular types and kinds of instructional pertinence indicators and film quality indi- cators provided in film descriptions. Eleven different types of instructional pertinence indicators and eight types of film quality indicators were identified. The types of instructionalypertinence indicators suggested for inclusion within film descriptions were the following; namely, indi- cations of: Instructional purposes or objectives Subject matter emphases Other content emphases Intended audiences and users Target population slant Rationale (theoretical or conceptual basis) Vocabulary level Comprehension level Intellectual, skill, or affect level Film medium characteristics Physical characteristics. dOKDCDNO'lU'l-pr-d dd The types of film quality indicators suggested for inclusion within film descriptions were these: 279 Ratings Critical appraisals Standard comparisons Awards of merit Effectiveness indices a. estimates of predicted or potential effectiveness b. indications of actual effectiveness 6. Inherent quality attributes a. technical quality indicators b. instructional design quality indicators c. other 7. Evaluations supplied from specific sources 8. Indirect quality indicators. (”boom—4 0 o u o o The Experimental Design and Methodology Overview An experiment was designed to determine the degree to which relevance and film quality information cues supplied in several dif- ferent film description styles, actually influenced the film selec- tion process. The experiment was structured to compare the simi- larity of selection judgments made about a pair of films which were elicited from (a) the films, with (b) corresponding judgments elicited from the descriptions of the films. Three independent variables were investigated: treatments (the type of decision-making information provided; films or descrip- tions); stimulus conditions (the circumstances posed for the judg— ments); and films (the specific experimental films). Rated judgments of the relevance, quality, and betterness of the stimulus films were obtained as dependent measures (variables) for several hypothetical instructional situations. The stimulus films were selected so that they would be perceived as being rela- tively equivalent in relevance to one of the situations, which was 280 employed to elicit the betterness judgments. The ratings obtained for each situation served as basic measures of the degree to which the experimental subjects were "influenced" by the type of informa- tion presented in the experimental abstracts and films. Sixty-four elementary school teachers served as subjects for the experiment. The subjects were randomly assigned to four treatment groups, referred to as groups I, II, III, and IV. The first three treatment groups used different sets (styles) of film descriptions as stimulus materials for the experiment. Treatment group IV used the experimental films as stimulus materials. Approxi- mately two weeks later during a second experimental session, all subjects used the experimental films as stimulus materials. Three sets of film descriptions were used, referred to as type I, II, and III abstracts or abstract styles, respectively (Appendices Cl, C2, and C3). The three styles were similar in con- tent and design in that they provided the same basic information about each respective stimulus film. Corresponding descriptions of the films each contained the same array of "instructional pertinence indicators." The styles varied however, in terms of the type of film quality information supplied or not supplied within the des- criptions. The styles contained different information about the rated "quality" of the pair of experimental films, as noted below: Type I abstracts contained no overall ratings of film quality for the films (which were arbitrarily designated as film one and film two, respectively). Type II abstracts contained yeljd overall ratings of film quality as assessed by a lG-member panel of film evaluators. Film one was rated "good"; film two, "average." 281 Type III abstracts contained invalid overall ratings of film quality, ratings diametrically opposed and somewhat more dis- similar than the ratings added to the type II abstracts. Film one was rated "fair,“ film two as "very good." Treatment groups I, II, and III used abstract styles I, II, and III, respectively, as stimulus materials during the first session of the experiment. The experimental abstract styles were particularly chosen, so that a secondary purpose would be served by the experimental phase of the study; namely, evaluation of the effectiveness of film des- cription styles which contained or did eet_contain overall ratings of film quality. The Hypotheses The hypotheses were established to determine whether or not relevance, quality, and betterness judgments are actually influenced according to the assumptions and principles defined by the model of film selection. The general question underlying the hypotheses was the fol- lowing: Are film selection judgments significantly influenced by the use of overall ratings of film quality within film descriptions? Hypothesis One. The mean magnitude of guality ratings elicited from the experimental abstract style lacking the overall ratings of film quality will be significantly different than that elicited from the experimental films. flypothesis Two. The use of overall ratings of film quality within the experimental abstracts will gee significantly alter the mean magnitude of relevance ratings elicited from the abstracts. 282 hypothesis Three. The use of overall ratings of film quality within the experimental abstracts will significantly alter the mean megnitude of betterness ratings elicited from the abstracts. flypothesis Four. The use of overall ratings of film quality within the experimental abstracts will eee_significantly alter the rank order relationship of mean relevance ratings elicited from the abstracts. hypothesis Five. The use of overall ratings of film quality within the experimental abstracts will significantly alter the rank order relationship of mean relevance, quality, and better- ness ratings, when combined, elicited from the abstracts. Measurement Methods Both objective and subjective measurement methods and indices of response were employed. To facilitate comparison of the subjects' responses, emphasis was placed upon the use of indices of reliability and consistency. Indices of reliability were obtained as measures of the similarity and stability of judgments elicited from the same film. Indices of consisteney were obtained as measures of the similarity of judgments elicited from a given film description style, and corresponding judg- ments elicited from: l. The films described by the descriptions (referred to as abstract vs. film indices); or 2. Another description style (referred to as abstract vs. abstract indices). Both intrasubject and intersubject indices were obtained. Mean ratings and Spearman rank correlations were used as basic indices of reliability and consistency. Also, the frequencies 283 with which specific types of rating patterns and subjective comments were made, were used as supplemental, descriptive indices. Subjective comments were solicited for the quality and betterness measures to identify why the experimental subjects made the types of ratings which were made. The comments were analyzed to determine if film quality or instructional pertinence factors influ- enced the rating patterns obtained from the experimental treatment groups. This was achieved by comparing the general frequency with which film quality and pertinence factors were referred to in responses associated with specific stimulus conditions—-as indicated by the type of film evaluation-selection criteria considered for a given, particular judgment. The analysis of variance statistical analysis technique, F tests, and tests of the significance of rank correlations from zero were used to identify significantly different rating distributions. A .01 confidence level was used for all formal significance tests. For the sake of comparison, the rank correlations and mean ratings were referred to as the rank and ANOVA indices. Critical Measurement Assumptions General assumptions.--Reliable film selection judgments in the form of mean relevance, quality, and betterness ratings can be elicited from both films and film descriptions. Rated judgments of film quality, relevance, and betterness elicited from films can be used effectively as a baseline of comparison 284 for corresponding judgments elicited from different film description styles. Indices of response which show significant differences (p S .Ol) indicate that the corresponding selection judgments asso- ciated with the indices have been influenced significantly in dif- ferent ways by the experimental treatments which elicited the responses. The rating patterns exhibited by the subjects will tend to reflect the principles of selection defined by the model of film selection. The types of subjective comments made by the experimental subjects and the frequency with which they occur will tend to support the ratings made by the subjects. The indices of reliability and consistency.--The ANOVA and rank indices will tend to exhibit similar, mutually supportive experimental findings—-both the indices of reliability and the indices of consistency. The intrasubject and intersubject indices obtained for a given type of index will tend to demonstrate similar, mutually sup- portive experimental findings. The rank correlation indices.--The rank correlation indices of reliability will tend to be high (+.65 and above) and significant from zero at the p S .01 level. Film description styles which are "good" styles will tend to elicit rank correlation indices of consistency (abstract vs. 285 film) which are significant from zero at the p 5 .0l level, whereas styles which are pee "good" will £93- The mean ratings: ANOVA indices.--Film description styles which are "gpedf styles will tend to elicit mean relevance, quality, and betterness ratings which do ppe_vary significantly (p S .Ol) from those obtained from the corresponding films, whereas styles which are DEE "good" will get, Corresponding, film-elicited, mean relevance, quality and betterness ratings will tend to be similar. The means will get vary significantly (p S .Ol). The experimental abstracts.--Each of the abstract styles will elicit unique, distinctive selection judgment tendencies. The type_l.abstract style, the style lacking the overall ratings of film quality, is pee_a "good" style. The type II abstract style, the style containing the yelid overall ratings of film quality: a. is a "good" style. It contains an adequate array of instructional pertinence indicators and film quality indicators. b. will elicit judgment tendencies overall of the three experimental abstract styles which are most similar to those obtained from the experimental films. The type III abstract style, the style containing the invalid overall ratings of film quality: 286 a. is pet a "good" style. b. will tend to show judgment tendencies overall which are least similar to those obtained from the experimental films. The Experimental Findings and Results General findings and results are provided here for the experimental hypotheses, the experimental abstract styles, the ANOVA analyses, the model of film selection, and the measurement methods which were employed. General Findings: The Hypotheses Hypothesis One.-- The mean magnitude of guality ratings elicited from the experimental abstract style lacking the overall ratings of film quality will be significantly different than that elicited from the experimental films. This hypothesis was rejected. The lack of overall ratings of film quality within the abstracts did pet_significantly alter (p S .Ol) the mean quality ratings which were elicited. Hypothesis Two.-- The use of overall ratings of film quality within the experi- mental abstracts will get significantly alter the mean magnitude of relevance ratings elicited from the abstracts. Hypothesis Two was rejected. The presence of the invalid overall ratings of film quality significantly altered (p S .Ol) the mean relevance ratings which were elicited. 287 Hypothesis Three.-- The use of overall ratings of film quality within the experimental abstracts will significantly alter the mean magnitude of betterness ratings elicited from the abstracts. The hypothesis was rejected. Neither the presence nor the lack of overall ratings of film quality significantly altered (p S .Ol) the mean betterness ratings elicited from the abstracts. Hypothesis Four.-- The use of overall ratings of film quality within the experi- mental abstracts will pet significantly alter the rank order relationship of mean relevance ratings elicited from the abstracts. The hypothesis was rejected. The rank order relationship of the mean relevance ratings was significantly altered (p = .Ol) by the presence of the invalid ratings of film quality. Hypothesis Five.-- The use of overall ratings of film quality within the experi- mental abstracts will significantly alter the rank order relationship of mean relevance, quality, and betterness ratings, when combined, elicited from the abstracts. This hypothesis was confirmed. The rank order relationships of the combined mean ratings were significantly altered (p = .Ol) by the presence of both the yeljd and the invalid overall ratings. General Findings: The Experimental Abstract Styles The abstract styles elicited some of the selection judgment response characteristics which were predicted to be exhibited. How- ever, other unexpected judgment tendencies and results occurred. 288 For example, overall, the ANOVA indices indicated that ell of the abstract styles were "good" styles (p 5 .Ol), whereas the rank indices indicated that 211.0f the abstract styles were pet_"good" styles (p = .Ol). Also, the type I, II, and 111 abstract styles, respectively, showed tendencies to elicit mean relevance and betterness ratings which were "low," "higher," and "highest" in value, respectively. Said another way, the abstract style which leekeg the overall ratings of film quality tended to elicit deflated ratings; and the abstract styles which contained the overall ratings, inflated ratings. General Findings: The ANOVA Analyses As expected, the repeated measures experimental design which was used did pee usually indicate the presence of significant overall main treatment effects (p f .Ol). Significant simple treatment effects (p S .Ol) were found, however. The effects were usually associated with treatment group III, the group which used the abstracts which contained the invalid overall ratings of film quality. The effects were always associated with the relevance measure. The quality and betterness measures revealed the presence of pe_significant simple treatment effects. Strong stimulus condition effects were found. (The experi- ment was structured to induce them. Hence they were expected to be found.) Strong, overall main sessional effects were revealed by the relevance ratings (p f .Ol; intrasubject, abstract vs. film). Several 289 strong simple sessional effects were also found, associated with the abstract style which contained the invalid overall ratings of film quality. The film-elicited ratings did pet usually show ses- sional effects however (strong or marginal). The relevance ratings also showed the presence of strong (p 5 .Ol) overall main film effects and main session x stimulus condition interaction effects. The abstract styles which contained the overall ratings of film quality elicited simple film effects. General Findings: The Model of Film Selection The experimental evidence generally supported the basic assumptions of the model. All of the underlying assumptions of the model were pet found to be true, however. General Findings: Measurement Methods Generally speaking, the measurement methods and response indices employed were useful for comparing the abstract-elicited and film-elicited judgments which were obtained. However, all of the underlying measurement assumptions were pet_confirmed by the experi- mental results. Confirmed assumptions.--As predicted: l. The subjects' rating patterns tended to reflect irIthe majority of cases, the principles of selection defined by the film selection model. 290 2. The film—elicited ratings were usually stable, reliable ones, serving as an effective baseline of comparison for the abstract-elicited ratings. 3. It was possible to identify instructional-pertinence- oriented and film-quality-oriented subjective comments which were made by the subjects. As well, the comments usually supported the ratings which were made. Unconfirmed assumptions.--Contrary to expectation: l. The ANOVA and rank indices of consistency did not always tend to indicate similar experimental results. Nor did the intrasubject and intersubject indices of reliability. 2. "Good" film description styles were pet clearly distin- guished from styles which were not "good." 3. The relevance, betterness, and quality ratings were pet always independently made. Strong intermeasure correla- tions were sometimes found. In particular, the film- elicited relevance and quality ratings were strongly correlated (p = .Ol). Conclusions: The Experimental Phase of the Study General conclusions follow for the experimental hypotheses, the experimental abstract styles, and the model of film selection. The Hypotheses Thepgeneral question posed by the hypotheses.-— Are film selection judgments significantly influenced by the presence or lack of overall ratings of film quality within film descriptions? 291 General conclusions: Overall, the experimental evidence revealed that film selection judgments can be substantially influenced by both the presence and the leek_of overall ratings of film quality within film descriptions. Strongly divergent judgment effects resulted from the presence and the lack of the overall ratings within the experimental abstracts. When compared to the corresponding film-elicited judgment tendencies which occurred, the abstract styles which contained the overall ratings showed tendencies to elicit relevance and betterness judgments which were substantially inflated in value (.03 S p S .15), whereas the abstract style which lacked the overall ratings tended to elicit relevance, quality, and betterness judgments which were substantially deflated in value (.02 S p 5 .l5). As well, the experimental evidence revealed that film selec- tion judgments eep be significantly influenced (p 5 .Ol) by the presence of overall ratings of film quality--both yelld_and invalid ratings. But to the contrary, the experimental evidence did get confirm that the lack of overall ratings of film quality within film descriptions will significantly alter the selection judgment process. The degree to which the judgments obtained for the experi- mental phase of the study were "significantly influenced" varied with (a) the types of judgments compared, (b) the particular measure— ment methods and indices used to manifest and compare the judgments, and (c) the baseline of comparison which was used. 292 Hypothesis One.-- The mean magnitude of guality ratings elicited from the experimental abstract style lacking the overall ratings of film quality, will be significantly different than that elicited from the experimental films. Conclusion: The guality judgments elicited from the abstract style which lacked the overall ratings of film quality did ppl_vary significantly (p s .Ol) from the film-elicited judgments. Hypothesis Two.-- The use of overall ratings of film quality within the experimental abstracts will ppt_significantly alter the mean magnitude of relevance ratings elicited from the abstracts. Conclusion: The lack of overall ratings did get significantly influence (p S .Ol) the relevance judgments which were made. Npe_did the presence of the valid overall ratings. To the contrary, the relevance judgments were sometimes significantly inflated (p S .Ol) by the presence of the invalid over- all ratings, when compared to those obtained from the experimental films. Overall, though, the relevance judgments were pet_generally altered significantly by the presence of the invalid ratings. Hypothesis Three.-- The use of overall ratings of film quality within the experimental abstracts will significantly alter the mean magnitude of betterness ratings elicited from the abstracts. Conclusions: Neither the presence nor the lack of overall ratings of film quality within the experimental abstracts significantly 293 altered the betterness judgments. Whether elicited from the abstracts which contained or did pel_contain the overall ratings, the abstract- elicited judgments did pet_vary significantly (p S .Ol) from each other or from the film-elicited judgments. Hypothesis Four.-- The use of overall ratings of film quality within the experi- mental abstracts will pet significantly alter the rank order relationship of mean relevance ratings elicited from the abstracts. Conclusions: The general pattern of the relevance judgments was significantly altered (p = .Ol) by the presence of the invalid overall ratings of film quality. The pattern of judgments elicited from the abstract style which contained the invalid overall ratings varied significantly from the patterns obtained from the experimental films and the other two abstract styles. But the pattern of relevance judgments was pet clearly changed significantly by the presence of the yellg_overall ratings of film quality, or by the leek_of the ratings. The abstract vs. film indices indicated that the leek of overall quality ratings and the presence of the yelld_ratings within the experimental abstracts significantly altered the relevance judgments which were obtained. However, the alterations were confounded with rating error'effects which may have induced the experimental findings. Hence, one can pet_ unequivocally conclude that the alterations were induced from the presence or lack of the overall ratings. 294 Hypothesis Five.-- The use of overall ratings of film quality within the experimental abstracts will significantly alter the rank order relationship of mean relevance, guality, and betterness ratings, when combined, elicited from the abstracts. Conclusions: The general patterns of the relevance, guality, and betterness judgments, combined, were significantly altered (p==.Ol) by the presence of both the yelle_and the invalid overall ratings of film quality. The judgment patterns elicited from the abstract styles which contained the overall ratings varied significantly from each other, from those elicited from the experimental films, and from that of the abstract style which leekee_the overall ratings. But the experimental evidence did pet clearly indicate whether or not the leek of overall ratings of film quality significantly influ- enced the general pattern of judgments obtained. As for Hypothesis Four, the rank indices indicated that the judgment pattern elicited from the abstract style which l飣éfl.the overall ratings varied sig- nificantly from that obtained from the experimental films. However, again, the differences noted were confounded with rating error effects. The judgment differences were pet clearly attributable to the influ- ence of the lack of the overall ratings. The Experimental Abstract Styles General conclusions.--As predicted: l. Each of the abstract styles elicited a unique, distinctive pattern of selection judgment tendencies. 295 2. The abstract style containing the invalid overall ratings of film quality was found to be the least effective style, overall. However, contrary to expectation: l. pre of the three abstract styles was found to be an effec- tive film description style. Although the ANOVA indices revealed that each of the three styles was a "good" style, other characteristics were noted about each of the styles which indicated that they were pet_"good" styles. 2. pre_of the experimental abstract styles contained an adequate array of instructional pertinence indicators and film quality indicators. 3. The three abstract styles exhibited these characteristics which indicated that the styles were ept_effective ones: a. Each style showed tendencies to distort the relevance judgments which were made from them. The relevance judgments were always inflated for film two and usually deflated for film one when compared to those elicited from the correSponding films during session one. b. The rank indices of consistency always indicated that the abstract-elicited judgments varied significantly (p = .Ol) from the film-elicited judgments. The type I abstract style.--As expected, this style, which lacked overall ratings of film quality, usually elicited guality judg- ments which were substantially different (.02 S p 5 .l4) than those obtained from the experimental films. 296 But contrary to expectation, this style: I. Did pet_always elicit relevance, quality, and betterness judgments characteristic of a style which was eet_"good." Did pet_tend to elicit guality and betterness judgments which varied significantly (p S .Ol) from those obtained from the experimental films. Elicited judgment tendencies, overall, which were most similar to those obtained from the experimental films. Tended to elicit: a. quality, relevance, and betterness judgments which were lower in value than those obtained from the experimental films and the abstract styles which con- tained overall ratings of film quality. betterness judgments which were closely similar to those obtained from the experimental films, but sub- stantially lower in value (.11 S p S .14) than those obtained from the abstract styles which contained the overall ratings. relevance judgments which varied marginally (.02 S p S .18) from those which were film-elicited, and those obtained from the abstract styles which con- tained the overall ratings. relevance and betterness judgments which were some- times strongly correlated (p = .Ol) for film two. 297 The type 11 abstract style.--As predicted, this style, which contained the valid overall ratings of film quality, elicited: 1. Relevance and betterness judgment tendencies which did eet_vary significantly (p S .Ol) from those obtained from the experimental films. Selection judgment tendencies overall which were generally dissimilar from those obtained from the abstract style which contained the invalid overall ratings. But contrary to prediction, the style: I. Did ppt_always elicit judgments characteristic of a "good" style. Did pet elicit judgment tendencies overall which were most similar to those obtained from the experimental films. Showed tendencies to elicit relevance and betterness judgments which were substantially inflated (.03 Sp 5 .l4). when compared to those obtained from the experimental films. Elicited a general pattern of relevance, quality,pend betternessyjudgments which varied significantly (p = .Ol) from that obtained from the experimental films and the abstract style which lacked the overall ratings of film quality. 298 The type 111 abstract style.--As anticipated, this style, which contained the invalid overall ratings of film quality, elicited: l. 2. Judgment tendencies overall which were most extreme; they were least similar to those obtained from the experimental films. A general pettern of relevance, guality, and betterness judgments which varied significantly (p = .Ol) from that obtained from the experimental films and the other two abstract styles. However, contrary to the results expected, this style: I. Did pet always demonstrate judgments characteristic of a "good" style. Did pet_usually elicit betterness judgments which varied significantly (p S .Ol) from those obtained from the experimental films, and the abstract style which contained the valid overall ratings. Tended to elicit both relevance and betterness judgments which were substantially inflated in value (.O4SpS .15) compared to those obtained from the experimental films. Sometimes significantly distorted relevance judgments. They were found to vary significantly (p S .Ol) from those obtained from the experimental films. As well, the general pattern of the relevance judgments varied signifi- cantly from those obtained from the other two abstract styles. 299 The Model of Film Selection General conclusions are offered next about the model of film selection. Thereafter, the critical assumptions underlying the model which were confirmed and unconfirmed by the experimental phase of the study are listed. General conclusions.-- l. Overall, the model is a valid, reliable model of the forced-choice film selection process. Although all of the critical assumptions underlying the model were eet_found to be true, the general validity and reliability of the model was essentially confirmed by the experimental evi- dence. The model can be used meaningfully with the measurement methods employed during the experimental phase of the study, to evaluate the effectiveness of different film description styles. The general effectiveness of each of the experimental abstract styles was identified. As well, the tendencies of the abstract styles to elicit film selection judgments which were similar to or dissimilar from each other, and the experimental films, were revealed. Overall, each abstract style was shown to elicit a dis- tinctive pattern of film selection judgment tendencies, both predictable and unpredictable ones. The reliability of the measurement methods used with the model to distinguish "good" film description styles from 300 those which were eet_"good" was not confirmed. The measurement methods provided contradictory results. Likely, though, this occurred because of the use of small sample sizes and the use of a small number of stimulus conditions. Confirmed vs. unconfirmed assumptions.--These assumptions of the model were generally confirmed by the experimental phase of the study: The forced-choice film selection process is a function which can be meaningfully manifested in terms of relevance, quality, and betterness judgments. Film selectors will tend to choose films for instructional purposes according to the principles of selection defined by the model. Both the presence and the lack of film quality indicators within film descriptions can substantially influence the nature of relevance, quality, and betterness judgments which are elicited from the descriptions. The general pattern of relevance, quality,and betterness judgments,combined, which are elicited from film descrip- tions can be significantly altered (p S .Ol), by the presence of invalid overall ratings of ‘fihn quality within the descriptions. However, other important assumptions of the model were pet, confirmed by this study. Contrary to expectation: 301 The attributes relevance,,quality, and betterness do pet, always Operate as independent variables. Judgments made of one attribute can influence judgments made of the other attributes. The presence of invalid overall ratings of film quality within film descriptions can significantly alter (p S .Ol) the relevance judgment process. Neither the presence nor the leek_of overall ratings of film quality within film descriptions will necessarily significantly alter (p S .Ol) the betterness judgment process. The leek_of overall ratings of film quality within film descriptions will He; necessarily significantly alter the guality judgment process. The presence of yelld overall ratings of film quality within film descriptions pep significantly alter the general pattern of relevance, quality, and betterness judgments elicited from the descriptions. Film descriptions defined in terms of a priori criteria to be "good" or not "good," will pet necessarily exhibit relevance, quality, and betterness judgment tendencies which indicate the same respective characteristics in terms of the behavioral reeponse criteria investigated by this study. 302 Discussion This section treats four topics: the interpretation of the experimental results underlying the conclusions which were made; new and revised assumptions of the study derived from the experi- mental findings; the strengths and limitations of the model of film selection as used to evaluate the experimental abstract styles; and the generalizability of the experimental results. Interpretation of the Experimental Results When first recognized, the contradictory results obtained from the ANOVA and rank indices were somewhat disconcerting. How— ever, upon further analysis of the experimental data, the discrepan- cies noted were resolved. Critical analysis of the experimental findings provided two important conclusions. fleet, all of the abstract styles demon- strated characteristics of both "good" styles and styles which were pet "good"--because the two types of indices reflected different judgment characteristics of the abstract styles. The indices were sensitive to different rating relationships. Seeded, the use of small sample sizes and a small number of stimulus conditions resulted in: a. a .Ol confidence level which was somewhat "stringent" for both the ANOVA and the rank order significance tests; and b. the rank indices being overly sensitive to rating error effects and normal rating fluctuations which occurred. Consequently, the differential influences of the 303 experimental treatments were clouded by the rating fluc- tuations and error effects. Four types of rating error effects occurred which tended to distort the magnitude of the mean ratings and the rank coefficients which were obtained: Sessional effects--The tendency for indices of response to be higher or lower during a given experimental session. Abstract vs. film effects--The tendency for ratings to be higher or lower for a given film when abstract-elicited. Fail safe effects--The tendency for raters to "play safe," to subdue or inflate ratings because of uncertainty factors. "Halo" effects]--The tendency for ratings obtained of two inde- pendent variables to be highly correlated (e.g., relevance vs. quality). The marginal differences revealed by the ANOVA indices, and the rank coefficients which were not significant (p = .Ol) from zero, both reflected the cumulative effects of the four types of rating errors. Nevertheless, the marginal differences and rank indices also clearly reflected the differential treatment influences which occurred. The general tendency for the type I, II, and III abstract styles to elicit "low," "higher," and "highest" mean rating values, respectively, implied that the differential treatment effects were peel ones, even though they were eet_usually registered as sig- nificantly different ones (p S .Ol) by the ANOVA indices. As well, the strongly subdued rank coefficients obtained from the two abstract styles which contained the overall ratings of film quality were direct reflections of the influence of the quality ratings which were provided in the two abstract styles. One or both 304 of the quality ratings supplied in the pairs of film descriptions were clearly outside the range of the corresponding quality ratings elicited from the films and the abstract style which lacked the overall ratings. Hence, the corresponding rank order relationships of the mean relevance, quality, and betterness ratings associated with the two abstract styles were substantially shifted as a result. In contrast, the magnitude of the ratings elicited from the abstract style which leeked_the overall ratings of film quality (type I) always fell within the range of ratings attributable to general sampling and rating error fluctuations. Therefore, the jumbled rank order relationships associated with the type I abstract style were basically reflections of experimental error factors. One can reasonably assume that the rating error effects which were abstract-elicited were essentially similar for each of the abstract styles. Therefore, the significant differences registered by the rank indices for the abstract vs. abstract comparisons were considered to reflect peel treatment influences. The frame of reference (judgment conditions) used to elicit the relevance judgments obtained for the forced choice situation clearly influenced the magnitudes and standard deviations of the relevance ratings which were obtained. Likely, the frame of reference was ppt_one which induced feelings of "certainty." The somewhat nebulous judgment context established may partially explain why the leek_of and presence of overall ratings of film quality caused the abstract styles to elicit the divergent mean rating tendencies which were found. 305 Although the ANOVA and rank indices revealed contradictory findings, they were still similar in three important respects. They both indicated that the abstract style which contained the invalid ratings elicited selection judgments which were most dissimilar to those obtained from the films. As well, they both showed that the abstract style which lacked the overall ratings elicited judgments which were most similar to those obtained from the films. And, too, they both revealed that film descriptions which contain overall ratings of film quality can elicit selection judgment tendencies distinctly different from those which do pet contain the ratings. Hence, despite the presence of contradictory findings, systematically induced error effects, and the confounding of abstract-elicited and film-elicited effects by the experimental design employed--the differential effects of the three experimental abstract styles upon the film selection judgment process were clearly interpretable ones. Assumptions of the Study: Revisions and Additions The experimental findings suggested that some of the basic assumptions of the film selection model and the measurement methods upon which it was based, should be restated or added as follows. The film selection process.--The types of instructional per- tinence indicators (IPI's) and film quality indicators (FQI's) perceived by film selectors can both significantly influence the kind of relevance, quality, and betterness judgments which are made by selectors. 306 .QQED the presence and the leek_of film quality indicators within film descriptions can significantly influence the relevance, quality, and betterness judgment tendencies which are elicited from the descriptions. The attributes relevance, quality, and betterness are pel_ independenteperceptions of film selectors. Judgments of one attri- bute can significantly influence judgments made of the other attri- butes. Relevance, quality, and betterness judgments will tend to be made with less certainty from film descriptions than from the corres- ponding films. Judgments of film quality obtained for a given film from dif- ferent types of evaluators can vary significantly. Ratings of film quality obtained from a given group of evaluators or selectors may vary significantly therefore, from the overall ratings of film quality supplied in film descriptions, obtained from a given panel of film evaluators. Film descriptions.--For a given situation, film descriptions which leek film quality indicators will tend to elicit relevance, quality, and betterness judgments which are leeee in value than those elicited from the corresponding films, and from descriptions which contain positive ratings of film quality (ratings of "average" or above). For a given situation, film descriptions which contain positive ratings of film quality will tend to elicit relevance, 307 quality, and betterness judgments which are Diflfléfi.i" value than those elicited from the corresponding experimental films. For a given situation, film descriptions which contain ngHep ratings of film quality will tend to elicit relevance, quality, and betterness judgments which are ngpep in value than descriptions which contain lower ratings of film quality. Film descriptions will tend to elicit selection judgments according to this basic principle: Descriptions which do ept_contain yelld IPI's and FQI's will tend to elicit relevance, quality, and betterness judgments which are significantly different than those elicited from descriptions which contain yelld_IPI's and FQI's. If a film description style shows tendencies to elicit relevance, quality, and betterness judgments substantially dissimilar to those elicited from the corresponding films, it does ppt_contain an adequate array of IPI's and FQI's. Measurement methodology.--The types of measurement methods and indices used to compare selection judgments made by film selectors can significantly influence the magnitude of the indices obtained, and conclusions made about the similarity and reliability of the judgments. For some situations, relevance, quality, and betterness judgments can tend to be strongly correlated. Mean relevance, quality, and betterness ratings elicited from film descriptions will tend to exhibit minor magnitude fluc- tuations--distortions--when compared to those elicited from 308 corresponding films. The fluctuations, which can sometimes be substantial, will inevitably occur because of uncertainties and other limitations associated with the making of selection judgments from film descriptions. Significance testing procedures used to identify signifi- cantly different selection judgment tendencies must adequately account for judgment errors which will occur. Measures of intrasubject and intersubject reliability will ept_necessarily tend to be equivalent; one type may tend to provide index values which are significantly higher or lower than the other for a given type of index (rank, ANOVA, frequency). Measures of intrasubject and intersubject consistency obtained for a given film description style will ept_necessarily tend to be equivalent; a given type of index (rank correlation, ANOVA, frequency) can tend to elicit measures of intrasubject and inter- subject consistency which vary significantly. Intrasubject rank indices of reliability will tend to be lower in value than corresponding intersubject rank indices of reliability. Intersubject ANOVA indices of reliability will tend to be more sensitive to real selection judgment differences elicited from dif- ferent films than intrasubject (sessional) ANOVA indices of relia- bility. The intrasubject (sessional) indices are subject to greater measurement error than the intersubject indices. Rank indices of consistency will tend to be subject to greater measurement error distortions than ANOVA indices of consistency at a 309 given level of confidence, especially when small sample sizes are used. The rank indices will tend to reflect the intensity of rating error effects as well as treatment effects present in judgment responses which are obtained. Use of the Model for EvaTuating Film Descriptions The film selection model, as used with the particular measure- ment methods and experimental design which were employed for this study, demonstrated both strengths and nonstrengths as a vehicle for evaluating the effectiveness of the experimental abstract styles. Major strengths.--Seven important strengths were noted. As predicted, film description styles were found which exhibited characteristics of "good" styles and styles which were pet "good." As predicted, the experimental film descriptions, which were designed to induce different selection judgment tendencies, elicited judgment tendencies which were distinctly different. Judgment tendencies were accurately identified which were systematically "shifted" in similar fashion by two or more of the experimental film descriptions. This characteristic enabled identi- fication of film description styles which induced similar vs. dis- similar judgment tendencies, as well as gross judgment distortions. The intermeasure correlations obtained served as useful diagnostic indices. For example, the abstract style which lacked the overall ratings of film quality (type I) was the only style 310 which showed a strong correlation between relevance and betterness judgments. This finding implied that the corresponding sets of judgments were distinctly different than those obtained from the films and the other film description styles. The use of different types of response indices, baselines of comparison, and data plots enhanced interpretation of the experimen- tal results. Each of these approaches helped to reveal unique as well as similar response tendencies and relationships which occurred. The intersubject indices usually provided the least dis- torted view of real treatment effects; for these indices were not affected by sessional influences as were the intrasubject indices. Both the rank and ANOVA indices of reliability were generally very stable. Overall, the film-elicited indices generally served as an effective baseline of comparison for the abstract-elicited indices. Major limitations.--Six noteworthy nonstrengths were iden- tified. "Good" film descriptions were not clearly distinguished from those which were not "good." The repeated measures design which was used induced the presence of sessional "carry over" effects. The responses in session two were systematically influenced by the experimental stimulus conditions established for session one. Hence, even though a two-week time lapse occurred between the experimental sessions, residual effects of the session one responses were noted in the 3]] session two responses. Consequently, use of the session two responses as a baseline of comparison was diminished. The sessional effects and other systematic rating errors which occurred tended to cloud interpretation of the experimental results. Resolution of the contradictions implied by different response indices required subjective interpretation of the experi- mental results. Hence, the objectivity of the interpretation process was lessened somewhat. Abstract-elicited indices of reliability were not obtained. Hence, the stability of the abstract-elicited responses was not confirmed by this study. The indices would have been useful for defining the "normal" range of the rating error effects which were identified; and in turn, useful in distinguishing real differential treatment effects from the rating error effects which occurred. In particular, the indices could have been useful in distinguishing the intensity of abstract-elicited vs. film-elicited effects which were confounded by the experimental design. Some of the response tendencies which occurred may have been induced because film des- criptions influence the judgment process in ways which films do not, and vice versa. The use of small sample sizes and the use of only a few stimulus conditions produced two undesirable results. First, numerous marginally significant (.02 S p 5 .l5) treatment differences were found. But they were confounded with the presence of the 312 rating error effects. Hence, interpretation of the degree to which the marginal differences were real ones was difficult to interpret. Second, the rank indices of consistency were grossly distorted by the rating error effects which occurred. The intrasubject rank indices of reliability were also distorted. The Generalizability of the Experimental Results The experiment administered for this study was an exploratory one designed to provide preliminary evidence rather than to yield broadly generalizable results. Because of the limited number and arbitrary (rather than random) selection of the stimulus conditions which were investigated, the generalizability of the experimental findings is basically limited to conditions similar to those posed by the experiment. Conclusions drawn from the experimental phase of the study are basically tentative ones, for the model of film selection inves- tigated was ppt_validated prior to this study. The conclusions would best be viewed in relation to findings obtained from replications of the study and other related experiments. Implications of the Study Numerous implications and critical questions were raised. They are summarized here in relation to seven topics: (a) potential undesirable effects of providing film quality ratings within film descriptions; (b) the reliability of film descriptions; (c) teacher training considerations; (d) potential applications of the selection 313 model; (e) the design and evaluation of film descriptions; (f) use of the model for evaluating film descriptions; and (9) use of the model for relevance and media selection research purposes. The Use of Film Quality Ratings One of the assumptions which led to conception of this study was that ratings of film quality supplied within film descriptions will usually influence the selection process in beneficial ways. However, the experimental evidence provided some indications that the film selection process may be served just as adequately and perhaps more conveniently by pet providing quality ratings within film descriptions. If provided in film descriptions, quality ratings would cer- tainly serve some useful purposes. Nevertheless, the selection process might also be hampered by the presence of the ratings. To illustrate, through time, many films will likely become outdated within a given film collection. And as they do, likely also, so will the film quality ratings. Theoretically, the reliability of the ratings could continue to deteriorate until they eventually provided very distorted, invalid information, and indUced signifi— cantly distorted selection judgments. To prevent this sequence of events from occurring, it would be necessary to periodically revise the ratings based upon updated evaluations of the quality of the cor- responding films. 314 The Reliability of Film Descriptions The study also provided evidence that a Highly reliable film description can pei_likely be devised. Because film descriptions are verbal translations of filmic characteristics and relationships, information distortions of one form or another are inevitably reflected by the descriptions. Several related questions were suggested by the study. First, what is the optimal level of Specificity which should be exhibited by an effective film description? Second, what criteria can be used most effectively to define the actual reliability of a given film description style? Third, how can the information dis- tortion characteristics of film descriptions be minimized? Fourth, how reliable are different types of film descriptions which are available within the marketplace? The experimental results implied that the level of detail provided in the experimental abstract styles may have been exces- sive. The results also suggested the following paradox: Supplying too much detail can cause undesirable, distorted perceptions from “cueing effects" which are triggered, while supplying too little detail can cause perceptual distortions from lack of certainty. This paradox indicates that basic limitations exist which will con- trol the degree to which a minimal level of information distortion can be attained. 315 Teacher Training The experimental subjects considered fewer evaluation- selection criteria, overall, than did the film evaluation panel, when they rated the quality of the experimental films. This finding raised three noteworthy questions. Were the subjects less skilled or just less sensitive to the broader range of evaluation-selection criteria considered by the panel? Did the subjects lack adequate training and experience in film evaluation theory and practice? Would replication of the study also show selection judgment differ- ences between teacher-elicited and panel-elicited judgments? Applications of the Model The model has several possible applications other than evaluating the effectiveness of different film description styles and the specific types of information cues which they contain. The model could also be used as an evaluation tool to evaluate: l. The film evaluation-selection skills of teachers and other film users. 2. The effectiveness of film description catalogs and film collections. 3. The effectiveness of other types of media-materials descriptions and audiovisual products. 4. The general evaluation-selection skills of persons and groups who choose materials for educational purposes. 316 As well, the model can be used potentially as a research tool to learn more about: 1. Factors which influence the film selection process in general, and the relevance, quality, and forced-choice selection judgment processes in particular; and 2. The reliability of different measurement methods and response indices which can be used to operationalize behavioral models of the media selection process. The Design of Film Descriptions This study suggested that the reliability of selection judg- ments made from film descriptions will usually be somewhat question- able. Hence, a strong case can be made for continuing research and evaluation efforts which (a) identify the degree to which available film descriptions elicit reliable selection judgments, and (b) define the ways in which descriptions can be designed to elicit the most reliable judgments possible. The instructional pertinence indicators and film quality indicators identified from the review of related research and litera- ture can provide a useful starting point for further research and evaluation activities. The literature review revealed that a variety of viewpoints and considerations can be expressed potentially within film descrip- tions about the characteristics and effects of instructional films. Two critical questions which were nei_well answered by this study, however, were the following. How can the judgments of film 3T7 evaluation-selection panels be summarized effectively within film descriptions? Can the judgments of different types of evaluators and selectors be summarized in ways which will benefit the selection process? Use of the Model for Evaluating Film Descriptions Two precautions were reinforced by the experimental findings. First, because different measurement methods and indices can provide different experimental results, caution will need to be taken when the model or other models are used, to prevent erroneous interpretations and conclusions from being made, especially if dif- ferent studies emphasize different measurement approaches. Second, the experimental findings clearly supported Vinson- haler's (1966) contention that if behavioral responses are used as the criterion for determining the effectiveness of a given media description style, the style should demonstrate its effectiveness across a variety of types of responses and stimulus conditions. Relevance Theony Several other noteworthy questions were also raised by the study, which bear upon the relevance judgment process and effective use of the model of selection as a research and evaluation tool. For example, is quality perception really a specific type of rele- vance perception? Are film quality indicators really relevance information cues? Or do film quality indicators just serve dual 318 roles as quality cues and instructional pertinence indicators? If other types of film quality indicators (rather than overall ratings of film quality) are supplied in film descriptions, will the indi- cators significantly influence the relevance judgment process? And conversely, if different combinations of instructional pertinence indicators are provided in film descriptions, will the indicators significantly influence the guality judgment process? Strong indications were found by this study, that the variables "instructional pertinence" and "film quality" are some- times interdependent aspects of the same parameter, "relevance." The degree to which these variables can nei_be defined and measured as independent variables may therefore impose limitations upon the degree to which the model of selection can be effectively used. The reliability of results obtained from significance tests used with the model could be substantially affected for situations showing strong interdependence. If so, the significance testing procedures might have to be made more stringent. Recommendations Six basic recommendations are offered, dealing with needed research and other follow-up activities which would be beneficial to pursue. Additional suggestions are also given in the discussion of each recommendation. The recommendations are these: l. Continue to refine and test the model of selection. 2. Continue to refine the measurement methods used to operationalize the model. 319 3. Continue to test use of the model for evaluating film description styles. 4. Define the state-of-the-art of film descriptions in behavioral terms. 5. Test other applications of the model. 6. Develop other behavioral models of the selection process. Refinement and Testingyof the Model The experimental phase of this study should be replicated using an improved experimental design. Replication of the study would be particularly helpful in confirming or disconfirming the assumptions of the model which were not tested by this study. A fully factorial experimental design would be beneficial to use. (Quality ratings should be solicited from all treatment groups during session one.) A representative sample of stimulus conditions should also be used--conditions which will tend to elicit "low," "mediocre," and "high" relevance ratings, as well as "low,“ "mediocre," and "high" quality ratings. The variables "relevance" and "film quality" should be explicitly defined and clarified for subjects involved in follow-up studies. Alternative definitions should be explored and tested. Attention should be placed upon identification of the degree to which instructional pertinence judgments, film quality judgments, and betterness judgments influence each other, and clarification of the ways in which the judgments can be elicited as unique, indepen- dent judgments. 320 Other ways to manifest instructional pertinence, quality, and betterness judgments in the form of observable, measurable responses should be explored and tested. The relevance research literature should be reviewed for related suggestions. The studies by Cuadra and Katter (l967a, l967b) and Saracevic (1970a, l970b) would be helpful in accomplishing this task. Refinement of the Measurement Methods Behavioral measurement methods, criteria, and significance testing procedures other than those which were tested by this study should be explored, which can be used effectively to define "good" vs. less desirable film description styles. The critical question to be answered is, "What criteria and procedures can be used to define 'significantly different' judgment tendencies which are elicited from various styles?" The measurement methods employed in this study are somewhat primitive. The methods should be improved and compared with others which can also be used to operationalize the model. Emphasis should be placed upon identification of the most sensitive, reliable, and convenient methods which can be used. The use of frequency distributions as indices of response should be investigated. A wide variety of frequency distribution indices can be obtained from use of the model. To illustrate, the frequency distributions of these factors could be used meaningfully for comparative purposes: 321 l. The types of betterness decisions which are made. 2. The types of rating patterns which are exhibited (macro- types and micro-types). 3. Response "switches": the degree to which subjects switch betterness decisions or rating categories from session to session. 4. Individual vs. group response characteristics. 5. The magnitude of the difference noted between paired ratings. 6. Intermeasure rating similarities exhibited by two or more measures obtained from a given subject (e.g., lowest vs. highest values). 7. The rate of occurrence of significant main effects and simple interaction effects. 8. The types of evaluation-selection referent criteria exhibited by subjective comments. "Thresholds" of acceptance/rejection could be established as significance testing criteria for these and other frequency distribu- tion indices, based upon a priori ratios or percentages of occur- rence. As well, conventional nonparametric significance tests could also be used with the indices. Comparisons should be made of the sensitivity and reliability of conventional parametric and nonparametric significance testing methods which can be used, as well as methods based upon other "thresholds" of acceptance/rejection. 322 The measurement literature and relevance research literature should be reviewed to identify other suggestions for improving the measurement methods used to Operationalize the model. For instance, the types of rating scales and response modes (rated vs. ranked judgments, comparative vs. absolute judgments, etc.) which may be best to use should be identified and tested. The work of Cuadra and Katter (l967a, l967b) would be helpful for this task. The "normal" range of fluctuation which can be expected for the types of rating errors which were found in this study should be determined. Two or more types of indices of consistency should be used in a given study so that the effectiveness of different indices can be compared. Both objective and subjective measures should be used in follow-up studies. The subjective comments obtained in this study provided valuable insights into why specific types of judgments were obtained. Subjective comments should be obtained for each of the three types of judgments which are elicited from use of the model. (This study did not solicit comments for the relevance judgments.) Instruments to aid categorization of the types of comments made by subjects and the types of evaluation-selection referent criteria exhibited by the comments should be developed. The relia- bility of associated frequency distributions would be substantially improved via use of appropriate instruments. The reliability of the 323 instruments and of the frequency distributions obtained from their use should be determined. Testing the Use of the Model for Evaluating Film Descriptions Although the model was found to be useful in evaluating the experimental film descriptions, the stimulus conditions used during the experimental phase of the study were hypothetical ones. The use of the model would be best tested under conditions which were not merely hypothetical ones. The model should be tested by using it to evaluate the effectiveness of several different film description styles avail- able within the marketplace. Styles which vary distinctly in terms of the types of instructional pertinence indicators and film quality indicators contained within them should be compared in initial follow-up studies. Experimental designs which are more reliable than the one used for this study need to be identified and used with the model. Designs should be used which eliminate or at least control the types of confounded effects and rating error effects identified by this study. Experimental designs which provide generalizable results should also be used, as well as designs which increase the power of significance tests. The publication by Campbell and Stanley (1969) should be consulted for appr0priate suggestions. The publication describes factors which influence the internal validity and external 324 validity (generalizability of results) of a given experimental design. It also describes the strengths and limitations of various designs. Although the repeated measures experimental design model generates undesirable sessional effects, experimentation with it should still continue, for the model provides the opportunity to acquire intrasubject indices of reliability which can be compared to corresponding intersubject indices. The minimal and optimal sample size and number of stimulus conditions which should be used in comparing different film descrip- tion styles needs to be identified. Defining Descriptions in Behavioral Terms The model of selection could be used potentially, to begin defining the effectiveness of available film descriptions in beha- vioral terms. Several types of follow-up studies would be useful. For example, it would be both interesting and worthwhile to compare a representative sample of commonly used film description styles that were very different in terms of the types of instructional perti- nence indicators and film quality indicators which they contained. The relevance, quality, and betterness judgments elicited from the styles could then be compared in terms of basic similarities and differences. This approach would likely* reveal some of the general characteristics of descriptions which influence the reliability of selection judgments that are elicited from descriptions. 325 Another study which could be pursued would be the following. Preferred and recommended film description styles would be identi- fied from appropriate surveys aimed at specific target groups. Once again, the selection judgments elicited from the styles could be compared for basic similarities and differences. Hopefully, both types of styles would tend to elicit;reliablejudgments. If not, follow-up experiments could be pursued to determine the effects of revising the styles in ways thought to be beneficial. Comparisons could also be made of commonly used styles which did pee contain film quality indicators, to identify the influence of different combinations of instructional pertinence indicators. By the same token, comparisons could be made as well, of styles which contained different film quality indicators. Overall, a series of follow-up studies would be beneficial which were designed to: l. Identify the combinations of instructional pertinence indicators and film quality indicators which tend to elicit reliable vs. unreliable judgment tendencies; and 2. Determine, alonga corrmon baseline of comparison, the actual reliability of commonly used, preferred, and recommended description styles. A factor which should also be useful to investigate is the effect of step—size between ratings supplied in peipe_of descriptions. The experimental results of this study implied that as the step-size increases, the influence of the ratings also increases. 326 Another beneficial study would be one designed to determine the deterioration rate of film descriptions, the rate of speed with which film descriptions become outdated and unreliable. Similar studies could also be done to determine the general reliability and the deterioration rates of ratings, evaluative comments, and other quality indicators which are supplied in descriptions. Testing Other Applications of the Model Follow-up studies would be useful which tested different applications of the model. Initial studies should be designed in ways which allow the generalizability of the underlying assumptions of the model to be verified. Studies which simultaneously compare the evaluation-selection skills of different target groups, and the effectiveness of different measurement methods, would also be worthwhile. The general reliability of the model as a research and evaluation tool would be best confirmed by testing different appli- cations of the model. DevelOpment of Other Behavioral ModETs Other behavioral models of the film selection and media- materials selection processes can surely be defined, which could be used to evaluate the effectiveness of film description styles. The information science and decision-making research litera- ture should be reviewed to identify constructs which can be used to build other models of the selection process. 327 Footnotes--Chapter V lThe "halo" effect refers to the tendency of ratings about specific qualities or attributes being judged, to be made in direc- tions similar to the general impression in one's mind (e.g., favor- able, unfavorable) about the item or thing which is being judged. Guilford (l954) distinguished "halo" effects from "logical errors, which produce similar results--stronger correlation values than normally expected. "Logical errors" occur when judges "give similar ratings for traits [factors, attributes, variables] that seem logically related in the minds of the raters" (p. 279). For related discussions, see Guilford (1954, pp. 278-80), Thorndige (l920, pp. 25—29), and Jackson and Messick (l967, pp. 126- 3l, lSO . APPENDICES 328 APPENDIX A THE RATING INSTRUMENTS USED BY THE FILM EVALUATION PANEL 329 330 .mmimm mcowpmmzc ou ucoammc ucm umwmwumam mmcouwgu pm as» we comm mum; .Epwm cm>wm m mcwzmw> mem< .m .cowumopwwcmpu com xmc mmmm_a .mcwcmms :_ p:m_m>?asm co mzoamwnsm m? mews“ -Acm wH .pcoE:Lumcw mwza cw cum: mmpmom mcwpmg can mwcmuwgu mcwpwc mgu sup: epmmcso» mNPmepmsmm .F ”meowuumcwo nxwwumamv gmzuo ihmwcm xwwumamv Smwpawumam taupe: “panazm pmwFa_uaam meow: .om xcmucmempm .LouuzcumcHucommmwoLQ cmcummp menacwsmpm mcoe go op 03 I f\ cue m-_ mcmm> cw mucmwcqum .umwww_m:c men 30» cows: cw Lmuco m>wumch an“ cw mmmcogmmc Lao» Luggac .mamz Pmcm>mm cw couum_mm mapww mg» mumspm>m op um_mwpm:c ago no» mH .mmwucmaxm mo mmmcm Lao» o“ mgmmmc cows; zofimp m>wuocgmupm gumm xomzu mmmmpa "mums ”coumapm>m PH4on< .mmmmu _Fm cw .ucmsmnzn m>wuwmom osmcuxm cm :0: o "acmEmuan m>wuommc memcuxm cm mucmmmcamc =F= < .cowcmuwcu comm cow =m= op =P= Eocm m? Eaacwucou mcwumc ugh .mmmmmm >Pu__m> cu nm_ww_m:c Pame.mmm on sex gown: mmcwpmc Ppm aw wpugwucm mpaaoo .cowcmuwcu comm mmwemppmm E—wm mg“ gown: ou mmcmmu we» to ucmsmvam Lao» mcwucmmmcamc m=~m> an» wwwpucwucm xn mmmcoammc mama 332 upeewmm_o me amen cop mmm o m apexee xgm> m m apexee >gm> o m pee>mpmm xgm> o m uemsuemgh gevgmeem m m mpeee~e> acm> m m ewucmgue< acm> c m ppeewmwwe me amen eee cue: N _ apmywwu zgm> pez N P apexwuu acm> uez N F pce>mpmggn xgm> N — ucmsuemch geee xgm> N p epee=_e> eoz N _ seem xgm> Nmecmweee amuse» mg» Lem muewmeeseee _m>mp agepeeeee> ms» mm .5 4m>m4 onmzmzmmezouu>m<4=m Nmuemweee amuse» me» »e em>mpcee me em apmxpp mmeemee m.s—Pm mg» mm Nme:m_eee ummgeu mg» em mempe me e» x—mxwp 5F?» me» we mmeegee mg» mg .o .m m>~huuwmonmmomm=a Nmemme Feeewpeegumep agegeesmmeee em pee>mpmc mmemmms m.E—.w ms» nu Ncmuues uemneem me» we mmeue: msu ea mpeeHFem .»Fm>wuemmmm emuemce uemuceu n.5va mg» mm Nmeemweee ummceu mg» on mepe> pe:e_ueeeem we .mpeszcugez uemucee Lmuues pumneem mg» mu A.eem .mmmecesm :_ meweem—mws ue: .emem>ece eewueELewew me magma c_ mueeueuuee .emmeez z~_eeuemppmuep .mpewemme .muegeeee xppeepmpuemwemv Newuemgeee Eye» mse we memeeee cmuues uemneem men «H .e .p >FHQH4<> hzmhzou 333 uemsuemgh mepcmeem o mm: Le_cmeem m 29.: xmm> mummy new m>wm -eemumceu o mewummcmu lem appe neewpemexm o ememewee eoz e upeewww_o me xmem 8» Team 0 uemspem: ceee >gm> p mm: seem Acm> .29.: E eoz mcwcem zgm> emueeueo xcm> P upemwmm_o Le xmem ee» gee: p Nppecm>e .empeee; eee emuempmm ppm: pempeee peemm> m.epwm mce mH .m. A.epm .xseemmeeene eweeemmpmu ce eweeemegewe .eewuee izepm .mmeepumewa ”mmeemeemm emmEee ewecee meePeee ece eewees we mmeueee ”seemeswee "me seemv Nmmeuce>ee ea emme aseegmeuene mmeuewe eemues ea meewce mmeepcsemu Peemp> mc< .Np >PH4<=O 4<=mH> Nmeemweee emmcee mzp cw;e_z mmewpmmw m>wueemumcee me meewmmmceew m>wueemmmcee mxe>m ea zpmxwp ape; mge mm ._F Nmeemweee emote» mge ea me em x_m¥__ spew mgu we mewpemeee nee mcpummcmuew 3e: .op zoahump cewmemzmceseu mgp mH .m 334 ucmEeemcp Lewmmeem o pcmspemm» cewgmeem o xpmxwuu xgm> uez m>wmwmem xgm> c Neexee xtm>.mmm qu—eeo Lewmmeem m acmspemgh Leee xmm> p xpmeee xgm> emm: F xpmxwm xgm> m>_uemmz xgm> — »_ex_e xcm> meepeee ceee xgm> Nmeemweee ummgeu me» we :eem :ewuemppe eee mummgmecw .Pm>m— mme men e» .Ppecm>e .emeeeee eee emeemgp spew mnu mH A.eum ”mpeemw> me mcwpece: eee commempmm ”mpxum m.mmuwgz pewmem mg» magm>wpme we mpzum m.meuecgee mnu “meemmmm eceem eee e_mes eeeecmxeee we mm: "mmewmeeuv Nmmeemee m EFWN IIIIP mg» m>mmzee epm; ea ppm; emme mcep eee eeee mc< .m_ .mp P2m£hmo mwo Nmeemweee pmmcee mgu ewcuwz emeceeemmc me em apmxwp .mmeemwuue mFeecwmmeee x—Pepememm .mmewcmemp ueeuPEeecee m—eegwmmeee mm< A.eum .eewuepcmmmmemcmwe .mee_ue~wmmeeeme;e mcweememe .meweewmce .mepe meece sec» seemmcm "meme“ eeem me mce_uec eee mme_e> —eweem emeemeee "cme_meeuv Nmmeuee cw m>wewmee appecmcmm peepm mmemmms m.spmm mg» mm .mp .mF hz<4m moPeee .xuwpeee :_ Pecemmmmwece HemEeemcu _e:mw> mg» mm .v~ 335 N 0000 m:_eceem -ueo xpegh .Lewcmeem o m Q'Q'Q'Q' Amp-epv Am-uv Ae-ev Am-¥v Leee xcm> mcweeeumuee x_:gp .mewcmeemnun Leeuunm 000m hLm>II© LOOQIIN eeeuuum Leee xgm>uup mmecm>mp meemweee zuem Le» .EFPm m_;u Lem m:_pec ppecm>e eeem e we mueewumm ammo gee» mFegwecm .3epme mpeem Heweeum ms» em mcwmcmmmm ._N Ammcweec cmnme _Pe we mmecm>e mgu Ap_cemmmemc uezv Nmeemweee mmmceu mcu me» 5.?» mwge we mmmcm>weem$wm —ewu:muee ece ammFeee me“ we mewueg —Fecm>e eeem e me yes: .ON onmmmmezH heo 336 ”Fm>me eewmemzmceseo .N ”memEuemmp pempeeu .m upm>mm be3eeee> ._ ANemsepwe xppemc spew mwsm mp m~m>mp meecm ewmwemem peg: gemv ”mpm>mm meecw ”meewuewcummm mmema “Nmecmweee pmmeee mg» saw; emme me pmme spew menu agave mmwuw>muee pecemmeecumcw peg: gem me ewe: geezezv "meempmmmmem mmem: "mmmmmcxemz meme: ”msummmmum come: .0 A.>gemmmeme m? mewm gmzee mmnv .emzmw>mce emew m>ez 3e» Efiwm mge we mewem_cmeeece;e ms» ueeee zepme x_mmwce acmeeee mmempe .NN Asp_m emcemv IIIemzme> spec emewc mee toe emmem m_;e mme,«« 337 Pm>me cewmemcmgeeeo .N uemsuemm» “emueeu .m u_m>me zeepeeeee> .— ANemgepme a—Pemm ape; menu we mpm>m~ meemm emmwemem Hes: geuv ”mpm>m_ meecu .m ”meewpewcummm mmem: .e thmeemweee peace» msp new: emme me meme E_ww mwcp agave mmwuw>wuee Peeemueemmmcw peg: mew Le ewe: peg: cHV "mcewpmmmmem mmem: .u ”mmmmmcxemz gene: .e umfimcmcum Lowe: .e A.xcemmmeme $_ mewm cmgee mmav .em3m_>mce ewe“ m>ec sex E__m me“ me mewemwcmpeegece mcm eeeee zepme prmwme acmEEee mmempm .MN Ae__a ecoeemv ,,,eezee> sp_c ecoumm ace toe eeeem m_;e em=,,, APPENDIX A2 FORM FOR ASSESSING THE CONTENT AND FACE VALIDITY OF FILM ABSTRACTS Film Title: Evaluator: Date: Directions: l. Familiarize yourself with the rating criteria, film abstracts and rating options used in this instrument. After viewing a given film, rate each of the l7 criteria specified (l2 in first set, 5 in second). Rate responses by encircling the value representing your judgment of the degree to which information in the abstract satisfies each criterion. If a particular criterion does not apply, put a line through the rating scale for that criterion. The rating options are "l" or "2" for each criterion. A "l" indi- cates that the criterion has ndi_been satisfied; a "2" indicates that the criterion Hee_been satisfied. Keep in mind the following constraint: The amount of information in the synopeis cannot exceed one paragraph in length. Hence, conciseness is a premium; use of key adjectives, phrases and modi- fiers is a must. As well, all modifiers used should be neutral or noncritical in tone. LASTLY AND MOST IMPORTANTLY: a. Please note suggestions for improvement for each criterion not satisfied (6n the abstract sheet, front or back side) by epeci- fying what additional information is needed to satisfy the criterion. b. Please encircle any information in the abstract which appears to beTinvalid or misleading. If anything is ambiguous or not clear in meaning, please ask for clarification. 338 339 Criteria for Rating the Information Adequacy of Film Characteristics Described in the Abstract Information Adequacy A. To what degree are the following l2 character— Rating istics of the film directly defined within or inferable from information in the film abstract? B. Is information about the film characteristic, accurate and representative of dominant features of the film? (1) (2) Needs Ade- Improve- quate FILM CHARACTERISTICS ment l. Purpose--Objective? l 2 2. Subject Area--T0pic Focus? l 2 3. Content Themes and Relationships? l 2 4. Audience Level? (Primary, Early El, Upper El , JHS) l 2 5. Target Population Slant? 1 2 (Is the film slanted in any way and would best be used or not used with Specific groups? e.g. for religious, minority, suburban, urban, rural, or ethnic groups; distinct age groups or levels; geographic groups, etc.) 6. Unique Potential Instructional Uses? l 2 (Basic instruction, enrichment, remediation, discussion, inquiry units, creativity assignments) 7. Major Settings? . l 2 8. Characterization? l 2 (Which social groups, institutions and individ- uals are dominantly portrayed in terms of school-age level, sex, nationality-ethnicity, culture, geographic locale, race, religious or political belief, working class, socioeconomic class, trade, profession, occupation, social status, social historical period, famousness- popularity, or other significant consideration?) 9. Iype of Film? l 2 (e.g., fictional, documentary, inquiry-oriented, animated, cartoon, training, TV-cleared, etc.) 340 Information Adequacy Rating (1) (2) l0. Basic Message Treatment and Form l 2 (Dominant treatment techniques or organizational forms used such as cinema verite, multi-image, montage, time lapse, or slow motion photography, animation, no narration, etc.) ll. Approach-Method-Style l 2 (e.g., case-study, instructive, investigative, problem analysis, problem solving, exploratory, inquiring, historical, chronological, contras- tive, creative, etc.) l2. Viewpoint-Mood-Tone l 2 (Humorous, entertaining, dramatic, slanted, critical, indicative, factual, demonstrative, open-ended, practical, theoretical, scientific, moralistic, religious, logical, ironic, satiri- cal, inspirational, etc.) Criteria for Ratingithe Overall Information Adequaey of Film Abstracts To what degree are the following abstract characteristics met? l. Subject Matter Relevance and Content Focus? l 2 (Are these the dominant factors captured in the abstract? They should be.) 2. Unigueness? l 2 (Is the unique nature, content, and subject matter focus of the film adequately indicated in the abstract?) 3. Readability-Comprehension? l 2 (Is the abstract vague or ambiguous in any way? Are any technical or professional terms used, not likely to be recognizable by the average elementary teacher?) (Please encircle unrecog- nizable terms and vague or ambiguous portions of the abstract and indicate which is which.) 341 4. Overall Validity? (Is the overall abstract a valid, truthful, accurate representation of the essential nature, content and subject matter focus of the film?) 5. Neutrality? (Is the abstract neutral in tone with no critical overtones present, positive or negative, or hints of qualitative worth?) Comments or Suggestions: Information Adequacy Rating (1) (2) l 2 APPENDIX B THE RATING INSTRUMENTS AND SURVEY FORM USED BY THE EXPERIMENTAL SUBJECTS 342 APPENDIX Bl THE ABSTRACT QUESTIONNAIRE Session Code: Social Security # Session Date: Time: Subject # Directions: l. Please indicate your social security number in the upper right-hand corner of this page. Your SS# will be used to associate your responses from this session with others to follow, and for this purpose only. All responses to questions asked by this study will remain anonymous. This information packet contains descriptions of a pair of elemen- tary level, l6 mm films and a series of related questions. Pull out the film descriptions (next two pages) for your use while answering the questions. After reading and comparing the film descriptions, answer the ques- tions which follow in the order in which they are presented. You may refer to the film descriptions as needed. Please respond to questions in the order in which they are pre- sented. Your responses will be of two types: a. Rated responses should be encircled on scales provided. Encircle the pne_numeric value which represents your best judg- ment of the questions asked. Only one value should be encircled on any scale. Please encircle a value for ell scales provided. b. Comments should be printed or legibly written. Respond in as much detail or depth as needed to make your point. Please respond to all questions asked. If you have any questions as you progress through the information packet, ask the proctor for assistance. When finished: a. If you have not previously filled out an "Elementary Teacher Survey“ form for this study, please complete the one attached. b. Return this information packet to the proctor before leaving. You may leave whenever you are finished. 343 C. d. Note: 344 Please check with the proctor to verify when and where your final session will be. Thank you for your cooperation and participation! The pages in this questionnaire are representative, only. The sequences of pages (prior to last page) and sequence of stimu— lus conditions ([ ]'s) was randomly presented in each ques- tionnaire. 345 Question: If Film One and Film Two were available locally through a rental or loan agency, what would you assume the overall qpality of each film to be for use with a fifth grade level class? Indicate the quality below by encircling the most appropriate value for each film. Film One: l 2 3 4 5 6 7 [F1] Very Poor Fair Aver- Good Very Supe- Poor age Good rior Film Two: 1 2 3 4 5 6 7 [F2] Very Poor Fair Aver- Good Very Supe- Poor age Good rior Note: This sheet was used by the treatment group I subjects only. 346 Read the following definition and respond accordingly to the questions posed. Relevance--a measure of the degree to which a given consideration is logically related or pertinent to another consideration. Question: Suppose you are planning to teach a typical fifth grade elementary class in the southwestern United States. a. Rate the degree to which you feel Film One is relevant in general to a hypothetical fifth grade level unit of [01F1] study entitled: "Characteristics of Water." Film One: 0 l 2 3 4 5 6 7 8 9 l0 Not 50% 100% Relevant Relevant Relevant b. Rate the degree to which you feel Film Two is relevant in general to a hypothetical fifth grade level unit of [02F2] study entitled: "Man's Water Supply." Film Two: 0 l 2 3 4 5 6 7 8 9 10 Not 50% lOO% Relevant Relevant Relevant 347 Read the following definition and respond accordingly to the question posed. Relevance--a measure of the degree to which a given consideration is logically related or pertinent to another consideration. Question: Suppose you are planning to teach a typical fifth grade elementary class in the southwestern United States. Rate the degree to which you feel Film Two and Film One are relevant in general to a hypo- thetical fifth grade level unit of study entitled: "What Can We Find Out About Water and Its Use?“ (encircle one value for each film) Film Two: 0 l 2 3 4 5 6 7 8 9 10 [O3F2] Not 50% 100% Relevant Relevant Relevant Film One: 0 l 2 3 4 5 6 7 8 9 lO [03F1] Not 50% lOO% Relevant Relevant Relevant 348 Question: Suppose you are planning to teach a typical iiiep grade elementary class in the southwestern United States. Rate the degree to which you feel Film One or Film Two would be better for you to use for the following situation: To supplement or enrich a hypothetical iiien grade level unit of study entitled: "What Can We Find Out About Water and Its Use?" [03035)] (encircle one response only) l 2 3 4 5 6 7 8 9 Film One No Film Two Much Better Differ- Much Better ence Question: Why did you rate your response to the above question as you did? Please describe the criteria or key factors which you considered in making your decision. Comment: (B-l) 349 Question: Have you ever seen Film One or Film Two? Check the appr0priate response for each below. Film One: Yes No Not Sure Film Two: Yes No Not Sure Note: Please complete the "Elementary Teacher Survey" form (next page) if you have not done so as yet for this study. You have completed this session. Thank you for your cooperation and participation. Return this packet to the proctor and check with the proctor to verify when and where your final session will be. Your final session will require about 30-40 minutes of participation time. (5-1) APPENDIX 82 THE FILM QUESTIONNAIRE Session Code: Social Security # Session Date: Time: Subject # Directions: l. Please indicate your social security number in the upper right hand corner of this page. Your SS# will be used to associate your responses from this session with those from prior ones, and for this purpose only. All responses to questions asked by this study will remain anonymous. This information packet contains a series of questions about two elementary school level films which you will preview. The first film you view will be referred to as Film One and the second, Film Two, in all cases in this information packet. After viewing Film One, respond to the three questions about Film Qne_in the packet. Questions about Film One are on the next two pages. After completing the questions for Film One, stop to view Film Two. The projectionist will start the films whenever every- one appears to be ready. After viewing Film Two, respond to all other questions. After reading through direction #9 below, skim through the infor- mation packet to give yourself an overview of the questions to be asked about the films. You will not be told anything else about the films other than what you can infer from information in the packet. Therefore, as you view each film, try to determine the overall relationship of the film to questions asked in the packet. Please respond to questions in the order in which they are pre- sented. Your reSponses will be of two types: a. Rated responses should be encircled on scales provided. Encircle the pee numeric value which represents your best judgment of the question asked. Only one value should be encircled on any scale. Please encircle a value for ell scales provided. 350 35l b. Comments should be printed or legibly handwritten. Respond in as much detail or depth as needed to make your point. 7. Please respond to ell_questions asked. 8. If you have any questions, ask the projectionist for assistance. 9. When finished: a. If you have not previously filled out an "Elementary Teacher Survey" form for this study, please complete the one attached. b. Return this packet to the projectionist before leaving. You may leave whenever you finish. Thank you for your partici- pation! Note: The pages in this questionnaire are representative only. The sequence of pages (prior to last page) and sequence of stimulus conditions ([ ]'s) was randomly presented in each question- naire. 352 Read the following definition and respond accordingly to the question posed. Relevance--a measure of the degree to which a given consideration is logically related or pertinent to another consideration. Question: Suppose you are planning to teach a typical fifth grade elementary class in the southwestern United States. a. Rate the degree to which you feel Film One is relevant in general to a hypothetical fifth grade level unit of [01F1] study entitled: "Man's Water Supply." Film One: 0 l 2 3 4 5 6 7 8 9 10 Not 50% 100% Relevant Relevant Relevant b. Rate the degree to which you feel Film One is relevant in general to a hypothetical fifth grade level unit of [03Fl] study entitled: "What Can We Find Out About Water and Its Use?" Film One: 0 1 2 3 4 5 6 7 8 9 10 Not 50% 100% Relevant Relevant Relevant (R-32) 353 Question: Rate the overall quality of Film Two for use with a fifth grade level class, by encircling the most appropriate value below. Base your judgment upon whatever criteria or fac- [F] tors you deem important. (Please do not downgrade the film 2 because of scratches or bad splices, etc. on it, since such flaws are beyond the control of the study.) Film Two: 1 2 3 4 5 6 7 Very Poor Fair Aver- Good Very Supe- Poor age Good rior Question: Why did you rate the overall quality of Film Two as you did? Please describe the criteria or key factors you considered in making your decision. Comment: (0-22) 354 Question: Suppose you are planning to teach a typical fifth grade elementary class in the southwestern United States. Rate the degree to which you feel Film One or Film Two would be better for you to use for the following situation: To supplement or enrich a hypothetical fifth grade level unit of study entitled: "What Can We Find Out About Water and Its Use?" (encircle one response only) [03(F1lel l 2 3 4 5 6 7 8 9 Film One No Film Two Much Better Differ- Much Better ence Question: Why did you rate your response to the above question as you did? Please describe the criteria or key factors which you considered in making your decision. Comment: 355 Read the following definition and respond accordingly to the question posed. Relevance-~a measure of the degree to which a given consideration is logically related or pertinent to another consideration. Question: Suppose you are planning to teach a typical fifth grade elementary class in the southwestern United States. a. Rate the degree to which you feel Film Two is relevant in general to a hypothetical fifth grade level unit of [O F ] 3 2 study entitled: "What Can We Find Out About Water and Its Use?" Film Two: 0 l 2 3 4 5 6 7 8 9 10 Not 50% 100% Relevant Relevant Relevant b. Rate the degree to which you feel Film Two is relevant in general to a hypothetical fifth grade level unit of [Oze] study entitled: "Characteristics of Water." Film Two: 0 1 2 3 4 5 6 7 8 9 10 Not 50% 100% Relevant Relevant Relevant (R-33) 356 Question: Rate the overall quality of Film One for use with a fifth grade level class, by encircling the most appropriate value below. Base your judgment upon whatever criteria or fac- [F1] tors you deem important. (Please do not downgrade the film because of scratches or bad splices, etc. on it, since such flaws are beyond the control of the study.) Film One: I 2 3 4 5 6 7 Very Poor Fair Aver- Good Very Supe- Poor age Good rior Question: Why did you rate the overall quality of Film One as you did? Please describe the criteria or key factors you considered in making your decision. Comment: (Q-Zl) 357 Question: Have you ever seen Film One or Film Two? Check the appropriate response for each below. Film One: Yes No Not Sure Film Two: Yes No Not Sure Please complete the "Elementary Teacher Survey" form (next page) if you have not done so as yet for this study. Note: You have completed this session. Thank you for your cooperation and participation. Return this packet to the proctor and check with the proctor to verify when and where your final session will be. Your final session will require about 30-40 minutes of participation time. (5-1) APPENDIX 83 ELEMENTARY TEACHER SURVEY FORM (Please print) Name Social Security # ’(Tast) (first) Male or Female? (encircle one) Local Phone (summer session) Classes enrolled in during summer session, July 2-August 10 Dept. Bldg. & No. Course Title Time Days & Rm. Instructor Total number of undergraduate and graduate credit hours accumulated in Audiovisual Instruction, Edu- cr. hrs. cational Media or Instructional Technology (approx.) Total number of graduate credit hours accumulated in Elementary Education (approx.) _____cr. hrs. Total number of years of full—time elementary (K—3) __iyrs. school level teaching experience. (4—6) __dyrs. (7-8) ___yrs. Total ___yrs. Indicate the last year during which you taught full-time at the elementary school level. Rate the degree to which you prefer teaching social studies topics as opposed to science topics, at the elementary school level: (encircle one response only) 1 2 3 4 5 6 7 8 9 Strongly No Strongly Prefer Distinct Prefer Science Prefer- Social Topics ence Studies Topics 358 10. ll. 359 Encircle the types of communities below in which you have taught full-time at the elementary school level. a. Kind: rural, urban, suburban b. Population (approx.): 5,000 or less; 5,000-15,000; 15,000-50,000; 50,000-100,000; 100,000-500,000; 500,000 or more On the reverse side, describe any special training or experience you have had related to selection and evaluation of instructional materials (coursework, summer institutes, inservice programs, etc.). Local Address (summer session): (street and number) (community) (zip code) APPENDIX C THE EXPERIMENTAL ABSTRACTS 360 APPENDIX C] THE TYPE I EXPERIMENTAL ABSTRACTS Film #1 11 minutes, color tgygi; Intermediate, Junior High (Grades 5-9) SUBJECT AREAS: Science, Physical Science SUBJECT TOPICS Water, Properties of Water, Water Vapor, Steam, Ice, TREATED: Fog, Properties of Matter, States of Matter, Evapora— tion, Condensation, Molecules, Molecular Theory, Solutions, Dissolution SYNOPSIS: An instructional film which investigates, demonstrates and explains through the use of animation, time-lapse photography and simple experiments, the nature, behavior and properties of water. Shows how water exists in three states of matter-~solid, liquid and gas, and what happens to it when it changes states. Explores and explains how and why heat increases the dissolving rate of water; how wind and heat speed up its evaporation rate, and that water vapor is an invisible gas. Uses a variety of familiar examples as illustrations: sugar dissolving in water; wet clothes hung out to dry; a sweating root beer glass; fog; the making of ice cubes; a steaming kettle and a steam locomotive. Poses questions throughout the film and uses an intermediate-level boy at his home to perform experiments suggested by the narrator. Shows key vocabulary terms on the screen. VOCABULARY--COMPREHENSION LEVEL: Grade 5 and above. Somewhat technical, using terms such as molecules, condense, condensation, evaporate, evaporation, dissolve, substance, temperature, beaker, flask, water vapor, particles. Uses a concise format; presents information at a fairly quick pace. POTENTIAL USES: Basic, supplemental, enrichment or remedial instruc- tion about the nature, behavior and properties of water or changes in states of matter. Supplemental or enrichment instruction for all subject topics listed above. Use near the middle or end of a unit of study after some introduction to the concepts and vocabulary treated. Usable as a prelude or follow-up to experimentation by pupils. Possible usage restriction: demonstrates two experiments which could be dangerous if an unsupervised youngster tried to do them on his own. Warns pupils not to experiment on their own, how— ever. 361 362 TARGET POPULATION SLANT: The content treatment appears to be pitched to grades 5 and 6. Slanted by omission; shows no minorities or females. 363 Film #2 11 minutes, color LEVEL: Primary, Intermediate (Grades 3-6) SUBJECT AREAS: Social Science, Economics, Natural Resources, Conservation SUBJECT TOPICS Water, Water Resources, Community Water Supplies, TREATED: Water Economics, Filtration Plants, Pumping Stations, Dams, Occupations, Water Supply Workers SYNOPSIS: A documentary film which explores the question, "Why does water cost money?" by showing how a city gets its water supply. Shows through live and behind-the-scenes photography in the western USA, how water is obtained from its natural sources, cleansed and treated in a filtration plant and transported to the city for vari- ous uses. Reveals that rain and snow, the basic sources of water, eventually become stored in rivers, lakes, dams, streams, and under- ground reservoirs, both near and far away from actual users. Explains how pumping stations, pumps, aqueducts, water mains and pipes are used to carry water from places of storage to places of use. Illus- trates and describes how Special workers and machines are needed to provide adequate community water supplies. Depicts the role of workmen such as well diggers. dam builders, equipment operators, filtration plant employees, laboratory technicians, trench diggers, pipe layers and a water meter reader. Closes with a summary of main ideas treated. VOCABULARY--COMPREHENSION LEVEL: The content treatment seems mixed; the narrative and vocabulary directed to primary pupils; the visual treatment to grades 4 and above. The vocabulary is basically non- technical in nature. Uses some technical terms such as pumps, wells, pumping station, filtration plants, chemicals. Uses a slow presen- tation and delivery pace in general. POTENTIAL USES: To supplement or enrich a unit of study on water, natural resources, or cities to explore the behind—the-scenes work necessary to provide an adequate community water supply. Usable at any stage of a unit; for instance to introduce a unit, for discus- sion purposes, or as a review. TARGET POPULATION SLANT: Slanted by omission; shows no minorities or females. APPENDIX C2 THE TYPE II EXPERIMENTAL ABSTRACTS Film #1 11 minutes, color LEXELF Intermediate, Junior High (Grades 5-9) SUBJECT AREAS: Science, Physical Science SUBJECT TOPICS Water, Properties of Water, Water Vapor, Steam, TREATED: Ice, Fog, Properties of Matter, States of Matter, Evaporation, Condensation, Molecules, Molecular Theory, Solutions, Dissolution SYNPOSIS: An instructional film which investigates, demonstrates and explains through the use of animation, time-lapse photography and simple experiments, the nature, behavior and properties of water. Shows how water exists in three states of matter--solid, liquid and gas, and what happens to it when it changes states. Explores and explains how and why heat increases the dissolving rate of water; how wind and heat speed up its evaporation rate, and that water vapor is an invisible gas. Uses a variety of familiar examples as illustrations: sugar dissolving in water; wet clothes hung out to dry; a sweating root beer glass; fog; the making of ice cubes; a steaming kettle and a steam locomotive. Poses questions throughout the film and uses an intermediate-level boy at his home to perform experiments suggested by the narrator. Shows key vocabulary terms on the screen. VOCABULARY--COMPREHENSION LEVEL: Grade 5 and above. Somewhat technical, using terms such as molecules, condense, condensation, evaporate, evaporation, dissolve, substance, temperature, beaker, flask, water vapor, particles. Uses a concise format; presents information at a fairly quick pace. POTENTIAL USES: Basic, supplemental, enrichment or remedial instruc- tion about the nature, behavior and properties of water or changes in states of matter. Supplemental or enrichment instruction for all subject topics listed above. Use near the middle or end of a unit of study after some introduction to the concepts and vocabulary treated. Usable as a prelude or follow-up to experimentation by pupils. Possible usage restriction: demonstrates two experiments which could be dangerous if an unsupervised youngster tried to do them on his own. Warns pupils not to experiment on their own, how- ever. 364 365 TARGET POPULATION SLANT: The content treatment appers to be pitched to grades 5 and 6. Slanted by omission: shows no minorities or females. OVERALL RATING FOR AUDIENCE LEVELS: * Grades 4-6: Good * Grades 7-9: Average-Good *(Rating Scale: Vepy_Poor, Poor, Fair, Average, Good, Very Good, SuperiorTT'Ratings determined by a panel of elementary teachers, elementary education professors, subject matter specialists and audiovisual media specialists. 366 Film #2 11 minutes, color LEVEL: Primary, Intermediate (Grades 3-6) SUBJECT AREAS: Social Science, Economics, Natural Resources, Conservation SUBJECT TOPICS Water, Water Resources, Community Water Supplies, TREATED: Water Economics, Filtration Plants, Pumping Stations, Dams, Occupations, Water Supply Workers SYNOPSIS: A documentary film which explores the question, "Why does water cost money?" by showing how a city gets its water supply. Shows through live and behind-the-scenes photography in the western USA how water is obtained from its natural sources, cleansed and treated in a filtration plant and transported to the city for vari- ous uses. Reveals that rain and snow, the basic sources of water, eventually become stored in rivers, lakes, dams, streams, and under- ground reservoirs, both near and far away from actual users. Explains how pumping stations, pumps, aqueducts, water mains and pipes are used to carry water from places of storage to places of use. Illus- trates and describes how Special workers and machines are needed to provide adequate community water supplies. Depicts the role of workmen such as well diggers, dam builders, equipment operators, filtration plant employees, laboratory technicians, trench diggers, pipe layers and a water meter reader. Closes with a summary of main ideas treated. VOCABULARY--COMPREHENSION LEVEL: The content treatment seems mixed; the narrative and vocabulary directed to primary pupils; the visual treatment to grades 4 and above. The vocabulary is basically non- technical in nature. Uses some technical terms such as pumps, wells, pumping station, filtration plants, chemicals. Uses a slow presen- tation and delivery pace in general. POTENTIAL USES: To supplement or enrich a unit of study on water, natural resources, or cities to explore the behind-the-scenes work necessary to provide an adequate community water supply. Usable at any stage of a unit; for instance to introduce a unit, for discus- sion purposes, or as a review. TARGET POPULATION SLANT: Slanted by omission; shows no minorities or females. OVERALL RATING FOR AUDIENCE LEVELS: * Grades K-3: Poor-Fair * Grades 4-6: Average (*Rating Scale: Very Poor, Poor, Fair, Average, Good, Very Good, Superior) Ratings determined by a panel of elementary teachers, ele- mentary education professors, subject matter specialists and audio- visual media specialists. APPENDIX C3 THE TYPE III EXPERIMENTAL ABSTRACTS Film #1 11 minutes, color leyep: Intermediate, Junior High (Grades 5-9) SUBJECT AREAS: Science, Physical Science SUBJECT TOPICS Water, Properties of Water, Water Vapor, Steam, Ice, TREATED: Fog, Properties of Matter, States of Matter, Evapora- tion, Condensation, Molecules, Molecular Theory, Solutions, Dissolution SYNOPSIS: An instructional film which investigates, demonstrates and explains through the use of animation, time-lapse photography and Simple experiments, the nature, behavior and properties of water. Shows how water exists in three states of matter--solid, liquid and gas, and what happens to it when it changes states. Explores and explains how and why heat increases the dissolving rate of water; how wind and heat speed up its evaporation rate, and that water vapor is an invisible gas. Uses a variety of familiar examples as illustra- tions: sugar dissolving in water; wet clothes hung out to dry; a sweating root beer glass; fog; the making of ice cubes; a steaming kettle and a steam locomotive. Poses questions throughout the film and uses an intermediate—level boy at his home to perform experiments suggested by the narrator. Shows key vocabulary terms on the screen. VOCABULARY--COMPREHENSION LEVEL: Grade 5 and above. Somewhat tech- nical, using terms such as molecules, condense, condensation, evapo- rate, evaporation, dissolve, substance, temperature, beaker, flask, water vapor, particles. Uses a concise format; presents information at a fairly quick pace. ~ POTENTIAL USES: Basic, supplemental, enrichment or remedial instruc- tion about the nature, behavior and properties of water or changes in states of matter. Supplemental or enrichment instruction for all sub- ject t0pics listed above. Use near the middle or end of a unit of study after some introduction to the concepts and vocabulary treated. Usable as a prelude or follow-up to experimentation by pupils. Pos- sible usage restriction: demonstrates two experiments which could be dangerous if an unsupervised youngster tried to to them on his own. Warns pupils not to experiment on their own, however. 367 368 TARGET POPULATION SLANT: The content treatment appears to be pitched to grades 5 and 6. Slanted by omission; Shows no minorities or females. OVERALL RATING FOR AUDIENCE LEVELS: *Grades 4-6: Fair *Grades 7-9: Fair-Average *(Rating Scale: Vernyoor, Poor, Fair, Average, Good, Very Good, Superior) Ratings determined by a panel of elementary teachers, elementary education professors, subject matter specialists and audiovisual media Specialists. 369 Film #2 11 minutes, color LEVEL: Primary, Intermediate (Grades 3-6) SUBJECT AREAS: Social Science, Economics, Natural Resources, Conservation SUBJECT TOPICS Water, Water Resources, Community Water Supplies, TREATED: Water Economics, Filtration Plants, Pumping Sta- tions, Dams, Occupations, Water Supply Workers SYNOPSIS: A documentary film which explores the question, "Why does water cost money?" by Showing how a city gets its water supply. Shows through live and behind-the-scenes photography in the western USA, how water is obtained from its natural sources, cleansed and treated in a filtration plant and transported to the city for various uses. Reveals that rain and snow, the basic sources of water, even- tually become stored in rivers, lakes, dams, streams, and underground reservoirs, both near and far away from actual users. Explains how pumping stations, pumps, aqueducts, water mains and pipes are used to carry water from places of storage to places of use. Illustrates and describes how special workers and machines are needed to provide ade- quate community water supplies. Depicts the role of workmen such as well diggers, dam builders, equipment operators, filtration plant employees, laboratory technicians, trench diggers, pipe layers and a water meter reader. Closes with a summary of main ideas treated. VOCABULARY--COMPREHENSION LEVEL: The content treatment seems mixed; the narrative and vocabulary directed to primary pupils; the visual treatment to grades 4 and above. The vocabulary is basically non- technical in nature. Uses some technical terms such as pumps, wells, pumping station, filtration plants, chemicals. Uses a slow presen- tation and delivery pace in general. POTENTIAL USES: To supplement or enrich a unit of study on water, natural resources, or cities to explore the behind-the-scenes work necessary to provide an adequate community water supply. Usable at any stage of a unit; for instance to introduce a unit, for discus- sion purposes, or as a review. TARGET POPULATION SLANT: Slanted by omission; shows no minorities or females. OVERALL RATING FOR AUDIENCE LEVELS: *Grades K-3: Good *Grades 4-6: Very Good *(Rating Scale: Very Poor, Poor, Fair, Average, Good, Very Good, Superior) Ratings determined by a panel of elementary teachers, elementary education professors, subject matter specialists and audio-visual media Specialists. APPENDIX D THE FILM EVALUATION PANEL'S QUALITY RATINGS 370 THE FILM EVALUATION PANEL'S QUALITY RATINGS APPENDIX 0 Table Dl.--The panel's ratings for film one obtained from the Film Quality Rating Instrument. Panel Members FQRI‘ Elementary Media Elementary Subject 0' Teachers Specialists Educators Specialists 1 2 3 4 5 6 7 8 9 10 ll 12 13 14 15 16 l 6 6 5 5 6 6 6 5 6 5 5 5 5 3 5 6 2 5 6 5 6 5 6 5 5 6 4 4 5 3 5 5 5 3 5 5 6 5 5 6 5 5 5 3 4 6 4 4 2 5 4 5 6 6 5 5 5 5 5 6 3 4 5 3 4 4 4 5 5 5 5 4 5 6 4 6 4 4 4 5 2 4 2 3 6 5 5 5 4 5 5 4 5 4 4 4 5 2 4 2 3 7 5 6 5 5 5 5 4 5 5 5 3 5 2 4 4 3 B 5 5 5 5 5 5 4 5 5 5 3 5 2 4 3 3 9 6 6 5 4 6 5 6 6 6 3 4 6 3 6 5 6 10 5 5 5 5 5 5 5 4 5 3 3 5 2 4 3 5 ll 6 5 5 5 5 5 5 4 4 2 3 5 2 ‘4 4 3 12 5 5 6 6 6 6 5 5 5 -3 4 5 4 4 3 6 l3 5 5 6 6 5 5 5 5 5 4 4 5 4 3 4 6 l4 5 5 6 6 5 6 5 5 6 4 4 5 4 3 3 6 15 6 6 5 5 5 6 6 5 6 5 4 5 ‘4 5 4 3 16 6 6 6 5 5 5 6 5 5 5 3 5 3 5 3 5 l7 6 6 6 5 6 5 5 5 6 4 3 6 4 2 4 6 18 5 5 6 6 5 5 5 5 5 4 4 5 4 4 3 6 l9 3 5 5 6 5 5 4 5 5 4 3 5 3 5 3 3 20 5 5 5 5 5 6 5 5 5 4 4 5 3 3 4 3 21 5 6 6 6 5 6 4 6 5 4 5 5 3 4 4 3 Totalb 104 108 108 103 104 108 99 100 104 77 74 103 63 80 7O 9O Meanc 5.2 5.4 5.4 5.2 5.2 5.4 4.9 5.0 5.2 3.9 3.7 5.2 3.2 4.0 3.5 4.5 50d .69 .50 .50 .67 .41 .50 .69 .46 .7O .81 .57 .37 .93 .92 .95 1.36 aFQRI OI - Film Quality Rating Instrument question number. 6 Column totals, questions l-ZO. cThe mean rating obtained per panel member (questions 1-20). dThe standard deviation of the ratings obtained per panel member (questions l-ZO). 371 372 Table 02.--The panel's ratings for film two obtained from the Film Quality Rating Instrument. Panel Members FQRIa Elementary Media Elementary Subject 0! Teacners Spec1alists Educators Specialists 1 2 3 4 5 6 7 B 9 10 ll 12 13 14 15 16 l 4 5 5 S 5 5 6 5 6 5 5 3 5 5 3 5 2 4 5 5 3 3 5 6 4 6 2 5 5 4 5 4 4 3 2 4 4 3 4 5 5 4 5 4 5 4 4 5 3 4 4 l 2 4 5 2 4 6 6 s 6 3 5 s s 3 3 s 5 2 4 5 3 3 6 6 4 4 6 5 5 2 5 4 6 6 2 4 6 3 3 5 6 4 4 l 5 4 2 5 3 4 7 3 4 6 2 l 5 6 2 6 l 6 5 5 4 4 3 8 4 5 6 3 4 5 6 2 6 l 6 5 4 4 4 4 9 ; 4 5 5 3 6 5 6 5 6 5 5 3 4 3 3 5 10 ' 4 2 5 2 3 3 6 3 4 1 4 3 4 3 4 3 ll 2 3 5 2 4 5 6 4 5 l 5 4 4 4 4 3 12 2 2 4 2 3 5 5 4 5 5 2 3 3 3 2 3 13 3 2 4 2 4 5 5 4 5 5 4 4 4 4 3 6 14 3 4 5 3 4 5 5 4 5 4 4 4 5 5 3 5 15 5 4 5 2 3 5 5 3 6 4 5 5 5 5 4 4 16 5 5 5 3 5 S 6 4 5 3 5 5 5 5 2 6 l7 6 5 6 4 5 5 6 5 6 2 5 6 5 5 3 6 18 4 3 4 2 3 5 5 2 4 l 5 3 4 3 3 5 l9 4 4 5 2 2 5 5 2 4 l 5 5 4 4 4 5 20 3 4 5 3 3 5 5 3 4 l 4 4 3 4 3 5 21 . 4 4 5 3 4 6 6 3 4 2 5 5 2 5 4 5 Totalb 68 7B 101 54 72 100 112 73 102 56 95 85 81 84 66 91 Meanc 3.4 3.9 5.1 2.7 3.6 5.0 5.6 3.7 5.1 2.8 4.8 4.3 4.1 4.2 3.3 4.6 SDd 1.19 1.02 .69 .80 1.14 .56 .50 1.04 .85 1.79 .85 .91 .94 .83 .66 1.05 aFQRI 04 2 Film Quality Rating Instrument question number. bColumn totals, questions 1-20. cThe mean rating obtained per panel member (questions 1-20). dThe Standard deviation of the ratings obtained per panel member (questions 1-20). APPENDIX E THE FILM EVALUATION PANEL'S SUBJECTIVE COMMENTS 373 374 zumwge> weemw> exeem Ewww ms» we mmeegee mg» we uzmwp cw mucmswgmexm me» uemucee Feemw> we mcwwece; ece eewuemwmm .mmcmem we cewpempmm mew - we mmegm>ee Peemw> eeew ”xuwweee Feemw> mp em. m.e .emwsteewea= meowm eeu we: newton mew . -mmgesw eee mmewwmmw wwweee sow; meweew m>weeegpmcee mxe>m eee muewemEcmeew ewes mpm>m~ meegm e» xwmxww Heewueemc em mcwewemew me e_ee3 . we zpmwme> e Lew mweemz meemweee empemexu FF we. m.e mcwgee awe e me: . emeewm>me mgmz zseeee uez . mpemeeee mewmmmgmuew emmeum cese meewe mecmweee cmmeee» e gem . -egemeesme mg» peeeeee _emeee pmmgmecw Lew ewe e» mememmeee» we mm: .pmmcmeew ”seweeemg e e: emNNen me eweeu . Lose; we mm: mew meemweee emuemexm ow mm. m ppegm>e .p—ee mceu use eeeE cmguee me: mcwwacepm mew . we mm: HaemEHemcu _Fee zem> meee eee eeez . Aemeewuema meezv mmemmms Fweem>o m_ _m._ N emm: mcmz meecm ageecmemeze mmee—umewu mmemem-mcu-eewcme 3mm . eee :eweeswce we mm: mew emewm: m>eg eweez mmeew: 3eww Lmuez we cewueewc< . -zemu xgeecmeeeze eeeo emme empee Seweme me» we mm: eeweeewee e: co mweuwm - -mweE we eeweeawee mew "xuwweee peemw> Np um._ p cewcmeweu we xcem ezh Ewwm we mgemcmcemeez mco Ewwm we mgumemcem xew_eeo E—wm ewmom mm .m ._mcee :eweeepe>m spww mse xe meee meemeeee m>_uemweem m>wueucmmmcemm Hezw Epwm we enumemcumce: ms“ .m> mco E—wm we ecumemmum mzwul.Pm mweew 375 mmecmwgmexm .mucmeepm ea :eSeNZeLmemm cew 8.5 . weeegeeee :mw emew>eé mewmcmwpeze xpweeeemp -Pmuew we: "weemeew eew 3ewm eee me: neweee mew mcewummee emecmiemee msu emxwm :ewummee e new: meem Ewww me» use unmeegcu mcewemmee we mm: mecmweee emecmucw mg» ea eewuee nemwce HaemEeemme mmemmme _Fecm>o m_ cm. pp meeeemeees eee mewxeeee emcee; we: eewuecce: mew meceem we mm: m>5emwwm5 go we: eeeem mew eweee eeew mew—eee eceem m— No. m.m .eum £53323 .eewueimm iceu e» =m>wm mwmezeem ez meweee mpeeneeuee xmemgu Leweempee we magma cw em>wm meewaeeewexm Lmuez we mceee: Lmuues eemneam mee we me_e> pecewpeeeem peeee eewueseewew exeem seweemwee mcu mzegm "xuwew~e> ucmuceu N No. m.m muemeeee pceumeesw mpemeeee emuememe—wwuppmz maem ewewexm em em_wem mmwesexm we mm: acmEuemcu emewewexm ece eewuereemce eeew Lmuuee uemneem Lmuume :mme m>ec epeee ucmsuemme mpeeuwem .m>wuemwwm mmeeege :ewpecu—ww mew cmeuee eemweem eeeu "xuwewpe> acmeeeu m mm. N Aemcewuems mcezv Lepee eeew Fpegm>e .xzeecmeuese acmwpmexm xuw_e=e weeewm -mmweme wwemm>e "xuw_eee _eemw> op mo. 5 mmmcemuee emueeuee uezszem . Aemcewueme meezv -mmmcwmemeeEmeeeu a mo. N :ewgmuwmu we xeem ezw E—wm we mzumemcumeez mee ewwm we meeecmeem eewpeae s__e emeem em a .eeaewecoe--._m e_eew 376 Fpmz’ ememzmee mmmz mcewummee mcewme emmeee zmm> mmcweeemmmmece xww eempe -mewe we: ewe newgem mg» . memeemsmeesee »_wmem magma weep: :eFNew Nm>mw -eeue> emaewge em: e.gewo . eu xmem me: newcum mew :ewmcmzmgeseu w oo. Np mmwewgeewe m>wuwmee xpwecmcmm Le mmFeEmw e: mzecm . Nemcewueme mcezv "meewm mmemmmz mp mN. ow memm: ameceeemu msmweeee Lewes -eee e» meee>mwmc m.>eeeu we msem mmceemw . Nemeewucme mcezv ”Newewpe> pcmuceu e _m. mp cmeez @5333 we memee ems me e» ,2me m5 .8 gmeeem meeew ”....er .. 3.82%.; PEZw m5 um>$emneenmmeemem o Nm. 3 emewcemme xwmeeeemee Newewmcmsmee me: me: Lmuez we mew emepee eemweem -eEee eee eeweecewww mew - Nemcewucme meezv "Newewpe> acmeeeu w ee. mp magma we :ewpecepexm eee eeweeeemmmge eeew emew> =smwxgeewee= -eee Amegez ecmwmv new» emEmmm xeeweeeee> mew . -eEmewew ageweeeee> mew zew eee _m>m_ muewgeemeee we: wm>mw age—eeeee> mew . ce me mw xceweeeee> mg» Fm>m~ xceweeeee> N me. NF e: E p e m meme wee m: E e m m m cewgmuwcu we xeem w _.a w :5 e z o __e w :5 : Lem Neepeao seem emeea em. .e I I.I.Il. .emzcwecoe--._m e_eew 377 .Lmesec :ewummee mamasmumcw meweem xuwweeo spwm mgwe .ezw E—wu Lew mewmec :ems mgu meewe meo Epww mew mewpeg ceme mew u a Lempe me em xwmxww Nemcewucme mcezv aprepe m.ewww mg» 1 nm>wuemneeummeegem m m~.- mp mecmweee ms» ecezeu mewecmem mmewcmemF mweecwmmeea -mecee we; Leuemge: mew . Nemcewucme mcezv "peepm mmemmmz NF 00.- mp e: E P e m a mg m e m e m m cewcmewcu we xcem N _.w w :u c u c z :o Ewwm w mg» : Lem xuwPeeo Ewwm ewmom mm .m .vmzcwucouiu.pm m—eeN APPENDIX F RAW DATA: THE SUBJECTS' RATINGS 378 379 Nmepemsmwm me» we pzmwm geese: newemmee seew emcweuee mewaec Amewseemuv mecmgmwmee mew n e .Nmm xwecmeemem gmseemw m N.N m.N m.N m.N o.N m.N ©.N c.~ m.~ m. m.N w. m.~ m.~ ¢.F om m.m ¢.m m.¢ w.m o.N N.¢ ¢.m m.w ¢.m 0.x ¢.N o.m m.¢ ¢.m ¢.¢ :mwz m m m m m m m 0.. o_. o_. m o N o m S. v e m m m o m m m m N ¢ ¢ 0 m mp m w o N N m m N N e m m m m m vp m m N m m m P w m or m m N m N MN m w w m w m w w w N F m m w 9 NF 0 N N m m N m m m m m m e m m p— m m N m m m m o— m o_. m m ¢ m m o— e m m m m m m e, e_ e_ e e m e N m eeeeeam o m v N m e m m m w m m m o m m _. o N 0.. m N m m Op 0.. 0.. o m m m N v m N N N N N m m m m m o m e o m m o N m P N m N o_ m o v N m m m o N m op _. e o_. 0.. ON OF N m _. e v m p N m w m m N m OF N w m c m m m m N m m N a m o— m op v m N m N o m m N m N m w m m N o m v m P u N F N P N p N F N p N F N — cowmmwm - Amm_evme meme Fume meme Fe_e N _ coweweeoe mgeewemz new: meweewum ee mmmccmeemm mece>mme Nuwweeo mcememz .mmcweec w enema memEeemcu chnu._m m—aeN 380 NgepcmEmNm mnu we ucmwm Lmeaee eewpmmee seew emeweuee meweee chNseemmv meemgmwmge mgN u e .Nmm xwecmeemem emceemN w m.N ¢.N o.N ¢.N N.N c.N N.N N.N m.N o.N e.N w. 1 m.N am o.m e.m v.0 e.o e.N e.m m.o 0.x e.m m.m m.N m.m 1 o.m :em: a m m N N: e N: a ON 3 N: o 1 o Nm m a w w m m N ON N: N N N 1 m Nm 0 m m N N N m m m m N m 1 m om N m m m m N m m m N m m 1 o mN w e 0 ON oN oN oN oN oN oN E m 1 m N o m m m m e m N m oN m m 1 e NN m N e N m N N N: N: N: N: N 1 N N m e m m m m m w m N w m u m mN Humwnzm N m m o m m m N: N ON S o 1 o N N w m m m N o N: m N: m m 1 m N N m N m m m m N m oN a m 1 N NN m N m m a m e m m oN N m 1 N NN e m e N m N o N m m m m 1 m ON a o N m m e m N N m N e 1 m mN m m N e m N m m ON a m m 1 m mN m N N N N m o m 0 CN w m 1 o NN 1 N N N N N N N N N N N N N cewmmmm 1 N N m Nmmo Nmmo NmNo NuNo N N eeNuNeeee N u NV 0 mceeNemz cow: meNeewum ea mmmeemeumm meee>mNmm NeNNeeo mcememz 7-1 I! .mmewuec NN eeecm uemsuemce ch11.Nm mNeeN 381 NgeucmEmNm mew we ugmwm geese: eewmmmee EeLw emeweuee mcwaeg Naewceemav meemcmwmge mzN u e .Nmm xweemeegem emgeemN m c.N m.N m.N am LON ON ON \Dl— wr— v.m N.m RN wI-I- mN o.m LOI— LOP- :emz F F oN oN F PP PP PP PP F F r—CDMSO mmfi'm Lnxooo-eNoomm l\ N F-Q'LDSO O‘NNF— QVQO‘ONO‘O 1'- L0 OSMO'SLD NMNO‘ NQKDSDNNC‘Q \D ("O F ,— ONLOCDO NOSLDO Dwfih oomooo r- to varmooo commN osmoommmmm Q 03 LOLOr-O‘ wkd'w maioommr—mso VNF-O oomooN OMSDNNQNM F CONNOS ooxoooo wameSQm to 0 00400000 €13me OO‘GDNONOSOG 6' :— NOIN OOSQO OMQNO‘O‘ F- F-r- OfiNwOS ONNO OGQNNOSON O O O F VQ'NKO NLOLON SDI-0‘01!) $0me N m l SOMEDKO NLOMN Q‘DSDMLOQ'MN l—' 1.0 eemNezm N F N P N F N P N P N F N cewmmmm Nee_mvme Nmme Nmme meme Nepe mceewemz sew: Nd cowewecoe m=N=ewem mmmemmeumm l'IIIII.I I II‘l. mece>mNmm iIIIiIICVIIIIIIIII I NeNNeze I. mcememz II- ll.lli I V- -11. I.‘II- Ill .mmeweec NNN eeecm uemsuemge ch11.mu mNeeN .Nmm xweemeecem emceemw NeepcmEmNm mg» we ucmwm gmesec cewummee Eegw emcweuee mcwpec chwceemuv mecmgmwmge mgN n me ¢.N am LON LON LON LON .m .N .m .N @1— LOF- LOU- LOF- Q'F- m.m cemz Q'N wr— e3,— oN m m w P P F I—LOMNQ’ 00 d" P P O F r—v— O‘CO‘O‘O‘OO‘ '— LO LOOOO‘ [\NOON LO LO FF 0 O F P m oN w 382 eeeeesm F MLOCDQ' mmmMMr—Nmmmmmcoas Q'r—LOON COO !— f—l—‘F' Nr—wO‘ Noomm NOSLONLOQ'LONGDQ‘ NPLDCD [\CDNMNI—NC‘CDNLONOSQ LOLDLOOS LOwwN POONV‘DLDLOCDN F-r- LOMQ‘NNONQ‘ LDOOLOGJNGDQ'LDO‘Y NOMLO 1.15me LOO‘ONMMOMDO‘ LOLOEDLO LDLDLDLO NLONLDLI)LOQLOLOF LONLDLD LONMLO Q'LONLONQ'LOKOMQ' Nfimm Q’LOLOLO mLONLO‘OLOQKOr—d' NLOMM Q'LOLOQ' LOLDNLOLDLOQ’LOONQ’ CD LO NmCDCh ooooo l—v—r— NO“) P Nmoom 0'1 03 l N N ,— N F N N N F eewmmmm AN LNLVMO N u me _mme NLNe mceewemz new: _L =o_e_e=oe m=_=swem em II. ll.l|ll| mmmccmeumm mece>mNmN IIAlIIIll'l -lillI- 'I-'IIIIIAIIIII‘1-|III'I I.-- NewNeee 1.0:... a' 'I mcememz .mmcweee >N eeeem eemEmemcu m:N11.eu mNeeN APPENDIX G THE SUBJECTS' SUBJECTIVE COMMENTS OBTAINED FROM THE QUALITY MEASURE 383 384 u? sag; uwmmcma op mocmwum Lo mcwccopm mmmpu mucmwum -Lmuca Lmamou m vmm: upaoz mucmusum m Low mucmsmgvsomg mpmwz mangmgwaamg am: my? pcmsaopm>mc xgm_:nmuo>.:vxv=um can mcowuwucou Low cowuogmamga gaze mngcmg upzoz Lowgg sous cop umm: u.ccpzoz mcwcgmmq .m EPPL mpg“ xn umgm>ou m.um;3 soc; aummgpm upaoz mpwaag meow Espauwggau mumgm ;HL_L Lao map; EmgmoLa pcmmmga m.poosum $20 one? mpxou mucowum cowuwgmmpcp EPPL mwgg mcwpupm mcwmmew u.=au usmccmum cw megmpms mgm>ou Espzuwggsu .e Ewan: mmmuogn “Lonmccgu new pmm «3 3o; mzogm cowummwggw ms» amuse u.=c_o mpcmewgmaxm mmuspucm mammqum mco gmnuo mgu mcowumowpaqm pmuwuumgq mzocm guppme cusp Pmuwcnumu mgos m? EFPL mwgh Ewan: Lo mmwugmaoga wcu mcpmpaxm -uumhnzm .m e-~ mmcmgm ou umpumgwu ma op msmmm umma mamamcmp ash mpwgza xgwswga saw: mm: Low nmuwzm Amgmcgmm_v mmwuwgocws mmwpu mumgm guwww a Low mucmwusm go m_g_m xcm zogm u.:mmoo mumwgaogaam mp cowumucmmmga mgh cmucmuca .N mmmoggza Lo xumwgm> a Low Pzwmm: memu_ .Lmum: mm=Um_u a.cmmoo auspm we “we: m we saw magma _mgmcmm xgm> cw mg» um .me>mg a mm H? mm: upaou xpco Lmumz we mum: mcu mcwmpaxm xusum Lo u_:= mg» cu ucm>mpmg m.uH mmmoagaa swam: mo aumamm mco >_co mzozm maum» new mugmucou uwmmn muowgp Pocowuuzgumcn .F mucmesou caucmwgo-:umcmgumcoz mucmesou umucmwgo-;umcmgum cowgmuwgu .lI-.>IIT- 'I'- -l‘ll’llv. -l‘l-‘ I... .mgammms xuwpmao mcu Low mpumnnzm _mycmewgmaxm mg» 50;; umuwuwpm mucmegou a maxh Lo mmpamem m>wumgumappHuu.—u mFDm» 385 ooooowogoo Focowuoogumcw woo coco ogoe op gogmg mos_posom mucossoo o mom» .oowgop_go .ooos mo: Aocosmoow sympooo E—PL m can» gosoogv pomEmoow mocoowugmo Pocomuoogamcm co poop aposm mucoseoo o moak "mpoz Eogoogo mocooopm oogo mooogu moo _oosom moo poooo pooh avg opooz oo oopopog nonzmeom zFoo m.uH . Eogmogo Foogom goo go» opoomo m.pH “mozzuoom “snoop o>os H egos: mucosoogw>oo on» op oooo>mpog o_gwooom o: oo: mooopo “mos on com: on opooo _mcogomosoo .o mpomewgooxo mcwxoooeoooo oco :owmmoomwo mmopo ogwoomg opooz Fo>op ooogo zugoog osu m>ooo um om: no: opooz mmopo cw mucoawgooxo oo o» gogogo mpgomog “mop o mo xFoo s_wg m_;u om: o—ooz Epwg xgou -ooooguo_ co mo ow om: no: opooz mcopo mucoemgooxo owomco mowoo poooo mcwcgoz o oowggou mpo>op moogm opoows ooo gozop am now: on opoou oowmmoomwo go» Lgouoxou o mo com: mo opoou mmoFo cw ocoo mucme -Pgooxo op oonzoppog o mo o—oom: muooocoo oocmpp -ooumo coo: ocwo—Poo go» ooow mco_uo_gpmog go mcowumomoom mmom: .N gpmcmp cw gopgosm :moo m>oc opoou one. coo mgoow_ m m_ Eggg mg» zuooop cw smooco ugocm m.uH mmoznggoom _mowmgga .o moooasou omocm_go-zumcogumcoz flows-=3 ooucmwgoéoocogom cowgoowgu .omocwucouuu.~w m—ooh 386 opo opupwp o moooououuoo no: m.oH mucouwoupoo ogoz mogopowo xooz aooou mgnmma on .ggom ogaoz ow =.omomo= mg e_wg mgo go «sow goaogog< ompooooo oop woo mo: app» on» mmooooooo .m ocwawmwpom ogoz moo?» -m=m_gxm m.spgg mgo _omg o.=owo H mowzosm ooooom mow co __oz mo EFL» mpg“ oxwp u.:owo H :owmmoommo ooo mcowumooo moo—oswom apoooogo o_=o3 mowxo>ogo unmoock “Pompomuoo \Foouoov muoogwm .e ommgm>oo gouuos poonoom mgoowooz Fpogm>o .aawpooo om omogm>o m.a~ moo agocwogo .omogm>o :o mo: a? map?» mwgu cw mowum—ooo gomgooom o: 3mm 5F?» mowooum _owoom acoFPmoxo co mo; pH xoz govgooom o cw ocoo mo: up upon ppogo>o .oooo zuuogo mo: Epwm och ocoo ppm: mo: ooucomogo no: “on: maggomm .m moogooe agoogoouozo Fogocom “mom .oopooswoo xco mm: u.:owo czocm acwgpago>m mowgomoo u.:owo ”mogoz acougoosw msom poo ugop xoogu ocoom ooh acoogmoaoco mop goL cmmoso ogm: mouwm ooow oocmooo; poo; zmwgopo op oooFo; xzoogoouozo «moopumsvu och Fowopm; xgm> mgmz moopuoewco on» gsagoms on» Lo omov xuwPooo Foomccoop .N poo moou op ammo mo opooz xowpooo mowo> ooh moou ow momocoomooooo mo: :owuoggo: ooh =gmgggmeggg= o3 3 coEoggoc\uo.pgom of. xuowgo> Poomw> smooco uoz _Fm: xgm> ocoom moo oocouos mogouumo och com: ogmz mmowuoom o>wuooguu< opoozoncm mowzow> woos xzoogmouooo on» o=_e_wg ooom mx_—owgooo_o .ocoo Fpoz apoomw>\owooov xu_—ooo poo_:;oop .P mocoEEou ooooowgo-;umoogomcoz I. I O I- III -..- I. mucossou oouooPgo-:omoogom I All. oopgoowgu .ogomooe »ow_ooo on» go» mooowoom Fooooewgooxo oz» sogg oou_ow_o mucoEEoo o womb Lo moposoxm m>wuogumoppfin-.mo o—oo» 387 xonwoo oou omo»>ogo mo: co»uosgo»o» 3oz oooo umo» mo» cox»H H . mo»E»u poo» oou mo: uo»gom coxoom on» smooco aHonm ooxHou gouoggo: on» - moHooo "oH .HH xooum »o ooc»oHoxo zHgomHo ogoz omm: msgo» . o_co on» go» oou»=m ngooHo p.cmo3 uo»oo on» o“ ooo gooHo . ompcom ooaogumoHH» Ho»goHo -ogo ngooHo no: mgo: mpgoo osom aHgooHo mgoz moo»uoom oouos»co on» - Hogooom "oH .oH om: op pHooH»»»o oou mo oHooz moo»uogomoHHH gowH»Eo» .oHoE»m now: . gumcoH mu» go» co»pos magma «Hos»m :» concomogo . -go»c» 3o: »o uoH H:»3o co moo»>ogo uo»oo pooo»»»»o omoco>oo goguog ago a» :» moo op oco .om»ocou .»o»go mo: uH u .xu»meoEoo ooooogu mooommthHaHoo»»»»o oosoom Hogmoom 2» .upoo»»»»o oou uoc m.uH - .xu»o»HoE_m HoH .m mooooooo mop »o mHooocopmgoooo ooo .ogozgo» HHo :o mcmuogpcoocoo oHooogu om: -uco»ogom “Hogouo: m» monomoop on» . Ho>oH mc»mo»coo ocoumgooco ou «Hos»m mo: EH»» oz» . oo»moo:ogosoo .oomm> o»o o an op o» ocoom ocoumgooco oco 3oHHo» ou Hmou - Hogoooo uoH .m mc»pmogopc» >go> mo; a» n5:» on“ oox»H H - mmoooipoogfio oco 2.8% 23.; 3H . mc»umogouo» xHHoomoo cho m.uH goHoo :» ooEH»» mm: “H . Hooooo m:»goo uHo o mo op oosoom Ho>oH .omogouo» ummgouc_ >5 ooox o.:o»o pH “mogou:» no»; xHo>»ooHog o no: . Hogoooo ”DH .m HHoz ago> go>o mo>»ooowoo :oHuoEgo»c» my» own mxozHo u.coHo ooo»oHoxm ooo oooogomoHH» .mmoogoo oo»uoooogu:» mo» :» oououm coHu HHoz ogo: ucooou a» moomocoo oz» - "HQHV :mHmoo -mmoo moo gozmoo aHHoog u.:mooo pH omoogoo »o mumgoHo muH - HocowuoogumcH .o moooeeou oooco_go-zomcogomcoz mucoEEou oouom_go-;umoogom co_goo»go III--- .Umocmucouuu.mw mHnoh 388 xu»Hooo EH»» »o max» moo coca ogoE op go»og mos»uoeom mucoeeoo c max» .co»gop»go .oooe mo: Hucosooon ooooo»pgoo Hooo»uoogumo» co cos» gogpogv ucosooon aa»Hooo EH»» o pogo xHos» muomEEoo o oo»» "mooz aHHoowoocows oop ooooogu mo: acmpcoo on» mos»u no ooHHoooo oou gonzosom mc»mm»e mo: EH»» ogu »o ugoo uHou co»o -opoomogo »o gmcooe ooom o oom: moo» coo moooHoPo oHoE»m o oo: ommmmo omommos HHogo>o ”oH .oH pcoEoogo»c»og go» .uoooog “_oo»o HmEgou acougoos» »o mo»HHoom Hoomw>v mogoz unm»m om: u.co»o ooogo»c»mg HHoz no: ogoz muooocou moHosoxo xooxgm>o .Hoo»uoogo oom: moo»o -ogumoHH» oco moHoono ooom oom: ozoox one o» ozocxco mop ooaoHom mo»sogmogo "oH .mH goupmo co»o»uooog oom: m>oz oHoou oocoooom ooom o 3oHHo» mxozHo u.co»o oooHoHoxo coco .czocm mogoz umcHoooooom ooou moon» »o 3oH» Ho»pcoooom ooom o map»oc»uooo mo: o:»o:oooom uoH .oH memo» poougooE» »o xgossom o »o mocomom on» mo: on» cho moH moo»o noooHoxo mg» oH cuooo zooooo uoz co»;mo» aHgoogo on :» oouoomogo gosuo gooo coo: uH»oo moon» moH ooooHo>oo xHHoo»ooH Hoo~»oomgo HHoz ogoooogom .oo»oo~»comgo "oH .mH xon»:o oou maooocoo 3o: ow oo>oz ..o» xoom: o» :o.5oEgo»cH go» .3338 $253 om3>ogo poo mo: 8.3 smoocu own» oop .co»uosgo»c_ zoos oou mucommgo magma go moon» 3o: xooe oou ucomogo u.:o»o moco oo HHo ooooomogo mo: Hotoooe zoos o8 ooz oooH poooooo \cowuoEgo»c» "oH .NH AA- 'I'A AIUIAII AII.’ -l. muowesou oooongo-;omcogom:oz muomesou ooooo_go-;umcogom :ngmo»gu .om::»u:ouun.mo upon» 389 .pomEEoo om>po o mo op omggm»mg mo moo oo»gmp»go go pomgm»mg moo ooop mgoz .oo»gmppgo xp»Hooo opp» o ooo pomgm»mg moomo»pgmo Hoooppoogpmo» oo mommoogo pomsooon mop oopgoo omgmopm -ooo mgmz mgopoo» moomo»pgmo Hoooppoogpmo» ooo »pppooo EH»» opoo poop apos» mpomEEoo co mom» ”mpoz oppooo gmonm »o ooopmgmooo op ppoo»»»»o mo oHooz mHooooopmgmooo mo oHooz ommo mmooooop mo» omgo goo o» omgoppoo ho ooopmgmooo mo opooz Hompompgo gmogompv Hm>mH oo»momomgoeou .m mHoooH m.Ep o» omgoH»oo go» ogoo oop m» xgopooooo> mo» ppoop» -»»o oop mo oHooz agoHooooo> mo» msgmp moogm op»»» zoos mmoopooH mgooogo oo»p» go» »gopooooo> mHoop»om o mo: Hm>mH mmooooop \agopooooo> .o p» ho omgoo mo oHooz omgoppoo meow mmoompgmoxm m.omgop»oo op_.mpomoooo ompopmm oo»pomppo gopoo o» omspp» mo: pH pomsm>Ho>o» m.gooogo op»p» o opoo p.=o_=oz - pzooo \pmooop=_ mgmoogo op».p» »»o ogop xHoooogo 33o: .. gmoooz mot. oopoz mmopop omzoom gmogomo .m mgmoogm ,3»va LOL. =30_.m: 0:..wa w m.HH I oooogooooo mpomoopm moogm>oum>ooo go» oooo 28.: 5.; $.33 .5» «33.525 - 35258 $2.25» 355 45% mH»o:o mmooog mo»pm»gmpoogooo gmzopm mpogpmog» :oooogo Boo: .. prpoo :o opts mm: op Ho»o»»momm gmogom._ .N pomHHmoxm mgmz omopgommo mommo gmpoz mo» . mpomoooo mHoooHo> .oooo mpomg» HHmz »gm> mooppoooHoxm pomEpomgp gmpoz »o mmmo mop pomgp p.oo»o - mHooooopmgmoo: .mHoo»m mmm: gmppoe mgopooo; oop mo: Hm>mH oopmmoompo mo» . ooppooogo»o.p Pei» mop »o »oogoooo moo. pomwoom .H mpomesoo ompompgo-opoomgpmooz mpomEEou ompompgo-opoomgpm oo»gmp»go go» mpomooom Hopomepgmoxm mop .mgomomo »pppooo mop Eog» omppo»Hm mpomesoo oo mox» »o mmposoxm m>»pogpmoHHH--.mo mHoo» APPENDIX H THE ANOVA ANALYSIS MADE OF THE SUBJECTS' BETTERNESS RATINGS 390 APPENDIX H THE ANOVA ANALYSIS MADE OF THE SUBJECTS' BETTERNESS RATINGS Tab1e H1.--Resu1ts of the treatments (4) x sessions (2) ANOVA analysis made of the subjects' betterness ratings. Source SS df MS F p $ (subjects) ... .. ... ... ... T (treatments) 32.15 3 10.72 1.6 .22 $ (T) 414.09 60 6.90 ... ... S (sessions) .20 1 .20 .0 .99 T5 22.96 3 7.65 1.6 .21 $(T)S 288.34 60 4.81 391 APPENDIX I THE ANOVA ANALYSES MADE OF THE SUBJECTS' QUALITY RATINGS 392 APPENDIX I THE ANOVA ANALYSES MADE OF THE SUBJECTS' QUALITY RATINGS Tab1e I1.--Resu1ts of the treatments (2) x sessions (2) x fi1ms (2) ANOVA ana1ysis made of the qua1ity ratings obtained from treatment groups I and IV. Source SS df MS F p S (subjects) .... .. ... ... ... T (treatments) .50 1 .50 .2 .63 $(T) 63.50 30 2.12 ... ... 5 (sessions) 9.03 1 9.03 11.7** .00 T5 3.78 1 3.78 4.9* .03 $(T)S 23.19 30 .77 ... ... F (fiTms) 3.78 1 3.78 1.2 .32 TF .03 1 .03 .0 .92 $(T)F 91.19 30 3.04 ... ... SF .00 1 .OO .O 1.00 TSF .13 1 .13 .1 .74 $(T)SF 32.88 30 1.10 ... ... *p < .05. **p < .01. Tab1e 12.--Resu1ts of the sessions (2) x fi1ms (2) ANOVA ana1ysis made of the treatment group I qua1ity ratings. Source SS df MS F p 3 (subjects) 21.25 15 1.42 ... ... S (sessions) 12.25 1 12.25 16.3* .00 $5 11.25 15 .75 ... ... F (fi1ms) 1.56 1 1.56 .6 .46 3F 40.94 15 2.73 ... ... SF .06 1 .06 .O .85 SSF 26.44 15 1.76 ... ... *p < .01. 393 APPENDIX J THE ANOVA ANALYSES MADE OF THE SUBJECTS' RELEVANCE RATINGS 394 APPENDIX J THE ANOVA ANALYSES MADE OF THE SUBJECTS' RELEVANCE RATINGS Tab1e JT.--Resu1ts of the treatments (4) x sessions (2) ANOVA ana1ysis made of the re1evance ratings obtained from the 01F] stimu1us condition. Source SS df MS F p 3 (subjects) ... .. ... ... ... T (treatments) 4.78 3 1.59 .3 .85 $(T) 343.69 60 5.73 ... ... S (sessions) 30.03 1 30.03 12.2* .00 TS 3.78 3 1.26 .5 .73 $(T)S 147.19 60 2.45 ... ... *p < .01. Tab1e JZ.--Resu1ts of the treatments (4) x sessions (2) ANOVA ana1ysis made of the re1evance ratings obtained from the 03F2 stimu1us condition. Source SS df MS F p $ (subjects) ... .. ... ... ... T (treatments) 43.84 3 14.62 2.2 .10 $(T) 403.38 60 6.72 ... ... S (sessions) 28.13 1 28.13 7.5* .01 T5 15.25 3 5.08 1.4 .26 $(T)S 226.63 60 3.78 ... ... *p = .01 395 396 Tab1e 03.--Resu1ts of the treatments (4) x sessions (2) x stimu1us conditions (2) ANOVA ana1ysis made of the re1evance ratings e1icited from the mediocre re1evance condition, for the 03F1 and 03F2 stimu1us conditions combined. Source SS df MS F p S (subjects) ... .. ... ... ... T (treatments) 31.67 3 10.56 .9 .43 $(T) 676.06 60 11.27 ... ... S (sessions) 36.00 1 36.00 8.1* .01 T3 13.19 3 4.40 1.0 .40 $(T)S 265.81 60 4.43 ... ... C (conditions) 66.02 1 66.02 11.1* .00 TC 25.17 3 8.39 1.4 .25 $(T)C 357.81 60 5.96 ... ... SC 2.25 1 2.25 .7 .40 TSC 7.19 3 2.40 .8 .53 $(T)SC 191.56 60 3.19 ... ... *p S .01. Tab1e J4.-~Resu1ts of the treatments (4) x sessions (2) x stimu1us conditions (2) ANOVA ana1ysis made of the re1evance ratings e1icited from the high re1evance condition, for the 01F] and 02F; stimu1us conditions combined. Source SS df MS F p S (subjects) ... .. ... .. ... T (treatments) 3.76 3 1.25 .2 .90 $(T) 374.33 60 6.24 .. ... 5 (sessions) 16.50 1 16.50 6.6* .01 TS 10.04 3 3.35 1.3 .27 $(T)S 149.70 60 2.50 .. ... C (conditions) .88 1 .88 .3 .60 TC 6.61 3 2.20 .7 .57 $(T)C 194.77 60 3.25 .. ... SC 13.60 1 13.60 7.0* .01 TSC 7.14 3 2.38 1.2 .31 $(T)SC 116.52 60 1.94 ‘ .. ... .01. l- U ll 397 Tab1e JS.--Resu1ts of the treatments (4) x fi1ms (2) ANOVA ana1ysis made of the subjects' re1evance ratings e1icited for the 03 objective during session one. Source SS df MS F p S (subjects) ... .. ... ... ... T (treatments) 35.59 3 11.86 1.7 .19 $(T) 428.53 60 7.14 ... ... F (fiTms) 46.32 1 46.32 8.9* .00 TF 27.90 3 9.30 1.8 .16 $(T)F 313.28 60 5.22 .. APPENDIX K THE EXPERIMENTAL DESIGNS USED FOR THE RELEVANCE, QUALITY, AND BETTERNESS MEASURES 398 .mgomome mooo>mng mop go» ompmmo Hopomspgmoxm mo»--.—x mgompo 399 oom-o¢m op oom-mmm mp Nom-g_m Np opm- Hm Hp Nomo Homo Nooo Hopo Nomo Homo Nomo Hopo ooogm o3» oopmmmm moo oopmmmm mpomnoom pomEpomg» oopppooou moHosppw ooo oopmmmm mm¢3m»Ho<:o .uoz<>mpmm mzh mo» cum: monmmo 4