ABSTRACT STABILITY OF SEMANTIC MEANING SPACE AND CHANGE IN CONCEPT MEANING DURING TEACHER TRAINING BY Richard J. Stiggins This study was designed to further explicate the nature of semantic meaning space as it relates to educational concepts presented during teacher training. It sought to test the potential of the semantic differential scaling pro- cedure for systematically assessing a noncognitive outcome of teacher training. More specifically, research hypotheses were generated from four major questions: 1. What are the primary dimensions which characterized the semantic meaning space with respect to educa- tional concepts? 2. What is the nature of the effect of teacher training on these dimensions? Do they remain constant from the beginning to the end of instruction? 3. Is there evidence that instruction influences the connotative meaning of individual concepts presented during teacher training? 4. Is there evidence that the instructor plays a role in these changes, if they occur? Richard J. Stiggins Two additional questions addressed the potential role of the semantic differential in providing feedback of an evaluative nature for the planning of future instruction: 5. Are the connotative meanings of concepts and the changes in those meanings appropriate for teacher training? 6. Are there other relevant hypotheses which might be addressed by employing this scaling procedure? The method devised for seeking answers to these questions was the administration of a pilot tested semantic differential instrument to 252 undergraduate education majors during the first and last weeks of their first teacher train- ing course. Eleven concepts were rated on 15 scales. The concepts were chosen to represent technical educational psychology terms, concepts related to the interpersonal demands of teaching, and concepts in no way related to edu— cation. The first two groups included terms systematically presented in the course, and the latter group was included as a control. The sCales were selected to tap three dimen- sions of meaning: evaluative (favorable-unfavorable), potency (weak-powerful), and activity (active-passive). The results indicated that more than three dimensions of meaning were tapped. Unlimited maximum likelihood factor analysis revealed that the most parsimonious explanation of scale interrelationship was a four-factor solution: eval- uative, personal evaluative (pleasant-unpleasant), leniency (severe-lenient), and potency (weak—powerful, active—passive). Richard J. Stiggins These four latent constructs remained very stable over time. Multivariate analysis of variance of factor scores revealed that there were significant changes in the meanings of the instructional concepts within this stable frame of refer— ence, but that there were no changes in the meanings of noninstructional terms. No differential effect of instruc— tors was found. Further, it was Speculated that the mean- ings and changes were apprOpriate for teachers and teacher training. However, additional refinement of the scaling procedure is required before other relevant hypotheses can be tested. STABILITY OF SEMANTIC MEANING SPACE AND CHANGE IN CONCEPT MEANING DURING TEACHER TRAINING BY Richard J.EStiggins A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Counseling, Personnel Services, and Educational Psychology 1972 ACKNOWLEDGMENTS The selection, clarification, and execution of the research reported here was a process which was as true a representation of myself as it is possible for me to attain at this point in my professional deve10pment. I am, in turn, a reflection of interactions with many people who shared greatly in that deve10pment: Joe L. Byers, who shared and clarified so many things, including the completion of this dissertation: Robert L. Ebel, who shared the task of planning the course of my professional development; and Robert C. Craig, Frederick R. Ignatovich, and Willard Warrington, all of whom provided stimulus for growth. Their encouragement, support, criticism, questioning, patience, and acceptance are greatly appreciated. The nature of the contributions of each of these people was clearly apparent, even through the final steps of doctoral study. For this, I thank them. I am also indebted to the staff and students of Education 200 for their valuable support and assistance. Maintaining order in the face of total confusion can be a difficult and trying task. Only the wife of a doctoral candidate knows how difficult. Thank you, Nancy. ii TABLE OF CONTENTS LIST OF TABLES O O O O O O O O O O O O O O O O O O 0 LIST OF FIGURES O O I O O O O O O O O O O O O O O O 0 LIST OF APPENDICES O I O O C O O O O O C O O O C O 0 Chapter I. II. III. IV. INTRODUCTION 0 O O O O O O O O O O O O O O O Rationale for the Research . . . . . . . . Component of Teacher Training to Be Assessed. . . . . . . . . . . . . . . Assessment Technique . . . . . . . . . . . Context of the Assessment. . . . . . . . . Questions to Be Addressed. . . . . . . . . REVIEW OF RELEVANT AND RELATED'RESEARCH. . . The Semantic Differential Technique. . . . The EPA Structure and the Language of Emotion. . . . . . . . . . Psychometric Qualities of the Semantic Differential Technique . . . . . . . . . Applications of the Semantic Differential to Educational Training and Change . . . Conclusion . . . . . . . . . . . . . . . . METHODOLOGY 0 O O O O O O O O O O I O O O O 0 Introduction . . . . . . . . . . . . . . . Instrument DevelOpment Pilot Test. . . . . Pre/Post Administration. . . . . . . . . . Analysis . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . RESULTS 0 O O O O O O O O O O O O I O O O O 0 Introduction . . . . . . . . . . . . . . . Scale Interrelationships and Factor Analysis. . . . . . . . . . . . . Tests of Factor Structure Invariance . . . iii Page vii ix 10 12 16 18 22 23 29 31 42 46 48 48 50 55 9O 92 92 92 115 V. Test of Change in Concept Meaning. . . Instructor Role in Meaning Change. . . conCluSion O O O O O O O O O O O O O 0 DISCUSSION 0 O I O O O I O O O O O O O 0 Introduction . . . . . . . . . . . . . Scale Interrelationships and Factor Analysis. . . . . . . . . . . The Question of Concept Meaning Change Instructor Differences in Change in Meaning Intensity. . . . . . . . . . The Question of Other Testable Hypotheses. Conclusion . . . . . . . . . . . . . . BIBLIOGRAPHY O O I O O O O O O O O O O O I O APPENDICES O O O O O O O O O O O O O O O O 0 iv Page 118 129 137 139 139 139 146 151 154 157 159 168 Table l. 2. 3. 4. 5. 6. LIST OF TABLES Test-Retest Reliabilities in Applications of the Semantic Differential. . . . . . . . Validity Coefficients for Semantic Differential Scores (s) and Thurstone Scale Scores (t). . . . . . . . . . . . . . Results of the Chi Square Test of Goodness of Fit of the Three-Factor Model to the Variance-Covariance Matrix of Responses to Each Concept for Both Pretest and Posttest Observations. . . . . . . . . . . . . . . . Rotated Factor Matrices for Four—Factor Solution of Pooled Correlation Matrices, Pre and Post. . . . . . . . . . . Results of All Statistical Tests Assessing The Degree of Factor Invariance Over Time When Comparing the Pooled (Over Concepts) Pretest and Posttest Factor Solution. . . Multivariate Analysis of Variance Results of Change in the Evaluative Dimension . . . // .Jf/ Multivariate Analysis of Variance Results 8. 9. 10. ll. of Change in the Potency Dimension. . . . . Multivariate Analysis of Variance Results of Change in the Leniency Dimension . . . . Multivariate Analysis of Variance Results of Change in the Personal Evaluative Dimension . . . . . . . . . . . . . . . . . Multivariate Analysis of Variance Results of Change in the Overall Meaning Intensity. Factor Score Means for Each of the Four Major Meaning Dimensions for Each Concept . Page 33 36 93 111 116 120 121 122 123 124 128 Page 12. Results of Manova of Instructor Differences in Pretest Meaning and Change in Meaning on the Three Dimensions Where a Signifi- cant Change Was Detected. . . . . . . . . . . 134 13. Results of Analyses of Variance of Change in Meaning Over Instructors for Individual Concepts. 0 O O O 137 vi Figure 1. 10. ll. 12. 13. 14. LIST OF FIGURES Example of an Application of Cattel's Scree Test. . . . . . . . . . . . . Example of a Concept in Semantic Meaning Space . . . . . . . . . . . Fit of the Factor Solutions for the Concept Myself as a Teacher . . . . Combined Data Assessing Fit of Factor Models for the Concept Nonverbal Behav ior O I O O O O O I I O O I O O Fit of the Factor Solutions for the Concept Myself. . . . . . . . . . . Fit of the Factor Solutions for the Concept Questioning and Listening Skills. 0 O O O O O O O O I O O O 0 Fit of Factor Models for the Concept Behavioral Objectives . . . . . . . Fit of Factor Models for the Concept Reinforcement . . . . . . . . . . Fit of Factor Solutions for the Concept Respondent Learning . . . . Fit of Factor Solutions for the Concept Shaping . . . . . . . . . . Fit of Factor Solutions for the Concept Physician . . . . . . . . . Fit of Factor Models for the Concept Religion. . . . . . . . . . Fit of Factor Solutions for the Concept Marijuana . . . . . . . . . Fit of Factor Solutions for the Concept Behavioral Objectives Employing 12 of the 15 Scales. . . . . . . . . . vii Page 71 78 94 95 96 97 98 99 100 101 102 103 104 108 15. 16. 17. 18. 19. Page Fit of Factor Solutions for the Pretest and Posttest Results Pooled Over concepts. 0 I O O O O O O I O O O O O O I O O 110 Graphic Representation of Pretest and Posttest Meaning Assigned to IPL Concepts: Myself as a Teacher, Nonverbal Behavior, Questioning and Listening Skills, Myself. . . . . . . . . . . 130 Graphic Representation of Meaning Assigned to Carrel Concepts: Behavioral Objectives, Reinforcement, Respondent Learning, Shaping . . . . . . . . . . . . . . 131 Graphic Representation of Pretest and Posttest Meaning Assigned to Non- instructional Concepts: Physician, Religion, Marijuana . . . . . . . . . . . . . 132 Sample of Data Representing Role of Instructor in Meaning Change. . . . . . . . . 136 viii LIST OF APPENDICES Appendix Page A. SUPPLEMENTAL GENERAL INFORMATION . . . . . . . 169 B. PILOT TEST SUPPLEMENT. . . . . . . . . . . . . 173 C. PRE/POST ADMINISTRATION SUPPLEMENT . . . . . . 177 D. RESULTS SUPPLEMENT . . . . . . . . . . . . . . 180 ix V. I.. n-- (I. ~Ii ‘h (l' ‘v CHAPTER I INTRODUCTION Among the many issues confronting educators in America, one of the most important is clearly a definition of the role which educational institutions are to play in our society. There are those who point to cognitive skills and the teaching and learning of subject matter competencies as the primary mission of these institutions, and there are those who would claim a primary role for affective goals and objectives. (For a discussion of these points of View, see Ebel, 1972.) With the increasing call for accountability with its many and varied definitions, there is a pervading need for evidence of success in academic ventures in the form of data. This call for accountability provides ample grounds for heated discussion between the two camps men- tioned above, because those who support efforts in the area of affective growth are hard pressed, indeed, when confronted with the task of presenting evidence demonstrating the effec- tiveness of their endeavors. This is not the case, however, for those educators who back the cognitive mission. Achieve- ment testing has deveIOped into a sophisticated apparatus for assessing the outcome of student and instructor efforts. It is being pressed into use in national and state accountability l assessments. To deny the critical role of the cognitive mission and its accompanying system of evaluation would be difficult to defend. However, those who support the primacy of the cognitive endeavors hasten to admit that there are affective outcomes of any educational experience. One reason that these may be considered secondary by—products, is their abstract nature and the difficulties in quantifica- tion which accompanies evaluation of affective goals. Herein lies the problem of evidence demonstrating the suc— cess or failure of affective educational endeavors. It is proposed here that this need not be the case in every instance of affective educational goals. Consider an example in the professional preparation of teachers. The traditional goal of training programs in education has been to produce professionals equipped with tools with which to perform the tasks demanded by the profession. Specific capabilities are taught to attain specific competencies because those experienced in the field have determined that these competencies are of assistance in developing an effec- tive learning environment. In other words, those concerned with the professional preparation of teachers advocate the use of the pedagogical tools and skills included in the teacher training curriculum. This is, of course, the case in any professional preparation sequence. But, unlike other fields, when the teacher has completed the sequence, he proceeds into an academic situation where, aside from global restrictions and limits suggested by his professional judgment, he has great freedom to choose the tools and pro- cedures to be applied. Each teacher is responsible for the learning environment in which students dwell in any given classroom. The determination of, or selection of, tools which will be applied is a critical process. But what is the process? Is it enough to know that when a teacher is sent into a work situation, lacking eXperience, he knows what tools are at his disposal and how to use them? The answer to the latter question is clearly, "no.“ Knowledge of behav— ior is a necessary but not a sufficient condition for the manifestation of the learned behavior when practicing autono— mously in the classroom. In addition to knowledge, there is at least one other dimension that is critical to the mani— festation of the desired behavior. That is the value that the teacher trainee places in each potential procedure. This may be described as a positive affective disposition toward the use of the required or desired skill or tool. For example, knowledge of the mechanics and theory of achievement testing will not necessarily lead to its use in the classroom. The knowledge must be supplemented by a motivation to employ this procedure. And so it is with each of the tools, concepts, and procedures the teacher takes from undergraduate and graduate training. Motivation, as well as knowledge, will determine what will be employed (and to what end) after the last final examination. ”n- '.‘V pyg, it». . n.,. I "U. u ‘*5| 1": 'oi' i.‘ But when educators endeavor to evaluate the effec- tiveness of instruction in teacher training, the primary emphasis (indeed the only emphasis) is placed on assessment of cognitive outcomes. This leads directly back to the problem described at the outset, which cited the relatively more advanced state of the science of measuring cognitive outcome when compared with the instrumentation and method— ology available to assess the affective (motivational) con— cerns. The research reported here is an attempt to begin dealing more systematically with this affective component of teacher training. In order to avoid confusion, there is an issue implied in an argument of this nature which must be clari— fied and discussed before proceeding. This clarification begins by stating that the above argument does not demon- strate a position favoring the primacy of affective outcomes. Cognitive outcomes are critical and the teacher trainee should continue to be held accountable for cognitive growth in the same manner in which he has been traditionally. However, the responsibility for the deve10pment of the moti— vation or predisposition to employ pedagogical skills and concepts cannot and should not be left to the student alone. Where, then, should this responsibility reside? The simple answer to this question is that the responsibility for the motivational predisposition lies pri- marily with the teacher, and for good reason. .99‘ .uu —-v‘ A‘A tow «.- I 0"! la. in Dr: ~A Iv u. o . Consider Glaser's (1962) conceptualization of instruction which characterized the pedagogical task as planning instruction, carrying out teaching strategies, measuring achievement, and allowing the results to feed back into instructional planning and strategies. If the results aren't as desired in the cognitive realm, one (or both) of two occurrences has been manifested: either the instruction was inadequate or the student has failed in his responsibility to study. In either case, valuable informa— tion is provided by the testing. Now apply the same model to measurement of the motivational predisposition defined earlier. It was stated that knowledge is a necessary but not a sufficient condition for later manifestation of the desired behavior. Consequently, assume for the moment that learning has occurred and that students know how and when to apply the tools, concepts, and procedures taught. If a measure is taken and it is determined that the students do not react favorably toward the concepts, do not see them as powerful, or able to play an active role in the learning environment, to what may these reactions be attributed? Clearly one of two things has occurred. Either the concepts presented do not command these attributes in their own right or the instruction has failed to demonstrate that the con- cepts are to be so valued. In either case, the major respon- sibility for this goes to the person who selected the concepts and/or planned the instruction. Again, in either case, valuable information has been gathered for the feedback 100p and the planning of future instruction. It should be the inherent responsibility of the instructor to use such infor— mation to make adjustments. It is difficult to conceive of a dedicated instructor who would not use such information, but the problem appears in the fact that such measures are almost never taken. When instructional evaluations are made, they are generally evaluations of the instructor and not the content presented. Courses proceed for term after term with students passing exams with flying colors, much to the satisfaction of their instructors, and then, when it becomes time to apply the concepts, tools, and procedures which have been learned, they choose not to do so. In short, what is advocated here is that the same model for the evaluation and reworking of instructional sequences be applied to both cognitive and affective con— cerns in teacher training so that the training sequence will have maximum impact on the student both cognitively and affectively. A means for accomplishing this is developed here. Rationale for the Research One need not search far in the education literature, and more specifically the research literature on education, to find support for the type of research reported here. A brief summary is presented below in the form of comments on the need for the systematic and scholarly evaluation of educational endeavors and the need for research on teacher training. The Need for Evaluative Research The need for the development of instrumentation and methodology with which to carry out evaluation of educational programs is restated throughout the literature, particularly in such volumes as the American Educational Research Asso- ciation Monograph Series on Curriculum Evaluation, the 1969 National Society for the Study of Education Yearbook edited by Ralph W. Tyler, Educational Evaluation: New Roles, New Means, and the Handbook on Formative and Summative Evalua- tion of Student Learning by Bloom, Hastings, and Madaus (1971). The current status of the concern for the scholarly advancement of evaluation as a discipline is demonstrated by the 1970 creation of a new division of the American Edu- cational Research Association, Division H., Evaluation of Educational Programs. Those who have been concerned with providing orien- tation and direction in the endeavor of exploring and deve10p- ing new methodologies and instrumentation, particularly Stake (1967), Gagné (1967), Provus (1969), Stufflebeam (1968), and Scriven (1967), all repeat the same themes and admoni- tions. Evaluation should go beyond assessment of cognitive outcomes to a consideration of the antecedent variables, to the goals of a program, to the transactions which characterize I 'li a.“ lo. - pm i u- ... 1. l 0 l a particular program, and even more importantly, to the critical interactions among these variables. The nature of the research which will result from these types of considerations is applied research in a very true sense and can be of great assistance in the planning of instruction. After discussing the nature and role of such evaluative endeavors, Ahman summarizes: Curriculum developers and educators are tempted to de- emphasize evaluation because of the complex and some- times ill-defined methodological problems present. . . . Evaluation is a secondary activity in the deve10pment of curricula, but still one which needs to receive a major share of attention from the principle develOpers. As Scriven points out, the stakes are high (1967, p. 89). The Need for Research on Teacher Training There can be little doubt that extensive research on teacher training must be high on the list of priorities of educational psychologists in the coming decade. Gage introduces his Handbook of Research on Teaching (1963) with an indication of the reason for this: In recent decades, such research [on teaching] has lost touch with the behavioral sciences. It has not drawn enough nourishment from the theoretical and method- ological developments in psychology, sociology and anthrOpology. Nor has it provided those disciplines with return stimulation, as it did in its earlier per- iod. To remedy this condition--to bring research on teaching into more fruitful contact with the behavioral sciences--is the purpose of this Handbook (Preface). Research on teaching can, of course, take place at many levels: at the point of selection of undergraduate teacher trainees, during undergraduate and graduate level training, at the time of selection to fill teaching posi- tions, and while the teacher is working in the field. Strong arguments can be constructed for the importance of research at all levels, because of the role this inquiry could play in the learning environments provided by teachers. For the purpose of this research, however, it is necessary to briefly discuss the rationale for research at one of these levels, that of undergraduate training and the formative evaluation of particular programs therein. Undergraduate teacher training programs are critical in the professional socialization of future educators, pri- marily because of the nature of the discipline. An effective teacher is, at best, very difficult to define. This lack of an acceptable criterion toward which to strive presents immense problems to those concerned with professional preparation of teachers. First, the selection of a poten— tially successful teacher in any systematic way is very difficult. Second, there are few criteria.against which to judge curricular adequacy. And, finally, accurate judgment of a student's "successful" completion of a train— ing program is very difficult. Part of the reason for these difficulties may be that educators have concentrated too heavily on cognitive assessment at the expense of the truly critical affective assessment of teacher training. If the dependent variable 10 being examined here is fruitful, at least the last two dif- ficulties mentioned above will be able to be systematically addressed. New instrumentation will be available for the evaluation and determination of instructional adequacy and this may lead to some insight into the realm of determination of success in a more clearly defined program. Component of Teacher Training to Be Assessed Having discussed the rationale for the evaluative research on teacher training programs, it is now apprOpriate to return to the specifics of the research reported. It has been argued that motivational and cognitive outcomes of teacher training are important for the production of effec- tive teachers. The motivational component has been evaluated less effectively than has the cognitive. The procedure dis- cussed below is aimed at an important first step in the remediation of this state of affairs. Affect has been defined in numerous ways in educa— tion and psychology. Krathwohl, Bloom, and Masia (1964) in their Taxonomy of Educational Objectives (Handbook 2, Affec- tive Domain) list such behaviors as awareness, willingness to receive, acquiescence, willingness and satisfaction in responding, etc. Psychological research in the affective area has been characterized by phenomenal fields (Combs and Snygg), needs (Murray), values (Allport), and attempts have been made to measure interests (Strong) as well as global personality characteristics (MMPI, Rorschach, TAT). 11 Those who advocate abandonment of efforts to contend with affective components of education as described above find much support for these global concepts with high levels of abstraction and accompanying measurement difficulties. Little that is systematically helpful in solving evaluative problems in education can be derived from these conceptual— izations. Gage (1963) points out that a more effective strategy might be: Rather than seek criteria of effectiveness of teachers in many, varied facets of their role, we may have bet- ter success with criteria of effectiveness in small, specifically defined aspects of the role. Many scien- tific problems have eventually been solved by being analyzed into smaller problems, whose variables are less complex (p. 120). If global definitions of affect have been ineffective contributors to the educational evaluation endeavors, then as Gage suggests, perhaps a measure of the specific affect described in the introduction can be of assistance. The affect described was concerned with the favorable or unfavor- able attitude of future teachers toward the concepts and tools presented in training. In addition, it was defined as the degree to which the student is able to attribute power to the tools at his disposal and the extent to which these tools are able to play an active role in the learning environment if they are employed there. In short, one may be concerned with two measures of outcomes of instruction in teacher training: cognitive and affective. This research is concerned with the affective. There are numerous directions in which to pursue affect. 12 This research defines it as motivational predisposition to employ the knowledge which has been acquired. Motivation can be defined in numerous ways, but the one of interest here is the degree favorability, power, and activity which the teacher trainee perceives in the concepts presented. It is hypothesized that if these meanings associated with the concepts can be measured, then it will be possible to correlate the measures with actual classroom behavior to determine the validity of this definition of motivation. It is the "if" clause above which is the primary interest of this research. That is, an attempt is made to operational— ize a possible measure of motivational predisposition for later validation. These possible indicators of “affect“ can be, and are, directly influenced by instruction and should therefore be evaluated and should serve as stimulus for change in instruction if it is indicated. This level of specificity of definition of affect may provide a smaller, more visible target of the type recommended by Gage. Assessment Technique In our endeavor to measure these motivational pre— dispositions, we can find assistance in the linguistic realm, specifically from the work of Osgood and his asso— ciates (1957) in the deve10pment of the semantic differential scaling technique. This technique was develOped in order to provide a systematic procedure for assessing the connotative 13 meaning of concepts or objects and is a combination of psychometric scaling and linguistic assessment. This com— bination yields an index of the particular affect being measured here, which are the motivational predispositions. Very simply, this technique involves the use of bipolar adjectives, usually separated by a 7-point scale, which the respondent employs to modify a concept. The manner in which he uses each scale reveals the connotational meaning he ascribes to the concept. For example, if the subject is describing the meaning of the psychological concept rein- forcement on a scale which has its ends anchored by the adjectives weak and powerful, and he sees the concept as very powerful, his responses would be very close to the‘ powerful end of the scale. This same concept could be rated on numerous other adjective scales and the ratings would reflect the connotative meaning that the concept reinforcement has for each reSpondent. There is a fundamental distinction to be made between the meaning of a concept measured by the semantic differen- tial and the meaning of that same concept as measured by an achievement test. This distinction is critical to the definition of affect being presented here. Any learned con- cept has a large domain of meaning associates which are the result of experiences with that concept. Part of that domain is its denotative meaning and this part is a reflec— tion of the substantive definitional meaning of the concept. For example, one who possesses knowledge of pedagogical 14 skills as a result of teacher training possesses the deno— tative meaning of the skills. This speaks to the issue of how, where, and when to employ the tools and skills and is traditionally assessed by an achievement test. This denota- tive meaning is supplemented by additional associations made with the concepts which reflect the meaning of a word apart from its explicit description or definition. Concepts commonly trigger an emotional or affective reaction which is every bit as much a part of the meaning of a concept as its definition. It is this type of emotive association or mean- ing which is being tapped by the semantic differential scal- ing technique. In this study, only a part of that emotional or affective reaction is of interest. That subset of reac— tions consists of the aspects of connotative meaning which appear to portray the motivational predisposition to use the concepts at some later time. In other words, there is a need to know if the concepts taught in teacher training trigger favorable reactions, command connotations of power, and possess the qualities necessary to be active contrib— utors in a learning environment. If these can be measured, then later research can validate the assumption implicit in this argument: that these measures are in fact indi— cators of later classroom behavior. Such an assumption would appear to have some merit because of the validity studies done in areas other than teacher training, which demonstrate that:fl3neasuresof connotative meaning are related to behavior. However, discussion of these will be 15 postponed until Chapter II, where the validity of the tech- nique is discussed. In their discussion of the Measurement of Meaning, Osgood, et a1. (1957) present a model based on a classical conditioning paradigm which is designed to explain the man- ner in which connotative meanings become attached to a con- cept. A complete discussion of this model will appear in Chapter II. However, some of the major points are worthy of note here. Language and linguistic elements play a critical role in the attachment process, because language is the primary vehicle for the transmission of complex mean- ing. In each learner, there resides an elaborate system for decoding verbal input and integrating into a structure of perceptions (cognitive and emotional) which has been created through experience. The purpose of instruction (i.e., teacher training) is to bring about new cognitive and affective (motivational) associations by using language and relying on an integration of these symbols with past exper— ience. It follows from this that language can be profitably employed to assess the associations after instruction. For example, when one wishes to measure the nature and quality of subject matter connections, one uses language to estab— lish a set of controlled conditions (test items) to take a "reading" of the associations which have been made. It is argued here that this same strategy can be used to assess motivational associations by establishing a set of controlled 16 conditions to read out connotative meanings using language as the vehicle. This is the intent of the semantic differential technique. As Osgood points out (1957, pp. 19—20), there are several levels at which one may make such an assessment. For example, it is possible to provide a stimulus and allow the respondent total freedom in his verbal response as in a projective psychometric test. However, this is most diffi- cult to interpret. He then proposes that limits be placed on the linguistic elements to be used by the respondent to describe the stimulus, leading to the semantic differential format described earlier. The respondent to an instrument employing this technique is simply asked to describe his associations with a concept on a limited number of descrip- tive scales which are selected to tap the meaning alternative of interest to the psychometrician. By so prescribing the elements of the description, the researcher is able to con- trol for individual differences in vocabulary and general grammatical and verbal facility. Such a technique would, therefore, seem to have the potential to supply useful information about the impact of instruction on students and the meanings they come to asso- ciate with concepts presented. Context of the Assessment Three points have been made thus far: there was a need to further explicate the process of becoming a teacher 17 through evaluative or applied research; a primary phase of such research should be to supplement cognitive outcomes of teacher training with information concerning affective out- comes; and, an assessment technique was available which has the potential for supplying such information. Given a teacher training context, a simple application of the tech- nique would reveal the utility of gathering such information and the role such affective data could play in the planning of instruction. Such a context was supplied by the initial course in the teacher training sequence at Michigan State University, which provides the student with a first look at the task and personal demands of teaching. The course is structured into two phases which run concurrently and which deal with the task demands as well as the personal demands of the profession. Task demands are defined as the knowledge and skills required to establish and maintain an effeCtive learning environment (Henderson, et al., 1972). This is the substantive content presentative phase where principles of planning instruction (i.e., reinforcement, shaping, respondent learning) are presented. Instruction on the nature and application of these tools is carried out in individual study carrels via cassette tapes, slide presentations, and instructional film loops. The other phase of the course deals with the per— sonal and interpersonal demands of teaching such as the role that questioning and listening skills can play in a learning environment or the importance of nonverbal behavior 18 in providing feedback to the teacher. These concepts are confronted in the context of an interpersonal process lab- oratory (IPL) of no more than 15 students under the super- vision of a graduate teaching assistant. The students attempt to develop in themselves the capability of dealing effectively with the interpersonal relationships which will be demanded of them as teachers. Emphasis in the IPL's is on self-growth, while in the study carrels it is on helping others to grow. Concepts, tools, and procedures, which will be of assistance in prac- tice, are systematically presented, and it is within this context that the meanings associated with these concepts were assessed. Questions to Be Addressed According to the opening argument, from the viewpoint of those coordinating and planning instruction, such tools, concepts, and procedures as behavioral objectives, rein— forcement, shaping, respondent learning, nonverbal behavior, and listening and questioning skills (among others) can play an important role in the deve10pment of an effective learning environment. ,But, it is insufficient to know that students'have mastered objectives which stipulate that they know how and when to use these learning facilitators. It is argued here that, as a result of instruction, they ought to ascribe value to them in the form of a positive dispo- sition toward their use. In addition, students should l9 perceive the learning facilitators as powerful and invest them with the capability of playing an active role in the learning environment. In order to adequately assess the worth of the semantic differential (SD) technique as a potentially rele- vant, dependent measure in the study of this affect, answers to the following questions will be sought: Is there a systematic set of dimensions which indi— viduals consistently employ as a frame of reference on which to rate the connotative meaning of educational concepts? The answer to this question will be inferred from the answers to two more specific questions: Are the primary dimensions which characterize this frame of reference Evaluation, Potency, and Activity, as might be aniticpated from prior SD research? Do these factors remain evident from the begin- ning of instruction to the end of instruction? Does instruction have a systematic effect on the connotative meaning of the individual concept taught? The answer to this question is conditioned to some extent on the stability of the frame of reference. This gives rise to two related questions: If there is instability in factor structure, what is the nature of the post—instruction structure in relation to that which existed prior to instruc- tion? If there is stability, that is the same factors char— acterize the post-instruction responses as represent the pre-instruction responses, do meanings of concepts change within the frame of reference? This is the critical question H.,... —' A ‘vVu-A .i A! uoUbb uy‘L‘n IOQH‘ p '5‘ AA in vy ‘F'I. ‘ V gnu .. bnoy um .. ‘I- .- up“... "'b V... VI Ii!" 6’ . V -" I .‘kl‘ (I! 1 ,. (I) ‘II (I‘ If 20 because the answer sheds light on the influence of instruc- tion on the meanings of specific concepts taught. There are a number of influences which could con— tribute to a systematic change in the connotative meanings of concepts. Among the most important of these is the instructor's role in the change, and it is here once again that a test of the utility of the SD technique becomes appar— ent. The test is, of course, the degree to which instructor differences can be detected and this test will be carried out. Another direction from which to approach the question of the relevance of this technique is to assess the degree to which it is able to supply useful information for the planning of future instruction. For example, a relevant con— tribution would be made if SD ratings revealed inappropriate meanings which are not influenced or changed to appropriate meanings by instruction. In this case, the appropriateness of the meanings would be defined by those who are responsible for planning the instruction. The results of this research will be interpreted with reSpect to this appropriateness issue. The final assessment of the potential of this tech- nique of measuring motivational predisposition is to seek other relevant hypotheses which might be addressed with such measurements. Some examples of these will be discussed in later chapters. 21 Each of these questions is systematically addressed in the research in order to assess the validity and utility of the SD in rating affective components of a teacher training experience. CHAPTER II REVIEW OF RELEVANT AND RELATED RESEARCH There is a large body of research on the semantic differential (SD) both in terms of a psychometric technique and a dependent measure of meaning in theoretical and applied research. A small part of that research literature is reviewed here with the intent of being brief, concise, and yet including the essential components of prior thinking as it relates to the research reported here. These essential components include a detailed discussion of the deve10pment of the technique from its objective (measuring meaning) through its final format, the relationship between the SD and other research on the language of emotional or affective reactions, and an extensive presentation on the psychometric qualities of the technique. From the general discussion of the technique, its development and nature, the presentation proceeds to a higher level of specificity with a review of applications of the semantic differential in the educational domain and particularly teacher training including review of the limited research on changes in mean- ing due to an intervening treatment. The juxtaposition of these latter two reviews leads to the research design and methodology reported in the concluding chapters. 22 23 The Semantic Differential Technique Chapter I included a concise discussion of the Osgood conceptualization of the role new experience and language play in the perceptions of the world of stimuli. This seC- tion will expand that discussion and add a brief review of the progression of research that led to the semantic differ— ential technique. From this it will be possible to move to a definition of semantic meaning space and extract rationale for subspaces in meaning and the tailoring of the technique for specific applications. The conclusion of the section will be a more detailed presentation of the format generally employed when measuring the descriptive associations made with concepts via a semantic differential. From among the various definitions of the term "meaning," Osgood chooses one which relates meaning to a "representational mediating process" (1957, pp. 5-9). This is a psycholinguistic conceptualization which tends to become more clearly understood when its elements are con- sidered separately. The primary vehicle of communication is language, which is composed of linguistic elements. These elements, according to Osgood, have been created by man to communicate his thinking or inner states of con- sciousness. Linguistic elements used in combination therefore represent an Operating system of thought in the organism. For example, the elements on this page are sym- bols which represent a certain pattern of ideas or concepts. Given these elements as stimuli, the reader is employing 24 them as mediators in the process of integrating these thoughts with his previous experience; that is, comparing what is said here with what is already known, both of which are represented by language. This is a process of encoding and decoding linguistic elements. The process is not what Osgood would term meaning, but is the combination of lin- guistic elements which are encoded to characterize a thought which is the manifest meaning given to a concept or idea. Consequently, it is this linguistic encoding of the words chosen to communicate a thought which he endeavors to assess systematically on the assumption that it is possible to use this information to draw conclusions about the "states of" or "events in" the language user. In an attempt to avoid confusion about this conceptualization of meaning and the role it can play in psychological research, Osgood adds a cautious limitation: Although it may be trivial in one sense to insist that all discriminable events in messages must ultimately be correlated with discriminable events in language users, this must be the case if we are to avoid mysticism in our interpretation of language behavior. When a lan- guage user comes out with sequences of linguistic responses which are ordered both as to structural and semantic characteristics, we must assume that there is some ordered, selective system Operating within the organism. Ultimately it is the job of the psycholin- guist to make a science out of the correlations between message events and states of the organism. In our work on what we have been calling "meaning," we have mapped only a small region of this complex set of correlations, and that rather sketchily (p. 321). The objective toward which Osgood and his associates intended to move, therefore, was to use linguistics to 25 measure states Of nature in individuals; or more specifically, the "connotative, emotive or metaphorical meaning" as Opposed to the "denotative, designative, or referential meaning" (p. 321). The progression of their research endeavor led to the semantic differential technique representing these states. There are, of course, a number of ways to measure such emotive reactions. Osgood (1957, pp. 10-18) reviews a number of these and comes to the conclusion that a psycho— metric scaling procedure is more apprOpriate. One alterna- tive is physiological measures of emotional reactions: action potentials in striate musculature, salivary responses, and galvanic skin responses. Other methods of measuring meaning have come from learning studies (semantic generali- ‘ zation, transfer, and interference research), perceptual methods, and word association research. However, these are seen as being of dubious validity and each involves cumber- some and tedious procedures. Instead, Osgood suggests that a simple though infrequently used scaling procedure is the most plausible Option, because it controls for individual differences in vocabulary and verbal facility. Such a pro- cedure can lead to valid inferences about the psychological state of the respondent. Each step in this research sequence is discussed by Osgood, et al. (1957) in Measurement of Meaning but cannot be totally reproduced here. However, the essence of the progression is outlined below. Osgood found that, when an 26 individual was asked his reactions to a previously encoded (familiar) stimulus concept, he typically responded with a series of adjectives and that the variety of the adjectives employed differed between individuals with groups of society. He therefore concluded that adjectives would be the most efficient descriptors but that it was necessary to exercise more control over the measurement of the reactions. This control derived from the scaling work of Mosier (1941) which used a bipolar sequence separated by an ll-point scale designed to tap a favorable-unfavorable attitude. The limitation noted in this work was that the measure was uni- dimensional, while the adjectives employed previously had appeared to be multidimensional. Consequently, Osgood and his associates employed various sets of bipolar adjective pairs'and coupled them with concepts to assess the manner in which undergraduates used the adjective pairs to modify the concepts. It became apparent that there were three primary dimensions on which individuals tended to rate the concepts. These were an evaluative dimension (characterized by such bipolar pairs as good—bad), which assessed the favorable or unfavorable nature of the concept in the perceptions of the rator; a potency dimension (strong-weak), which reflected the per- ceived power of the concept; and an activity dimension assessing the perceived action of the stimulus concept (active-passive). 27 From this research two major points emerge which are relevant considerations in the research reported here. The first is the frame of reference commonly formed when indi— viduals are asked to rate meanings of concepts; that is the evaluative, potency, and activity dimensions. These three dimensions of emotive response have been revealed in other research in the language of emotion and will be discussed in greater length in the next section. The second major point that emerges is in the nature of the instrument itself and its potential for assessing linguistic associations by providing a controlled set of conditions within which the respondent must Operate to make known his reactions to con- cepts and ideas. The potential is realized when one con— siders the number of combinations of concepts and descriptors it is possible to generate in one linguistic system. In fact, research to date has only tapped small pockets of information about these meanings and associations. Osgood and his associates began with the general Case assessing well-known concepts familiar to all respondents. The con- clusions they have been able to draw from such data provide only the broad framework of the system used to encode ideas and reactions. This is at best very incomplete by Osgood's own admission (see quotation above). However, in advocating continued deve10pment and application of the technique, he stipulates that new, more specific domains be probed simply by selecting content- specific concepts and exploring the interrelationships with 28 various descriptors. In 1961, R. G. Smith concluded: . . . the dimensions of any special subject matter area must be individually determined even with areas as closely related as those of general speech to the theater arts, since there are both factor and scale variations in significant amounts. This necessitates, for any special area of investigation in which the semantic differential is to be used, a specific factor analysis to determine the important factors and the scales which measure them (p. l). The intent Of the research reported here was to move in just such a direction in the realm of education; to assess the dimensionality and to seek to quantify meanings ascribed to instructionally relevant concepts, tools, and procedures; in order to use such informatiOn to advantage in planning instruction. Both Osgood (1957) and Heise (1969) present procedures to accomplish this and these were applied and are discussed in Chapter III. For this reason the format of the instrumentation employed was that Of a SD technique tailored to assess the domain of educational concepts. More specifically, the student was given a response sheet with a concept listed at the top and a series of bipolar adjectives, each separated by a traditional 7-point scale. The middle response Option was labeled neutral and the extremes in both directions were labeled slightly, very, and extremely, resulting in the following standard SD format: BEHAVIORAL OBJECTIVES extremely very slightly neutral slightly very extremely good : : : : : : : bad strong : : ‘ : : : : : weak active : : : : : : : passive etc. 29 The EPA Structure and the Language of Emotion The relevance of the evaluative, potency, and activity dimensions to this research was defined in Chapter I in terms of the qualities a teacher should ascribe to the instructional concepts, tools, and procedures presented during teacher training. The question of the application of acquired knowledge is addressed directly by the degree to which instruction and training have been able to demon- strate that the procedures presented are good, powerful, and active contributors to an effective learning environment. However, there is an additional facet to the rele- vance of this tri-dimension in assessing the qualities ascribed to specific concepts. In the previous section it was reported that Osgood and his associates found these to be the basic dimensions or frame of reference in the general language realm. In other research as far back as the works of Wundt (1905) dimensions similar to these have been con— sistently reported. Wundt mentions pleasantness— unpleasantness, tension-relief, and excitement-quiet as the three dimensions of emotion. Schlosberg (1954) altered the terminology but the dimensions remain essentially unchanged: pleasant-unpleasant, rejection-attraction, and activation- sleep. Nowlis and Nowlis (1956) termed the dimension hedonic tone, social orientation, and level of activation. It was at this point that Osgood (1957) began to discuss the EPA structure and this was followed by others: Block (1957): pleasantness, relevant interpersonal relatedness, and level 30 of activation; Davitz (1969): hedonic tone, relatedness, and activation. There have been others who have cited various components of the three dimensions as being characteristic of man's communication in the affective domain. (See Davitz, 1969, p. 132 for a comprehensive listing.) It becomes evident from this that if man is asked to describe his affective reactions regarding a familiar stimulus he will probably describe it in all or some subset of three components: whether or not he reacts favorably, how powerful he perceives it to be, and its level of activ- ity. This has even been shown to be true across language and cultural differences (Kumata and Schramm, 1956; Tanaka, Ogama, and Osgood, 1963; Michon, 1960; Suci, 1960). This has brought Osgood to the conclusion that there is a common factor which he terms "semantic space" which has led man to develop linguistic elements which focus on three primary areas and add additional subdimensions and Miron (1969) to term the three dimensions as "universal features of human semantic systems." The need to define and tailor an instrument to tap these three dimensions should now be evident. Such a pro- cedure is advisable because of the nature of the information sought in this particular study and because it is the common frame of reference in describing the types of linguistic associations which are of interest. 31 Psychometric Qualities of the Semantic Differential Technique The intent of the argument throughout Chapter I and to this point in Chapter II has been to clearly demonstrate the relevance of the research carried out and the relevance of the semantic differential technique to that research. However, in addition to relevance, the SD technique of quan- tifying psychological variables has been shown to have the psychometric qualities demanded of scaling techniques. The qualities on which one traditionally judges the adequacy of a particular instrument have been listed as objectivity, reliability, sensitivity, validity of metric assumptions, comparability, validity, and utility (Remmers, 1963; Osgood, et al., 1957). Each of these is defined and dis— cussed at length below to demonstrate the status of the semantic differential as a measurement tool. Objectivity "A method is Objective to the extent that the Opera— tions of measurement and the means of arriving at conclusions can be made explicit" (Osgood, et al., 1957). The criterion of Objectivity is that the instrument produces "verifiably reproducable data regardless of the rator" (Remmers, 1963, p. 330). The essence of the defense on this quality is that, given the same concepts and scales plus constant instruc— tions, any two researchers must come up with the same response patterns. Osgood observes: 32 It may be argued that the data with which we deal in semantic measurement are essentially subjective-— introspections about meanings on the part of subjects-- and that all we have done is to objectify expressions of these subjective states. This is entirely true, but it is not a criticism of the method. Objectivity concerns the role of the observer, not the observed. Our procedures completely eliminate the idiosyncrasies of the investigator in arriving at the final index of meaning, and this is the essence of objectivity (pp. 125-126). Reliability The quality of reliability is defined as the ability of an instrument to yield the same variables over repeated measures within tolerable error (Remmer, 1963). There have been numerous studies reported dealing with the reliability of SD instrumentation in the test retest sense. These are summarized in Table l. Coefficients of this magnitude speak directly to the issue of reliability of the results of SD applications over repeated administrations. The issue of internal consistency is, of course, one which must be addressed for each discussion with each new set of data. However, Oles and Bolvin (1971) report coefficient alpha reliabilities of individual scales on the evaluative dimension ranging from .86 to .92. Though this is frequently not dealt with in SD scaling, construction of such instrumentation requires inter— nal consistency within dimension Of meaning measured. There will be a further discussion of this is Chapter III. 33 mHoE no mo. mmuoom nouomw mumwpmfiafl hemmav cmzom ouoE no mm. mmuoom Houomm some mumflpmfiafl Aammav coma: mm. mamme mamom nucoe H Ammmav cmauoz Amummocoo Hm>o .m>mv em. 0» mm. mmuoom uouomm cme nucoe H Ammmav xOHo a mumfl>oo Ammcflwmu mamom mo mm. on hm. .m>mv mmuoom HOpomM mxmmz m Ammmav Eomocmccma mm. mouoom mamom Hmouom ammuou ummu mumHOmEEH Anmmav ooommo H manmfium> am>nmucH wpoum .HMHDOOHOWMHO oaucmfimm on» mo mGOHDOOHHmmm cw mmfluflaflnmaaou ummuounummeln.a magma 34 Sensitivity In order to be considered sensitive, an instrument or measurement technique "should yield as fine distinctions as are typically made in communicating about the Object Of investigation" (Remmer, 1963, p. 330). This quality is clearly demonstrated in the semantic differential technique from two separate directions, depending on what is defined as the object Of investigation. If the object is to employ linguistic elements to make generalizations about specific operating systems in the user of those elements (i.e., the speaker), an alternative to the semantic differential would be a projective test. In this case, as mentioned in Chapter I, the control offered by the SD for idiosyncrasies in vocabulary or differences in verbal ability makes it more sensitive to (or more capable of detecting) common aspects of mental operations across individuals. — U—HCIf, on the other hand, one chooses to define the object of investigation as the measurement of affect, which is generally equivalent in the literature to defining it as an attitude measure, the technique must be shown to be as sensitive as other measures of attitude. Two studies have been carried out which address this issue. Before discussing these, however, there is an impor- tant distinction to be made which speaks to the issue of the sensitivity of the semantic differential. The measure- ment of attitude has been traditionally defined as the 35 scaling of a learned favorable or unfavorable predisposition toward an object (Allport, 1967). It must therefore be con- cluded that, if one is concerned with attitude as a measure of affect, he is concerned with only one of the dimensions which can be measured by the SD techniques, the evaluative dimension. The other two primary dimensions of interest here, potency and activity, are critical components of affect which expand the definition of affect and the information which the technique can provide to those concerned with non— cognitive outcomes of education. The dimensions are, there— fore, claimed as added sensitivity for the semantic differen— tial over more narrow attitudinal definitions of affect. However, even within the limited definition, if one compares the evaluative dimension of the semantic differen— tial with other attitude scaling procedures, the criterion for ascribing the quality of sensitivity to the semantic differential is met. That is, it has been shown to be as sensitive as the Thurstone and Guttman procedures. Table 2 reports part of the data discussed by Osgood concerning the correlation between responses to a Thurstone procedure and the semantic differential technique for three attitude Objects measured twice. Osgood reports that these six coefficients exceed .90. In another study reported by Osgood (p. 194) the SD was compared to the Guttman procedure (N=28 on 14 items) and subjects were found to be rank ordered with respect to attitude with fairly high consistency (rho=.78, p<.01). 36 Table 2.--Validity coefficients for semantic differential scores (5) and Thurstone scale scores (t).a Object rsltl rszt2 The Church .74 .76 Capital Punishment .81 .77 The Negro .82 .81 aAfter Osgood, et al., 1957, p. 194. Consequently, it would seem that a micrOSCOpe at least as powerful as those traditionally used can be focused on affect by employing the SD technique. Sensitivity can be added to the extent that one is able to measure dimensions of affect outside the common attitude definition. Validity of Metric Assumptions As psychometricians set about the task of developing instruments which have potential applications as dependent variables in the realm of behavioral science research, they make certain assumptions about the variable of interest and the technique they use to measure it. It is essential that these be valid assumptions in order to adequately Operation- alize the variable in such a manner as to allow inferences to be made about the state of the variable. The SD requires three such assumptions. The scales are (l) assumed to equal interval scales, (2) anchored at each end by polar opposites, and (3) passing through a common origin (Messick, 1967; Heise, 1969). 37 Messick tested the first two assumptions by means of a comparison of the bipolar scales and scales constructed by the method Of equal appearing intervals. He came to the conclusion that, considering . . . an approximate equality of corresponding interval lengths from scale to scale and a similar placement of origin across scales, it seems reasonable to conclude that the scale properties implied by the SD procedures have some basis other than mere assumptions (1967: p. 206). Greatest assurance of the equality in intervals can be secured, according to Cliff (1969), Howe (1962, 1966a, 1966b) by employing adverbial quantifiers extremely, very, slightly, and neutral to define the points on the scale. These add credence to the assumption that the seven points separating the poles are equidistantly spaced (Heise, 1969). The third metric assumption, bipolarity, has been examined by Green and Goldfried (1965) and Bentler (1969). The first study hypothesized that, if adjectives are indeed bipolar, there should be a negative correlation between them when rated individually on a scale. The test of this hypothesis was carried out and some bipolarity was indi- cated. A second criterion was employed to further test this point, which stipulated that if adjectives are indeed Oppo- sites, a factor analysis of their individual ratings should be equal but Opposite in sign. Again some bipolarity was revealed. However, neither test led Green and Goldfried to state conclusively that the assumption is valid. Bentler (1969) found approximate bipolarity in linguistic contrasts frequently used in SD research, as did Mordkoff (1963, 1965). 38 It must be concluded from this that, at worst, one makes little error in most applications of the semantic dif- ferential assuming bipolarity, equal internal properties, and a constant origin (Heise, 1969). Comparability According to Osgood (1957), comparability speaks to the issue of the range of situations over which the measure- ment technique is equally valid. As mentioned in the dis- cussion of the relation of EPA structure to the language of emotion, the technique of measuring affect reactions via a semantic meaning space appears to be equally applicable across languages and cultures outside that on which it was initially tested. In addition, Heise (1969) cites the con- sistency of scale interrelationships within factors. However, there has been some evidence of concept scale interactions where different concepts cause the same scales to load on different factors when comparing widely different populations. If this discrepancy can be controlled by developing an instru- ment with stable scale loadings, the quality of comparabil- ity with the domain of immediate interest (within which the instrument development work was done), the quality of com- parability, or validity across applications within that area can be safely assumed. Validity and Utility Thus far, the two psychometric qualities which have not been discussed are validity and utility, and for good 39 reason. There is substantial evidence in support of the psychometric stature of the semantic differential procedure on four of the first five counts, but reliability, validity, and utility must be demonstrated on each new application. Reliability has already been discussed. Utility is defined as the ability to efficiently yield information relevant to contemporary and practical issues. Chapter I presented the argument that the technique should have the potential to make such a contribution, but the purpose of this research is to attempt a verification of this capability in the realm of program evaluation in teacher training. Tied very closely with this issue is the conceptualization of validity or the extent to which scores on an instrument "correlate with scores on some criterion of that which it is supposed to measure" (Osgood, 1957, p. 140). The technique must demon- strate validity to be useful when applied to current prob- lems. First, it is necessary to discuss the criterion to which the semantic differential must be compared to be con— sidered valid. It will then be possible to assess the validity and generalize to the utility. In discussing the sensitivity Of the technique, cor- relations were reported between the semantic differential evaluative scales and other attitude scaling procedures. When corrected to allow for the unreliability of the instru- ments, the validity coefficients exceeded .90. In a certain sense this demonstrates a high degree of validity, but it is 40 a limited type of empirical validity (Brown, 1970), only examining part of the total instrument. In order to truly speak to the issue of utility, this conceptualization must be expanded to include validity in a predictive sense. Such predictive studies have been carried out in numerous areas of investigation outside the educational domain and to a lesser extent within education. From areas outside of education, the SD procedure has been tailored (i.e., scales and concepts selected) to predict movement toward mental health employing independent ratings of clinical psychologists as the criterion (Dingman, Paulson, Eyman, and Miller, 1962; Endler, 1961). In addition, it has been shown to be predictive of the pro- fessional status of counselors (professional versus novice) in relation to their empathetic responses to clients (Green— berg, 1970); to predict racial and ethnic status of students (McNeil, 1967); and various other personality characteris- tics (Rentz, et al., 1968; Griggs, 1959). In these areas and others, specifically tailored instruments can be said to be valid measures of the variable tapped by the behavior criterion. Within the domain of education, there has been less work done attempting to assess the relationship between the SD procedure and other relevant psychological constructs. Spino (1959) and Kubiniec (1970) have attempted to predict academic achievement on the basis of connotative meaning ascribed to constructs, but both studies yielded inconclusive 41 results. Cook (1959) was somewhat more successful in this endeavor, as he was able to demonstrate some increase in the ability to predict ACE scores by adding SD ratings on the concepts "myself as a student" and "the ideal student." There has been other research involving applications of the technique to educationally relevant problems and these are reviewed in the next section, but they do not speak to the issue of validity per se. Perhaps part of the reason for this dearth of research is that there are few clearly defined and reliably measured constructs in the area of edu- cation (beyond academic achievement). In any case, the ultimate criterion to which any instrument purporting to measure a relevant construct in teacher training must be related to teacher behavior in the classroom. It is here that the full import of the validity-utility issue comes to the fore. For example, if a group of students could be shown to ascribe favorable qualities as well as power and a high degree of activity to the concepts, tools, and procedures presented in teacher training and another group unfavorable, weak, and passive quality, then in order to demonstrate validity and utility of the measurement the two groups should behave differently when constructing their learning environ- ment in the classroom. The first group should immediately extensively employ the tools and procedures presented, while the others should choose alternative procedures or at least delay implementation of the learned techniques. This is the 42 desired and ultimate goal of the research presented here, but due to the state of knowledge in educational applications of the SD technique it is necessary to take some preliminary exploratory steps first. Among these are the deve10pment of adequate instrumentation and trial applications to more con— trolled problems such as its sensitivity to changes due to instruction. In this way, the validity and utility Of the semantic differential procedures in the educational realm will be assessed and developed. Applications of the Semantic Differential to Educational Training and Change Previous research relevant to educational pursuits involving the semantic differential can be found, but investigations in the area of teacher training are not numer- ous, nor do they supply much information about this partic— ular pedagogical task. In 1964, Kerlinger Suggested that the techniques develOped by Osgood may have potential appli- cations in teacher training: A related possibility is the study of the attitudes and semantic spaces of teacher trainees. What effect does teacher training have on the educational semantic space of teacher trainees? What effect does actual teaching experience have on the semantic space of teachers? And, if a change takes place, do concom- itant changes in educational attitudes take place? (p. 579) These critical questions related to the measurement of conno- tative meaning have not as yet been systematically and thor- oughly addressed, and must be pursued prior to the ultimate validity test mentioned above. 43 The work that has been carried out that might be in some way related is exemplified by the work of Hoffman (1967), who only peripherally dealt with the measurement of teacher attitudes by employing SD ratings on the way to developing some methodological and psychometric procedure in the area of measuring connotative meaning. Feshbach and Beigel (1968) attempted to relate teachers' self-perceptions and their perceptions of an ideal child in order to draw some conclu- sions about the role of the degree similarity between the personalities of teachers and pupils in determining the suc- cess of classroom interactions, learning, and achievement. Wittrock (1962) employed the semantic differential procedures to assess the nature of the connotative meanings of "Public School Teachers“ and Public School Children“ in an attempt to further explicate the nature of the educational semantic space. This was a follow—up to the study by Husek and Wittrock (1962), in the same domain. Educational meaning space was the subject Of further research by Walberg in 1967. These investigators were attempting to define exactly what the nature of the meaning of teacher was and how it related to education. In a very real sense, however, these were only deve10pmenta1 explorations like the one being presented here. In order to bring the role Of the technique to its full potential as seen by Kerlinger, the procedures have to be brought to bear on the question Of change and this has only been done twice in teacher training under severely 4.\ ..Wu (A... U» NV 44 limited conditions. The procedures have supplied useful change data in the area Of counseling and psychotherapy. Both Endler (1961) and Dingman, et a1. (1967) used evaluative dimension ratings Of self-concepts to demonstrate the effects of therapy in abnormal pOpulations. Hartley (1968) assessed investigated changes in perceptions of self and others on evaluative, potency, and activity dimensions as a result of group interactions processes and was able to demonstrate significant changes. Otis and Barrett (1967) carried out research on the role of educational and vocational counsel— ing in bringing about changes in semantic differential ratings. Hypotheses were tested in evaluative, potency, and activity perceptions with complete success in predicting evaluative changes and very little success in revealing 7 changes on the other two dimensions. Data on the change in ratings as the result of educational treatment or instruc— tion are not nearly as extensive, particularly in the area of teacher training. Some work has been carried out by Hoover and Schutz (1968) and by Walberg, Metzner, Todd, and Henry (1968). These came quite close to addressing the questions advanced by Kerlinger, but only in a limited manner. Hoover and Schultz measured changes in the favorable-unfavorable eval- uative reactions of students regarding concepts which "represent on an indirect level the major value dimensions tapped by the course." This means in effect that the 6‘ «Lb WV ‘1‘ i its. «a x 45 concepts were not specifically dealt with in the course, but were peripherally related to a particular value bias pre— sented. This, along with the limitation Of measuring only evaluative reactions, is the major inadequacy of the study. The researchers would have enhanced the external validity of the study by choosing more dimensions to tap on concepts actually presented in instruction. It is worthy of note, however, that there were statistically significant changes in attitude toward 10 of the 13 concepts rated. Walberg, et a1. (1968) succeeded in overcoming one of the limitations mentioned in the previous study by revealing changes on more than one dimension over a l4—week period of practice teaching. The concepts in which changes were sought were "myself as a teacher," and the subjects were asked to rate how their students and their peers (teach— ers) would rate them. The purpose of the research was to measure differential effects of practice teaching in lower class ghetto schools versus affluent suburban schools on the self-concepts Of teachers. Differential changes were reported as a result of the experiences, thus supporting the role of the semantic differential in this type of research. The major limitation from the point of view Of the research being reported here was that the changes dealt only in self- concept. There are other concepts which might have been added, such as myself as a person, child, authority, education in general. These are all concepts addressed 46 systematically in this type of practice teaching experience and would add to the generalizability of the results. In sum, the research applications of the SD technique which sought to map changes due to a teacher training experi- ence have serious limitations when taken separately. One (Hoover and Schultz) measures numerous peripheral concepts on a single dimension, while the other (Walberg, et al.) rates one concept on numerous dimensions. The research described here combines these two research designs to measure numerous directly relevant concepts (including self-concept) on more than the evaluative dimension. Conclusion A small portion Of the research literature on the SD and its applications has been reviewed here with the intent of demonstrating the relationship between the technique and similar measurement methods in the domain of emotion, the adequacy of the technique in terms of psychometrics and the past applications of the technique to teacher training exper- iences and the measurement of change therein. By coupling prior thinking on applications of the technique with the rationale developed in Chapter I, it has become apparent that one may systematically address the questions posed by Kerlinger, and an attempt has been made here to do so. This attempt revolved around assessing changes in semantic space and the meanings of concepts as the result of an educa- tional experience where the concepts are systematically 8V5 in 47 presented during instruction and the bipolar scales tap the evaluative dimension as well as activity and potency factors in the students' reactions to these concepts. The research was carried out in the hOpe that the role Of the semantic differential procedures in revealing the influence of instruction could be demonstrated, thus defining an addi- tional dependent variable for the systematic and scholarly evaluation of educational programs. CHAPTER III METHODOLOGY Introduction At the conclusion of Chapter I, a series of questions was posed which, if answered, would supply some insight into the role the semantic differential would play in evaluative procedures and into the role of instruction in teacher train- ing. Some of these questions spoke directly to the issue of change in semantic meaning space raised by Kerlinger in Chapter II: 1. Are the primary dimensions which characterize the frame of reference with respect to educationally relevant concepts dealt with in teacher training identifiable as evaluative, potency, and activity factors? 2. What is the nature of the effect Of teacher training on these dimensions? DO they remain constant from the beginning of instruction to the end? Other questions posed in Chapter I were intended to probe more deeply into the effect of a teacher training experience, going beyond the level of change in factor struc- ture to change in meanings of concepts and the role that instruction plays at that level: 48 49 3. Is there evidence that instruction influences the connotative meaning of educationally relevant con- cepts dealt with during a training experience? 4. If there are such changes is there evidence that different instructors play differential roles in bringing them about? Still other questions spoke directly to the role of the technique and its potential in program evaluation in the teacher training context. These cannot be addressed statistically, but are nevertheless crucial to this inves- tigation of a new methodology: 5. Are'the connotative meanings of Specific concepts and the changes in those meanings appropriate, given the Objectives Of instruction? 6. Are there other Obviously relevant examples of hypotheses which might be addressed via this type of scaling procedure? These questions have been restated here because they form the framework for the methodological procedures to be described here by generating research and statistical hypoth— eses as well as tests for these hypotheses. By way of orientation, the chapter will begin with a description Of an instrument deve10pment pilot test with its purpose, sample, procedures, and results. The discussion will then turn to the application of the SD in order to seek answers to the questions above, describing the sample, instrumentation, design, (including its limitations), procedures, and data reduction. 50 Finally, the specific research and statistical hypotheses will be stated with their respective tests. Instrument Development Pilot Test Purpose There were two major purposes for administering a pilot test of the semantic differential prior to seeking information about changes in semantic meaning. The first was to assess the feasibility of gathering and analyzing such data on a large scale. This involved such subproblems as determining a means of communicating the interest of the research without cueing responses, deve10pment of instruc- tions for administration Of the instrument to large groups, development of data processing capabilities and the capabil— ities to analyze the response of a large number Of students. The second purpose for the pilot was to take some preliminary steps toward the deve10pment of instrumentation for measuring the dimensions Of connotative meaning desired. It is necessary to mention at this point, however, that this type of development cannot be completed in one step. But the first step was deemed necessary prior to a full-scale administration, primarily because of the desire to have scales which measure three specific dimensions (evaluative, potency, and activity). Pilot Test Sample For these reasons, a group of 120 undergraduate education majors at Michigan State University responded to 51 a pilot semantic differential instrument (described below). The subjects were enrolled in Education 200 (which was described earlier as the first course in the teacher training sequence) during the Winter Term of 1972. Pilot instruments were given to a random sample of 200 of the 600 enrollees. Given the exploratory nature of this pilot, a relatively low response rate of 60 per cent did yield useful data. Procedures The instrument to which these students reSponded was made up of 17 concepts chosen from three content domains within the course and these were each rated on 25 sets Of bipolar adjectives chosen from the research literature as representative of the evaluative potency and activity dimen- sions. The poles were separated by a 7—point scale with the points labeled "extremely, very, slightly, neutral, slightly, very, extremely." The three content areas represent the three facets of Education 200, with concepts being chosen from the individual study carrels which present the task demands of teaching, the small group experience which deals with the personal demands of teaching, and the large group presentation which aims at a gross emOtional reaction to the field of education. The first group of concepts, there— fore, was technicai educational psychology concepts, the second was interpersonal concepts, and the third was general educational concepts. These are listed along with the bipolar scales in Appendix A. 52 These elements were assembled, one concept per page, on an Optical scan answer sheet with a letter of introduction and instructions taken verbatim from Osgood (1957, pp. 82-84) to compose the test booklet (see Appendix B). The booklets were distributed by the instructors during the last week of classes and were tO be returned by the students one week later. When returned, the booklets were separated by concept. In the few cases where a concept was left unrated, that concept was eliminated. In a few cases a single scale was unrated, and a neutral response was entered so that the remainder of the ratings would be used. Results of the Pilot Test Regarding the first purpose, which questioned the feasibility of employing the SD scaling procedure, numerous inadequacies were noted. Personal interaction with respOn- dents as booklets were returned revealed a generally negative reaction, due primarily to the inadequacy of the description of the purposes and relevance of the research presented in the letter. Additional contributions to the lack of enthu- siasm on the part of the students came from unclear instruc- tions and the length of time required to rate 1? concepts on 25 scales. Under these circumstances, it was difficult to maintain motivation at a sufficiently high level to insure thoroughly considered responses. The data collection procedure involving the use Of Optical scanning for card punch was a very efficient means 53 of preparing the data for analysis. However, the nature of the response sheet employed (see Appendix B) necessitated choosing a response from 1 to 7 on the scale at the left and transposing that response to an Opscan field at the right. This was a tedious process and was seen as an inadequacy in the procedure. Therefore, the following decisions were made with respect to the alterations in procedures to be made in develOping the instrument for step 2 in the research: 1. It is very important to more clearly and concisely define the purposes of the instrument to the respon- dent, since it may not be immediately apparent in the instrument. 2. The instructions taken verbatim from the SD research literature were totally inadequate. It was apparent that the instructions had to be rewritten to contrib— ute to the student's understanding of the task and its purposes. 3. The instrument had to be reduced in length to main- tain a sufficiently high motivation level throughout. 4. The deve10pment of a new response sheet was called for to facilitate responding by allowing the student to work a response within the scale field, i.e., between the adjective pairs. 54 Instrument Construction The responses to the concepts on the 25 scales were summarized in a number of different ways to assure complete understanding of the qualitative and quantitative response patterns. This implies two complementary concerns. The first deals with the statistical nature of the data and calls for summary of central tendency and dispersion of ” responses as well as analysis of the interrelationships of responses. The second concern is with the summary of the semantic or linguistic relations between the concepts and scales. Both of these concerns, one empirical and one rational, were seen as essential contributors to the deci— sions to be made concerning the nature of the final instru— ment. The quantitative analysis involved the computation of the means and standard deviations of each scale on each concept. Scales desirable for continuation to further research were those which had nonneutral mean values (Kerlinger, 1964, p. 520; Mitsos, 1961), and which showed wide variability (to allow for clear descriptions of factor structure). Two separate manipulations of these data led to a fairly clear-cut decision about which should be maintained. First, for each concept the means and standard devia— tions were rank ordered and the concepts with the lowest sum of the two ranks were noted. Second, a mathematical combination was computed combining the means and standard deviation for each scale on a given concept by extracting .__......A—... < .1 r 1 thf de: (A: \( 2 a. In (I) I?! (7 55 the square root of the sum of the squared means and standard deviations for each scale. These combination values scales were then ranked and the highest ranks were noted. As might be expected, the same bipolar pairs of adjectives were noted in each of these two cases. There was a high degree of agreement in these rankings across concepts. To supplement these data, a common procedure for analysis of SD data was employed. A series of factor analy- ses was performed on the data to determine how the scales interrelated. Such an analysis of the scales noted in step 1 alxove clearly revealed an evaluative dimension and gave indi- cations of potency and activity dimensions. Again, there was a high degree of agreement among concepts. The result of the statistical analysis was that 15 Sc3a.:Les should be retained. The rational decision as to the cotlczepts to be continued is presented in the discussion which follows. The pilot test, therefore, was valuable in that it re"Veealed the major inadequacies in the procedures and shed Scmnn£a light on the nature of the instrumentation required. Tlleisse considerations were carried on into a full—scale a{PEDIication to seek answers to the change questions posed. Pre/Post Administration The second step in this research was taken during Sllring Term 1972 at Michigan State University, when a second group of enrollees in Education 200 were asked to rate 56 concepts via the semantic differential procedure at the beginning and the end of the term. This section of the methodological discussion will present information concern- ing the sample tested, the instrument employed and the design chosen for application of the technique, and the procedures employed in data collection and data reduction. Sample Since some of the questions of interest refer to the influence of an educational experience on the semantic mean- ing of concepts taught, it was necessary to exclude any students who failed to complete that experience successfully. Tflaere were 412 enrollees in Education 200 during the term when the research was carried out, and of these 300 completed .iristruction successfully, receiving a passing grade. Of these 300, 252 or approximately 81 per cent completed the 'téiésk of rating the concepts at the beginning and at the end (315 'the term. These 252, then, comprised the sample which 1917<>\rided the information described here. The 19 per cent ‘vrlCD completed instruction and who were not included lacked eit11‘1er the pretest or posttest rating and were about ecIlléally distributed between the two. Other than the always 91365£3ent issue of differential motivation, there is no obvious reason why the 81 per cent is a special or obviously biased 9:: cup . 57 Instrumentation The SD instruments to which these students responded at the beginning and at the end of instruction were identi- cal. They were composed of 11 concepts and each was rated on 15 bipolar scales. Just as with the pilot test, the poles of each scale were separated by 7 points modified by the adverbial descriptors extremely, very, etc. The response sheets were modified to allow for response between the poles. For a sample see Appendix C. The selection of scales was described in the pilot test section and resulted in the following bipolar scales representing the evaluative, potency, and activity dimensions: EVALUATIVE POTENCY ACTIVI TY (favorable to (weak to strong) (passive to active) unfavorable) unfair-fair worthless- relaxed-tense bad-good valuable passive-active negative-positive lenient-severe insensitive- unimportant-important weak-powerful sensitive uninteresting- gentle-violent still-moving interesting unpleasant-pleasant unenjoyable-enjoyable However, the factor loadings in the pilot test analysis were not clear cut. Consequently, the question Of identifiable dimensions representing the three listed above still existed and this led to a reassessment of the factor load- ings on the pretest posttest administration. The nature of this reassessment was a test of the fit of orthogonal «mm hr “NM ‘6 flx‘ I 58 dimension to the data and is described in detail in the hypotheses and results to follow. The concepts to be rated on each of these scales were chosen to represent two of the areas tapped in the pilot.* These were concepts dealt with in the interpersonal process laboratory experience and in the individual study carrels. l IPL CONCEPTS CARREL CONCEPTS MYSELF AS A TEACHER BEHAVIORAL OBJECTIVES NONVERBAL BEHAVIOR REINFORCEMENT QUESTIONING AND LISTENING SKILLS RESPONDENT LEARNING MYSELF SHAPING With the exception of reinforcement, each of these was tested in the pilot. The general educational concepts were not followed up because of the general and diffuse nature of the instruction which peripherally dealt with them. To carry these through would have resulted in the same type Of inadequacy seen in the Hoffman (1967) study, which rated concepts only tangentially related to instruction, leading to very tentative conclusions about the role of the instruc- tion. In addition to these concepts, three others were added which are completely unrelated to instruction. NONINSTRUCTIONAL CONCEPTS PHYSICIAN RELIGION MARIJUANA *For definition of concept, see Appendix A. 59 These were included to allow for some quasi—control in the hope that, in the absence of a control group of subjects, they might assist in making some tentative judgments about the role of instruction. This is admittedly a type Of con- trol which lacks rigor, especially due to the lack of inde- pendence manifested between the ratings of instructional and noninstructional concepts. However, since a control group was not feasible for this study and since this is an explora— tory type of research, these concepts were seen as a poten— tially useful addition. The conclusions drawn involving this control will be made carefully and with the full reali- zation that later research must necessarily allow for tighter control. By selecting these initial elements for SD research of this variety, it is clear that the criteria for such selection prescribed by Osgood have been met. The sug- gested criteria for stimulus selection (concepts) are that the concepts should be relevant to and representative of the area of research interest but with anticipated individ— ual differences, unitary in meaning (within acceptable limits), and familiar to all subjects. The scales (bipolar adjective pairs) should be relevant to the area of inves— tigation and representative of the desired factor structure (generally three for each factor), with an eye toward seman— tic stability and "they should be linear polar opposites and pass through the origin" (Osgood, 1957, pp. 77-81). 60 Two final comments are necessary to make the descrip— tion of the instrument complete and these refer to the limi- tation of length of the final task and the order of concepts and scales. Since this is an initial and an exploratory attempt to assess the utility of a methodology, the same concepts were used as stimuli for all subjects. It was determined that 17 concepts or 25 scales in the pilot test reduced motivation to unacceptably low levels. The combina- .- _ 1“; ~‘-_ 2‘ 7 tion of these two limitations made it necessary to assess a very limited number of concepts (11), thus raising a question concerning the generalizability of these results to other concepts. The 11 chosen were not selected by means of any random procedure to be systematically representative of any particular domain. Consequently, the reader should realize that such generalizations cannot be made. This limitation can only be removed by employing a large number of such stimulus concepts in ongoing research. This type of ongoing study has been carried out in the general application of the semantic summaries as the "Semantic Atlas" reported by Snider and Osgood (1969). However, in the specific field of educa- tion, this work is only beginning and must begin with limited investigations such as the one reported here. Finally, the order in which the concepts and scales would appear in the booklet was achieved by assigning num— bers to each and using a table of random numbers to generate the order. Then the direction of each scale was determined by beginning with the scales as they appeared earlier in 61 this section and randomly reversing them. These procedures are in accordance with standard research techniques and with the semantic differential methodology as described by Kane (1971). Design The design employed for assessing the capabilities of the semantic differential for detecting relevant change is a one group pretest-posttest procedure. This design, as described by Campbell and Stanley (1963), has some inherent weaknesses, some of which pose as threats to the internal validity, given the nature of the intended research. The first potential source of internal invalidity is the fact that events other than instruction (defined as treatment) may be plausible rival explanations of any change that takes place. There are two reasons why this is not a serious weakness here. First, systematic events that occur to all subjects in a large group are unlikely, and second, other experiences related to the concepts which are chosen specifically for relevance to education courses, in partic- ular Education 200, are highly unlikely since that is the first course taken in the teacher training program and the only course in education taken during the term. Another potential threat to internal validity is the possibility that the psychological process of interest (semantic Space) may vary systematically with changes in time or simply due to maturation. According to Campbell and 62 Stanley, this requires special attention in studies carried out over an extended period. Here, once again, there are two points which reduce this threat to internal validity. The first is derived from the developmental semantic differ- ential research of Foster (1960), Matz (1963), and DiVesta (1966). These studies report some instability in semantic space in early education, followed by increasing stability through the teens. Therefore, as Darnall (1964) has con- cluded, there is every reason to conclude that maturation plays a role in this type of research. However, it seems appropriate to conclude that the growth curve of this psycho- logical characteristic, like so many others, levels Off in the late teens. This has not been tested empirically beyond the middle teens, so maturation may not be totally discarded as a threat in this design. But when one combines the early deve10pmental research with the facts that the duration of this study is just over two months and that the respondents were nearly all college SOphomores and juniors, the threat is certainly minimized. A third potential alternative explanation of change is a testing effect which stipulates that merely responding to a semantic differential would serve as a stimulus for changing the characteristics of an individual's semantic space. This assertion has not been tested empirically and must therefore stand as a possible alternative explanation for any changes which do take place. However, the use of 63 the group of noninstructional concepts clarified this point (discussed in Chapter V). There are four additional threats to the internal validity of this research, only one Of which may present problems. Differential pre-post measures due to instrument decay are impossible, given identical instruments being employed at both times. Statistical regression is only a problem with selection of extreme groups from the population of interest. Since the population of interest is college students majoring in education, there is no reason to suspect that this is an extreme group in terms of semantic ratings. Selection bias is controlled by measuring as many of the students of interest as possible. But this may give rise to a problem due to mortality or the loss of subjects who drop out of the course during the term. Further, with a sample of this size, loss of subjects due to absenteeism and other administrative difficulties are important considera— tions. Efforts have been made to minimize this threat, by including only those who completed instruction and (which requires attendance) by measuring changes in as many Of these as possible (81 per cent). In general, it must be remembered that though the cause of change is important in this research, it is the instrumentation and methodological model which are of pri— mary interest. If the dependent variable indicates potential, it will later be applied to true experimental studies which precisely control these problems. 64 Procedures The instrument described above was administered to students enrolled in Education 200 during the first week of classes of the Spring Term of 1972 and again during the last week of classes, thus allowing the intervening treatment to be carried out over a period Of nine weeks. The first test booklets assembled and passed out included a cover letter explaining the intent of the research as follows: As you proceed through the course, you will be exposed to numerous instructional concepts, tools and procedures which will, hopefully, be Of assistance to you when you become a practicing teacher. It is impor— tant, therefore, that our instruction be effective in teaching the meaning and use of these tools and concepts. In order to assess the worth of our teaching procedures, periodic measurements will be made in the form of tests to determine how much you have learned. That is, there will be an attempt made to determine whether or not you know how and when to use the instructional concepts and tools presented in the course. This is one method of determining if our instruction is as effective as it might be. However, there is another influence which the instruction has on you as a student which is as impor— tant as how much you learn, and that is how you react emotionally to the concepts and tools as they are pre- sented to you. This effect of instruction is almost never measured in our educational endeavors, but we feel that it could serve the useful purpose of providing information as we plan instruction for the future. The survey which you are asked to respond to there is an initial attempt to measure some of these non- cognitive outcomes of our instruction by asking you to describe your reactions to the concepts. It is an initial attempt because we plan to use your responses to adjust and refine these measurement techniques in order to develop a systematic means Of measuring your reactions. For these reasons, we require your assis- tance. The posttest booklets included basically the same description with adjustment in verb tense required to put the course in the past and a reminder that they had responded 65 to a similar instrument earlier in the term. For complete COpies of both letters, the reader should refer to Appen- dix C. The letter was followed by a set of instructions including a further explanation of the research and describ- ing how to use the scales to modify the concepts. These were modifications of the Osgood instructions employed in the pilot test, and appeared on the basis of student reac- tion to more adequately describe the task. The instruction sheet is reproduced in its entirety on the following page, rather than in the appendix, to insure complete understanding of the task on the part of the reader. The optical scan response sheets employed in this procedure were also modified after the pilot allowing the student to make his response on a scale located between the polar adjectives. For a sample answer sheet, the reader should refer to Appendix C. The exact data collection procedures called for the pretest booklets to be given to the students at the first class meeting by the instructors, and for the booklets to be returned within one week. Approximately 10 students returned the booklets after the deadline, and these were eliminated from the study. It was felt that these students might be too far into instruction and might therefore bias the pretest results. The posttest booklets were handed out directly to the students during the last week of instruction, and they were asked to fill them out at that time under the 66 INSTRUCTIONS The procedure we have chosen to measure your reactions to the instructional concepts and tools presented in Education 200 is to have you judge each against a set of descriptive scales. Consider the terms or phrases which appear at the top of each page as concept tools and procedures which you have at your diSposal in your classroom when you start to teach, and as you respond, make your judg— ments on the basis of the potential you feel the tools have in your learning environment. On the top of each page of this booklet, you will find listed different instructional concepts and tools to be judged, and beneath each will be a set of scales. Please take time to think about and define each of the concepts in your own mind before you respond. Then rate each concept on each of the scales. HERE IS HOW TO USE THE SCALES: If you feel that the concept at the top of the page is very closely related to one end of the scale, for example, if you see NON-VERBAL BEHAVIOR as extremely active or passive, respond as follows: Active | 2 3 4 5 6 7 Passive or Active 1 2 3 4 S 6 I Passive If you feel that the term at the top is quite closely related to one end of the scale, as for example in the case where you feel that BEHAVIORAL OBJECTIVES are very strong or weak tools to use in a classroom, respond accordingly: Strong 1 II 3 4 5 6 7 Weak or Strong 1 2 3 4 S | 7 Weak If, however, you see the tool as only slightly related to one end of the scale, for example, if you react to SHAPING as being only slightly positive or negative, then you should respond: Positive 1 2 3 4 . 6 7 Negative or Positive 1 2 I 4 '5 a 7 Negative Finally, if you see the term as being neither one nor the other, as in the case of the concept SHAPING as being neither active nor passive, then make a neutral response: Active 1 2 3 I S 6 7 Passive NOTE: Included among the instructional togls in this booklet are three terms which are unrelated to education. This is an intended part of the procedure and these terms should be rated in the same manner as the others. REMEMBER THESE THINGS AS YOU RESPOND: (1) Take a few seconds to think about the terms and the scales before responding. This should take about twenty minutes to complete. (2) Make each response a separate and independent judgment.' (3) It is your first considered impression that we want. Please don't respond carelessly, but don't IaBor or puzzle over any scale. 67 supervision of the experimentor. The alteration in proce— dures was made to reduce the amount of class time taken for evaluation procedures. Analysis Data Reduction The data collection procedures described above yielded very thorough data with every stimulus concept rated on all or most of the scales by the 252 respondents. In those few cases where a scale was left unmarked, a neu— tral response was entered. Of the 82,160 data points, less than 100 were left blank. This was seen as a necessary data completion procedure due to the critical role that the variance-covariance matrices would play in the analysis of the data. Computation Of this type of summary data requires complete responses. A machine card drop from the Optical scan response sheets produced 22 data cards per subject coded by concept, time of measurement, and instructor, each containing the responses for a given concept pre and post. These raw responses were summarized by computation of means and the variance-covariance matrices for each concept at testing time one and two. With the large amount of data in this more manageable form, analysis was carried out. 68 Hypotheses and Tests: Factor Structure The research hypotheses tested are derived directly from the questions posed earlier. These reflect concerns in the area of factor structure and changes therein: Research Hypothesis #1 A three-factor model will fit the raw responses to each of the concepts rated on the 15 scales and these will be conceptually identifiable as evalu- ative, potency, and activity. Research Hypothesis #2 The frame of reference (factor structure) which characterizes or fits the interrelationships among concepts at the beginning of the term will fit the responses for the concepts at the end of the term. The first research hypothesis was tested in two ways, one of which was quite conservative and the other quite liberal. These two decision models both sought to identify the correct factor model (i.e., number of factors) from the point of View of a statistically parsimonious explanation Of the response interrelationships. The first was an application of the unlimited maxi- mum likelihood factor analytic (UMFLA) procedure develOped by JOreskog (1965) and adapted for use on the CDC 6500 Computer System at Michigan State University by Schmidt and Scheifley (1970). According to this procedure: A correlation matrix of order Po by P0 is factor analyzed with a KO common, orthogonal factors, result- ing [in a] maximum likelihood solution . . . represented 69 by a factor matrix (of order Po by K0) Of factor load— ings and a diagonal matrix (of order Po by P0) of unique variances. The factor loadings of variables represent the regression weights in the regression of the vari- ables on the common factors (i.e., latent constructs) (Schmidt and Schiefley, 1970, p. 7% The computational procedure calls for the program to perform a series of maximum likelihood solutions using Kaiser's (1958) varimax method and after each solution is completed it is "examined from the point of View of fit using lawley's (1963) chi—square test" (Schmidt and Schiefley, pp. 9-10). If the fit does not meet the prob- ability level Of a false rejection of true null hypothesis that the model fits the data (which is specified by the experimentor) then another factor is extracted and the test is repeated. This continues until a model Of K common orthogonal factors is found which adequately explains the interrelationships among the variables. The primary reason for selecting this factor analytic procedure was that the traditional principal components analysis assumes that all of the variance in responses is true variance (i.e., the subjects represent a population), while the UMFLA procedure does not make such an assumption. Rather, it assumes the subjects to be representative of a larger population and extracts an estimate of unique variance prior to beginning the factor analytic process. Therefore, the resulting estimates of factor loadings represent a more precise description of scale interrelationships and have an 70 available statistical test for assessing the fit of the. various factor models (Morrison, 1967). The second test of the appropriate factor model was the "scree test" described in detail by Cattell (1966). Though the statistical basis of this test has not been »thoroughly described, it is a convenient test which char— acterizes important common factors as those associated with the largest latent roots. According to Cattell, in a graphic representation of the latent roots, "the curve falls in a curvilinear fashion and then becomes absolutely straight in a 'scree Of' small factor debris" (p. 206). The apprOp- riate factor model is that characterized by the latent roots which depart from the scree line (i.e., are the larger initial latent roots). For example, in Figure 1 (page 69) the three-factor solution would be apprOpriate and the latent roots beyond that would be considered unimportant. The second research hypothesis stipulates that the factor structure which best explains the responses on the pretest is the same as the structure on the posttest. In effect, it was hypothesized that the respondents used the scales in the same way or to form the same frame of reference at the beginning and at the end of instruction. This hypoth- esis was tested and was examined at a number of levels. Cattell (1966) and Cattell and Baggley (1956) report numerous tests of factor structure similarity and suggest that the appropriate test depends on the research interest and the level of measurement which the researcher 71 \ firScree Line --_1-_-_. Size of latent root 1 2 34567 Number of factors extracted Figure 1.--Example of an application Of Cattel's Scree Test. 72 is willing to assume. For example, the researcher may only be interested in comparing factor profiles or in levels of the factor loadings. In addition, he may be willing to assume that factor loadings are nominal, ordinal, interval, or ratio scales. Depending on the level of interest, he may choose from among nominal scale comparisons which include the Sign Test or the S test (Cattell, 1966). Assuming an ordinal scale, he may compare factor loadings via a non- parametric test such as Spearman's Rho or Kendall's Tau. If an interval scale is appropriate, a product—moment cor— relation will test factor similarity making maximum use of the information contained in the data, as will Burt's coefficient of congruence, assuming a ratio scale. Each of these tests approaches the problem of com— paring factor loadings in the same manner--that is, by comparing loadings from a distinguishable factor from Matrix A with a similar column of loadings from Matrix B. In the research under consideration here, for example, an identifiable factor from the pretest matrix of factor load- ings was compared with a comparable factor in the posttest matrix. Each of the identifiable factors was compared with one:test from each of the levels of measurement.‘ The specific tests carried out to assess the degree Of factor invariance were: 73 1. Nominal Scale: Harris Test (Cattell, 1966) Tests which have been suggested when a nominal level of measurement is assumed are: (l) the nonparametric Sign Test (Cattell, 1966), which compares factors on the basis of the degree of agreement between signs of the factor loadings using an exact probability distribution; (2) the S test (Cattell, 1966), which dichotomizes the factor load— ings as salient (relatively high loadings) or hyperplane (low) and compares the same variable from two factor matrices via a 2x2 contingency table and a Chi Square test of inde- pendence; and (3) the Harris test, simply an improvement on the S test, which uses more information by categorizing loadings as positive salient, hyperplane, or negative salient. The same test of independence is applied. The latter of the three was chosen to test the factor invariance in the present research because Of its maximum use of nominal scale data. The critical value for the deter- mination of factor loading salience applied to the data was that value suggested by Nunnally (1967): 1.50. 2. Ordinal Scale: Spearman Rho (Seigle, 1956) ’,a— This nonparametric test assesses the degree of rela— tionship between factors by comparing ranks assigned to the Shoadings for the comparable factors from the matrices of iJIterest. Again, the test was Of no relationship between rallks. Where N is very small (less than 10) the sampling diSStribution of p is apprOpriate. In the case where N=11—30 74 (the present data compared 15 scales) the sampling distribu— tion of t with n-2 degrees of freedom can be applied. 3. Interval Scale: Produce—Moment Correlation Coefficient This is simply the sample correlation computed for pairs of comparable factors. These sample statistics can be tested (HO: no relationship) via the sampling distri- bution of r. 4. Ratio Scale: Coefficient of Congruence (Cattell, 1966) The primary reason for the computation of this index of factor invariance as stated by Cattell is that it coun— ters the major weakness of the product-moment correlation which is that it "takes no account of difference of levels of the two patterns" (p. 196). Burt's coefficient of con- gruence, rC, is computed as follows: ZXY rc ‘ zxzzyz Where X and Y are loadings of the same scales. Since these values are no deviations of loading as in the product— moment correlation, a high degree of congruence is only indicated when level and pattern of factor loadings are Very similar. Other tests which attack the problem from the same (tirection are Tucker's (1958) "coefficient of congruence" anti the Wrigly-Neuhaus "degree of factorial similarity" (15955). The computation of these indices is virtually identical to the Burt Coefficient. 75 The major weakness of this measure is that to date there is no test of significance available, because the metric assumptions have not been explored in this domain (Cattell, 1966, p. 196). 5. Analysis of Covariance Structure: (Schmidt and Schiefley, 1971) One additional test of factor stability was carried out which attacked the problem from a different direction. Rather than comparing each pair of comparable factor load- ings separately, this test compared the entire matrix of factor loadings on the pretest with the variance—covariance matrix of the posttest responses. This was the most rigorous of the tests of factor similarity employed to compare the pre with the posttest interrelationships. This procedure attempts, through an interactive process to find a fit between a full rotated maximum likelihood factor matrix from time 1 and a variance— covariance matrix derived from responses by the same sub— jects to the same scales at time 2. Having found the point of closest approximation, a Chi Square test Of goodness Of fit is carried out. The reason for five tests of factor stability over ‘tine:is best explained by Cattell: No completely satisfactory test of the goodness of fit . . . has been developed, but the available pro- cedures for matching factors and testing for signifi- cance Of the factor loadings may be applied (1966, p. 339). 76 In order to thoroughly test the hypothesis of factor sta- bility, one test from each of the progressively more strin- gent measurement levels and the analysis of covariance structure were carried out. By executing this progression, it becomes possible to make a decision about stability on the basis Of a full examination of the scale interrelation- ships. Dependent Measures as a Composite of Individual Scales The factor structure which best repreSented the data from the rational (psychological) and statistical points of view is thoroughly discussed as part of the results presented in Chapter IV. Briefly, however, the most parsimonious solution was one of four factors and this was found to be invariant (i.e., stable) over time.* Given that this set of latent variables characterized the student's frame of reference both before and after instruction, it was possible to proceed to the second phase of the research, that of working at changes in the response to the concepts within this set of variables. Before discussing the analysis of the data with respect to the questions and research hypotheses about mean— iJIg change, however, there are two critical preliminary EKDints to be clarified. Both deal with the dependent measure *It will be necessary to give some indications about the! results to facilitate discussion in this chapter and F0 «allow for clear presentation of the complete results in (Shapter IV. 77 employed in the following analysis. The first deals with the operational or statistical definition Of "meaning," and the second is the manner in which the data were pre- pared to reflect this definition. The operational definition Of meaning was derived directly from the bipolar scales and from the formulation of meaning as originally put forth by Osgood: The point in space which serves us as an operational definition of meaning has two essential properties—- direction from the origin and distance from the origin. We may identify the properties with the quality and intensity of meaning respectively. The direction from the origin depends on the alternative polar terms selected, and the distance depends on the extremeness of the scale position checked (1957, p. 26). A simple example will make Osgood's Operational definition Of meaning perfectly clear. In this example, assume that one subject has been asked to rate one concept (i.e., atomic bomb) on the complete uncorrelated (orthogonal) bipolar scales and that S responded as follows: neutral bad: X : : : : : : : good weak: : : : : : X : powerful passive: : : : : : : X : active By placing these responses on three perpendicular axes (orthogonal axes) in three-dimensional Euclidean space it ids possible to graphically represent the meaning of this CXJncept for S (see Figure 2) in the subject's semantic mean- ing; space. In this case the quality and intensity Of each reaction is projected as a vector in three—dimensional 78 .............. Epwerful k \ I \ \\ ' \ Pas 1ve \ I \ \ ' \\ | \\ ' \ ' \ ' . \ \ lAtomlC¢"“ --- ------ .1 :Bomb,}'p“\\D : l Bad L ' : Good \ . ' \\ I I \ i ' \ I I L ....... d ______ Active Weak Figure 2.-—Examp1e Of a concept in semantic meaning space. 79 space extending from a common origin in a direction selected by S and reflecting the distance specified by S. By assign— ing quantitative descriptors to the direction and distance options available to 8, it becomes possible to mathematically represent and manipulate the responses. Consequently, a change in the meaning Of a concept, which was of interest in this research, might have been reflected as a change in either the quality or intensity of the response (or both). It is the nature of such a change which was the object of the analysis described below. There are a number of different procedures avail- able moreflectthe change and examine its qualities. One is to attempt to interpret changes on individual scales of each concept. With a small number of scales and a few con— cepts, as in the example above, such a procedure might be most enlightening. However, with a large number of scales (i.e., 15 in this study) and numerous concepts (11), such a comparison of mean changes on each bipolar scale would be tedious at best and most difficult to interpret. Conse— quently, as in virtually all SD research, this was not con— sidered a plausible analytic alternative. A more plausible alternative which is frequently chosen by SD researchers is to attempt to find an equitable means of summarizing the individual responses into some psychologically meaningful-~yet more manageable—~form. This is frequently done by assessing the interrelationships Of the scales, averaging over those which seem to be eliciting 80 the same information (high correlations), and separating those groups of items which appear to have little in common (low correlations). The analytic procedure of choice in this instance is factor analysis in which the correlation matrix is rotated to orthogonal factors maximizing the dis- tinction of high and low intercorrelations. The equitable data reduction is then carried out by averaging over those bipolar scales which demonstrate the highest factor loadings on a given orthogonal factor. These scores then have the advantage of being more manageable, but there is a condition placed on their use. This is an acceptable procedure when, and only when, the researcher is able to make rational sense out of the high and low factor loadings; is able to explain, in psychological terms, why the scales should be averaged over, and what the averaged composite score means. Such a procedure has been extensively applied in SD research. Clear examples can be found in the work Of Jakobovits and Osgood (1967), Miron (1961), Walbert, et al. (1968), Feshback and Beigel (1968), and Hoover and Schutz (1968). It is not uncommon to find researchers relying on the exten— sively tested EPA structure Of semantic differential research to form these simple averaged composite scores on the basis of an a priori selection of scales to represent this latent structure (Hartley, 1968). To Offset the advantage of easier interpretation of these composite scores, there is a major disadvantage in the procedure, which arises from the assumption implicit in 81 the computational and analytic steps carried out. In dis- cussing this assumption, it is critical to bear in mind that raw scores with the highest factor loadings on each factor are averaged. This assumes that the intercorrelations of the scales and the factors they represent are unity; that is, perfectly correlated. Further, it assumes that the intercorrelations between scales and the factors which they don't represent are zero. This is tantamount to performing a complete and detailed analysis of scale interrelationships and then discarding 75 per cent of the information gained. It is at the very least a sacrifice of precision, but beyond that it is the assumption that there is a set Of latent parameters which totally and perfectly explain the response variance and that the researcher has discovered these and tapped them with his scales. It is safe to conclude that this is simply never true. The extent to which the scales are not perfectly intercorrelated with the latent factor they represent will be reflected in the internal inconsis— tency (unreliability) of the composite scores. In addition, the extent to which the correlation of scales with other factors is not zero will be reflected in the lack Of inde— pendence of the composite scores across factors. And yet research such as that carried out by Jakobovits and Osgood (1967), Miron (1961), and Osgood (1957, p. 87) operates under such an assumption and seems to be the rule in semantic differential research rather than the exception. 82 The errors which might possibly be made with such a procedure can be avoided and maximum precision and inter- pretability can be maintained by using all Of the information gained by the factor analysis in the computation of the com— posite scores. Rather than examining the factor loadings and changing the highest to l and the lowest to 0, the factor loadings as they are computed can be used as regression weights to allow each scale to make its due contribution to each factor (Thurstone, 1947). If the factor loading matrix is rotated to orthogonal factors, the standardized raw scores can be multiplied by their respective regression weights (loadings) and summed within factors to yield com- pletely independent factor scores. In addition, it is pos— sible that a scale might contribute valuable information to more than one factor and do so for sound psychological reasons. Such a bifactor item (Nunnally, 1967) would be represented by moderately high factor loadings on two fac— tors. If the loadings are altered to l or 0, the total contribution of that item to the meaningful psychological interpretation of the data will be sacrificed. Consequently, in order to avoid the misinterpreta- tions which might result from assuming that a factor struc— ture or set of latent constructs perfectly explain the data, and to avoid the equally hazardous errors in interpretation Which might result from analyzing individual scales, com— posite scores were generated using the rotated four—factor SOlution factor loadings as regression weights, thus allowing 83 for ease of interpretation, maximum internal consistency, and total independence of factor scores. The computational formula for such scores in the cases where traditional principle components factor analysis has been carried out is given in matrix form by Thurstone (1947, p. 68): q k k N[ ]=N[ ]x q[ ] Factor Raw Factor Score Score Structure Matrix Matrix Matrix Where N number of subjects, q number of factors, and k = number of items (or scales). This formula is reproduced in an alternative form by Cattell (1966): Where Z matrix of factor standard scores, f ZV = variable standard scores, and er = factor estimation matrix. This same type of composite factor score is possible in the case where maximum likelihood factor analysis has been applied to the problem of assessing scale interrelation— Ships. The fundamental difference between the two—factor 84 analytic procedures must be taken into consideration, however, in the estimation of factor scores. This is reflected in the equation for estimation of factor scores for the maximum likelihood case reported by Harmen (1960): Y=XS-1A Where Y is the matrix of factor standard scores, x is the l is the inverse of the matrix of variable scores, and S- variance-covariance matrix. The first two components, Y and X, are the same matrices as those represented in Thurstone's and Cattell's formulae. However, 5"1 is included by Harmen to correct the score matrix in accordance with the estimate of error partialled out at the beginning of the maximum likelihood procedure. The reader will recall that classical factor analysis Operates under the assumption that the data represent a population and contain all true variance. The maximum likelihood procedure, on the other hand, is an estimation procedure computed on a sample of finite size. As N+w, the two procedures become identical and no correction for error is required in later procedure. However, in the present case, the correction was applied. An equivalent form of the Harmen equation is presented by Morrison (1967): l A A/ A—l Y=X1)’ A(I+A 1) KY4 85 A Where X and A are the same as above, 0 is the diagonal matrix of unique variances, and I is an identity matrix. For the present purposes, the Harmen formulation was applied simply because of its conceptual similarity to the more familiar classical analysis formula and because of the ease of computation, once the inverse of the variance- covariance matrix is computed. The result Of these computations for the present data, which included the rotated four—factor estimation matrix, was four orthogonal factor standard scores repre- senting four conceptually clear latent constructs for each subject on each concept, pretest and posttest. Because these scores were in standard score form (WN,X = 0, 6 = l) and were, therefore, not easily interpreted in terms of the original seven-point scale, a grand mean was added to each score (placing the 0 mean back at its original loca- tion) and the standardization procedure which resulted in 6 = 1 (i.e., dividing by the raw score 6) was reversed by multiplying by the original raw score standard deviation. As a result, there were four factor scores pre and post on each concept which approximated the original metric. This is critical because the original metric contains a valuable bit of information: the location of a neutral point against which to measure the intensity and quality of the original response. The operational definition of meaning is, therefore, the same as it was in the simple example presented at the 86 outset of this section, except the three bipolar scales used in that example to describe "atomic bomb" become four orthogonal factor scores which describe educational concepts. The same geometrical interpretation would apply if it were possible to picture four dimensions. The change in meaning of any concept can be easily derived from this, simply by computing the difference between comparable factor scores pre and post. This value for each factor served as the operational definition of change in meaning. An additional meaning index which can be computed from semantic differential data which is labeled as an indicator of how meaningful a concept is (Osgood, 1957) and is a composite of the factor scores. This again can be clearly represented in terms of a geometrical interpreta- tion. Again, since the four-dimension case can not be represented, return to the simple example of the atomic bomb rated on three scales (refer back to Figure 2). If the distance marked off on each axis represents how meaning- ful the concept is to the person on that scale by reflect— ing intensity of meaning as compared with a neutral point of no meaning for that concept on that scale, then the dis- tance from the point in three-dimensional space where the concept is placed to the origin reflects the overall inten- sity of that meaning on all scales. In three space that distance is computed according to the standard distance between two points formula where one of the points is (0,0,0). The formula is: 87 D = /ra2 + b2 + c2 Where a, b, and c are vector lengths on each axis. This can be easily generalized to the four space of the present data by D = //a2 + b2 + c2 + d2 and yields an overall index of the intensity of meaning of a concept. This value was computed for each concept and sub- ject both for the pretest and posttest data. A change in the intensity of the meaning of the 11 concepts was then Operationally defined as the difference between pre and post. The foregoing discussion has presented the concep- tual and the computational aspects of an operational defi- nition of meaning as Osgood originally conceptualized them, with some alterations in the mathematical basis. The resulting dependent measures were 10 scores for each subject: four factor scores and a distance measure both pre and post. With the data in this form it was possible to sys- tematically address the questions and research hypotheses concerning change in concept meaning. 88 Hypotheses and Tests: Meaning Change Two of the questions stated in the introduction to this chapterfknmlthe basis for research hypotheses concern- ing the changes which take place in the meanings of instruc— tional and noninstructional concepts while a teacher proceeds through undergraduate preparation or, in this case, the first course in the professional training sequence. Analysis Of such changes is, of course, conditioned on having variables measures which are comparable both at the beginning and at the end of instruction. This implies stable factor struc— ture. The stability of the factor structure in the data reported here will be described in detail in the next chap— ter. Briefly, however, there was a very high degree Of stability found from pretest to posttest observations. It was possible, therefore, to proceed to the examination Of changes of concept meaning within that invariant structure. This examination revolved around two research hypotheses, with the second only testable given an affirming result on the first. Research Hypothesis #3 There will be a statistically significant change in the meaning of instructionally relevant concepts from pretest to posttest which will not be repro— duced in the noninstructional concepts. The test implied in this hypothesis was carried out SSParately for each of the orthogonal factor scores and 13KB distance measure. Each was a one factor one level deSign with the change in factor score (post minus pre) for 89 each of the 11 concepts as multiple dependent measures on the same subject. The statistical test was of the null hypothesis: H: ’1111 _“11_ L or that the vector of mean changes was not different from a vector of zeroes,was carried out with a multivariate analysis of variance. Evidence that a Significant change had in fact taken place would be a multivariate F—ratio, which is improbable assuming the null hypothesis. Given a significant overall change, the contribution of the changes in individual concepts (i.e., instructional and noninstructional) to the overall is assessed primarily by means of a step down F and the individual univariate F-ratios which assess the null hypothesis: Given a significant change in meaning on any one factor, it becomes possible to look more deeply for the influences which brought about such a change. This led to the second research hypothesis concerning change. Research Hypothesis #4 The instructor with whom the student was associated played a critical role in the changes in meaning which were manifested, and thus role will be demon- strated by differences in the changes for different instructors. 90 Once again, the results of the overall change are presented in Chapter IV. But briefly, there were signifi— cant changes in meaning for instructional, but not non— instructional concepts in three of the four factors and the distance measure. Therefore, an analysis of instructor influence was indicated. This was carried out by means Of a one way fixed effects multivariate analysis of variance with instructor as the independent variable and three depen- dent variables per subject. The three were the average changes in (l) IPL concepts, (2) carrel concepts, and (3) noninstructional concepts. The null hypothesis was: HO: pl pl W F111 .1 “2 “2 = “2 u u u _ 3 . i3 2. L 3 19- and if it is rejected there is evidence of differential changes across instructors. Once again the step down and univariate F's provide information as to the location Of the most important contributions to the differences. This analysis, given a significant main effect, can be followed by BREE Egg tests to determine more precisely the nature of the instructor influence. Conclusion These four research hypotheses were subjected to tests based on the designs described. They represent all but two of the questions posed earlier. The remaining 91 questions will remain unconsidered until the discussion of results is presented, since neither requires a statistical analysis to determine the answer. The answer to the first can be implied from the results, but must be ultimately answered by those responsible for instruction: Are the meaning changes and the final meanings assigned to concepts after instruction appropriate for education majors given the course objectives? The second unanswered question, which speaks to the issue of the psychometric generaliz— ability of the SD technique, calls on the creative act of generating additional applications: Are there other rele— vant hypotheses on which this technique might be brought to bear? Before presenting this discussion, however, the results of the tests of factor structure, factor invariance, and meaning change will be presented. CHAPTER IV RESULTS Introduction By way of orientation, this chapter will present results in the same order as the hypotheses were presented in the previous chapter. The final decision as to the apprOpriate factor structure was made as a result of a pro- gression of factor analyses summarizing the data in various ways. These will be described in sequence. The decision concerning the invariance or stability Of the factor struc- ture over time was also made as a result of a series Of ever more stringent statistical tests. These will be pre— sented in sequence. The tests of hypotheses concerning meaning change will be presented via tables of summary data and MANOVA tables. These will be concise presentations followed by a concise summary of the various decisions. Scale Interrelationships and Factor Analysis A complete exploration of the interrelationships of the scales began with a separate maximum likelihood factor Enialysis of the responses to each concept both pre and post. Tflle purpose of each separate analysis was to test the hYpothesis that a three-factor solution was the best expilanation of the responses. Table 3 reports the results 92 93 of these tests and they clearly demonstrate that the research hypothesis of the three-factor solution is inappropriate. However, the analysis need not stOp at that point, because these analyses contain information concerning the factor model which is the most appropriate. Note in Figures 3—13 that in nearly every case the five- or six-factor model explains enough of the response variance to yield a good fit using Lawley's Chi Square Test. Table 3.--Resu1ts of the Chi Square test of goodness of fit of the three-factor model to the variance-covariance matrix of responses to each concept for both pretest and posttest observations.a Concept Pretest Posttest MYSELF AS A TEACHER x2 = 93.5 x2 = 81.8b NONVERBAL BEHAVIOR 154.4 136.9 QUESTIONING AND LISTENING SKILLS 99 . 4 106 . 5 MYSELF 152.9 166.7 BEHAVIORAL OBJECTIVES 134.2 170.2 REINFORCEMENT 135.7 152.1 RESPONDENT LEARNING 219.2 144.5 SHAPING 236.3 139.6 PHYSICIAN 132.1 78.1b RELIGION 157.7 261.5 aThe null hypothesis that the model fits the data is rejected at x2 E 82.22 where a = .05 and df = 63. bThree-factor model fits. It was concluded at this point that if these five or six factors were clear from a rational (psychological) standpoint, they could serve as the basis for the tests Of .6 us EV. .e-uo.‘.u a... L ... 'e \ Size of Latent Root 94 Pretest Posttest 7 8 9 10 15 Number of Factors Extracted Figure 3.--Fit of the factor solutions for the concept Myself as a Teacher; size of the latent roots associated with each factor for the Scree Test-best fit designated S on the abscissa; Chi Square associated with each model for test of fit-best fit designated G on abscissa. Size of the Latent Root 16 15.. 141 13.. 12» 11.41 in J T 95 Pretest Posttest .. we U. Q). 5.. m. e S. 5 Number Of Factors Extracted Figure 4.--Combined data assessing fit Of factor models for the concept Nonverbal Behav— igg; size of latent roots associated with each factor for Scree Test-best fit designated S on abscissa; Chi Square associated with each model-best fit designated G on abscissa. Size of Latent Roots 96 Pretest ---- Posttest s ‘ ‘\ ‘P u- N l-‘sr- Nd» O @ U1“ mu \1-4- (D o. |'-’ 0-1 J [.4 U1 Number of Factors Extracted Figure 5.-- Fit of the factor solutions for the concept M self; best fit according to Scree Test deSIgnated 8; best fit by Lawley x2 cri— terion designated G on abscissa. Size Of Latent Roots Eye-e 97 Pretest Posttest 1'1 '- ' c : fl F. : t % : {L—o 7 8 9 10 15 Number of Factors Extracted Figure 6.-- Fit of the factor solutions for the concept Questioning and Listening Skills; size of the latent roots associated with each fac- tor for Scree Test - best fit designated S; chi square associated with each model for test of fit - best fit designated G. Size of the Latent Roots 98 20.. Pretest Posttest 15‘ 14.1. 13-- A v A 47, F r . : r % #7 5 {F~ 1 2 3 g) @ 6 8 9 10 15 Number of Factors Extracted L 1. qr- .. .. .. \l—I Figure 7.--Fit of factor models for the concept Behav— ioral Objectives; size of the latent roots associated with each factor for Scree Test- best fit designated S; chi square associated with each factor for test of fit—best fit designated G on the abscissa. Roots Size of Latent 15. 14. 13. 12. ll. 10« 9. 3. 7. 61 5. 41 3. 2. 14 99 x2=405 x2=360 Pretest \ ..... Posttest —TI___*-—“‘*“7L, l i i fiL IF“ 7 8 10 15 Number of Factors Extracted Figure 8.—-Fit of factor models for the concept Reinforcement; size Of the latent asso- ciated with each factor for Scree test - best fit designated S; chi square asso- ciated with each factor for test Of fit - best fit designated G. Size of Latent Root 301 28. 26, 24‘ 22. 20. 18 d- 16 I 14. )2) 100 Pretest Posttest G pre Number of Factors Extracted Figure 9.--Fit of factor solutions for the concept Respondent Learning; best fit via Scree Test designated S; best fit via x2 test designated Gpre for pretest results and Gpost for posttest results. Root Size of the Latent 254 20 d)- 15., 14.. 12., ll. 10.. 9.1 84» 7.. 6.. 54. 44% 3. 24) 11. 101 Pretest ..... Posttest Number of Factors Extracted Figure 10.--Fit of factor solutions for the concept Shaping; size of the latent roots asso— c1ated with each factor for the Scree test best fit designated 8 on the abscissa; chi square associated with each model for test of fit - best fit designated G. Roots Size of Latent 102 14. 13J Pretest ----- Posttest firfi’efi 1 2 3 @ ® 6 7 8 9 10 15 G G post pre J. .1 Number of Factors Extracted Figure ll.--Fit Of factor solutions for the concept Physician; best fit using Scree Test and chi square test designated 8, G r and respectively on abscissa. Gpost Size of Latent Roots 15, 14. 13« 12. 111 10) 103 Pretest ----- Posttest Number Of Factors Extracted Figure 12.--Fit Of factor models for the concept Religion; S and G on the abscissa indi- cate best fit by Scree and x2 test Of goodness of fit respectively. Size of Latent Root 104 U365 A x2=604 15 P to 36: 14.. x2=532 1‘ 13» 11.. 10" Pretest ----- Posttest Number Of Factors Extracted Figure l3.--Fit of factor solutions for the concept Marijuana; S and G on the abscissa indicate best fit by Scree and goodness of fit tests. 105 factor structure invariance. However, examination of the factor loadings of the various models for concepts revealed? little that made psychological sense around the five- or six-factor levels of analysis. At about the four—factor level there were indications of what seemed to be a common thread of factor structure, but, beyond that, single scales started separating themselves out as factors. If these had been the same scales over numerous concepts, it would have been feasible to acknowledge these as inefficient yet impor- tant factors and to let them stand. However, this was not the case, since different scales seemed to spin off for different concepts and at different times. Consequently, the fifth or sixth factor and beyond were of little use in obtaining a simple and parsimonious explanation of the data. A further exploration of the data (see Figures 3 through 13) by means of a scree test or graphing the latent roots and noting the largest of these, as suggested by Cattell (1966), demonstrated that in most cases there were four large latent roots and the remainder were unimportant. This further supported the hypothesis (initiated by looking at the factor loading) that a four-factor solution was the best explanation. The disagreement between these two tests indicated that it would be advisable to go back and look very closely at.the factor loadings to determine the exact nature of the 106 statistically unclear latent constructs.* On close inspec- tion, it was apparent that, in nearly all of the four-factor solutions (16 out of 22), it was possible to identify a factor characterized by large loadings on these scales: worthless-valuable, bad-good, negative-positive, and unimportant-important. This makes immediate sense in terms of Osgood's traditional evaluative factor. It was also apparent that the scales unpleasant-pleasant and unenjoyable- enjoyable were highly intercorrelated and were being sep- arated as a second factor of an evaluative sort but reflecting more of a personal dimension as opposed to the "other" oriented traditional Osgood evaluative dimension described above. A third group of scales which loaded highly together were severe-lenient, tense-relaxed, and violent— gentle. For clarity of discussion this was termed a leniency dimension, though it bore some similarity to a factor Wittrock (1964) and Husek and Wittrock (1963) termed a tenacity dimension of meaning. Finally, the three scales weak-powerful, passive-active, and still-moving were appar— ently tapping a fourth latent construct revolving around the potency of the concept being rated. The remaining three scales, as yet unaccounted for, were unfair—fair, insensitive-sensitive, and uninteresting-interesting. There were some indications that these also fit into the pattern *Appendix D contains the complete factor loading matrices for each of the 11 concepts (pre and post) for the four-factor solution and the solution of best fit by Lawley's criterion. 107 of interrelationships described above: unfair—fair into the other oriented evaluative dimension; insensitive into the leniency dimension; and uninteresting-interesting into the personal evaluative factor. However, these were by no means conclusive indications. The analysis to this point made two things quite apparent: (1) There was a high degree of consistency in the statistical results though they were unclear from a criterion point of view as to which factor model was most appropriate. (2) There was a relatively high degree of rational consistency and interpretability in the four—factor solution. Consequently, in search Of a means of filtering out the noise that was inhibiting a clear reception of the relevant latent constructs, two additional data manipula- tions were carried out. The first was to take one concept and eliminate the three scales (listed above) which had no clear place in the apparent psychological structure. The intent was not to test the possibility of totally eliminating these variables from the analysis, because this would have been an undue sacrifice of a great deal Of data. Rather, the purpose was to take a closer look at the factor structure with some noise removed in the hOpe this would yield a clearer outline of the structure elements. This served the purpose well, as reported in Figure 14. The results of the factor analysis of the 12x12 variance-covariance matrix (demonstrated that by both statistical criteria, the scree Root Size of Latent 108 ‘- CW" \1 CD1)- 1" N Number of Factors Extracted Figure l4.~- Fit of factor solutions for the concept Behavioral Objectives employing 12 Of the 15 scales; size of the latent roots asso- ciated with each factor for the Scree Test - best fit designated S on the abscissa; chi square associated with each model for test of fit - best fit designated G. 109 test of the size of the latent roots and x2 test Of goodness of fit, the four-factor solution was correct. Examination of the factor loadings also supported the psychological interpretation presented above. The next manipulation performed was to pool the responses over concepts on the pretest and on the posttest and to reanalyze the pooled variance—covariance matrices. This was done for three reasons: (1) it was most difficult to deal with the idiosyncracies of 22 separate factor analy- ses considering 9 to 10 factor models for each; (2) in the hope that clarity would be gained through errors canceling themselves out over a larger number of responses; and (3) because of the high degree of consistency in the factor patterns across concepts. The results of these two factor analyses are reported in Figure 15, along with the factor pattern matrices in Table 4. These data provided some support for the four-factor model, some additional insights, and some disappointments. In short, they pro- vided some good news and some bad news. It was disappointing to find that the errors did not cancel out--they accumulated, thus increasing the dis- crepancy between the two statistical tests. The four-factor solution was appropriate, once again, with the scree test as the criterion. However, by summing over 2,772 rather than 252 responses, what was accumulating was not noise which would be canceled out, but true variance. Thus a larger number of factors was required to explain enough of 110 25 20" Pretest 150 l4" 13x 12x 11" 10" Posttest 2-1 2_ 2- x —337 x —l9l X2=81 X2=29 x2=104 x2=37 A I (/_ - v I } If / 123@567.91015 G Figure l5.--Fit of factor solutions for the pretest and posttest results pooled over concepts; size of the latent roots associated with each factor for the Scree Test - best fit designated S; chi square associated with each model for test of fit — best fit designated G. e.me e.oe o.oo o.om m.oe o.~e o.oo o.em .eo> .eoo o.m o.o o.o o.om e.m m.e m.os o.sm .eo> .oose mso. ose.s mo~.u mos. oom.: omo. oom.: mmm.- so». sos.- osoosomoooouosoosomom moo. oom. esp. oes.u mos. omm. oms. emo. ~es.- ems. sssuonoos>oz one. oom.: sm~.: moo» som.u one. oom.: omm.u mom. omm.- uoomoosooonooomooso mam. oom. oom. oom.: ose. omm. hos. ooe. oom.- moo. mes»monouosuoosumououosoo oom. moo. mom. so~.u pom. oom. mmm. omm. sms.u oom. o>susmooouo>susmoomos soo. nos.u mmo.- Nms. oom.- mom. aoo.- oso.u oss. oso.- o>somoono>suoa ose. ~ms.| oom.- mms. eon.- meo. ees.- ~m~.u mos. mme.- ooouuooss-uoouuoossoo smo. ooo. has. o-.- mos. oem. moo. «ms. sos.u mos. osueomuuoosos> moo. mom. som. eo~.- ooo. mos. one. ems. oom.- omo. o>sosmoonm>soomoz oom. moo.u ooe. ooo.- ope. oom. moo. ohm. seo.- hoe. soosozoouxooz Nos. osm. mms. oos.u mse. ems. son. ess. o-.- oom. ooxosouuooooa “u mes. osm.u oos.u oom. ~me.- one. oom.- oos.- oom. ooo.- ooouoooo 1. oom. ope. oom. oos.u oom. mom. oom. osm. oes.- ose. usoonesooco msm. oem. oso.- oeo.u moo. oom. smm. ooo.- ooo.- oso. ecosoosnouo>om one. son. oos. sos.u mso. mos. oom. mos. oes.u mop. osoooso>uooosouuoz mg m. % Me me Me m: w. W M w M . msnosuo> u 3 E I P u .4 P I P I. a Is I I. a IS T. a u n O n a u n O n u o P u E u D P u p. 0 K 1.9 .4 3 A 1.9 .4 I MI M. A mi N. a a a a u.o.H uouosm umouumom ammuoum .umom one mum .mooauume SOwumHoHHoo posoom mo cowpoaom Houommsssom How mmOMuumE Houomm poumuom:l.v manna 112 the variance in responses for a statistical fit by the x2 criterion to be achieved. This led, of course, to a loss rather than a gain in clarity. However (now the good news), it is interesting to note that an average of 77 per cent of the common variance is explained by the four—factor solution (see Table 4). If one proceeds beyond this point to the eight—factor solu— tion, it becomes apparent that, in order to find an adequate fit, one must explain approximately 89 per cent* of the com- mon variance. It can be concluded from this that the addi— tional four factors extracted are each explaining a relatively inefficient 2 per cent of the variance, while each of the first four is explaining a minimum of twice that amount. The x2 test must therefore be considered a very conservative test Of fit. In addition to this, a close examination of the factor pattern matrices (Table 4) by the reader will reveal the nature of the scale interrelationships with respect to the psychological model presented earlier. Note prior to this examination, however, that the random reversal of scales which was described in the instrument development section Of Chapter III has sometimes rendered scales representing *This figure was arrived at from the results in Figure 15. The amount of common variance explained can be computed by summing the latent roots up to a certain factor model and dividing by the total sum of latent roots. For the four-factor model in Table 4 this proportion is .783 for the pretest and .757 for the posttest. In the eight- factor case, where a "good fit" is achieved, the proportion is .89. .r N 113 the same factor Opposite in direction-—thus resulting in factor loadings with Opposite signs. The scales are listed in this table as they appeared in the instrument. Having carried out this extensive analysis Of the interrelationships of the bipolar scales through numerous factor analyses, the decision was made to proceed on to the test Of factor structure stability with the four-factor solution where the factors were interpreted to represent the following latent constructs or dimensions of meaning of the concepts to the respondent: Meaning Dimension Evaluative Personal Evaluative Leniency Potency Characteristic Scales worthless-valuable bad-good negative-positive unimportant-important unenjoyable-enjoyable unpleasant-pleasant (uninteresting- interesting) severe-lenient tense-relaxed Violent-gentle (insensitive- sensitive) (unfair-fair) active-passive still-moving weak-powerful Meaning Definition This dimension represents the respondent's favorable or unfavorable reaction to the concept in terms of how it affects others. This dimension reflects the respondent's favorable or unfavorable reaction in terms of his own personal values or from the point of view of how it affects him. These scales reflect the meaning of the concept from a flexibility point of view or on an open vs. closed dimension. This factor represents a reflection of the manifest or observable power and activity potential Of each concept rated. 114 Evidence in support of this decision can be drawn from three sources. The most important of these is their consistency and intuitive appeal from a rational point Of View. That is, they seem to make sense. In addition, they. consistently satisfy one of the two statistical criteria for the appropriate factor model, though this was the more liberal of the two. Finally, one must consider their abil- ity to consistently explain a large proportion of the common variance. The evidence counter to this decision comes pri— marily from the failure of this model to explain a suffi- cient amount of the variance to be considered apprOpriate from the point of view of the second mathematical criterion. This is a serious limitation, since this was established as the criterion of choice at the outset. Overall, however, the weight of evidence would appear to support this decision, given that the decision is made with the full realization of the limitations of the data. The nature of the data and its limitations are not perfectly clear and could yield erroneous con- clusions. The temptation to overgeneralize or to place too much store in these factors is to be avoided and cautious interpretation is called for. With this in mind, the next step in the analysis was carried out-- that of assessing the stability of the factor structure over time. 115 Tests of Factor Structure Invariance The results of four of the five progressively more stringent tests of factor structure stability of the dimen- sion model are reported in Table 5. Each of these tests employed progressively more of the information contained in the factor loadings of the overall pretest and posttest factor pattern matrices reported in Table 4 to assess the degree of stability of comparable factor loadings for each factor. The nominal scale test, based only on the sign of the factor loadings and the trichotomization of the actual factor loading value, demonstrates a consistent rejection of the null hypothesis that the factor loadings are independent. The ordinal scale test, based on the ranking of factor loadings according to size or salience, also reveals total rejection of the hypothesis of no relationship between the loadings of comparable factors. Assuming that factor load- ings have the quality of equal intervals among values, it is apparent that a very high degree of relationship exists between comparable factors as seen in the product-moment correlation coefficients. The ratio scale test, as defined by Burt, also shows strong relationships among factors pre and post. However, a statistical basis for this decision apparatus (based on a sampling distribution of known qual- ities) does not exist. But such a test would, at least in this case, be an unnecessary tour de force-—given the high degree Of factor stability reflected in the other tests. 116 .Aomosv mosmomm pom ssouumu Cam “Aomm .m .ooosv ssouumo mo ConsuoEEOm men ma u z QHSmGOHumHTm oz "om .mmossume souomwuosu may Eosm osmosso> oEmm one so mocsomoa mo comssmdeoo m we zooms soo. soo. ooo.ss oo. ooo. ooooooo moo. ooo. oo.o so. ooo. oooosoos ooo.s moo. ooo.m mo. ooo. .so>m soooouoo ooo. ooo. ome.ouu mo. om. oesuooso>m "Hooooo Acosuosmm oooo mo.uo .ssou s mo.uo .ssou s mo.uo .suoo oz oo osooouooom osuoo ooo. msuoo oso.s oo.m ooooom oo ooesooomv oeso> soosesso Oopmouco H u «x cosponflnumso mcflHmEmm oocooumcoo SOHumaousou 03m mocmwamm uouomm mo pamfioz coauommm mo umoe ucmsosmwoou :uooposm umoa mounmm n OHB¢m a¢>mmBzH AdZHQmO aflzHEOZ QmEDmmd nezmzmmbmmmz mo qm>ma m.coHuoHOm uouoom “moupmom pom umouosm Amumoocoo uo>ov posoom on» mcwummeoo cos3 mass Ho>o mocmssm>cs souomm mo moumoo map mcwmmommm mummy smosumsumum Ham mo musomomul.m magma 117 The results of the fifth and final test Of the invariance of factor structure, the analysis of covariance structure (ANCOVST), were not able to be completed due to the frequent failure of the computational package employed. The primary advantage of this test would have been its abil- ity to consider the entire factor model from the pretest data and compare it with the variance-covariance matrix of the posttest data from the point of view of fit. In the ANCOVST case, however, the factor model from one set of data is compared with the criterion variance—covariance matrix from a separate set of data to assess the fit. Even in the absence of any concrete data, however, it is possible to draw some tentative conclusions about what the results would have been. In most cases where the pre- test four-factor model for any concept or for the pooled data had been compared with its posttest variance—covariance counterpart, it is safe to conclude that a lack of fit would have been demonstrated. But, just as with the factor analy- ses themselves, the reason for the lack of fit would have been the minor undefinable and undiscoverable additional factors which have tended to prevent a clear definition Of scale interrelationships throughout this analysis. The essence of this argument is that a specified factor struc- ture which fails to explain a sufficient amount of its own variance to be considered an adequate fit can hardly be expected to explain enough of the variance of a separate and independent set of observations to be considered an 118 adequate fit. In this case, the reason for the lack of fit would once again have been the high degree of precision required by the statistical test. Therefore, based on the results of the four tests of invariance that were carried out, and tempered by the expected results of the fifth test if it had been completed, it was decided that the four-factor solution was stable or that the frame of reference within which students rated the concepts remained constant over time. This, of course, led to a comparison of comparable variables at the beginning and at the termination of instruction to determine the nature of changing meaning within this constant frame of reference. Test of Change in Concept Meaning The factor standard scores were computed and were transformed so as to approximate the original scale as described in the previous chapter. The results of these computations were ten scores for each subject on each con— cept: four orthogonal factor scores and a composite measure of meaning intensity for both pretest and posttest observa— tions. These can then be transformed by subtracting post from pre to indicate change. All results will appear in this form, without further consideration of individual scales. According to the first hypothesis, which assessed the change in meaning of individual concepts, the following decisions are possible for each factor: Either there is an 119 overall change in mean for each factor (tested separately) or there is not; if there is a significant overall change, either an individual concept contributes to this or it does not. The results of each of these four-factor score change tests appear in Tables 6 through 10. It is apparent from these tables that there was a significant overall change in three of the four factors and that changes in individual concepts can, therefore, be considered for these three. Table 6, representing changes on the evaluative dimension, reveals that the vector of mean differences dif— fers significantly from a vector of zeroes, thus reflecting significant change in that dimension (F=28.4324, p<.000). The column of mean changes shows that the greatest changes took place for the most part in the carrel concepts, with Shaping, Respondent Learning, and Behavior Objectives show- ing the most extensive changes. Note that the changes have positive signs, indicating a change in meaning from less favorable to more favorable. The contributions of each con- cept to the significant overall change are represented in the step down F-ratio. Reading up from the bottom of that column, it is apparent that the first significant F is associated with the concept Shaping. All F ratios above that point must be considered significant and the three below insignificant. This indicates that the instructionally relevant concepts did, in fact, change in meaning, while the 120 oooo. vo some.om u o .oooooo ssoso>o moss. oomm.s sooo. sooo.o moss. oooooseoz sooo. omso.o sooo. ooom.o moss. oosossom some. oeom.o sooo. omoo.os ooom. oesosoooo oooo. omoo.oo oooo. mooo.moo omoo.s oesoooo sooo. oomo.o~ oooo. oomo.oo oooo. oosoeoos ooooeoomom sooo. emm~.s sooo. oooo.o moms. oooBoooooosom sooo. oooo.oe sooo. som~.om meom. oo>soooooo sooos>ooom sooo. omom.os sooo. smoo.om ohms. osmosz moss. Nmso.s osos. soo~.s eomo.u osssxo oosooooss was mascooumooo sooo. ssms.oo sooo. oosm.om moss. Hos>ooom soouoeooz sooo. oooo.es sooo. oeoo.os moss. soooooe o mo osoooz vo m vd m omcoso umoocoo czoo doum mummuo>fic3 x momcmnu umoonoo HmoOH>HUGH .concoEso o>sumsHm>o may as omcoso mo muaomou mocswsm> mo mammamom ouoflsm>fluaozll.w THQMB 121 sooo. vs omms.o u E .ooooso ssoeo>o ooss. moss.s osom. msmm.o oeeo. ooooosooz smoo. omom.o osso. sooo.m sows. oosossom smos. soso.s sooo. omoe.ms ooos. oesososoo sooo. msoe.s sooo. mooo.om osso. oesoooo oooo. oeso.o sooo. ooos.om ssso. oesosoos ooooooooom osoo. msos.o sooo. ssmm.os omom. eooeooeooosom sooo. ooms.os sooo. oomm.om some. mo>sooosoo soeos>ooom ooos. omso.s mmoo. omoo.o ooos. osmosz osoo. smos.s oooo. osms.ss oses. msssso oesooemss one mascOHummso sooo. omos.o msoo. eoso.os ooms. Hoseooom soosoeooz sooo. ssso.os sooo. soso.om osos. Hoooooe o oo osoosz vm m vd m omcmno umoocoo axon doom ousaum>flaa x momcmso udmocou HmoOH>HUcH .cOsmcoEHp wocouom osu cs omcsno mo meadows mocmosm> mo mommamcm ousaum>susozll.b magma 122 sooo. vo sooo.o u o .oooooo ssosoeo mmoo. oooo.o soso. ommo.o ooso. oooomssoz mssm. ooso.o ssmo. omos.o sooo.- oosossom osme. osom.o msoo. soss.o soso. oososoooo sooo. msoo.ms sooo. oomm.sm osmo. oosoooo osoo. ooss.ss sooo. ssmm.os omms. oososoos uoooooomom osoo. smom.o ooso. osss.o ooso.- ooosoosooosom omsm. ooso.o oooo. omoo.o ooso. mo>soooooo soeoseooom momo. osms.o omos. smoo.s sooo. osomsz sooo. ssso.o smoo. osmo.s moss. osssxm.oosooumss pom mascosumooo smoo. ooso.o ooom. ooss.o somo. Hos>ooom soouoeooz ooso. osso.m ooso. osso.m smos. ooooooe o mo osoooz vm m vm m mmcmso umoocoo c30o doom ousaum>sco x momcmsu pooocou Hmopo>HOnH .commooEHp mocoscos was as omcmno mo muaomou moomflsm> mo mommsosm museum>suaozll.m magma 123 smoo. vo osos.s u E .ooooso ssouoeo ssoo. osoo.o some. omoo.o moso. assesses: oooo. osoo.o osoo. omos.o moss.u oosossom ssss. ommm.s ooss. osoo.s ooss.- oososmsoo osmo. smoo.s oooo. osso.e osms. oosoooo ooso. somo.o oooo. oooo.o soso. oososoos ooooooomom osss. ooms.o oooo. oooo.o osoo. ooosooeooosom moms. ssse.s osoo. ooso.o osoo.- woesoooooo sosos>oeom ssmo. smeo.o ooso. osos.o ooms. «sows: ooos. omeo.s moss. oemo.s soss. msssxo oosoouoss pom mcwsoHumooo sooo. sess.o oooo. osos.o ooso. oos>ooom sooeo>ooz osoo. smmo.o osoo. smmo.o msss. Bonuses m we osoooz Va m vd m omcmno umoocoo SBOQ moum opossm>sco x momcmzo umoocou HMOOH>HOCH .SOHmcoEmU o>sumoas>o usocomsoonosu as mmcmso mo muaomou mocmHHw> mo mommsmcm ooosso>susozuu.o osooe 124 oooo. vo sooo.os u o .oooooo ssosoeo moss. sesm.s mooo. soss.o mmmo.- oooooseoz msos. moos.s ooso. smso.o msmo.- oosossom mooo. osso.o osoo. ooso.o soos. oososoooo sooo. oomo.eo oooo. mooe.sss osso. oosoooo sooo. mooo.oe oooo. osss.oo mooo. oosoooos ooooooooom ooso. sooo.o omoo. ooso.o eses. oooBooeooosom sooo. ssoo.os sooo. ooem.os sooo. oo>sooomoo souos>ooom osoo. oses.e sooo. somm.ss eses. osomoz sooo. esos.o osss. ssoe.s sooo. osssso oosooomss pom mascooumooo omoo. soso.s mooo. osoo.ss osos. eos>osom sooeoeooz sooo. moom.ms sooo. moes.ms soos. Hoooooe o mo osomsz vm m vm m omcmnu umooaou c309 moum opossm>scs x womcmsu umoocou HmoOoPHch .musmcouao mossmoe ssmso>o on» as omcmso mo musomon mocmwum> mo mommsmcm mumssm>susozun.os wanes 125 noninstructional concepts remained relatively unchanged in evaluative meaning. However, there were two instructional concepts which changed little in meaning as demonstrated by the small univariate and step down F's. These were Questioning and Listening Skills and Reinforcement. Table 7 presents the results of the analysis of change in the second orthogonal dimension, potency. Once again there is a significant overall change (F=8.2354, p<.0001) and all changes have positive signs, reflecting a change from less potent to more potent. The high degree of similarity in change patterns with the evaluative dimension is also reflected in the large changes in the carrel con— cepts relative to the others and in the significant change in instructional concepts but not in the noninstructional concepts. This same pattern is again reproduced in Table 8, which reports the results of the change analysis for the leniency dimension, but there are some important differ— ences to be noted. The vector of mean differences does depart significantly from zero, but fewer individual concepts make important contributions to this difference. Once again, reading up from the bottom of the step down F's, it is apparent that the noninstructional concepts did not change in meaning. However, in this case it is also apparent that only two of the four carrel concepts (Shaping and Respondent Learning) and two of the four IPL concepts (Myself as a Teacher and Questioning and Listening Skills) contributed 126 significantly to the overall change. The column of mean changes in this case reflects changes to a meaning which is more "lenient" in nature, since positive signs predomi- nate. The two negative values reveal relatively minor changes toward a more "severe" meaning. In Table 9 the results of the analysis of change in the personal-evaluative dimension are reported. In this case there was no significant departure from zero, thus no change took place here. This means that the concepts were not seen as being any more enjoyable or pleasant after instruction than they were before instruction. Finally, in Table 10 the results of the analysis of change in overall meaning intensity are reported. Given significant changes in three of the four orthogonal dimen— sions Of meaning reported above, this analysis was academic, because this score is a composite of the others. The pri- mary reason of the execution Of this phase was to determine where the largest changes in overall "meaningfulness" occurred. The reader will recall that as a result of the mathematical computation of this value, a positive change would reflect movement away from the origin of meaning where the axes intersect, thus indicating a greater inten— sity of meaning or more meaningfulness. Examination of the column of mean changes reveals that nearly all of the instructional concepts are seen as more meaningful after instruction, while the three noninstructional concepts showed little change and two were in a less meaningful 127 direction. Further examination of this column reveals that two of the eight instructional concepts tended to change little in meaning intensity. These were the con— cepts Questioning and Listening Skills and Reinforcement. The largest change in overall meaning was in the concept Shaping, followed by Respondent Learning. The remaining concepts in order of magnitude of change from highest to lowest were Behavioral Objectives, Nonverbal Behavior, Myself as a Teacher, and Myself. It is apparent that the largest changes occurred in the carrel or more technical concepts. The results of this analysis of change in meaning provide a fairly clear sketch of the alterations in meaning that occurred during the period when instruction was being carried out. However, the analysis as it has been described up to this point provides only part of the complete picture which can be drawn from the data. It says little about the actual meaning ascribed to the concepts by the respondents. In order to complete thepdcture,neen factor scores on each of the four dimensions of meaning for each concept were computed for pretest and posttest responses. These means are reported in Table 11. In order to facilitate interpre— tation of these values, a three—dimensional Euclidean space was constructed to represent only the three dimension in which a significant change in meaning was detected. It was then possible to place each of the concepts as points in this space. This graphic representation appears in Figures 16 128 emh.m mmm. moo. mms.s mvm.u omn.~ vow. owe. moo.s osm.u masonsumz ooo.s moa.s Hmm. oos.s mmh. vmm.~ mam. msm. oom.s one. cOHmssmm hmo.m mom.a hs~.s .mnm. vsw.a mmv.~ mmo.s oms.s oom. mmm.s cooowmwnm Nvo.~ o-.H sms.s «mo.s mmm.s vsH.~ was. omm. smm. «on» moodonm sso.~ mes.s «mo. ooo.s Hmm.s mmm.m wmn. own. ooo.s mum. mascuomq unoUGOQmom mms.m omm.a smm. Hos.s mom.a sso.m mom. moo.H moo.s mmh.s acoEoouOMSHom mmo.~ mo~.s mmm. oom. ooo.s mmm.~ one. oom. ooo. mos.s mo>suoonoo Hmu0m>mnom hae.m omv.s oeo.a wsm.s mmH.a mmm.m msm.s mom.s som.s mom.. msomwz msm.m msm.s Hos.s mom.s hhm.s mmm.m ooo.s «mm. mms.s omm.s massxm mascoumsq new mGHSOAummoO ooo.s mmm.s weo.s oso.s msm.s mmh.m woo. omm. moo. ooo.s u0s>mnom Hmnum>coz Hem.m oms.s Hmo.s osn.s .mms.s mmm.m ~vm.s mmm.s mmm.a who. Honomoe o no name»: o m o o a. o o o m o s o o o .923 is 3 u Is P as 3 u Is 2 a I a .1. I. T. I a I a I. I. T. T. u E u a n n u s u e n n s.1 o u E s so: 3 u s P T. T. .A o 3 .4 I. T. .A o 3. .4 1. .A T. T. 1. A T. I. A A A .A A A e e e e amouumom umououm .umoocoo some now msoflmcoEwp meadows sense “sow was mo some ecu momma ouoom Houoshll.aa manna 129 through 18. The IPL, carrell, and noninstructional concepts are reported on separate sets of axes for clarity. It is apparent from these figures that each concept from the instructional categories is seen as favorable, lenient (gentle), and potent and these reactions tend to intensify during instruction. In addition, there is an apparent cluster of instructionally relevant concepts along a value of just less than one on the leniency scale. This cluster includes the following concepts: Nonverbal Behavior, Questioning and Listening Skills, Behavioral Objectives, Reinforcement, Respondent Learning, and Shaping. A second cluster appears further out on the leniency scale, which includes Myself and Myself as a Teacher. The noninstructional concepts have a large variance in meaning between them, as might be expected because of differences in their nature. The single departure from the lenient—favorable—potent response pattern is seen in the noninstructional concept Marijuana, which is portrayed as unfavorable (bad) in mean— ing. Instructor Role in Meaning Change The final research hypothesis to be tested by statistical means anticipated that there would be differ- ential changes in meaning depending on the instructor with whom the student was associated during the course. In order to test this hypothesis, two analyses of different portions of the data were carried out. First, in order to 130 Potent A “.2 -3 Unfavorable \\\ __- —2 x 1 I? -1- {1? 1 $161) 11 q? OLenient ‘ J’Iz Severe F) i I I I . I I 3 ‘F\ I ___ i 1 l 1, T— I I I. ' l I | I_\I I -__ ' —2 i-2 I I v Impotent to 3 “'3 <""0 1 2 Favorable Figure 16.-‘Graphic representation Of pretest and posttest waP-J meaning assigned to IPL concepts: myself as a teacher nonverbal behavior questioning and listening skills myself 131 Potent I52 Unfavorable \ \ N... Severcs 0 Lenient -——-—.——- l. I H Impotent to -3 eum 0 Favorable Figure l7.--Graphic representation of meaning assigned to Carrel concepts: behavioral objectives reinforcement respondent learning shaping ubWNH O I O 132 Potent -2 Unfavorable 3 “C? -__ -2 -l ___ h 3‘ -1 I —4? Severe J O Lenient \ : I I I .5 I I I --- j L—l Jr | I I: I I I I I I I“2 Impotent to -3€——- 0 1 2 Favorable Figure 18.--Graphic representation of pretest and posttest meaning assigned to noninstructional concepts: 1. physician 2. religion 3. marijuana 133 establish the base line on which to plot the changes by instructor, differences between instructors on the pretest responses were examined. In this analysis, instructor with 19 levels was the independent variable and there were three dependent measures: average ratings of carrel con- cepts for each student, average rating of IPL concepts, and average rating of noninstructional concepts. A one-way multivariate analysis of variance was carried out for each orthogonal dimension of meaning where a significant overall change in meaning had been detected. The results of this test, reported in the left hand columns of Table 12, indicate that there were no differences in initial ratings of the concepts by students studying with different instructors. In order to assess differential change in meaning, a similar analysis was carried out with instructor as the independent variable, but with the three dependent measures redefined as the difference between the average carrell, IPL, and noninstructional ratings for each subject on each of the three orthogonal dimensions where significant change had occurred. Once again there was a negative result, as demonstrated in Table 12. This was a firm indication that there were no differences in the amount or direction of change in meaning for students studying with different instructors. The mean pretest rating and changes for dif— ferent instructors can be used to clearly represent the nature of this result. 134 Table 12.--Results of manova of instructor differences in pretest meaning and change in meaning on the three dimensions where a significant change was detected.a Evaluative Dimension Pretest: Multivariate F = 1.3248 p< .0643 Univariate F p< Step Down F p< IPL Concepts 1.0274 .4297 1.0274 .4297 Carrel Concepts 1.9881 .0112 1.8386 .1221 Noninstructional Concepts 1.3084 .1833 - 1.1324 .3217 Change: Multivariate F = 1.0180 p< .4418 Univariate F p< Step Down F p< IPL Concepts .6556 .8522 .6556 .8522 Carrel Concepts 1.4406 .1139 1.4229 .1217 Noninstructional Concepts 1.0577 .3968 .9935 .4681 Leniency Dimension Pretest: Multivariate F = .9290 p< .6207 Univariate F p< Step Down F p< ' IPL Concepts 1.0961 .3570 1.0961 .3570 Carrel Concepts 1.0356 .4207 .8288 .6654 Noninstructional Concepts .9880 .4744 .8725 .6126 Change: Multivariate F = .9265 p< .6257 Univariate F p< Step Down F p< IPL Concepts .4398 .9779 .4398 .9779 Carrel Concepts 1.1185 .3349 1.1098 .2698 Noninstructional Concepts 1.0288 .4281 1.1675 .2897 PotencyiDimension Pretest: Multivariate F = 1.0457 p< .3891 Univariate F p< Step Down F p< IPL Concepts .8950 .5852 .8950 .5852 Carrel Concepts .7549 .7513 .8964 .5835 Noninstructional Concepts 1.2844 .1989 1.3580 .1542 Change: Multivariate F = 1.0366 p< .4062 Univariate F p< Step Down F p< IPL Concepts 1.1179 .3355 1.1179 .3355 Carrel Concepts .8861 .5960 1.0751 .3785 Noninstructional Concepts 1.0810 .3740 .9271 .5464 aDependent variables were average over carrel, IPL, and noninstructional concepts for each subject. 135 A sample of these means has been plotted in Figure 19. Two major points can be made from this representation. First, this randomly selected group of 5 of the 19 instructors began the term with students who had very similar reactions to the concepts on the evaluative dimension as demonstrated by the narrow range of means (1.05 to 1.37 and .91 to 1.42). Second, the parallel nature of the lines indicating change indicates the high degree of similarity in the changes which took place among students associated with different instruc— tors. Though this figure represents only a subset of the data, the results reported in Table 12 indicate that graphic representations Of the remaining data would be very similar to that pictured above. Given this negative result, it was decided that the data might be more closely scrutinized from the point of View of instructor differences by looking at these differ— ences on individual concept ratings. The prior analysis was based on a composite (mean) of all the subject's responses to the carrel concepts, for example, not his individual responses to each concept in that category. It must be recognized at this point that the ideal follow—up to the negative findings in the prior analysis would have been to seek instructor differences in the change in meaning of each individual concept on each of the meaning dimensions. However, since this would have required 44 separate analyses (4 dimensions by 11 concepts), the decision was made-to further assess the degree of variations among instructors 4,13 102‘“ 12 15 .5 i pre IPL Concepts 4,13 12 l I post 136 12 13 l I fir j pre »post Carrel Concepts Figure l9.--Sample of data representing role of instructor in meaning change. Five of 19 instructors selected at random are graphed with respect to the average rating of their students on the IPL and Carrel concepts on the evaluative dimension. 137 on concepts where they would be most clear, and yet attempt to represent each concept category and meaning dimension. For this reason, nine ANOVAs of change scores were carried out with instructor as the independent variable and changes in the meanings of the concepts and dimensions listed in Table 13 as the dependent variables. Once again, however, there were no differences among instructors, as indicated by the F-ratios and their probabilities of occurrence. Table l3.--Resu1ts of analyses of variance of change in meaning over instructors for individual concepts (not averaged within concept category). Univariate Dimension Concept F p< Evaluative Nonverbal Behavior 1.2993 .1891 Shaping 1.4718 .1012 Physician 1.6773 .0443 Leniency Questioning and Listening Skills 16844 .8253 Shaping .5998 .8981 Religion .5143 .9501 Potency Myself as a Teacher 1.2228 .2436 Behavioral Objectives .6311 .8735 Physician 1.0715 .3822 Conclusion The major results of this attempt to clarify the nature of the alterations in connotative meaning which take place during the period of teacher training experience defined above can be summarized in terms of the three major research hypotheses tested. A three-factor Solution to the 138 SD ratings gathered is not the most appropriate factorial model. In fact, a fourth factor must be added to gain a parsimonious explanation of the data. The factors are not the traditional evaluative, potency, and activity dimensions of meaning which has characterized SD research, though there are indications of part of that structure. The four—factor model has demonstrated a high degree of stability over time where the interval between measures was about nine weeks. However, there are significant changes in concept meaning within this stable meaning framework. The largest of these changes are associated with more technical concepts from among the 11 rated and the smallest were those associated with the noninstructional concepts. For the most part, changes in these noninstructional concepts were chance occurrences. Finally, it was apparent that the instructor played little role in terms of differential changes in their respective changes. Given these results, there is some dis— cussion and speculation which seems warranted. This will be presented in the fifth and final chapter and will revolve around the statistical tests which sought answers to three of the questions posed, as well as the questions which did not lead directly to research hypotheses. CHAPTER V DISCUSSION Introduction The common thread throughout this dissertation has been the set of questions of interest which were posed at the outset. Once again these can serve as the framework for discussion, but since the analysis is complete and the results are now available, it is possible to formulate direct answers to each. This will comprise much of the discussion in this chapter. Then, in order to bring all of the threads of discussion together near the starting point, some con- clusions will be drawn concerning the potential of semantic differential research in the systematic and scholarly evalu— ation of instruction in teacher training. Scale Interrelationships and Factor Analysis The purpose of the series of algebraic manipulations termed factor analysis is to thoroughly examine and draw conclusions about the interrelationships among a large num- ber of variables by maximizing the conciseness and simplicity of the exploration of the intercorrelations. Though many procedures have been developed for performing factor analy— sis (see Cattell, 1966), each is aimed at the major purpose stated above. More specifically, according to Thurstone 139 140 (1947) and Nunnally (1967) there are two primary uses of factor analysis: (1) to test hypotheses about the existence of certain latent constructs (factors) and/or (2) to reduce data into an interpretable number of distinguishable primary factors. The "and/or" conjunction in the last sentence is critical in discussing the results of this research, because it was necessary to use both of these in combination. The first of the two primary uses assumes that the researcher has sufficient reason to expect that a certain latent structure should be manifested in his data. The actual structure may then be compared with the anticipated structure to test the truth or falsity of the hypothesis. This was precisely the case in the research being presented here. It was hypothesized that a three-factor structure (evaluative, potency, and activity) would explain the responses, and this prediction was based on extensive semantic differential research and careful scale selection to tap those dimensions. However, the latent structure which generally characterizes responses to concepts from the general domain of meaning did not seem to apply in the rating of educational concepts. Therefore, the answer to the first question posed: "Does EPA structure fit?" was no. There was another, more adequate explanation to be discovered. Before proceeding further in the discussion of the scale interrelationships, some comments about the hypothesis of EPA structure seem warranted. In the original research which reports this structure, Osgood and his research . _ .__ ‘_--.- _- a 141 associates (discussed in Chapter II) never imply that these three dimensions are the entire basis for an individual's reaction to concepts being rated on bipolar scales. The EPA dimensions were considered the primary dimensions and the researchers were willing to acknowledge that there were other, less important dimensions which their efforts had not been able to clearly define. The "primary" nature of these dimensions can be defined in statistical terms and this will serve to clarify the point being made here. As Heise (1969) points out in his review of the methodological SD research, the evaluative dimension generally accounts for around 30 per cent of the common variance in responses and the other two dimentions (potency and activity) tend to account for around 15 per cent each. The remaining 40 per cent is explained by minor undefined dimensions orthogonal to these three. Yet in the research reported here, an attempt was made to prove that these dimensions accounted for nearly all of the common variance. Though this attempt was not stated outright, it was implied in the statistical test used (UMFLA), which, for the number of subjects used, requires that approx- imately 90 per cent of the common variance be explained to satisfy the criterion of an adequate fit of the factor model. In all fairness to the original researchers, this was an unnecessarily strict criterion. However, the failure of the three-factor EPA model to fit did manifest itself in rational as well as statistical terms and also led to some interest— ing results, as the further assessment of the nature of 142 the interrelationship proceeded to more complex factor models. It was at this point, then, that the second use of factor analysis came into play. The question of interest then became: "Is there a distinguishable set of constructs which adequately and parsimoniously explain the scale inter- relationships?" It might be argued at this point that, once the initial hypothesis was found to be inadequate, further examination of the correlations is unnecessary. Given a negative result for the three-factor hypothesis, it would seem that one of three courses of action is warranted: (1) Stop the analysis and report the inadequacy of the hypoth- esized latent structure, if that is the only hypothesis of interest; (2) Leave the question of a parsimonious reduction of the data unanswered and proceed to test other relevant hypotheses on the basis of raw score data; or (3) Attempt to search beyond the hypothesized structure for an alternative, yet equally plausible, latent structure. The first choice was unacceptable here because the most important hypotheses were to be tested after the deci- sion as to the most appropriate latent structure (i.e., tests of concept meaning change). The second was equally unaccept— able because of the great difficulty in data manipulation and interpretation involved in explaining the large numbers of raw responses. Therefore, because of a need for a simple structure for reduction of the data and in order to clearly understand the nature of the results, the third option was 143 chosen. This turned out to be a fruitful course because the result was a four-factor latent network which possessed the qualities of fair statistical integrity and a high degree of psychological interpretability. From a psychological point of view there was a high degree of similarity between the four factors at the begin- ning and at the end of the instructional period; that is, the same scales loaded together on the pretest as on the posttest. Statistical tests of this stability will be dis- cussed below. The first factor was most easily identified. It was clearly an evaluative factor of the type frequently found in SD research and which reflects a favorable- unfavorable reaction to the concepts rated. The second factor was equally clear in meaning. It was comprised of the scales enjoyable—unenjoyable and pleasant-unpleasant and was named a personal evaluative dimension. It is inter- esting to note that conceptually this factor is quite simi— lar in definition to the evaluative scale, but statistically it was quite independent. This is a clear indication that the respondents were registering two distinct reactions to the concepts. That is, whether or not they found a concept personally pleasing had little to do with their judgments of the concept's worth. So that, for example, the typical student reaction to the use of behavioral objectives in terms of how enjoyable and pleasant they would be to them personally had little to do with the value and importance they placed on the concept. 144 The remaining two dimensions of meaning were quite clear conceptually, but were difficult to name. They were a combination and redivision of Osgood's potency and activity dimensions, with both potency and activity scales combined in each of the new factors. For example, the scale weak— powerful, which is usually the primary scale on Osgood's potency dimension, was highly related to active~passive, which is usually the primary scale on his activity dimension. This was an unanticipated result and is very difficult to explain. The most satisfying explanation from the author's point of view comes from a comparison of this dimension with the fourth dimension, leniency, and the scales which characterize it (lenient-severe, tense-relaxed, violent- gentle, etc.). These scales seem to reflect more latent or less observable characteristics of the concepts, while the power and activity scales tend to reflect the more outward, observable, or manifest characteristics of the concepts. These are clearly meaningful and distinct psychological reactions which the respondents have made to the concepts rated. The naming of the factor described as latent char— acteristics which allowed for the most meaningful reference was taken from the first scale, severe-lenient, and was the leniency scale. The naming of the power and activity scale was more difficult, but the name potency was chosen as a combination of the nuances of meaning found in the scales. The primary limitation in the evaluation of the scale interrelationships is found in the statistical criterion for 145 the definition of the most adequate factor structure. This was discussed in detail in Chapter IV, but is worthy of fur— ther discussion here. There were two statistical criteria—- one liberal and one very conservative. These were the scree test and the x2 criterion of the UMFLA procedure developed by Joureskog. The reader will recall that the lack of pre- cision in the data gave rise to factor solutions using UMFLA which went beyond rational or psychological interpretability (in terms of the number of latent constructs) in order to find a statistical fit. The more liberal scree test, on the other hand, gave rise to clear psychological interpret- ability, but fell short from a statistical point of view, because it lacked clear statistical antecedents. The deci— sion was made to proceed with the four—factor solution with the realization that future research must clarify this issue of best factor explanation. This is simply a question of acknowledging the value of these four dimensions in terms of their relevance to the task of assessing student ratings to instructional concepts in teacher education and then verifying them by attempting to compose and try new scales which more clearly tap the dimensions of interest. This would, hOpefully, tend to clarify the factor structure and eliminate the problems experienced here with the statistical criterion. This limitation caused by unexplainable variance in the data led to a second problem in the assessment of the stability of the factor structure from pretest to posttest. 146 Both the UMFLA procedure and the analysis of covariance structure, which was one of the tests of the hypothesis of factor invariance, require a high degree of precision in order to satisfy the requirement of the statistical tests involved. The reader will recall from the presentation of results, that each of the other four tests of factor stabil— ity indicated a very consistent factor structure over time. However, the results of the ANCOVST, incomplete though they were, would have cast doubt on stability defined by the other tests. The procedure described above for clarifying the factor structure, by employing more scales that tap the dimensions of interest and deleting those which tend to cloud the issue, would tend to assist in the use of the ANCOVST decision apparatus also. This will be carried out in ongoing research. Little that is of additional interest can be said about the stability question beyond the fact that there was a very high degree of consistency. That is, the responses that students made to the concepts or the manner in which they used the scales to describe their reaction were very similar across concepts and across testing times. The Question of Concept Meaning Change The function of the scales used to assign connotative meaning to concepts did not change in their basic nature or structure as demonstrated by the stability of the factor structure, but the meaning of instructionally related 147 concepts did change in intensity and definition. In addi- tion, there is some indication that the course instruction played a role in this change because of the relatively minor changes that took place in the intensity of the reactions to the noninstructional concepts. Here again, however, there is a limitation mentioned in earlier chapters which must be restated. The degree of control over the variables of interest which is gained by using control concepts rather than a control group of subjects must be considered less ade— quate than if a control group had been employed. For this reason, a true cause and effect relationship between instruc— tion and concept meaning change must remain to be proven by additional experimental manipulation of the variables. However, for purposes of the following discussion and in the absence of an immediately apparent rival explanation of the cause of change, it is going to be assumed that the course of instruction played at least some part in bringing about the change in meaning intensity which has been demonstrated. With this assumption in mind, then, it becomes pos- sible to proceed to a discussion of the nature of the changes that occurred and the reasons for their occurrence, and beyond that to a discussion of the questions of the approp— riateness of the meanings and the changes therein for teacher education. The reader will recall that the largest changes in the overall intensity of meaning were demonstrated in the carrel concepts Behavioral Objectives, Respondent Learning, 148 and Shaping. These changes can be clearly interpreted in a manner which makes psychological sense when one considers the amount of prior experience or prior meaningful associa- tions which the students had with these concepts. Concepts such as Myself, Myself as a Teacher, Nonverbal Behavior, Questioning and Listening Skills, and Reinforcement are all familiar terms with which the respondents had had at least some prior experience. The three technical concepts men- tioned above, on the other hand, were probably very new to most of the respondents and were, therefore, more easily altered through instructional experience. This is not to say that the more familiar terms did not change, because for the most part they did. But what is being suggested here is that there is an inverse relationship between familiarity with a concept and the amount of change in connotative mean— ing which can be achieved in a person's reactions to that concept if the amount of energy expended (instruction) in bringing about that change is held constant. This may be considered rather clear evidence of a point made by Osgood in his development of the semantic differential as an index of meaning. In the case of a lack of familiarity, a reSponse near the neutral point in terms of connotative meaning scales is elicited. As familiarity increases, the intensity of the meaning associations increases. The direction of the change depends on the nature of the experience, but as one moves further out on the scale, it becomes more difficult to bring about change. This might be interpreted as a statistical 149 artifact resulting from the attenuation of responses at the limits of the scale. This probably contributed to the effect being discussed, but the amount of the contribution remains an unknown. It is true that if a person has made a response of four or neutral to a scale, the extent to which he may change this response at some later time (in either direction) is maximized. But, the fact remains that he has made that response for some psychological reason and, if he alters it at the time of the later rating, he alters it because of some change in his psychological interpretation between the scale and the concept which he is rating. Regardless of the limitations which the researcher places on the respondent in terms of the range of possible responses, it must be assumed that, if there is a desire to make infer- ences from the scales to the psychological operations of the respondent, there is a direct connection between the scale and the psychological function being measured. It is also interesting to note with respect to the meaning of the self-concepts and of the various instruc- tional tools and procedures, that the two clusters seem to differ a great deal on the leniency dimension and to a lesser degree on the potency and evaluative dimensions. These are not vast differences and should probably not be pressed too far, but it is interesting to speculate on the nature of a classroom or learning environment constructed by a person who sees the tools he uses as being more valuable and less lenient than himself as a person. This would appear to be a 150 wholesome situation indeed, because if these were reversed and the tools were seen as less valuable and more lenient than the self, the tools would probably never be employed. The location of the noninstructional concepts in the three-dimensional meaning space is of interest, because they tend to add credibility to the scaling procedure. First, the location of the concept Marijuana in the unfavorable domain discounts the potential argument that there is a favorable overall halo effect being demonstrated in the responses. Second, the wide differences in the location of the three concepts Physician, Religion, and Marijuana suggests very different meanings, as there surely must be. There was suggestion made above concerning the approp— riateness of the meanings assigned to the concepts with respect to teaching and teacher training. This was one of the questions posed at the outset which didn't imply any statistical test and which must ultimately by answered by those responsible for the planning of teacher training instruction. However, some discussion and conclusions seem warranted at this point. Certainly, since all of the edu- cationally relevant tools, concepts, and procedures were seen as favorable, lenient, and potent at the outset, students could be considered to be predisposed to use them in the class- room. But above and beyond this, the changes which took place during instruction rendered the concepts even more favorable, potent, and somewhat more lenient, thus appearing to increase the predisposition for their later use. It would appear that 151 the associations with these few concepts were appropriate for the teaching profession and that the instructional sequences served the purpose for which they were intended. Instructor Differences in Change in Meaning Intensity Perhaps the most puzzling result from the point of view of formulating a sensible explanation is that of no differential effect among the 19 instructors. The changes noted in the intensity of the respondents' rating occurred regardless of the instructor with whom the student was asso- ciated. No differential effect of instructor in terms of the carrel concepts and the noninstructional concepts might have been expected because these concepts are not system— atically presented in the IPL, where the instructor makes his contributions. But to reveal that there were no differ- ences in the IPL concepts either, where it is assumed that the instructor has the greatest influence, was uneXpected. One can, of course, only speculate as to the reasons for this result. It is possible, for example, that the instructor simply does not make a contribution which is clearly and identifiably his in the change which takes place in the students during the ten weeks of IPL meetings. It is equally possible that he does have an effect but that it occurs in ways and is reflected in changes not tapped by this instrument. Third, it may be true that the instructors influence these reactions differentially, that the instru— ment measures the correct change, and that all instructors 152 are operating at the same high level of instructional abil- ity. Each of these possible explanations has some merit. The first implies that there is a general diffuse increase in awareness concerning the meaning of education and teaching which is reflected in the results, and in which individual instructor plays only a uniform part in this growth. This is a distinct possibility because of the short duration of the course (ten weeks) and because of the large amount of information concerning the task demands and the personal demands of teaching which is presented to the student during those ten weeks. It might be worth noting parenthetically that the hypothesis of a diffuse increase in awareness would have been testable by adding a fourth category of concepts related to education, but presented in this particular course. The nature of the change in these concepts would have been most interesting, because a change in these noninstructional but educational concepts would tend to support the general increase in awareness hypothesis. The second explanation suggested above is based on the extremely limited scope or breadth of the instrument employed. The reader will recall that the scales were selected to represent the traditional EPA structure (see Appendix A) and that the concepts were selected simply as examples of the types of concepts presented in the course. It is very possible that, with only four IPL concepts selected from the large number of possible concepts, there is simply a sampling error being demonstrated here. From a different ' M L .;._ 153 point of view, it is also possible that the role of the indi— vidual instructors would be clearly demonstrated by a com— pletely different set of dependent measures taken perhaps from sociometry or personality assessment. 1 Finally, there is the suggestion that the instrument is a valid assessment of an equally adequate performance by all 19 of the instructors. Though this is a difficult explan— ation to believe because of the large number of individual instructors and students involved, it does find some support in supplementary evaluative research carried out during the terms when these SD data were gathered. In a research report currently in preparation by the Education 200 Research Council (Stiggins and Byers, 1972), it is reported that the students perceived a very high level of performance on the part of the instructors. Students were asked to rate their instructors at midterm and at the end of the term on their performance with respect to facilitating group interaction, displaying the IPL objective, and helping the students to reach those objectives. On a 7-point scale with 7 being the best possible rating in terms of performance, all instructors scored consistently at the 6 level. Though this does not necessarily lead to any conclusions about the concept mean— ing changes reported here, it does reflect a level of per— formance in the Opinions of the students. It should be made clear that this research merely lends support to the third suggestion as a possible explanation of no instructor dif— ferences in change. It is not intended to prove that an 154 unusually high level of performance was in fact the case. It is also possible, of course, that some combi- nation of the suggested reasons was Operating to bring about these results. The correct reason or combination of reasons for no differential instructor effect must await further examination. Further examination is warranted, because it was suggested in the first chapter that one of the merits of this type of instructional evaluation would be its abil— ity to supply feedback to individual instructors in such a way as to be of value in the reformulation of instruction. Such a procedure could be carried out with the results pre— sented here, but, with the lack of surety that the results reflect something that is, in fact, part of the instructor's behavior or instructional procedures, the value of the feed— back would be questionable. The Question of Other Testable Hypotheses It was also suggested at the outset that the merit of this type of instructional evaluation would be judged first on its performance in this task and then on its poten- tial to contribute answers to other questions of interest. Given the demonstrated ability of the SD procedure employed here to detect changes in the intensity of concept meaning tempered by the methodological limitations suggested through— out this chapter, it would seem that the procedure was able to supply answers to most of the questions posed at the outset. 155 A few other hypotheses which might be examined by this procedure come to mind. Some of these are worthy of detailed discussion while others need only be suggested briefly. However, each could serve a further testing ground to the semantic differential in the role of providing feed— back about instruction. First, it was suggested at the outset that the ulti— ( mate test of the validity of this procedure would be its ability to predict teacher behavior. Given the generally positive nature of the results reported here, an interesting 2, research design for this test would be as follows: Begin with a group of students who are undergoing some type of professional preparation to become teachers. Gather from these students achievement test data (M1 the concepts being presented both at the beginning and at the end of the course and supplement this with SD ratings of the concepts. This procedure would be most revealing if the concepts presented were instructional tools and procedures such as those used in this study. The correlations among the achievement test items and the SD rating would be interesting enough, but even more revealing would be post—course observations of classroom behavior with special focus on the tools and pro- cedures taught in the course. Using frequency of usage of the tools and procedures as the criterion variable, it would be most interesting to determine which of the two predictor variables was most efficient. Further, it would be reveal— ing to know the efficiency of these two variables in 156 combination for predicting teacher behavior as compared with other predictors. The ultimate result of this test would determine the validity or the true value of this type of rating in teacher training. The further use of the procedure to test other hypotheses and to serve other evaluative func- tions would be dependent on a positive result in this test. If such a result were achieved and a high degree of predictability were revealed, the generalizability of the procedure would be endless. Other concepts and procedures could be assessed from the various courses that compose a teacher training sequence. The role of teacher training, for example, might be closely scrutinized or various modes and methods of instruction in the teacher training sequence might be compared. Throughout these types of evaluations further scales and dimensions of meaning might be assessed and, if a con— sensus of relevant scales is reached, an attempt might be launched to construct a semantic atlas of educational con- cepts like that developed in the general linguistic domain by Snider and Osgood (1969). The primary interest in the research reported and discussed here has been on the result of instruction on the methodological aspects of teaching (i.e., on tools and pro— cedures of teaching). There is an entirely different realm of interest which might serve as a valuable supplement to the concern expressed here, and that is the use of the SD procedures for a closer look at the interpersonal demands of 157 teaching. This implies the use of bipolar scales to rate people. This use of the semantic differential is currently being tried in the course where the data for this disserta- tion were gathered, and the results promise to be most revealing. But the reader must bear in mind that most of these research possibilities are still in the distant future. The :5 immediate research priorities are there: (1) further data 3 collection and manipulation of the scales and dimensions of M the SD instrumentation from a psychometric point of view, (2) a series of more tightly controlled experimental admin— istrations to determine the nature of any cause and effect relationships, and (3) a test of the true validity of this procedure employing teacher behavior as the criterion. Conclusion It was stated at the outset that the semantic dif- ferential procedure for the scaling of psychological reac- tions termed the connotative meaning of concepts might have potential value for the systematic and scholarly evaluation of instructional sequences. To begin measuring this poten- tial, an attempt was made to measure changes in the meaning of concepts presented in a course which contributed to the professional preparation of teachers. It was stated further that some indication of the potential would be gained by attempting to find the answers to a series of questions regarding the role of instruction. 158 It was determined that the traditional dimensionality of semantic meaning space was inadequate, but that a plaus- ible alternative did exist. A series of statistical tests verified the fact that the basic nature of the semantic meaning space was not altered by instruction but that the intensity of reactions to the concepts rated was altered significantly. The meanings ascribed to these concepts were seen as apprOpriate for teaching and the changes in that meaning appropriate for teacher training. From this research, the course for future investigations seems quite clear. These were all favorable outcomes. However, there were limitations noted which remain to be accounted for in the future study of the role of the SD procedures for course evaluation. Consequently, it must be concluded that, though there is much more work to be done, there is evidence in support of the initial statement that the use of the semantic dif- ferential procedure for the scaling of connotative meaning can, in fact, serve a useful purpose in the systematic and scholarly evaluation of instruction. The measurement tech— nique would seem to have survived its initial test. The next research steps are clearly drawn. The validity of the measurements, using classroom behavior as the criterion, must now be determined. BIBLIOGRAPHY 159 BIBLIOGRAPHY Ahmann, J. S. "Aspects of Curriculum Evaluation: A Synopsis." Perspectives in Curriculum Evaluation. Edited by R. Tyler, R. Gagne, and M. Scriven. New York: Rand-McNally, 1967. Allport, G. W. "Attitudes." Readings in Attitude Theory and Measurement. Edited by M. Fishbein. New York: John Wiley and Sons, 1967. ll . _.__.._ _’fl'A‘I-_f ._...Ll Ausubel, D. P. Educational Psychology: A Cognitive View. New York: Holt, Rinehart and Winston, 1968. Barrett, G. V., and Otis, J. L. "The Semantic Differential as a Measure of Changes in Meaning in Educational and Vocational Counseling." Psychological Reports, XX (1967), 335-338. Bentler, P. M. "Semantic Space Is (Approximately) Bipolar." / Journal of Psychology, LXXI (1969), 33-34. Block, J. "Studies in the Phenominology of Emotion." Journal of Abnormal and Social Psychology, LIV (1957), 358-363. Bloom, B. 8.; Hastings, J. T.; and Madans, G. F. Handbook of Formative and Summative Evaluation. New York: McGraw-Hill, 1971. Brown, F. G. Principles of Educational and Psychological Testing. Hinsdale, Illinois: Dryden, 1970. Burt, C. L. "Factor Analysis and Canonical Correlation." British Journal of Psychology (Stat. Sec.), I (1948), Campbell, D. T., and Stanley, J. C. Experimental and Quasi— Experimental Designs for Research. New York: Rand: McNally, 1963. Cattell, R. B. Handbook of Multivariate Experimental Psy- chology. New York: Rand-McNally, 1966. Cattell, R. B., and Baggley, A. R. "The Objective Measure- ment of Attitude Motivation: Development and Evaluation of Principles and Devices." Journal of Personality, XXIV (1956), 401-423. 160 161 Cliff, N. "Adverbs as Multipliers." Psychological Review, LXVI (1959), 27-44. Cook, D. R. "A Study of the Relationship of the Meaning of Selected Concepts to Achievement and Ability." Unpublished Doctoral dissertation, Indiana University, 1959. Darnell, D. "A Validity Study of the Semantic Differential." Unpublished Doctoral dissertation, Michigan State University, 1964. / Davitz, J. R. The Language of Emotion. New York: Academic Press, 1969. Dingman, H. F.; Paulson, M. J.; Eyman, R. K.; and Miller, C. R. "The Semantic Differential as a Tool for Measuring Progress in Therapy." Psychological Reports, XXVI (1962), 271—279. DiVesta, F. J. "A Developmental Study of the Semantic Dif— ,/ ferential." Journal of Verbal Behavior, V (1966), 249-259. DiVesta, F. J., and Dick, W. "The Test—Retest Reliability of Children's Ratings on the Semantic Differential." Educational and Psychological Measurement, XXVI (1966), 605—616. Ebel, R. L. "What Are Schools For?" Phi Delta Kappan, LIV (1972), 3-7. Endler, N. S. "Changes in Meaning During Psychotherapy as // Measured by the Semantic Differential." Journal of Counseling Psychology, VIII (1961), 105-111. Ervin, S. M., and Foster, G. "The Development of Children's Terms." Journal of Abnormal and Social Psychology, LXI (1960), 271~275. Feshbach, N. D., and Beigel, A. "A Note on the Use of the Semantic Differential in Measuring Teacher Person— r ality and Values." Educational and Psychological Measurement, XXVIII (1968), 923—929. Flavell, J. The DevelOpmental Psychology of Jean Piaget. New York: Van Nostrand, 1962. Gage, N. L., ed. Handbook of Research on Teaching. New York: Rand-McNally, 1963. 162 Gagné, R. M. "Curriculum Research and the Promotion of Learning." Perspectives in Curriculum Evaluation. Edited by R. Tyler, R. Gagné, and M. Scriven. New York: Rand-McNally, 1967. Glaser, R. "Psychology and Instructional Technology." Training Research and Education. Edited by R. Glaser. Pittsburgh: University of Pittsburgh Press, 1962. Green, R. F., and Goldfried, M. R. "On the Bipolarity of ./ Semantic Space." Psychological Monographs, LXXIX (1965), 6 (Whole No. 599). Greenberg, B. S.; Bowes, J.; and Kagan, N. "Dimensions of Empathetic Judgment of Clients by Counselors." Journal of Counseling Psychology, XVI (1969), 303— 306. Griggs, A. E. "A Validity Study of the Semantic Differentialy, Technique." Journal of Clinical Psychology, XV " (1959), 179—181. Hartley, J. E. "A Semantic Differential Scale for Assess- , ing Group Process Change." Journal of Clinical V Psychology, XXIV (1968), 74. Heise, D. R. "Some Methodological Issues in Semantic Dif- ferential Research." Psychological Bulletin, LXXII (1969), 406—422. Henderson, J. E. "Individual and the School: Ed. 200 Course Outline and Description." East Lansing, Michigan: Michigan State University, 1972. (Mimeographed.) Henderson, J. E.; Willard, S. M.; Barnes, H. L.; and Prawat, R. S., eds. Individual and the School. East Lansing, Michigan: Michigan State University, 1972. Hoffman, J. E. "An Analysis of Concept Clusters in Semantic Interconcept Space." American Journal of Psychology, LXXX (1967), 345-354. Hoover, K. H., and Schutz, R. E. "Student Attitude Change in an Introductory Education Course." Journal of Educational Research, LXI (1968), 300-303. Howe, E. S. "Associative Structure of Quantifiers." Journal of Verbal Learning and Verbal Behavior, V (1966a), 156—162. 163 . 'Probabilistic Adverbial Qualifications of Adjec- tives." Journal of Verbal Learning and Verbal Behavior, I (1962), 225-242. . "Verb Tense, Negatives, and Other Determinants of the Intensity of Evaluative Meaning." Journal of Verbal Learning and Verbal Behavior, V (1966b , 147-155. Husek, T. R., and Wittrock, M. C. "The Dimensions of Atti— tude Toward Teachers as Measured by the Semantic Differential." Journal of Educational Psychology, - LIII (1963), 209-213. 5 Joreskog, K. G. UMFLA: A Computer Program for Unrestricted Maximum LikelihoodfFactor Analysis. Research Memorandum RM-66FZO. Princeton, New Jersey: Educa— tional Testing Service, 1965. Kaiser, H. F. "The Varimax Criterion for Analytic Rotation in Factor Analysis." Psychometrika, XXIII (1958), 187-200. Kane, R. B. "Minimizing Order Effects in the Semantic Differential." Educational and Psychological Measurement, XXXI (1971), 137-144. Kerlinger, F. N. Foundations of Behavioral Research. New York: Holt, Rinehart and Winston, 1965. Krathwohl, D. R.; Bloom, B. S.; and Masia, B. B. Taxonomy of Educational Objectives: A Classification of Educational Goals. Handbook 2: Affective Domain. NéwSYork: McKay, 1964. Kubinec, C. M. "The Relative Efficacy of Various Dimensions of the Self-Concept in Predicting Academic Achieve— ment." American Educational Research Journal, VII (1970), 321-336. Kumata, H., and Schramm, W. "A Pilot Study of Cross— Cultural Methodology." Public Opinion Quarterly, XX (1956), 229-238. Lawley, D. N., and Maxwell, A. E. Factor Analysis as a Statistical Method. London: Butterworth, 1963. Maltz, H. E. "Ontogenic Change in the Meaning of Concepts /* as Measured by the Semantic Differential." Child Development, XXXIV (1963), 667-674. 164 McNeil, K. "Multivariate Relationships Between the Semantic Space of Various Subcultures and Various Personality Variables." Unpublished Doctoral dissertation, University of Texas (Austin), 1967. Messick, S. J. "Metric Properties of the Semantic Differ- ential." Educational and Psychological Measurement, XVII (1957), 200-206. Michon, J. A. “An Application of Osgood's 'Semantic Differential' Technique." Acta Psychologica, XVII, (1960), 377-391. L/ Miron, M. S. "The Influence of Instruction Modification on Test—Retest Reliabilities of the Semantic Differen— tial." Educational and Psychological Measurement, XXI (1961), 883-893. . "What Is it That Is Being Differentiated by the Semantic Differential?“ Journal of Abnormal and Social Psychology, XII (1969), 1894193. j Mitsos, S. B. "Personal Constructs and the Semantic Dif- ./ ferential." Journal of Abnormal and Social Psychology, LXII (1961), 433—434. Mordkoff, A. M. "An Empirical Test of the Functional Auton— ymy of Semantic Differential Scales." Journal of / Verbal Learning and Verbal Behavior, II (1963), 504-508. . "Functional vs. Nominal Autonymy in Semantic Dif— v/” ferential Scales." Psychological Reports, XVI (1965), 691-692. Morrison, D. F. Multivariate Statistical Methods. New York: McGraw-Hill, 1967. Mosier, C. I. "The Psychometric Study of Meaning." Journal / /’ of Social Psychology, XIII (1941), 123—140. Norman, W. T. "Stability Characteristics of the Semantic /, Differential." Journal of Psychology, LXXII (1959), “ 581-584. Nowlis, W. T., and Nowlis, H. H. "The Description and the Analysis of Moods." Annals of the New York Academy of Science, LXV (1965), 345-355. Nunnally, J. C. Psychometric Theory. New York: McGraw- Hill, 1967. 165 Oles, H. J., and Bolvin, J. O. "The Reliability and Use— ability of a Semantic Differential Attitude Scale With Third Through Fifth Grade Students." A paper presented at the National Council on Measurement in Education Annual Meeting, Chicago, 1972. Osgood, C. E.; Suci, G. T.; and Tannenbaum, P. H. The Measurement of Meaning. Urbana: University of Illinois Press, 1957. Provus, M. "Evaluation of Ongoing Programs in the Public School System." Educational Evaluation: New Roles, New Means. Edited by R. Tyler. NSSE Yearbook, f 1969, pp. 242-283. E Remmers, H. H. "Rating Methods in Research on Teaching." Handbook of Research on Teaching. Edited by N. Gage. New York: Rand-McNally, 1963. Rentz, R. R.; Fears, E. B.; and White, W. F. "Personality Correlates of Group Structure: A Canonical Corre- lation Analysis." Journal of Psychology, LXX (1968), 163-167. Rowan, T. C. "Some Developments in Multidimensional Scaling Applied to Semantic Relationships." Unpublished Doctoral dissertation, University of Illinois, 1954. Schlosberg, H. "Three Dimensions of Emotion." Psychologi— cal Review, LXI (1954), 81—88. Schmidt, W. H., and Scheifley, V. Computer Programs for Maximum Likelihood Factor Analysis. Center for Urban Affairs Research Report No. 2. East Lansing, Michigan: Michigan State University, October, 1971. Scriven, M. "The Methodology of Evaluation." Perspectives in Curriculum Evaluation. Edited by R. Tyler, R. Gagné, and M. Scriven. New York: Rand-McNally, 1967. Seigel, S. Nonparametric Methods for the Behavioral Sciences. New York: McGraw—Hill, 1956. Smith, R. G. “A Semantic Differential for Theater Concepts." Speech Monograph, XXVIII (1961), 1-8. Snider, J. G., and Osgood, C. E. Semantic Differential Technique: A Sourcebook. Chicago: Aldine, 1969. Spino, W. D. "Semantic Differential Patterns of Selected College Freshmen as a Basis for Academic Differen« tiation." Unpublished Doctoral dissertation, Indiana University, 1969. 166 Stake, R. E. "The Countenance of Educational Evaluation." Teachers' College Record, LXVIII (1967), 523—540. Stiggins, R. J., and Byers, J. L. "Annual Report of the Evaluation of Education 200." East Lansing, Michigan: College of Education, Michigan State University, 1972. (Mimeograph in preparation.) Stufflebeam, D. L. "Evaluation as Enlightenment for Deci— sion Making." Columbus: Ohio State Evaluation Center, 1968. (Mimeographed.) Suci, G. J. "A Comparison of Semantic Structures in American Southwest Culture Groups." Journal of Abnormal and Social Psychology, LXI (1960), 25—30. Tanaka, Y.; Oyama, T.; and Osgood, C. E. "A Cross—Culture: Cross-Concept Study of the Generality of Semantic Spaces." Journal of Verbal Learning and Verbal Behavior, II (1963), 392-405. Tannenbaum, P. H. "Attitudes Toward Source and Concept as a Factor in Attitude Change Through Communication." Unpublished Doctoral dissertation, University of Illinois, 1953. Tucker, L. R. "An Interbattery Method of Factor Analysis." Psychometrika, XXIII (1958), 111—137. Tyler, R. Education Evaluation: New Roles, New Means. National Society for the Study of Education Yearbook, 1969. Walberg, H. J. "Dynamics of Self—Conception During Teacher Training." Unpublished Doctoral dissertation, University of Chicago, 1964. Walberg, H. J.; Metzner, 8.; Todd, R. M.; and Henry, P. M. "Effects of Tutoring and Practice Teaching on Self- Concept and Attitudes in Educational Studies." Journal of Teacher Education, XIX (1968), 283—291. Wiley, D. E.; Schmidt, W. H.; and Bramble, W. J. "A Class of Covariance Structure Models." Journal of the American Statistical Association (in press). Wittrock, M. C. "The Connotative Meaning of Concepts: Teachers and Children." California Journal of Educational Research, XV (1964), 60-67. fu— ' H 167 Wrigley, C., and Neuhans, J. O. "The Matching of Two Sets of Factors." American Psychologist, X (1955), 418-419. Wundt, W. Grundriss der Psychologia. 7th rev. ed. Leipzig: Englemann, 1905. APPENDICES 168 - . A. rm._.:__‘--._.-a}.. -1 i _ -4 APPENDIX A SUPPLEMENTAL GENERAL INFORMATION 1. Concept Definitions 2. Research Basis for Selection of Scales 169 _ ___+_r.-_7__.-__.,._ DEFINITIONS OF CONCEPTS IPL CONCEPTS Myself as a Teacher--(Chosen on the basis of research by Walberg, 1957) "Self-concept in teacher trainees as they imagine themselves in the role of teachers" (p. 84). Nonverbal Behavior--(Stipulated under objective #3 for the small group experience in Ed. 200 Course Outline, p. 3) "The student will be able to recognize and interpret through description and explanation, diverse modes of nonverbal behavior, i.e., hands, face, arms, etc." Questioning and Listening Skills--(Ed. 200 Course Outline obj. #1 and #2, small group experience, pp. 2-3) "The student will be able to not only restate what has been said but to relate the feelings and intended meaning of the speaker to the speaker's satisfaction." "The student will be able to seek further information or clarification without cueing a particular response.‘ Myself Self-concept independent of any specific role other than an individual as a person. CARREL CONCEPTS (Definitions Taken From Ed. 200 Handbook, Winter, 1972) Behavioral Objectives "A behavioral objective is a specific statement of what the teacher expects the student to do after completing a prescribed unit of instruction. The critical attributes of a behavioral objective are: intended behavior must be observable, achievable, and relevant, . . . specifying the following: Terminal behavior, conditions for intended behavior [and] criteria. . . ." (p. 214). Reinforcement "Positive reinforcement is the presentation of a rewarding stimulus following a response; the presen- tation of the reward (stimulus) is made contingent upon the occurrence of a specific behavior (response). The result of the stimulus presentation is that it increases the probability that the response will be repeated" (p. 328). 170 171 "Negative reinforcement is the removal of a painful stimulus (aversive situation) that is contingent upon a response (behavior). The removal of pain increases the probability that the behavior which results in removal will be continued" (p. 352). Respondent Learning Shaping "Respondent learning is a change in behavior which results from the pairing of two stimuli. Initially the first stimulus elicits a particular behavior which the second stimulus does not elicit. Follow— ing a number of experiences in which the two stimuli occur together in time and space, the stimuli become so closely associated that the conditioned stimulus alone comes to elicit similar behavior to that originally elicited only the unconditioned stimulus" (p. 384). "Shaping is the selective reinforcement of succes- sive approximations of desired terminal behavior" (p. 368). 172 Axoouuuazv auaomcwe xxoouuuaz. suaaanmum Axoouuufizv mmocm>wmmmumxm Adamzozv huwusomm Axoouuuwzv mmmcmbfimmmumxm Axoouuuwzv wuflomcms .sonz um. umzuo Nucmm voommo voommo hmauumm GOEoHom coommo voommo doaoHom xoouuuwz GMENudom w mahoo Hfiozoz uxoouuuwz poommo woommo uxoouuuaz Hflmzoz Nuw>wuo¢ Noamuom HHOZOS Hflmzoz Hflmzoz mumawucwlmuofimm SOHqummm mmmmloouonmq Haaumnmca>oza m>wuwmcmmcfllm>fiuwmcmm« voxmamulomcoac Eamonoanmuwoxm m>wmmmmlm>wuo¢c momamcoseoonmavflca msoummcmclmmmm maucmmlusmH0w> msflumoumucwcalmcaumoumucHa Humamlmuumq oumowamvtvmmmsm unmwcoalmum>mma Hawum3omlxmm3a wmauum: «nucmm Nuanom w Hm>oom coommo nucmm voommo poommo m>wumsam>m manm>0ncm::IMHDMSOHQM¢ mewcanuwmm« mmmanuuozlmunmsmm>a m>wummmclm>wuwmoma ucmmmodmcslucmmmmamu ucmuhomawczlucmuuomaHa fiwnivoowc mamom .muscmooum ummuumom\umoumum may oucw so powuumu ma mmonu mmumowvcw a .mamom on» How ucwEmomHm sawmcwafic m pmcwaumump wamsow>mum no: on) AmvuwSOHmwmmH mnu mo mamamxm cm can Ham» uonm way How ammono mmHmom on» mume 3o~mn manmu 0:9 APPENDIX B PILOT TEST SUPPLEMENT 1. Cover Letter 2. Instructions 3. Sample Response Sheet 173 174 MICHIGAN STATE UNIVERSITY m1 LANSING . MICHIGAN 48823 COLLEGE OF EDUCATION - BRICKSON HALL FROM: Dr. Judith E. Henderson Dr. Joe L. Byers Coordinators, Education 200 In our attempt to improve the student's experience in ED 200 we have constructed this instrument to assess the influence of the experience on the student. This instrument, like the Student Instructional Rating_Rsport which you filled out in your last small group meeting, is designed to pro- vide feedback so that instruction may be altered to Optimize its role in assisting you in your preparation to become a professional educator. In order to assist us in this endeavor and not consume too much of your class time, we ask you to complete this form outside of class and return it to the secretary in Room 238 Erickson Hall (ED 200 testing room) by Wednesday of final exam week (March 15). When you return the form to the secretary, be sure that she checks your name off on a list of students. Please do this at your earliest possible convenience so that your input can be sued in planning instruction for the Spring term. INSTRUCTIONS The purpose of this study is to measure the meanings of certain things to various people by having them judge them.against a series of descriptive scales. As you respond, please make your judgements on the basis of what these things mean to you. On each page of this booklet, you will find different concepts to be judged and beneath each will be a set of scales. You are to rate the concepts on each of the scales. Here is how you are to use the scales: If you feel that the concept at the top of the page is very closely related to one end of the scale, you should place your reaponse in the box at the right corresponding to the numbers 1 or 7, as follows: fair:l:2:3:4:_5_:_6__:_7__:unfair ..... ' 233333 or ~. r1 fairzlz2:3:4:5:6:7:unfair.....{iEEEQEE If you feel that the concept is quite closely related to one or the other end of the scale (but not extremely), you should respond as follows: strong:1:2:3:4:5:6:7:weak.....$IEEEEE or strong:1:2:3:4:5:6:7:weak.....Lr1-‘J [2] [31 E E: I E If the concept seems only slightly related to one side as opposed to the other side (but not really neutral), then you should respond as follows: O O O O O O I O r. n [—1 m or active. 1 . 2 . 3 . 4 . S . 6 . 7 .passive.1h F5 ll {5 £5 Eh 35 active: 1: 2:3 :4 :5 :6 : 7 :passive..:1] E E a l [6] E73 The direction toward which you respond, of course, depends upon which of the two ends of the scale seems most characteristic of the concept you're judging. If you consider the concept to be neutral on the scale, both sides of the scale equally associated with the concept, or if the scale is completely irrelevant— or unrelated to the concept, select number 4 as your response: . . . . . . . . "INF? mm” active.1.2.3.4.5.6.7.passive..d.1 ‘3‘ a l L21 (911.7.) IMPORTANT: (1) Be sure you check every scale for every concept--do not omit any. (2) Never put more than one response on a single scale. Sometimes you may feel as though you've seen the same scale before in rating a given concept. This will not be the case, so do not look back and forth through the items. Do not try to remember how you checked similar items earlier. ‘Make each item a separate and indgpendent jugggment. Work at a fairly high rate of speed. Do not worry or puzzle over individual scales. It is your first impression, the immediate "feelings" about the items, that we want. 0n the other hand, please do not be careless, because we want your true impressions. MAKE ALL RESPONSES IN PENCIL, THANK YOU FOR YOUR COOPERATION violent : unpleasant : lenient : intimate : excitable bad unenjoyable light : passive still : sharp : interesting : positive sensitive unique rugged : fair: weak worthless safe : tense : fast : unimportant labored QIESTIONING SKILLS 176 03 519A 9 MM \J :4: 01 CD I. Q Fl :gentle.............. : p1easant............ severe. O O O :mteOIOOOOOCOOOOOO C. :MOIOOOOOOOOOOOOIOO :enjoyable........... :heaVYOOOOOOOOOOOOIOO : active. . . . . :moving.... :dUllOOOOOOOOOOOOOOO. : uninteresting : negative. . . . . . insensitive......... :cormonplace......... HHPI—‘l—‘Hl—‘i—‘l—JHHH :delicate............ 0') O. unfair ............ . . :SmgOOOOOOOOOOOOOO ”I : valuable. . . : dangerous. : relaxed " ‘ Hi—‘l—‘l—‘HHHHH MSU-OS-iOG SlWOQOOOIOIOOOIIOIQ I O 8201 APPENDIX C PRE/POST ADMINISTRATION SUPPLEMENT 1. Cover Letter 2. Sample Response Sheet 177 178 Your name Student Number TO: All students enrolled in Education 200, Spring Term, 1972 SUBJECT: Evaluation of Instruction As you may know, one of the goals of Education 200 is to continually evaluate and revise our instruction so that it will supply maximum.service to the student as he or she prepares to become a professional educator. As a student enrolled in Education 200, you must play a critical role in this endeavor by providing feedback about the instruction from time to time during the term. For example, as you proceed through the course, you will be exposed to numerous instructional concepts, tools and procedures which will, hOpe- fully, be of assistance to you when you become a practicing teacher. It is important, therefore, that our instruction be effective in teaching the meaning and use of these tools and concepts. In order to assess the worth of our teaching procedures, periodic measurements will be made in the form of tests to determine how much you have learned. That is, there will be an attempt made to determine whether or not you know how and when to use the instructional concepts and tools presented in the course. This is one method of determining if our instruction is as effective as it might be. However, there is another influence which the instruction has on you as a student which is as important as how much you learn, and that is how you react emotionally to the concepts and tools as they are presented to you. This effect of instruction is almost never measured in our educational endeavors, but we feel that it could serve the useful purpose of providing information as we plan instruction for the future. The survey which you are asked to respond to here is an initial attempt to measure some of these non-cognitive outcomes of our instruction by asking you to describe your reactions to the concepts. It is an initial attempt because we plan to use your responses to adjust and refine these measurement techniques in order to develop a systematic means of measuring your reactions. For these reasons, we require your assistance. Please read the instructions on the next page and respond accordingly. If you have any questions or need further clarification as to the intents of this survey, please do not hesitate to seek further assistance in room 238 of Erickson Hall, or phone Rick Stiggins or Rob Brann at 353-8765. Please return the completed form to Room 238 Erickson Hall by Tuesday, April 11. Judith E. Henderson and Joe L. Byers Co-Coordinators of Education 200 Richard J. Stiggins Education 200 Evaluation 179 Please enter and code your student number. nmunmonmunmonmuhmo anmHmHmunmfiwu ahHUnnHCZHU nwfiocnounmonmunwo anmoanmonmunmu ncunvonvuflvunvunvO nnHmnnnunmufinHmc nmfimonmunmonuunmo n—uzunpunzn—H: noHounoHounounOU DE .Oz .550th BEHAVIORAL OBJECTIVES noun m aanmnnmurmu anmonmurmnnwuwmu nhbnmonmuhndnano nous munwunwanoufiwo anmonmunmmnmuhma 35. «53.9. «Luann 3 any m ”any, mu; momma nmu, manurmfiNurnn n3: #31431“ now 0 encunOAvounoc LE .OZ mDOQ valuable 7 F1 1 worthless lenient l 2 1 8 eve re fair unfair had good 1 relaxed i O .r).. powerful 7 ......J . weak negative j positive i . l\ 3p. gentle 7 ‘ LJJ r). I violent unimportant important } passive l ./ L). I active sensitive 7 , {J., a 1 insensitive uninteresting i interesting 2 unpleasant A pleasant } l P 1 still moving unenjoyable 7 2 (1 enjoyable 1 F3 4' '6 main nsu '4 n7“ 6 n3u n2u n7u nsu noiu nlu ~7u Asa n4» n7; :lu nibu n5u ndu n3u nlu .179 Please enter and code your student number. noun annoyamunwurm: anmonmunmuanxu ahHSnnthhun: anocnononono anmcanmamHmu nvuqvonvunvunwunvO nnHmnnnHmun nunmo nNPmonNHmanNUnmn lupus—DCHCZH C nOHoonoHocnOunoo i DE .OZ PZwODPw BEHAVIORAL OBJECTIVES nmu, a nan o “a rwupmu anmamHmeme mu Phone snake fisHAO nos. c ambiguities anmanmuamnnmUnmu five eonvuranvun co any man! manor no "N9 manNunquNupmn 22:22:19 C no“: OH OH OAWCu—noo OZ “CCU valuable / / I worthless lenient 7 1 severe fair U unfair bad good 3 relaxed 7 U 1 tense powerful 7 .5. ‘1 weak negative j positive 7 gentle 7 a)“ . 7 I violent unimportant .LJ important a passive 7 I 1 active sensitive 7 .CJ 1 insensitive interesting rd uninteresting , unpleasant 7 pleasant j ‘ / 2 1 still moving unenjoyable 7 1 enjoyable v5“ "7“ m5” ”1.. m3“ n3u n7u nsu use “4“ n3u "7“ The "59 n4“ n3u APPENDIX D RESULTS SUPPLEMENT Factor pattern matrices for each concept (Note: all matrices report varimax rotated solutions) Raw scale means and standard deviations for each concept 180 181 mam.0 osm... cmo.| oma... mmo. wabmxomcocouozmxogw was. «am. ass. .as. kme.- mci>9sazuum mkoqei ODN.’ ONO... dMOel Nmme ucmmflofimfislucwmmwfim mom. mag. 32. . oom. Omn.l mcnmouwuca -wcwummuwucwcs mkme ”ND. OGOO #300 OHNOO U>wuwmflwmlm>wuwmfiwm~uw mt..- 3...... m3. 3...- Sm. wiggling. mmn.o :n:... ma: . cod... no«. unmuuoaa«co-ucmue0daw mam. mos . 3... mos . 53.- mismwicflor. QON. 0mm. «Ode "nae @Nde- Q>wuwwOQI0>wumwa 2: .- :0. m2? SN. 3:... Stuaoal... :59... «am. Now. NMa. mad... ooxwdopbmcob onn.- co... .m:.. ooo.- so“. e.eiaow mom. can . ewe. one... mow..- 33-33:: use. 5:." .0 3mm. nmo. cm: . unmecgeugg NON. mam. aneI 0H9. 0°C.... OHDND~Q>IwmwMLUMO3 m c n N ~ Twin; . Ho. vn . 3.3 a Nx "acaugom nouommum 3w .. 02.... New..- 3a. .3.F?.§-.Z§%$ one. mod. and. odd... mega—AS: cmo ... no.7: Java... .50. uaemeoHnasuuconmoHn NNN. use. a“... 3:... wcwumououca -wcwumououcaco :NQ. mum. and. ans... ozuwmsombzuamcomg an... .. a3. 3...- 2.... 3:39.33. 036.: one. 90...... ham. acouuonawsouucwuuongu sno. on. KN“. mmn.: caucowbcwfiog a: . 2:. at... SN? massagisummg won. mam .0 33m . «so. Hsmuoaoduxwma «mu . 50:. com. mon..- ooxeaouemcou 2.3.: oo«.u New... mnN. vooupoow «mo .: mNo. «on. new... .5: 333:: N2. . mmo. so." .0 can. msg#37333 «he . Nag. mmm. «s«.: manosfiearummozuuoa e m N A II .0 il l! ‘1.770,'I‘I 7-.. 1.7.;illlll fiance .oo.va .msuNx acuuoflom uOuumuae Cebu may .2095 Hmagnificence HHh Bmmm mo ZOHSAOm 92¢. ZOHHDAOm «8.05m .V “Esau—‘8 muOncmcs-m3w>0mcm M9N. «09. 009.0 .19. 00N. 9.0«.0 mixes-Swan 23.- now.- oom. :3? 3?- Sn. “cmmmflacsbcmmcma :50. :90. 009.0 999. MNN. mo~.0 95:3qu-wc3mwuwu5cs :50. «99.0 mn« .0 msm. mm«. :m9.0 mzugmcwm-mzuwmcomg 9h«.0 mmN.0 mm«. «:9.0 :90.0 0:9. $532—$338 9mN.- 93.0 ..NN. 099... 02: RN. 33330555585 :~.«. 0~«. mh«.0 3N5. s:«. s««.0 mHucuw-ucgofv min. 3.». SN: new. an. NZ..- «>32oa-mfiuamwc om«. M90. -~.0 «m«.0 30M. 9m«.0 «swuwga-xmoa 009. 000. n09.0 0MN. N09. 009.0 vmxgwu-mmcou 9~:.0 00~.0 :Gn. 0n«.0 «~«.0 99m. van-woo» 050. 0~«. mN~.0 mh«. om«. .3«.0 Sow-.335. 0N«. 9:9. «:9. 090. 309.0 .0«9.0 ucflcmH-wugwm m9... 00~. «03.0 509.0 0«~. nm«.0 30w3u>$mmzuuoa 0 n q n N H Nmumv .No.u.a .mm.mc 0 Nx ”aowuaHOm nouomm-0 NS . - N3 . - NS . mi . .313 52.3 .13.. 3°. :3. moo: 3M.- wigs-:3. :0w .0 09« .0 :9« . .n«~ . ucummofimna-uaguoa a: . nmu . 3N .- 25 .- 33.335 -33..»355 909. mm«.0 :0«.0 0.3.0 25323-333..me 9.3... No«.- is. mcu. 0333-332. :N 9 .0 on« . 0 0«N . «ms . uauuuoafiwas-ucwuuoaqs New. «2. 0~«.- ««n.- 333-330? 03. oom. m3..- ocm: $338-$332. nn~.0 99:. 050.0 «99.0 «:wuwga-xnwa ..NN. 03. N9«.0 00«.0 voxufiou-oacou «n«.0 90«.0 N0«. 9M0. van-coon oom. m3. oo«.- 30: Saw-Sous: mom . 999 . :09 . 999 . 0 2.3607326... 9N9.0 m0«. 9...~.0 ~90.0 u~0u=Ha>-nam«nuuoa q n N a fimumu.oo.v. .Ho-Nx "co«u:Hom nauoam-c 33.333 33%: ”.5828 8H0 9.0mm .5 202.0400 072 2025.50 «8.059 .V ".5528 $04.0 90m 385%: zanaa0n=mc=-w3mm0ncw and... Nac- vna. oat-U wcw>oEuHkum own. cod-u Noa.u find. yammwmacsuugmmwa «av... nod. ken. 000.0 wcgumwumucfi-wcnmwumugc: ovn.c woo... «on. mac”. mZufimcmmumZuwmcmmcw oaN. mtg... one. 0H0 «Emmanumzuum adv.- aMBUt one” ”on” ucmuuanHCSLCmuuanw toN.| o c 3 cm. moo mfiucmmuucmfiofi, new... and. man. omv.o mzuwmoauoiuwwm: udo.u eon. 059.: nmv.| .25»w399¢3m3 «Dd-u och. hon. ONH... vmxmamuummcg won. vmdoo 000.0 onv. cmnAXBw Gum’s «on. ten. ood.o pwwwtfiéucs moo. one. not. 02...: ucmwcmfi -mpgmm ovw.s and. and. vwv... o~pw3w>ummm2uuos Q m N a amufiucovH mp< uwm Huufiumwuwuw muu=va< wo newusfiom vcw cowusfiom neuummu¢ Aumwuouav auguuou a no :09»: ”.2528 5" .1. 'u" a: 5mm uoizoflagqmloi 20:38 58.2 ¢ ".5328 53 «8 $025: zéhwuam60muw>wuwm5mm5 on»... man... son. 0333-033» nnm..l one... can . ucauuoaawcsuucsuoaau Nno. nov. non... maucmmuucuaofi. an». «on. ovv.o o>wuumoauw>3mwwc use. ooo..- ovn.o Hnmumaomiuma ouv. «Nu. vow... vwxoaounmmcwu own... New... ooo. vwnéoow man. can. «om.u 53-33:: «can. can. v2... ufimemepo>mm coa. mnn. nnv.u w~nas~n>ummm~£uuoa w m N a 8.4:. . 8.» a 3.3 n «x 83.38 .839.-. Nam... nan... can. nN~.u Sawhogocauwanumoficm. mom. .5. «on: 9.2.. 9:85-22. «mm . .. nvv . 0 «mm . nmo . . uaoonodacsbnouuoan van. orv. nmv.o nso. wawumuuouaw-wcfiumououflc: 93. one. 5.1. 02. «33.5.-..32235 flow... com... ooo. vn~.v m>ammam$3uuu vau... 5a.?! NdN. mow... uauuuoafiacsbaouuonfia woo. vmv. «on... dvfi. maucowuucoaofi, von. Non. onv.n one. oguwmoauuzgwmc. ooo. moo.o vow... aka. Hawuusoaium.» oov. uoN. #5.".- one. vokuiuuunaou «no—I NOH.’ ox... ONO.O vannvoow vvdm mnm. can... one. flaw-Sauna no.1... ovn. ova... “Nd. ucuaco~$um>wa van. owa. onv... one. wanuaau>émo~£uuoa a m N a Hanan .oo.va .omuNx Aummuumoav " acauafiom Heuuomuv umcuomu a mu :3? "Hum—028 . Ivllllll.‘litll.il|'n|llll- HHm ....me mo ZOHSAOm n5 .ZOHHDAOm m80wu2w mm”... ooo..- Non. vvm.a ucwupoafiecauquupanw vac. man. man. moo. maucmuwuucgofia onN. _va. .ucfl. QNN. w>wufimoaum>fiumwwc Nov. moo. vmo. awn. Hawuwaoaimma Ohm” vun” “mm” Nan” vmxmamu-mmcmw on .- oww a Nnv r vmnéoo oom. nae. mno. owe. uwmwlfidmc: and... vam. ooo. who. ucm«CmH-mpm>mm Nam. 00.". voa. vno. mHnm5~m>ummmHLuu03 q n N H no..va nm.mq a ax "acausaom HOuowmuo on»... can. non. vno. mfinm>0mcmcsumanmx0ncm 0N0..- ONN.I mam... kofi. mat/OEAHEm sun... ovu . mm." . “No . a ucnmowaacauuawmmfia non. Ncm... can... amm. wfiumwumucfi-mcflmwuoufic: CNN. «mn.l mom... nvm. w>wuwmcmgmum>3wmcmmc« mac.u nmo. mon. 23... m>wmmma-w>wuum NNN . o as." . ens . nvo . .- ucmuuoaewcsbcmuuoaea mmo.u "30.0 «m3.- oma. mHqumLcmao: ooa. com... «or... Hvfl. mzuwmoauwzuwwmc va. who... ahv... 0.7". Hamuwaoauxmua New” Kev“... ova”. sac” Umxmfimuammcwm aka I nnn hcm omo 0 “Ep-v9u sud. Nnm.l Nva.t oNN. uwmwlemcs aka... sov.o 2.0... 0.3. ucmwcoaumumzm vvd. com... chm... vva. w~nms~w>ummoflzuuoz v m m H 'l' 'lll’ll‘.’ nul1|\-l ll .II'I...I. xil fi'l’!‘ I‘ll’l- I‘I.‘-.. Hmuwu .oo... .ooumx .‘I.-I clv’lll {U 1“ ’1‘- 5"» .1 I’ll"! “c0wusfiom Heuommuq Aummuwuav qu>w£mn Hanum>coz uHmmozoo -.eHm pmmm no oneogom oz< onHDAOm moHowuwmcmmuw>wufimcmmcw vno. own... ”no. Hod.| m>wmmmanw>wuum vvo... 005... 00“” vmanu ucmuuoaewcswcmuuoaew . moo... me... on .. mno “.35 L53? oom.... man. an»... Gav. mafimoaamzumwwc moo... «om. vnN... nho. Hamumwyoaimmi moo... on”. Nun... vom. vmxgmqumcmu non. oom.: moo. wm«.o vmnAXSw :3... gm. .23... man. 33-53:: «91: can. “do. amv. ucmwcflamugmm o3... .3... 03... no... 3.3.2-3353; v m N H HmuwucwvH mu< uwm ~m0wumwumum wumzvmv< wo cauusaom was newusfiom acuumm-q Aummuumoav “8.25503 Hanumzuoz ".5828 Illl-I'| .Sm mem no onHDaOm 92¢ ZOHHDAOm m800ncm mom. '0."- onfi... . ”NH-I wcw>ofiflzwum ohm... nav... oom. an» azummmaasbcmmuwa Iv. awn. omv... emu: 9.3.335-mcflmmpmufic: Nov. nnv. won... :3: «3333-33.33... ... a . ovN. and. m>wmmmauw>wuum ”fl“... Mo“..- oon. Qua” ucmuuomewcsbcmfipomfiw . v . cod.l nafi I mflucowuucwfiofi/ mm“. “mm. mhv.u 23.. m>wfimoauw>3mwmc «no. mac... nmv: one... Hztmzoaimma onN. adv. «va... Nan.u vwxmfimpummcwu oom.... who... can. ova. vmnéoow on». va. New... ”ms... 33.385: «3... on... So. «no: E35722... van. vow. can... vad.u m~pm=~m>ummm2uuoz q n N A HmUwucwvH wu< awn amouumwumum mumsvwv< mo segusfiom cam cowusfiom Heuumhuv Aumoucuav 3qu 95:3me and wcwcofimoso ".5528 K! .-...vll .5.” Hmmm no zoupbaow 92¢. ZOHHDqOm m80oflcm wcw>oE-~kum qummmfiacaauammmmaa wcwumoumucw-wcwummumuCWGD w>wuwmcmmuw>wuwmcmmcw m>fimmmauu>wuom ucmuuoaeflc:-ucmuuoasw maucmw-ucm~ow> m>wuwmoaum>wumwmc #:wpm3oanxmm3 vmxmfimpummcmu vmnuvoow uwmwuuwmucs ucmwcwfiumum>mm mfinmaam>-mmmH£uuoa HmowucmoH mu< ugh Hmowumwuwum ouasvwv< no :OMuDHOm can newuzfiom acuumhuq aaawxm wcficuumqfi wan maaCOwummno “Hmmozoo HHm Hmmm no ZOHPDAOm Qz< ZOHHDAOm MOPUwuwmcmm$>wufimcom6w NHN. mmo.o 0cm. «wen. mzmmwa-wzuom ama. vwa... omm. va. unmuuoaswas-ucmuuomEfi mma. «we. mma... oofiu maucmw-ucmfio; mo«.- “1mm. nee... 0mm.» 3338-033me 3»... P...- omm... 3.... 3.38.1.3 new... ome n00... oam... vmxufimu-umcmu nwa. av“... men. mm". nan-voom and... erq mnm... new... Sum-.539. onn.a Hmh. nvo... one.» ucmucwa-mumgwm «in... mma. vom.I ch.I wfipmus>-wmozuuoz o m N H £10.: "l'l'o-. . . III ....I I1 Iltlll'cll I‘ll-1‘ A I‘ll-I'll 0 1-31! -4l.!-lll!l'0" --.-l ‘Illl‘l anuww .oo.va .ooumx Aumwuwuav “Seausfiom uOuuwm-¢ 33.8.20 3.8323 ”5828 .Sm Hmmm LO onHbAOm oz 4 ZOHHDAOm mama...“ q H8...“.5750 mo~ A“: ,h.,...m.~ n '3‘ ..é.‘ 190 o¢H.u voa.n oo«.n Hmm. mom. m~nw>o?h3§7mfipmeHCm mmn. mmm. 0mm“ mmm.n anfl.t mcw>QFLAHum noo.o nod.n anN.n nmm. 0va. unmmmmT&=TQCmmmea mam. ¢on. cam. 0mm... OHM... wcflummumucwnmcfimmpmucwcs vam. 00v. «5.... mnm... 0mm... o>wufimcwm-m>flfimcmw5 0mN... 00¢... nwo. «mm. :3. m>wmwmaaw>wuum mad... 0am... who...» no.“ . mon . ucmuuoaawcsuucwuuomew mad. unn. om0.. can... emu... wfiucmwuuamaofi, mam. moo. Anna 0mm... mvv.» w>wuwmoauw>wumwmc mcn. mam. «mon 050... 0mm... Hawumsoaaxmoa amo. Nvm. H00. Nma... 00«.n voxmfiwuuwmcmu ton.u nno.a m««.u Nnn. mqo. vmpAXEm «NV. mama sva nma.a nm¢.r uwmmlfibwc: use. omo.| acm. 3.0.3 vmo... ucmwcwfi$um>mm mom. one. 0pm., vno. vam.u m~nms~m>-mmm~£uuo3 . m q m N H anwn Ho.dva on.m0 u Nx "cauuaHom Heuownnn New... N5... 3.... mom. oSmmohcmcséfiioflcm cvo. can. onN... Nam... mat/06-3w; an." . 9 cum .I amm . ova . ucmmmmfiacabcammmfiq muv . mmm. as, . ... mov . r mcwummumucw -wcwumwumucwc: mvv. onv. new... 00m... m>wuwmcwm$>3wmcmm5 vmv... awn... cmm. can. mammwauwzuom 0v... . n «.«u. a mmm . mfim . ucmuuoaewcsuucmuuoasw men. one. can.u vma.u maucmw-ucmdv.r can... own. “Hm... 0m0.u m>wuwmoaum>3mwmc 3%... mac. 0H0... ovm.u Homuwaoauxwma vod. ace. nnN.I «ma.» vmxmawuummcou 0m0.n ocuwo mvn. 0mm. vmnAXEw non. 0mv. nud.o com.r uwwmlziwca 0am... Him. 0mg... nmo.v ucmwcwfiumumfivm own. NHN. owe . «.50 . .. m~nm3m>ummmanuuo3 q m N N Nmuuv .oo.ua .omuNx Aumouumomv "cofiusfiom nauummu¢ 3.53020 Hmuofirmnwm H.Emozoo ‘alllllv'ili’llll'! ..ll HHm Hmmm no ZOHHDAOm az< ZOHHDAOm MOPU3wmcmmum>wuwmcwmcfl mi... .. 3...... 02. SN. NS. 2.83-332... wa...“ 0cm... mac... vvt. mum. ucmupoqENcsuucmuuanH oom. IE. 1.... Nam... 33.. NSSNLSEE NHN. mnm. anJ m0m.u m0a.n m>wufimoaLfifiumwmc mma. «00. nmfiJ mn«.u odo.o 1dum39%&6m3 2;. SN. omoq m3: oom.. 23.33-38. mom.c mam.| Mme]: NNm. 0mg. vmnéEow on... on... 93. mo... 35... $.3mech 0mm. Nmo... vmv. vmo. Nmo... ucwwcfluwuwiwm n3. 3.... mean. 5.... N3... «3.3358230: m .V m N a aqumv H~.uya #m.n< u Nx “coHuDHom uOuummnm Osv”.. ”van N00. o0m. mfinmmomCmcsvmapmeMCm vow. vmmq wod.s 50w... w5w>osézum mmm... chm ~50. 0mm. qummmHacauucwmmmHa mmo. Fora... mam..- Haw... chummumugumcwummumucwc: Ham. «.0. man... 3:... m>wuwmcwm$>wuwmcmm5 ov¢.s owe... oven NHv. w>wmmmauw3uom mam... mwu. ¢mH mmn. udmuuoaewcsuucoupomfiw ...mm. nwm. nmmns oaa.u wHucmecmHog n0... oww. A}. .. Nmo... w>wuwmoanm>wumwmc can. ouc. moo... .Cm... #:wumgaimma WNM. mvu vvm... mom.. vmxwfioubmcmu m I omm.a ava. mum. vunAXEw mmn. wwm. mmm.n can." wwwmlfidwca O C a one... n00. 0mm u 00H. ucmwco~-muw>mm How ¢hn. .mflo. 500.. m~nh3~u>-mmm~;uuos c m m H Amuwv .oo.va .mwu x ”cofiusfiom Neuummuq N Aummumumv ucmamuuowcwmm “Hmmozoo 1‘ Ill- 1|“- Iu' ill I I y. [I’ll I .. (II... NH. ewum mo zoNysgow-mz< zoN9240m .55: ... “.5828 .53 mo... $354.: sz.o.ncm unH. 0N0... mad... new... nma... wcfizoELZum NtH... dun. nvm... 004... vac. ucmmmmfiacsuqumwwaa xum. oom.... at”. How... 0mm... mcwumwumug -wctmmumufic: mmv. hon... ofiww 93.... mm... ... m>wuwmcwm$>wuwmcmmcw moo... Nmo. Nvo. Nmm. can. w>wmmwauw>wuum cum... ¢NH. men... 0mm . mm... ucmuuoafiwcsbcmupoasw Ana“ New... Nmr. 33.. New... mfiucmwwuucmflof. mod .mon... oom.. omv... «vow... mauwmoanmaumwmc mac” ¢v¢no cram: av”... NMH... Hawuw30auxmw3 and. mac... Nam emu... wad.» voxmawu$mcwu N3 2” «5.. 3.. NE. .368. Nna. own... hm... oflh.t oaa... Hammuuwwwcs «fun... 00.". QW¢. nnn... ooo.. ucwficwfi$um>mm Nwfi. omd... Rho. H00... «mo... anw=~m>ummmH5u03 m +~ m N a H.... #0.... m...o u Nx “COHusHom neuumm-m mnmno mmmuo mom. oom. wfipo>0ncmc=$HnmaoFm ova. oma. mco... vma.o wcfi>OEL—:um mon... own... can... no... ucmmooaacsbcammwfia awn. nmm4 mom: ame.u wawummumucw-wgummumucwc: nmv. arm. mm“... mam.» wzumeumumZuwmcwmg ano... mmc. mum. and. m>wmmmaum3uum Nm....... mam .... NMN. se.... ucmuuoaewcznucmuuoaew Mom” nvo. mmn ... Nam... wfiucmwbcgog .0.) «cm. m0¢ .o mmm... w>wuwmoa$>wummmc om¢. hcn... Nmn .I are... Hawumgauxwoa Add. mon. omw... mmm... cmmeou$mcmu ova... 3.4.: nan. HmN. . vmnuvoow non. m... «Hm.- nvfi.. uwawazmwcs “ma... 3+. mm"... «Ho... ucmwcufiéuwgm meg. ”mm. can: mus: w~nm3m>émm2303 q m N H .... '.-'I'" In. i‘l vil- l“ V , lt.!‘ . Hm "mu .oo.va .mmumx ”cofiusfiom Houumm-q Aummuumomv unwauuuomcfimm "Emmozoo 1‘llillllllr 0 i‘i‘tillll' “Sm Hmmm no ZOHSAOm oz< ZOHHDAOm m80.tawfiwun 05mmmo~a::-ucmmm20; wc0ummhwuc0-wcwumuuoucwc: m>0umemmno>000mCmm50 w>0mmmasmbwuum uCMupanwc:-uchngE0 mfiucow-ucmflo0> m>0u0woa-m>0umwwc fisuum30a-xmma vamHmuuwmcmu vmn-voow uwmwouwmmc: ucw0CmH-mpm>vm w~nm3~m>ummwflfiupc3 “coflunaom pOuommnm wfinm%omcmc:-m~nm%0mcm ch>oE-H~wum ucmmmwfia::-ucmmmmfia wcwummumuCH-wcwumwuwucwc: m>0u0mcwm-o>0uwmcmmC0 w>0mmma-m>wuum quuuomewcs-ucmupanw mHUme-uC¢HOH> m>0u0moa-m>0umwmc aswpm30auxmm3 vmxmfiwp-mmcwu vmnctoom uwmw-u0mwcn ucmwcwfiumum>wm y~nm3~m>-mmm~:uuo3 Amuwv .oo.va .qmaumx ”cowusfiom 0000mmuq Aummuouav wcwcummq ucmncoammm ”Hmmuzou -II mQHudm .uuhmmozoo mp_...nc..f.. Hmo- moa. mmaw 04.0... wmm. 9....Hmmmanv>...um mon. «.50. 0%.... «.00.! moo. uCNupcaEF3-0.069300: mam.s who.u meal. van. NNQ.L .lucQuLCm”;> new... .93... can... 3m. ooo.... m>..u...c0-.>.:...x..c hnv... omoo.. nun... 0&3. nov... Msmpczc.0-x.wm3 000.. «3. 0mm... 00.... SN... 00.2.7.2... new. cod. omdq 5mm... 0%.... vm.L-vccm mom... moo. DVNJI «mm... mvm... uwmwupémcz 050. One... oma... mmv. uaa... 0cmwcmfi-vp....f,.m ANN... onN... made“. van. mun...» 33mg;m>umpu.LL...:>... m d m m H Hwnww No.dva ma.oo u mx "newusfiom uouumm-m Nam. aux. 0mm... cum. mfinwzcfldmczlipfiw,3...:m own... vhm... NNN. ”mu... wcwu.ofi-:....- ovN. mac. hmv... WNW. ucmmwiarsLCe. .10 omw.t mmv.l . van. omv.o wcwfifiiyuchcu0fgz.g nvm... m....V.u mam. cmn.o m>03mc..m-:;fi.z - moo. \vd. 000.. mam. ..ttn... wa. kaJ 00...... MN . 060.05.92.00 -011... ch.c maa.c 039. meo.u :_t.& 7%..sp Gwyn... max-l or..- vmc... .....w...,.._. 3... run...» mmv... vim... Ohm. mun... H...:.....J.-.,w-;...r.. v03... an.I ave. vau.. U..x.......»- .15..“ mma. Uta... vmfia... own. 3.7.530 vmn... nvm... on... 0.1.1... ...m...0-.:w.0c: 04'0- O...no.. .mvv. vmo.o ac....:.....-...~...>...r mud... 00...”... now. 000.! ....2m:.......>-.r.r.::.L .3 J fl N . .0u00 .oo.V0 .00u0x .c...:..0 ..._..-. Aummuumoav mewcuqu ucmvcoamwm .rm..zcu .ZL km”... ...... 75:: ...;m 3...... .../1.7.0.1....n «.504... ... .hmwbzvu :34... x Z m..fi...._m.nm.>. Zamtdwm ago/0.0m DEcE-Hkum gammmvflaca-ucmmmmfia wcwummuoucw-mcwumopmucfic: w>wu.w:mm-o>wuwmcmwcfi m>wmmma-w>wuom utmupoafiwc:-ucmupoaew mfiucww-ucmgow> m>wuwmoaum>flumwmc Hawum3ca-xmm3 vmxmfimu-mmcmu vm0-voow pwmwupwmucs quflcwH-mpo>mm wanmzam>-mmwfizuuo3 "Sofiusaom neuummuo com. a... no..- man. mnH.- a... 0.0. cm... nmo.- on..- wum. mum." Nnm.. .mfi. Nom.- ..n” on..- ”Do. mom. mm... Koo. mmvuu mmw. ac... moo. on... knm. mmn.- mvm. “on.. uom. 005. sec.. «cm. 0.0.- nmm. ~9m.. umv. nsc.. mm.“ 0.... com. mom.. .mm.. mm." saw”- ..N. no... naa.a oom.. mud. ooo.- ava. 0mm.. one. ooo.. moo.- om~.. Duo.. 0.0. -~.. mno.. one. man.. c... oum.. oom.. . o.~. mou.. men. a“... men. we... ooo.. «an. mmm.. mad. com.a was. ooo. mam. mmo.c “co. use.» nvn. mm».. mm”. m.... a... co~.. . o m a n N H mm": 8. V. 2.3 a 1. own. «om.» um“. «on. Ho«.u oqo.n ovo.. ~m~.. mm». num.. cod. mvm. mnv.c mna. ov¢.u mm1.s ma..- mum“ uflm.- mmn.. mum. .00.- men. mom. 0... ¢.«.- Hm¢. mnm. Hm~.- «Mm. omm.- do..- voo.u mum. ovm.. nun.. no..- ..u.. do..- mmo. no~.. awk. omo.. amm.. con. a.... gem. Nmm. nmm.- urw. mmo.u mom.- o“o.- awe. Hmo. mno.. om..- o.n. oom.. mm~.. q m N H Hmumc .oo.va .ooumx I. .... . I A I » .. Ill"! .I‘Iill. ,l'ltl.u||l|‘ l!. I .i.0.ii,v‘.ll'lln.u II... I .. I 101'. 40-01.!!llll II‘OI..I I.ln .IIIIIIIQI.'..'|II. Vv I I» .I-I - ll ' I In Illl‘i' ...-'11. Il|‘llu'lll.ll|lillf 3;! w~nw>0ncmcs-m~£mmOMCm wcw>oenfifiwum ucmmwmfiaczuucmmmofia wCHummuwucw-wcwummum.cfic: m>wuwmcwmum>wuflmcvaH w>wmmmaum>wpum ucmupoaewcznpcmupozew vducmm-ucw~ow> o>qumcn-m>wumwmc #:wpm3oaoxmm3 noxm_mu-wmcmu vma-noow ufimuuuwmwc: gcmwcm~-mum>mm o~nm3~m>-mmmasuuo3 ”newusfiom neuummuq haw Hmmm no onHDAOm Qz< ZOHHDAOm ”599$ ¢ ”ammozoo mowuwmcmm$>3wmcmm5 mac. mo..- man. 9:. 33.3-33... osm. . «ma... one. mod. ucmuMOQUCZTLcmupcgbw on»? :3. 03... m3... 32..-...33 «on... cow. moo: mom... m>CGoab>Bamc -:.0 was. osn.n mma.| astmzcaimwa om«.c mmo. bba.c mud... vmxmawu-mmcou 3N. Cu... no... mm... 2568» «aw... 3n... .30.- mma... 53-33:: ado. mmm. MOO... «m.a.u ucmwcma um»w>wm QMN.I cod. mwk... cad... mfinmaam>ém3£uuoa q m N # HmowuconH mu< ugh HmUWumwumum muwavmv< mo cowuaflom vcm cowusfiom Heuumm-q i. I’l ...‘lunlIll’ull I Ilan'lll 1 II III Iln.|l| -..I'll' £1?" II I. I’ll-III'. III. I. .||I'.l|lll|l.«. 1". l . L . .II t... .IA \A- ,1 n IIIOIII II!|. III 'IJ.II:II I "1 I'l'l‘tvll .II-ll-Inllu’ PH. 9mm. .0 20.9240m az< oneagow, «Bo: .. "$828 :9; «8 $855. zmmHHomcm «mm. mm«. Jma. 0mm..- .me... wcw>oE-::m o:«.l mma.l who... ca“. JMM. ucmmmmfiass-qummofla M.;. mum. own. .35.- mmm.n mcwummumucwnwcwummpwucflc: .....N. mad. ooo. 93.: add... m>wuwmcwm-m>wuwmcmm5 00¢... mmsf 0mm... mma. of. 9333-332... ..«N... NE... mam... oma. Una. 23.855-33.85 N3... wnm. No». man... mmo.| mfiucmw-ucwao: Now. cam. M.;... moN.I 0N"... m>wuwmoa$>3mwmc m3. 2:... ...». 3a.- .m..- 3.338-183 nae. ~04". onm. moo... 00.“... vmxmfiwu-wmcmu H:«.I ”Sm... mmm.I 055. ”Nd. van-noow Nag. «cm. mom. msa.u :oH.I Saw-Emacs 9mm... .58.! mam. :3... Jon. quMCmH-mpo>mm omm. nms. mam. NAN.I «5...... m~nm3~m>ummwanuu03 m q n N H 3": 60.». ....mm- Nx £3.38 838$ mod... .2»...- 35...: Ono. waameHCmca-mflnmxomcw who. 35.". Now. mad... wcH>OE-:Um 0.3..- ax©.| omm: NJM. unmmwmacs-ucgmm—a om... Nmm. Nam . 0mm.u wgummpmucwumcfimmpmucwcs 9mm. man. Now. ..NH... 33.23-33.2sz m::.l mom... 3&3... 03c. w>wmmma£u>wuum mMN... mom... own .I “on. ucmupoaewcs-ucguoafiw ..ma. mink. Nam. mmm.l mismm-ucmfiofiz «mm. 5.5. mom . ...:«.| ...:fimca-miumwmc mom. 2.6.» . mmm. H.,-mo..- Hampm3oa-xmm3 on“... Nam. ¢mm. aam.c nmxwfimu-wmcwu 0+3..- ~n..m..o mo.w.l 4...". me-coow «as. :nm. mum. flaw... £33336: msm: 4mm. mad... Ni. “523-336... mam. mod. Ann. was... m~nm3m>-mm3:uuo3 c m m A Hmumv .oo.¢a .mmnmx ”ccwuafiom neuumm-q Aummuwuav :mwuwmzzm HHmmuzou HHm Hmwm no ZOHHDAOm Qz< ZOHHDAOm «2.05“ .V “8.5028 xoomcw NmN. Nam? 93... wages-SE. mmm .1 0mm . :mm. ucmmmmHQCS-ucmmmmaa mum. mum... mmm... mewummumucwuwcfiummumucwc: mmm. 0.5.- 0mm .I m>wumemmuw>Nuchmmcw Jung... mm.“ . Nun. w>fimmwa$>wuum :on.l NHN. 0mm . ucwuuanNCQ-ucwuuoafiw mxm. mmo.a N2..- «Nucww-ucmaofi, Mmm. mmm ... :Km.l m>wuwmoa$>wumwmc «no. 534.... we..- Hamumaoa-xmmz 00H. mmm... cma... vwxmfimu-mmcwu MAN.I mow . ems. cwn-voom mom. mm: .I mm»..- Sum-Saws: moo... cmm ..- x-m. . uamwch-wumaam ....ON. n+3..- Nmi... m~pm=~w>-mm3£uuoa m N H melwu mo. Va main u Nx ”colgom u0uumm-m H3. .- 3.....- cmo. i. . 3%??82-.3§o?. ...... 0:. 2:.- 3..- €205-23. - mt“... mmw... can. don. ucmmmmflacs-ucmmmmfia mom . 0am. :mm... mod... wcwummumufi-wc3mm.~mu5c: nmm. mmm. mmm ... 00.2.0 mauflmcmm-mzuwmcvmcw N3.- mm..- .3“ . 2... 36.3.23: mam}: mqa.l mmm . .m... ucmupcaewcs-ucmtcaew «a... «ma... "aim... mo.- .. mfiucmw-ucmfiof, 5:”. 30M. mmm.u mks... m>wuwmca$>3mmmc awn-u. mma. .53... mm...- Hawuwaoa-xmma 00H. Nam. 9:." .I new... vmeku-mmcou omm.- oom.- m... N... vmeéaow com. Nam. mom... 5...... Saw-page: mmm.t umm. hon. mg..- ucmwcmfl-wumgom mmNo 3N3. mmNol mN‘..- wfiflmflamkrlmmOfiLUHO‘J c m N _ .mumc .oo.va .mmumx ”cofiusfiom .ouumm-c Aummuumoav cmfiuwuxnm ”emmuzoo ill-lb" .II I 1|. In i‘-| .Iitlll ..‘I Illnl-- ..--.l. 'l-C‘III‘ .. Ii..- ‘5‘. I 0 ll". ‘A‘ HHh Hmwm .mo ZOHHDAOw 92¢ ZOHHDAOm mguwuwmawm$3uwmcmmcw «an. m:a.n mam... 4mm. w>wmmmauw>wuum msm . wmm.u ems ... :ON. uauuuoaewcsbcmuuoaew :Nm .I com. mam. no.7! mflucww-ucgot’ mo: ... .33. mm: . :mN.I wzuumoa-wzumwwc .13.: can... 0:... «mo... Hawumaoa-xmma mud... omm. «sq. :m«.0 vmmewH-mmcuu Hum . Nmm...- mmo ... ..mN. van-v00» mom .I mmm. akm . mod... Emu-53:: Nd." . one. 1:... 00...-..- ucwwcofi -mugwa new .I sum. @NN . n3»... manuzfimgémmzuuoa q n N H HmUNucmvH ou< yum amuwumwumum ouusvov< mo newu=~0m van cofiuSNOm pauuuh-c Annouuuav aoflmwfium "emmozoo HHM mem ho ZOHHD-Hom GZ‘ .... Q- ’n... f. . ‘.‘ ..2!\ \s \ is 5‘. s 200 . ..l-l. |‘-'I-:-l"ll-Il|’l:“l| J 5-. ...... mm”... . Hmm . m m m ... mm.- . m.b«:...ofl:w:7-... .. 1; 5...... :. mgr... mma.: 336...: NwN. :m-m.: mat.:.,..,.,v-.g... m 4.0.2:. mmo. abm. mmm... NM... ucmqmmgacs-uccuvx”...... mm»... DmJ... max-N... :mm. 0mm... mcflumwpmucflawswummp.ficwc3 N.,...N. «3”.- 3mm.: km... 03.: mifimcwm-mifimcmmfi n+5...- ::m. NNN. MMQ... :NN. w>wmmma£3uum :::.-. ~:N. mom. sod... mnw. “cmuuoaewcs-ucmupoebfl nod. 93.: com}: mum. :mm.: mfiucww-ucmaof» and. Non..- mn:.: :cm. mmm... m>wuwmoafiw3umwwc 3m... mmdf. Nmm.: woo.: mom.: Hawum3oa-xmu3 .33. -~.: mam.: mmn. Nam}: nwxmfimp-omcwu -a.l 0mm. can. ammo: owe. Umn-voow 03o. 0mm... ohm}: 01m. 90...: uwmw-pwmwc: mun... one... mmn. :«s. 90...: . unmwcmH-mum>am New. «NM... 52...: smN. memo: wfipmsam>-mmm~fipoa m q m N H 3an 2.1. 2.3- Nx . 28338 .80.... m one. mnm.: oom.: no... manmxohcmcs-mfipmxomcm «ma.: wmo. mkm. :NN.: chEBFéfiwum mmo. Nxm.: 35.: mm... unmmmwaacs-ucmmmia mm:.: as... now. 03...: wcwummumucw-wc3mou.ficw:: mm~.: «mm. on... q:m.: mzfimcg-uzfimcomcw ANN. :nm.l cunt: new. oimmma-E/Com wow. 5:1: sad... M.;. gamuucafiwc:-ucmu.f.._.....fi non.: 9mm. mmm. mam.: mHuCQm-uc.._ow> mON.: hm... 00M. Nnm.: m>2wmcaiu>tmwoc NNN.: 03m. omv.: men.: Hfiumzoa-xmma NNN... omm. oms. ....NN.: vuxmflmp-omcmu oom. mom.: mmm.: Non. vmnAXEw N:~.: 0mm. mom. 053.: :ww-fimwc: «MAT: aka... can. 05m... ucmflcmH-ouMSDm amm.: mmm. mmm. «1.x... 3nm3m>-mmgcuuo3 c m N H Hmuwc .oo.va .aNan ”cozsfiom heuumm-q Aummuumoav cofiwmem UEwozou ‘1‘ O:l:-I'll:'l:-.l:'.l. .(I..’IID‘ ---I.I:l.|.| -. .ul. :..:.- - ‘11- I1} .. :rlllll'l v .n, .- |Crlu|l :.'|II III..- :l-I-:- lit-:.l:ll-llll.llllllxl HHm Hmmm LO ZOHHDACm Qz< ZOHHDAOm m0~ucfihzunum «ms. pammmwfiacs:ucmmmaam mms... mcwummumucw:wcflumopoucwc: cma... m>wuwmcmm:m>fiuwmcmmcfl n.9u.: m>wmmma:w>wuom .wod. unaunanwcs:ucmuuanw NJm... mfiucwmuucmaofl> %o.: m>wuwmoaum>wumwms de... Hamumsoa:xmw3 m:m.:. vmmemuummcmu :mm. vmn-voow 03.: 53:53:: no:.:. acmwcwfi-mum>om mms.: m~nm3m>umm3£uuoa H ”:OHuSHom neuuwm:o «NN. mfinw50nc0:3:manm»0mcu ms; .: mcw>oE:~kum cos. ucmmwmflacs:ucmmmmaq mom... wcwumwumucw:wcwummumucdcs mad... w>wuwmcmm:m>wuwwcmmcfl own... m>fimmma-m>wuuw m ad . ucmupoaagcchmupanw mud... mHucuwuucmHoH> Ham. m>wummoa:m>wumwwa r; o . I Hawumaoa :xmwz :mm.:. vmxmamuuwmcou aha. nap:uoow .wom.: udmu:uwmmcz H:d.:. ucmwcma-wpm>mm mow... manmnflm>ummw~£uuoa fl “ newuaaom neuumm:q N Aummuounv uaaafiwuqz n2828 I'llnll. ‘lll‘ .I Ilt’llivll'f HHm Hmmm ho ZOHHDAOm Dz< onHDAOm mgofw c fl“350200 Io9Fggfium ~om. .«o«.: oom.: maa.: :m«.: mmo.: 9mm. ucmmumdg=r¢a.mmofia 3m? can. m3. 3.... 92. mi? 3%.: $3.335$538355 SN? NE. mom. 2:. So. :3. 1%.: 33:5.-.33235 oi. 9.3.: ooo.. ooo.: «8.: ~21: 93.: .333...>:.m $3. 03 .: to? 03 .: ,30 . So .: Nmm . 23.855-23.83 m2... 3». .3. Ni... 3.: is. $0.: .chgflot 3m: 3.... ......m. 30. 3a.: Sm. 1b.: «33.8-.3umwma 3...: am... 1;. Cm. 1.3.: m3. 3:... Hatoaoaimoa m3? 3... SN. 25.: $9.: 2?: 2...: 3x33-3=3 NAN. o:«.: ~mm.: :mm.: Nwm. .afl.: ax“. vap;aow :m«.: :mu. one.‘ egg. oao.: :mo. ~m».: “.mwxzauca 2a.: to. 2K. 93.: 23.: mg. .33.: 2.357326... «3.: 3:. am... 0N0. Km? #3. NE? 33.327.82.83 n o n q n N M Sufi. emfva 3.3 a mu "c0338 “Bomb-“ $~.: m3. m3. 3.. 031255-29...an m3. 01.: mag... 39.: 3:85-23... mmm.: ”Na. 0.3. :ms. acumamgnczbaaummaa ms: . 3mm . : 0 mm .: on: . : 95:33.: :wcwunouougc: mum. mNH.: «09.: .53.: . o>«u«mcom:o>nwmcwmca can. Nms. N39. «no... m3mman$>30m mom .: :mm . mm.“ . :om . unnuuoaawaabcnuuoafia «mm. m3. «3.: «2.: 32%-»...3? SN. 03.: mom... ooh... «3:89.33»... 3n. 3...: mm”? 31: 3.38.7.1... 3... «.3. 93.: mum: wax-1.73:3 93.: c3. «3. 3... v3.3.» mon. .25.: 3o. 30.: . 35-3.25 ..mm . «No . 2.... . one .: 33:3 -32... n: . om“ . : a: .: Em . : uSufigézzfioa e n N g S ...: .89. .mtumx ” c0338 .30.»-.. 33335 2:33.32 ”Hmmozou PH“ Emma mo onubqow oz< ZOHBDAOm £83m .~ H935200 5.55 “cm mmongfin Zamb