Hill W I; ( V W" { «l I H 1 H < l I H l — — — A COMFARESCJN 6? THE EFFECTWENESE Q? USENG SLED-ES AND NON-VISUALS AS YES? ERflREfiMEN-E‘S FOR QESECEN ENQERSTAREWGS ”(Emma gore {-‘iw Daqima of M: A. EEECBEEEM‘? 3mm UNWERSE’EY ”Faye L. Brasington 1966 ABSTRACT A COMPARISON OF THE EFFECTIVENESS OF USING SLIDES AND NON-VISUALS AS TEST INSTRUMENTS FOR DESIGN UNDERSTANDINGS by Faye L. Brasington Matrix: Design for Living is a course designed to provide students with basic design understandings. Slide examples and illustrations are employed. Testing is accom— plished through objective, machine scorable examinations. This research involves comparing the effectiveness of slide items and verbal items in the examinations. The hypothesis states that slide items will more effectively measure a higher level of intellectual ability than will verbal items, and also that slide items will be more discriminating. Working in connection with an Educational Develop- ment Project, a study group wrote, revised and selected sixty slide items and sixty verbal items to be used as the final examination for TRA 140, Spring Term, 1966. The ma- jority of these items were paired in subject matter and difficulty. All of the items had been pretested Winter Term, 1966. By means of item analysis, items on the pre- test and final test were given an index of discrimination and index of difficulty by the Scoring Office of the Eval- uation Services, Michigan State University. Each item was Faye L. Brasington assigned a classification level of intellectual ability according to the Taxonomy of Educational Objectives. Flanagan's index of discrimination, not affected by diffi- culty, was also computed for each item. The study group determined validity and the Kuder-Richardson method deter— mined reliability. The comprehensive picture came from item analysis of total verbal items, total slide items, and the classification levels within the verbal and slide items. Computation of the correlation coefficients helped determine a comparison of reading scores to total verbal and total slide scores, and of CQT scores to total verbal and total slide scores. Student interviews and a student questionnaire revealed attitudes toward slide and verbal items. The results showed the total slide items more dif- ficult than the total verbal items, but less discriminating and less reliable. At the lower classification levels, verbal items were more discriminating and more reliable, but less difficult than slide items; however, at the high- est level of intellectual ability, the opposite was true. The questionnaire indicated that students believed the slide items more difficult than verbal items, but def- initely worthwhile in the testing program. The highest correlation coefficient, .671, was between the slide items and the COT scores° A COMPARISON OF THE EFFECTIVENESS OF USING SLIDES AND NON-VISUALS AS TEST INSTRUMENTS FOR DESIGN UNDERSTANDINGS BY Faye L. Brasington A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of MASTER OF ARTS Department of Textiles, Clothing and Related Arts 1966 .45 ’5 J. J 03 ACKNOWLEDGMENTS I wish to express my sincere appreciation to Mrs. Lorraine Gross and Dr. Mary Alice Burmester for their guid- ance in directing this research and assistance in writing and reviewing test items. I also gratefully acknowledge the thoughtful suggestions and understanding help given by Dr. Mary Gephart, and the assistance given by Dr. Elinor Nugent, Dr. Gertrude Nygren and Mr. Robert Bullard. To Miss Louisa Starr for her help in writing items, and Miss Bette Guiliani for her assistance with statistical proced- ures, I am indebted. ii TABLE OF CONTENTS ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . LIST OF TABLES. . . . . . . . . . . . . . . . . . . . LIST OF ILLUSTRATIONS . . . . . . . . . . . . . . . . Chapter I. INTRODUCTION. . . . . . . . . . . . . . . . . Justification . . . . . . . . . . . . . . . Focus of the Study. . . . . . . . . . . . . II. REVIEW OF LITERATURE. . . . . . . . . . . . . Comparison of Visual and Verbal Test Items. Related Tests 0 O 0 O O O O O O O 0 O 0 0 0 Related Research. . . . . . . . . . . . . . III 0 METHODOLOGY 0 0 O O 0 O O 0 O O O O 0 O O O 0 Construction of the Test. . . . Classification Levels of Items. Administration of the Test. . . Data Analysis of Final Items. . . Determination of the Quality of the Test. Factor Analysis of Possible Influence on Test Results. . . . . . . . . . . . . . . O O O O O O 0 O O O O O 0 IV. FINDINGS AND INTERPRETATIONS. . . . . . . . . Comparison of Total Verbal and Slide Items. Comparison of Slide Items and Verbal Items Concerning Specific Levels of Classifi- cation of Intellectual Abilities. . . . . Other Factors of Possible Influence on the Test Results. . . . . . . . . . . . . . . V. SUMMARY AND RECOMMENDATIONS . . . . . . . . . summary 0 O O O 0 o O 0 O O O o O 0 0 O O 0 Recommendations for Further Research. . . . BIBLIOGRAPHY O O O O 0 O O 0 O 0 O O 0 O 0 0 O O O 0 O APPENDICES 0 0 O O O O 0 0 0 0 0 O O 0 O 0 O O O 0 0 0 iii Page ii iv ll l6 l9 19 28 29 29 31 34 36 37 41 43 43 45 46 50 LIST OF TABLES Table 1. Summary Data Item Analysis for Indices of Difficulty and Discrimination and Reliabil- ity for the TRA 140 Final Examination, Spring Term, 1966 . . . . . . . . . . . . . 2. Summary Data Item Analysis for Each Classifi- cation Level of Items for the TRA 140 Final Examination, Spring, 1966 . . . . . . . . . 3. Correlation Coefficients of the Individual Total Verbal Item Scores and Total Slide Item Scores Obtained by Students in TRA 140 with Their Reading Scores and CQT Scores. . iv Page . 34 . 39 . 41 LIST OF ILLUSTRATIONS Figure Page 1. Slide Item Illustration for the Final Exam— ination of TRA 140, Spring Term, 1966. . . . 22 2. Slide Item Illustration Using a Slide Pair for the Final Examination of TRA 140, Spring Term, 1966. . . . . . . . . . . . . . 23 CHAPTER I INTRODUCTION The development of discernment and criticism in relation to art objects and general design is important. Modern society consists largely of consumers rather than producers of art. The design judgments and choices made are a part of our daily living. Design is an integral part of every man's en— vironment. The design quality of this environment and the satisfaction it affords the individual de- pend upon his aesthetic sensitivity. As a basis for aesthetic sensitivity he needs to develop an awareness, creative understanding and appreciation of design, to acquire a core of knowledge concern- ing design, to achieve competence in making design judgments, and to establish a personal design phi- losophy.l The student needs to understand the various prin- ciples, concepts and elements in design. Also, the student should develop an evaluative ability in applying the stand- ards of design, acquire an interest in, and an appreciation of, good composition, and increase his understanding of the use of different media in the design of his environment. If these understandings and abilities are the 1Course Outline for TRA 140: Matrix: Design for Living, Textiles, Clothing and Related Arts Department, Michigan State University, 1966 (in the files of the Depart- ment). objectives of the course, an evaluation of student achieve- ment of whether or not these objectives have been accomplished is needed. The Textiles, Clothing and Related Arts Department offers Matrix: Design for Living, TRA 140, as a basic core course in the College of Home Economics, Michigan State Uni- versity. The large section lecture course incorporates slides for illustration and discussion purposes. The ob- jectives of the course are to develop in the students: 1. Awareness of the nature of design and its mani— festation in life's matrix. 2. Knowledge about the design elements, principles and concepts. 3. Some competence in utilizing the basic design elements, principles and concepts in solving design problems and making design decisions. 4. Formulation of a value system and philOSOphy related to design.1 Student achievement of the second and third objec- tives is measured by objective tests, while the first and fourth objectives are measured by student essays. During the past academic year the testing program of TRA 140 has been the subject of an Educational Develop- ment Program.2 The project produced new testing methods and new test items and a revision of existing test items. lIbid. 2Educational Development Program, Provost's Office, Michigan State University. This research is an outgrowth of the EDP project which is concerned with comparing testing methods. The objective tests developed from new and improved items must prove valid and reliable for effectiveness; ”Fur- ther research is needed on the development and refinement of tests of aesthetic judgment, especially in regard to the measurement of judgment in specific situations."1 Elfreda Samuels, in her study concerning the construction of a test of design judgment, states: The need then seems to be for an instrument geared to comprehension of contemporary art educa- tion, . . . because it needs to be devised to test the types of art activities found in the average art classes of today.2 Justification Evaluation, inevitable in education, normally takes the form of testing.3 The testing of large numbers of stu- dents in the elusive areas of aesthetics and discrimination necessitates a search for valid new methods of evaluation lMarilyn Joan Horn, "The Ability of College Students to Apply Principles in Concrete and Abstract Situations and Its Relation to Art Interest" (unpublished Master's thesis, Cornell University, 1953), p. 149. 2Elfreda C. Samuels, "The Construction of a Test of Design Judgment" (unpublished Master's thesis, Boston University, 1955), p. 7. 3Paul Dressel, Evaluation in Higher Education (Bos- ton: Houghton-Mifflin Company, 1961), p. 160. in these domains.1 Dressel states that conventional evaluation proced- ures dependent on words alone are inappropriate in attempt- ing to measure intangible reactions.2 Many art and design educators use primarily written tests based on the instruc- tor's lecture material. They believe "objective" evalua- tion to be imprOper in the judgment of art. However, when there are large numbers of students in a class, subjective methods of evaluation become a practical impossibility. Researchers have for a long time been developing measuring instruments for objective means of evaluation in the area of aesthetic judgment and appreciation. Educator, artist, and layman alike hold meas- urement in the fine arts to be a controversial issue with no scientific basis or truth on either side, or on any of the many sides of the problem. The issue which seems basic to all the objec- tions raised against scientific measurement in art stems from the idea that objectivity must necessar- ily involve an absolute standard, that such a stand- ard measures conformity only and is therefore in contradiction to the true meaning of art.4 1Project Proposal: Course Development of TRA 140-- Matrix: Design for Living, Textiles, Clothing and Related Arts Department, Michigan State University (in the files of the Department). 2Dressel, op. cit., p. 160. 3Julius Heller, ”Changes in Art Judgment Resulting from Courses in Art Appreciation" (unpublished Doctoral dissertation, University of Southern California, 1948), p. 1. 4Peter A. Carmichael, “The Phantom of Critical Ob- jectivity,” Journal of Aesthetics, Vol. 9 (September, 1950), p. 13. Such a standard could inhibit our individual responses to art and design. It would seem, however, that reliable design judgments could be made concerning the elements, principles and concepts of design and that construction of a basic set of standards for evaluating the use of de- sign principles and elements should be possible. There also must be criteria by which to evaluate the function of a form or object and the techniques and materials used. The visual method of evaluating students' design understandings has produced a controversy among educators. Munro states a criticism of tests using paired pictures selected by a group of experts: The usual effects of such tests is to penalize all deviation from adult, conventional norms of taste in that particular environment, since the student who prefers the ”right“ examples gets a high grade. The relativity of aesthetic values is ignored, no allowance being made for legitimate differences in taste and style, or for the fact that different art forms may be desirable under different circumstances. In a study concerning visual testing procedures, Curtis and Knopp reported that this method of test adminis— tration can yield a greater coverage of test content in a unit of time than can the normal mode of test presentation.2 lThomas Munro, "Aesthetics as Science: Its Develop- ment in America," Journal of Aesthetics, Vol. 9 (March, 1951), p. 180. 2H. A. Curtis and Russell Knopp, "Experimental An- alyses of Various Modes of Item Presentation on the Scores and Factorial Content of Tests Administered by Visual and Audio-Visual Means: A Program of Studies Basic to Television Gropper found the importance of employing testing procedures closely related to teaching methods.1 Because design theory is taught to the students in large sections of TRA 140 through the use of slides, the course committee believes that knowledge gained by this teaching method should be evaluated by using the same type of stimulus for testing. For many areas of education, including art, Benjamin2 lists several reasons for testing with visuals: 1. Dependence upon reading as a sole means of pro- viding test stimuli is reduced. 2. Various parts of questions can be presented almost simultaneously, without the necessity for verbal buildups or descriptions. 3. It is easier to see relationships among various parts of data in questions. 4. Pictorial or graphic representations of things, events, or situations can be fairly lifelike, making it easier for students to see relation- ships between the posed problem, and actual Testing,“ Department of Educational Research and Testing, School of Education, Florida State University, National Defense Education Act of 1958, pp. 78-79. 1George L. Gropper, "Learning from Visuals," Audio- Visual Communications Review, Department of Audio-Visual Instruction, Washington, D.C., Vol. 14, No. 1 (Spring, 1966), p. 47. 2Harold Benjamin, Audio-Visual Instruction Materials and Methods (New York: McGraw-Hill Book Company, 19597, p. 420. application. 5. Variety is provided through pictorial, recorded, or dramatic elements in testing procedures, improving student attitudes toward testing. 6. Some students believe that evaluation situations which are not completely verbal are easier, thus heightening morale by thinking that stu— dents are better able to demonstrate their abil— ity on such a test. 7. Aspects of objectives which cannot be measured at all by strictly verbal means may be able to be measured by employing visual materials. Little is known, however, about the use of slides for testing. For the students' attainments of the course objectives for TRA 140, an experimental study should help determine the value of slides as a testing medium. Focus of the Study This research seeks to determine the relative ef- fectiveness of verbal and slide test items in the evalua- tion of students' attainments of the course objectives of TRA 140. To compare verbal and slide test items, it was necessary to construct an objective test of design under- standings and judgments composed of both verbal and slide items for use as an evaluative device in TRA 140. The assumptions and hypothesis guiding the research are as follows: Assumptions: 1. Both slide and verbal questions can be formu- lated covering the same basic course objectives in TRA 140. The classifications level tested by each item can be determined according to Bloom.l Hypothesis: Slide questions will provide opportunities to effectively measure a higher level of intellec- tual skills and abilities, that of qualitative judgments, than will verbal items. Therefore, the use of visuals should prove to be a more discriminating procedure. 1 Bloom, Krathwohl and others, Taxonomy of Educa- tional Objectives, Handbook I: Cognitive Domain (New York: Longmans, Green and Co., 1956). CHAPTER II REVIEW OF LITERATURE This chapter reviews the theoretical and pertinent literature and research pertaining to the problem of visual and verbal testing methods. Areas included are: (a) a comparison of visual and verbal test items, (b) related tests, and (c) related research. Comparison of Visual and Verbal Test Items As a result of testing by means of both visual and verbal items, Gropper discovered that, provided a visual lesson is suitably programmed, the student can answer both pictorially and verbally stated questions about conceptual phenomena.1 The conditions permitting this suitable pro- gramming to occur appear to be those which facilitate dis- criminations about similarities and dissimilarities in visual situations.2 Instructional settings which provide these conditions can aid in the understanding and subsequent 1George L. Gropper, "Why Is a Picture Worth a Thou— sand Words?“ Audio—Visual Communications Review, Department of Audio-Visual Instruction, Washington, D.C., Vol. 11, No. 14 (July-August, 1963), p. 85. 21bid. 10 practice of generalized responses.l Gropper discovered a statistically significant inter- action between intelligent quotient and mode of stimulus presentation as measured by verbal test items only. While above average students profited more from verbal presenta- tion, below average students benefited more from the visual presentation.2 A general expectation is that the greater the simi- larity between the learning situation and the testing situ- ation, the greater would be the degree of transfer. As a result of measurement, Gropper found that the solely visual instruction led to superior performance on the visual test items, and the verbal lesson proved a more effective instruc— tional experience for the relatively more abstract verbal items.3 “While the verbal lesson did lead to successful performance on the verbal test items, it did not prove to be superior to the visual lesson in this regard.“4 Experience with concrete visual examples in the visual lesson allowed for successful transfer either to concrete visual items or to abstract verbal test items, while concept acquisition based on a programmed verbal lesson H Ibid. 2Gropper, “Learning from Visuals," op. cit., p. 45. 3 Ibid., p. 46. #- Ibid. ll appeared to have facilitated transfer less readily to the visual criterion test than to the verbal test. Gropper concluded on the basis of this difference in findings for visual and verbal lessons that, for transfer to occur, sim— ilarity between learning situation and testing situation may be less important when learning is based on visual ma- terials.1 Gropper found by achievement testing "(a) that non- significant differences in total test scores between visual and verbal treatments were obtained and (b) that relatively high achievement levels were obtained for both treatments."2 Total test scores showed no differences in the relative effectiveness of the visual and verbal presentations. How— ever, differences in the relative effectiveness of the visual and verbal presentations were revealed by separate analysis 0 O O 3 based on scores on either Visual or verbal test items. Related Tests The McAdory Art Test of art appreciation was pub- lished in 1929. It contains pictures of 72 works of art which cover a wide variety of contemporary art forms, rang— ing from pictures of furniture and other functional objects to works of art in museums. Four versions of each art work lIbid., p. 47. 21bid., .___. P° Ibid. 44. 3 12 are given, differing in shape, arrangement, shading and use of color. The person being tested is to rank the four versions in terms of his preferences. Its dependence on contemporary art values of 1929 produced a primary weakness of the test.1 The test was validated by 100 judged ranging from department store workers to competent lay critics and art producers. Meier writes in the Mental Measurements Yearbook: . . . save for the possibility that time may out- mode some of the prevailing standards on which both the scoring norms and the consensus were based, the test represents a definite achievement in pro- viding a test of general art appreciation.2 The Meier Art Judgment Test uses the altered-version type of item for measuring art appreciation. It differs from the McAdory in that only one alternate version is given for each art work, and the examples concern relatively time- less art masterpieces.3 Meier believes that a work of art can be judged on the basis of its organization through an understanding of the functioning of principles basic in all art.4 Each example in his test contains some principle or lJum C. Nunally, Educational Measurement and Eval- uation (New York: McGraw-Hill Book Company, Inc., 1964), p. 298. 2Oscar K. Buros (ed.), The Nineteen Forty Mental Measurements Yearbook (New Jersey: Mental Measurements Yearbook, 1941), p. 146. 3Nunally, loc. cit. 4Norman Charles Meier, The Meier Art TestsJ Exam- iner's Manual, Bureau of Educational Research and Service (Iowa City: State University of Iowa, 1942), p. 7. 13 principles which have been singled out for manipulation in one version, so that the two versions presented are nearly identical, but with one having the functioning of the prin- ciple impaired. The test was originally known as the Meier- Seashore Art Judgment Test, published in 1929. Revised in 1949, it became the Meier Art Judgment Test. The Graves Design Judgment Test measured certain components of aptitude for the appreciation or production of art structure. The test measures the degree to which a subject perceives and responds to the basic principles of aesthetic order-~unity, dominance, variety, balance, continuity, symmetry, proportion, and rhythm.1 The items consist entirely of abstract designs in an attempt to be as removed as possible from traditional and contemporary art values. Each item consists of two or three versions of the same basic design, the altered version or versions being constructed to violate a basic aesthetic principle. In a review of this test, Nunally states that the test is a useful measure but adds that only a small amount of em- pirical work has been done with the instrument.2 The Crow Picture Interpretation Test was published in 1926; its purposes were: (a) to measure the ability of students to look at pictures and give aesthetic and lMaitland Graves, Design Judgment Test Manual (New York: The Psychological Corporation, 1948). 2Nunally, op. cit., p. 300. l4 thoughtful interpretations of them; (b) to create a wider interest in the study of good pictures in the public schools; (c) to aid teachers in understanding the difficulties of students in looking at pictures; (d) to enable teachers to measure progress of students in picture interpretation by determining standards for the various grades; (e) to enable both teachers and students to see more in pictures and get greater pleasure from them.1 The test consists of a booklet of questions and answers, and an envelope of eight copies of masterpieces. The questions are concerned with the pupil's interpretation. of details, aesthetic responses to details, the meaning and beauty of the picture, and also points of contact be~ tween the pupil's experiences and the experience interpreted. in the picture.2 Lewerenz's test in the Fundamental Abilities of Visual Art was constructed to enable teachers to measure students' capacities and skills. It is an easily adminis— tered and scored group test. The test has nine forms: 1. Recognition of color. 2. Observation of light and shade. lAlfred S. Lewerenz, ”A Critical Analysis of the Elemental Abilities Required in Art Education with a View to Possible Objective Measurement“ (unpublished Master's thesis, University of Southern California, Los Angeles, 1927), p. 36. 21bid., p. 7. 15 3. Visual memory of proportion. 4. Originality in line drawing. 5. Recognition of proportion. 6, 7, and 8. Analysis of perspective. 9. Knowledge of subject matter. The test was validated on the basis of what was taught in art courses in the Los Angeles schools. The stu- dent was to choose the best of four bowls, cornices, curves, composition of landscapes, or other design examples, and, in addition, he was to make ten original line drawings. In the Nineteen Forty Mental Measurements Yearbook, Faulkner says this about the Lewerenz tests: It is of little value to those who believe that art is an integrated activity rather than a series of separate skills, nor is it of great value to those who believe that an approach to art through such general and abstract art elements as light and shade, color, and proportion is less desirable than through such specific fields as architecture, industrial art, and the like. Thus its value is highly dependent on one's philosophy and psychology of art.1 A Test for Art Appreciation by Karwoski and Chris- tensen, published in 1926, includes 28 questions of three different forms. One form is the comparison of two examples, one good and the other poor. Five reasons are provided for choice under the paired pictures; the subject is to choose one reason. In the second form the subject judges a single picture and checks one of the five reasons for the preference. lBuros, 0p. cit., p. 149. 16 The third form is concerned with selecting the best of four examples of similar subjects. The authors believe art ap— preciation can be tested by forcing the subject to give an opinion of why one art form is preferred over another. The pictures are in the areas of painting, architecture, sculpture, industrial arts, abstract design, and color.1 A revised version in 1933 included the areas of automotive, flatware, furniture, and costume design. Related Research Johnson constructed and evaluated a test designed to determine the degree of intellectual and aesthetic re- sponse to painting. He was concerned with reactions to content, composition, color, line, form, and tone. The test consisted of 140 verbal items cast in multiple-choice form which referred to one of seven pictures selected for visualization of the factors being tested. The test proved a reliable measure of the concepts being measured and a valid instrument of the aptitude of art appreciation in a verbalized situation.2 Heller conducted an investigation to evaluate art 1Theodore Karwoski and Erwin Christensen, ”A Test for Art Appreciation," Journal of Educational Psychology, Vol. 17 (March, 1926), pp. 187-194. 2Dana D. Johnson, "The Construction and Evaluation of a Test of Aesthetic Reactions and Understandings--Paint- ings” (unpublished Master's thesis, School of Education, Boston University, 1954). l7 judgment in courses of art appreciation at the university level with contemporary evaluative materials. He wished to discover to what extent art judgment can be measured at the university level, and to what extent it can be changed by instruction. A test instrument in art judgment composed of thirty—nine pairs of pictures was constructed. The pic— tures were selected according to the availability of the items pictured, their suitability for projection, and the actual situations in life which demand certain judgments to be made. The test form was one based on pairs of pic- tures which could be projected simultaneously on a large screen. The student was directed to choose the picture he preferred. Heller felt that his investigation indicated that art judgments can be measured, and that they can be changed by instruction.l Horn was concerned with the relationships of abstract art principles, and their application in specific areas of design. She hypothesized that an understanding of the prin— ciple or abstraction does not assure an understanding of its application to the concrete or specific situation. In addition to support for her hypothesis, she also found that, conversely, an understanding of the specific does not lead to the development of an abstract art concept which can be applied equally well in a number of situations. Two visual lHeller, op. cit. 18 art tests and an art interest inventory were given to 275 college students. One of the visual tests used was the Graves Design Judgment Test for measuring application of design principles to abstract forms. The other test was a combination of the Brief Form of the McAdory Art Test with some plates from other selected art tests. The latter was designed to measure the application of design principles to concrete forms. Horn found that the ability to make art judgments can be improved with training; however, training appears to be more successful in increasing knowledge and understanding of art principles in regard to abstract forms than to concrete forms in the specific areas of design.1 Samuels constructed a test of design judgment for junior high school pupils in Framingham, Massachusetts. One purpose of the test was to measure the students' abil- ities to exercise good art judgment in the field of design. The test included 22 abstract designs, each illustrating one or more basic art principles and the use of art elements. The test was thought to be valid, but was very low in reli- ability. It was felt that the test should be lengthened and administered to a higher age group. Each test item consisted of two designs, one better than the other. The student was to select the better of the two.2 1 . Horn, op. Cit. 2Samuels, op. cit. CHAPTER III METHODOLOGY Construction of the Test Development and sources of the items When the experimental design for this study was developed, it was assumed that both verbal and slide items could be formulated covering the same basic understandings taught in TRA 140. The first step was to devise machine scorable items to test student understanding of all areas of design theory included in the course. The subject mat- ter areas included in this course are: A. Design elements 1. Form 2. Color 3. Texture B. Design principles 1. Balance 2. Emphasis 3. Proportion and scale 4. Rhythm 5. Harmony-—unity C. Design concepts 1. Criteria for design judgment l9 20 2. Design and culture 3. Timelessness and obsolescence 4. Materials and techniques 5. Form and function 6. Enrichment 7. Originality, expression and beauty A study group1 was formed to improve test items and methods for teaching TRA 140. This group was respon— sible for writing, reviewing, revising, and rejecting test items for the course. In order to compare verbal items and slide items, it was necessary to construct items in pairs of one of each type of item. In writing paired slide and verbal items, the general procedure was to determine the kinds of ques- tions which could be used to test the different course ob- jectives. The next step was to select pictures or objects for photographing about which an item or items could be formulated concerning the course objective or subject mat- ter area under consideration. All slide possibilities were presented to the study group for discussion and analysis. If a slide item was accepted by the group, a verbal item was written which would be parallel to the particular slide item in subject matter and difficulty. ers. Lorraine Gross, instructor of TRA 140, Dr. Mary Alice Burmester, representative for the Educational Development Project, Miss Louise Starr, graduate teaching assistant for TRA 140, and the writer. 21 The majority of slide items were cast in key form, the answer being selected from a key list of possible an- swers. The key for the paired verbal items was generally the same as its corresponding slide item. An illustration of such paired items is: Verbal item: "Yellow and orange." Slide item (see Figure 1, page 22): a picture of a yellow and orange object. The answer for each item was to be selected from the fol— lowing key: 1. 2. 3° 4. 5. Monochromatic Complementary Analogous Triadic None of the above Occasionally the paired slide and verbal items were not cast in the same form. This occurred with some slide items which required the student to make a judgment. An example of such a pair is: Verbal item: Which of the following most violates the prin- ciple of emphasis? l. 2. 3. 4. Pale yellow wall with a red—orange and blue chair. Green walls with white cabinets. Large black stars on a white floor in a bathroom. Bright pillows on a white couch. Slide item (see Figure 2, page 23): 22 Figure l.--Slide item illustration for the final examination of TRA 140, spring term, 1966. (Note that the colors in this reproduction are not compar- able to the slide used which showed a much brighter yellow and orange, and an almost non-existent blue.) 23 .mmma .eumu manna. .ovH mq compumsuam :eHuumHn uwmmwo mo compumnuwm newuomfla lawman mo coaumUAW lumps: m0 xmpCH mo umnesz lumosx mo meCH we anESZ IHmmmHU cmmz meCH cam: XmUCH Cams cmmz mamuH mpnam msmpH Hmnum> mama .maauam .cowpmcwemxm Hmcflm oea 4mg mcu mom mEmpH mo Hm>ma COHuMUHMHmmmHU comm mom mwmhamcm Emuw mwmo mumEESm .N magma 40 applied.1 Thirteen verbal items and twenty-three slide items fell within this level. Again the slide items were more difficult, but less discriminating and less reliable than the verbal items, although the differences were not as great as at the 2.00 level. Classification Level 6.00--involves judgments concerning the value of material and methods for specific purposes; 2 The results of this the use of a standard of appraisal. level are opposite to those at levels 2.00 and 3.00. The twenty slide items concerned were slightly less difficult than the sixteen verbal items, but the slide items were more discriminating and considerably more reliable. The hypothesis for this research is here supported in finding a higher discrimination with the slide items at this higher classification level. It was also found that at this level the writing of slide items was less difficult than the writ- ing of verbal items. The students agreed in the question- naire that, were slide questions eliminated, it would not be possible to test all information taught in TRA 140 (see Appendix B). The writer feels that this attitude would be reflected to the greatest extent at this classification level. lIbid., p. 205° 2Ibid., p. 207. 41 Other Factors of Possible Influence on the Test Results Reading scores The correlation coefficients indicating the rela— tionship of reading scores with both verbal and slide items are .343 and .397 respectively, as can be seen in Table 3. Although a positive relationship is present in each case, the coefficients are relatively low. There is no significant difference between the two correlations and thus it cannot be predicted that a student with a high score on the reading test will obtain a higher score on either the verbal or slide items. Table 3. Correlation coefficients of the individual total verbal item scores and total slide item scores obtained by students in TRA 140 with their read- ing scores and CQT scores Correlation Variables Coefficients Verbal scores and reading scores .343 Slide scores and reading scores .397 Verbal scores and CQT scores .392 Slide scores and CQT scores .671 C T scores The correlation coefficients between CQT scores and verbal and slide scores are .392 and .671, respectively. This would indicate that it may be predicted that a student obtaining a high score on the COT would be more apt to 42 consistently obtain a high score on the slide items than on the verbal items. The hypothesis is in agreement with this result, especially at the 6.00 classification level as it was anticipated that slide items would more effectively measure learning of a higher ability level. Student questionnaire Certain attitudes expressed by the students in an- swer to the questionnaire have been included in this thesis where pertinent. However, additional results of interest may be noted. Although the data analysis results indicated that verbal items are favorable except at the highest classifi- cation level, student responses on the questionnaire would indicate that slide items do indeed occupy a significant role. One such response was that, were slide items elim- inated, it would not be possible to test all information taught in TRA 140. In addition, nearly all of the students felt that the expectation of slide items on an examination caused them to be more attentive in class. They did, however, feel that there may be more than one correct answer to a slide question, and that personal opinion was a factor. These attitudes, although not universal, do perhaps indi- cate that slide items should be used more than just at the highest classification level. They also are indicative of the fact that great care must be taken in the structuring of slide questions. CHAPTER V SUMMARY AND RECOMMENDATIONS Summary One method of evaluation used for the TRA 140 course at Michigan State University is objective examination. Be- cause the course is taught with the aid of slides, it has been of interest to know whether objective testing is more effective with the use of slide items. Therefore, the pur- pose of this research was to determine whether slide items or verbal items are more effective as testing tools for the TRA 140 course. The hypothesis that slide items would more effectively measure a higher level of intellectual skill and would be more discriminating than verbal items was sup- ported in that slide items were more discriminating at the highest classification level. A study group was formed in connection with an Edu- cational Development Project. The group wrote, analyzed and revised test items. The items were pretested, and many were eliminated so that sixty verbal items and sixty slide items remained on the final examination. The majority of these items were paired in subject matter and difficulty. Validity of the examination was based on the face validity of the items, and also on the results of the item 43 44 analysis. The Kuder-Richardson method of inter—item con— sistency was used as a measure of reliability. An item by item analysis including difficulty and discrimination was used for interpreting the results of the examination. In addition, the Flanagan method was used to compute another index of discrimination for each item. The results were tabulated so that information was available on the total verbal items, the total slide items, and the total verbal and total slide items at each classification level. Because the effect of some other factors was antic— ipated, reading scores and CQT scores were obtained for each student taking the examination. A coefficient of correla- tion was then computed between each of these scores and the total verbal scores and the total slide scores for each student. Twelve interviews with students who had taken TRA 140 the preceding term were used as a basis for the preparation of a questionnaire answered by the students taking the course Spring Term, 1966. The questionnaire was designed to determine student attitudes toward verbal and slide items. The findings revealed that the total slide items were more difficult for the student than the total verbal items; however, the verbal items were slightly more discrim— inating and more reliable than the slide items. Therefore, it would appear preferable to use verbal items whenever possible. An analysis of the items of each classification 45 level indicated that verbal items are probably better to use at the lower classification levels because they were more reliable and more discriminating than slide items. However, these results were reversed at the highest clas- sification level and thus slide items appear preferable at this level. Because the questionnaire revealed strong student attitudes that slide items occupy a position of value, slide items should possibly be used more than only at the highest classification level. The highest correlation coefficient was between the COT scores and the total slide scores. The indication is that those students obtaining the higher CQT scores should score above the others on the slide items. Thus, to this extent, slide items give the advantage to the students with greater ability. Recommendations for Further Research Repeat the testing using two matched groups of stu- dents and administering the slide items to one group and the verbal items to the other group. Administer the total examination to two groups, alternating the sequence of Verbal and slide items. Conduct further research to discover which students excel on verbal items and which students excel on slide items and the causes for the excellence. BIBLIOGRAPHY 47 Anastasi, Anne. Psychological Testing. New York: Mac- millan Book Co., 1954. Bean, Kenneth L. Construction of Educational and Personnel Tests. new York: McGraw-Hill Book Company, Inc., 1953. Benjamin, Harold. A V Instruction Materials and Methods. New York: McGraw-Hill Book Company, Inc., 1959. Bloom, Krathwohl, et a1. Taxonomygof Educational Objectives, Handbook I: Cognitive Domain. New York: Longmans, Green and Co., 1956. Buros, Oscar K., Editor. The Nineteen Fortngental Measure- ments Yearbook. New Jersey: Mental Measurements Yearbook, 1941. Carmichael, Peter A. “The Phantom of Critical Objectivity,“ Journal of Aesthetics, Vol. 9, Sept., 1950. College Qualification Tests Manual. New York: The Psycho- logical Corporation, 1957. Course Outline, for TRA 140: Matrix: Design for Living, Textiles, Clothing and Related Arts Department, Michigan State University, 1966 (in the files of the Department). Curtis, H. A., and Knopp, Russell. “Experimental Analyses of Various Modes of Item Presentation on the Scores and Factorial Content of Tests Administered by Vis— ual and Audio-Visual Means: A Program of Studies Basic to Television Testing." Department of Edu- cational Research and Testing, School of Education, Florida State University, National Defense Educa- tion Act of 1958. Dressel, Paul. Evaluation in Higher Education. Boston: Houghton-Mifflin Company, 1961. Educational Development Program, Provost's Office, Michigan State University. Flanagan, John C. ”General Considerations in the Selection of Test Items and a Short Method of Estimating the Distribution,“ The Journal of Educational Psychology, Dec., 1939. Graves, Maitland. Design Judgment Test Manual. New York: The Psychological Corporation, 1948. 48 Gropper, George L. "Learning from Visuals,“ A V Communica- tions Review, Department of Audio-Visual Instruc— tion, Washington, D.C., Vol. 14, No. 1, Spring, 1966. Gropper, George L. "Why Is a Picture Worth a Thousand Words?“ A V Communications Review, Department of Audio-Visual Instruction, Washington, D.C., Vol. 11, No. 14, July-August, 1963. Heller, Julius. “Changes in Art Judgment Resulting from Courses in Art Appreciation," Dissertation, Univer- sity of Southern California, 1948. Horn, Marilyn Joan. “The Ability of College Students to Apply Principles in Concrete and Abstract Situations and Its Relation to Art Interest,“ Thesis, Cornell University, 1953. Johnson, Dana D. “The Construction and Evaluation of a Test of Aesthetic Reactions and Understandings-- Paintings,“ Thesis, School of Education, Boston University, 1954. Karwoski, Theodore, and Christensen, Erwin. “A Test for Art Appreciation,“ Journal of Educational Psychol- 09!, V010 l7, NarCh, 1926, 187-1940 Lewerenz, Alfred S. ”A Critical Analysis of the Elemental Abilities Required in Art Education with a View to Possible Objective Measurement," Thesis, Uni- versity of Southern California, Los Angeles, Cali- fornia, 1927. Meier, Norman Charles. The Meier Art Tests, Examiner's Manual. Iowa City: Bureau of Educational Research and Service, State University of Iowa, 1942. Munro, Thomas. ”Aesthetics as Science: Its Development in America," Journal of Aesthetics, Vol. 9, March, 1951. Nunally, Jum C. Educational Measurement and Evaluation. New York: McGraw-Hill Book Company, Inc., 1964. Project Proposal: Course DevelOpment of TRA 140--Matrix: Design for Living, Textiles, Clothing and Related Arts Department, Michigan State University (in the files of the Department). Readinngest Form A62. Office of Evaluation Services, Mich- igan State University. 49 Samuels, Elfreda C. "The Construction of a Test of Design Judgment," Thesis, Boston University, 1955. Thorndike, Robert, and Hagen, Elizabeth. Measurement and Evaluation in Psychology and Education. New York: John Wiley and Sons, 1961. University of Minnesota Classroom Teaching Bulletin. Wood, Dorothy Adkins. Test Construction, Development and Interpretation of Achievement Tests. Columbus, Ohio: Charles E. Merrill Books, Inc., 1960. APPENDICES 2. APPENDIX A INTERVIEW SCHEDULE (Questions Presented to Students in Interviews) Were slide questions easier or harder than verbal ques- tions? Why? Did you feel that the slide questions were objectively selected? Did you feel able to make a definite choice when asked to choose which was the better of a pair of slide ex— amples? 51 Name 10. ll. 12. 13. APPENDIX B Student No. Key: 1. Strongly agreed 2. Agreed 3. Disagreed 4. Strongly disagreed I think that slide questions are harder than verbal questions. When judging paired slides, I find it difficult to choose which one is better. It is difficult for me to picture what is being asked in a verbal question. I like slide questions better because I can actually see what the question is asking. I think that there may be more than one correct answer to a slide question because people see things differ- ently. I think that slide questions are ambiguous. Slide questions are of value to me because I feel that it is more important that I can see the design quali- ties than to talk about them abstractly. I think that verbal questions are harder than slide questions. I feel that slide questions are expressive of the in- structor's values and opinions. I find that I must study differently for an exam when I know that slide questions will be asked. I find that expecting slide questions on an exam causes me to be more attentive to the slides shown in lectures. If slide questions were eliminated, I think that it would still be possible to test all information taught in TRA 140. I feel that the correct answers to the slide questions are based on design criteria and would be selected by other people trained in design. 52