. . .253; . Sr. ¢ . . 2F»; . 4 .1 Z .1 4.1“........C.x..qm €323.21... .. . Z, . . .15.... {km 3 . 1.1 14.? twat? L 7‘..——_' _.._.;__ R“ 3' CE PR0 _ . POTHESES-AND 1 3 F0 ITEM TCESSE UNIVERSITY: m , Y A F‘TEST CIEN MULATme HY DEFINING OPERATIONALLYi 3: Thesis for the Degree .of Ph. D :DARREL W MICHIGAN STATE E DEVELOPMENT-,0 FOR E. INTEGRATED. S T“ TH :2 a. . r; rfr. "an 105305 \\\\ilt\\\\X\\\2\\§\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\2\\\\\ flair” V. . 751%“; w’ '1’ " mishigm 3““ 3; University This is to certify that the thesis entitled THE DEVELOPMENT OF TEST ITEMS FOR THE INTEGRATED SCIENCE PROCESSES: FORMULATING HYPOTHESES AND DEFINING OPERATIONALLY presented by Darrel Wayne Fyffe has been accepted towards fulfillment of the requirements for ___Ph_-_12-_ degree in MN /Vtc—....s «24/ //.56 L/Kflwf a Major professor Date October 253 1971 0-169 ) \. ABSTRACT THE DEVELOPMENT OF TEST ITEMS FOR THE INTEGRATED SCIENCE PROCESSES: FORMULATING HYPOTHESES AND DEFINING OPERATIONALLY By Darrel Wayne Fyffe Problem Instructional innovation and implementation to develop process skills has become a concern of many classroom teachers. A problem associated with this developing concern is that present evaluation techniques and instruments are inadequate to assess the acquisition of process skills. This study was an attempt to develop group test items which can be shown to measure process skill acquisition in two inte- grated processes. Literature A review of the literature points out that the development of methods of teaching and evaluating the process skills has recently come into greater demand. The dramatic increase in our accumulated knowledge and the lack of evidence that process acquisition is a by—product of content oriented education has convinced many science educators that emphasis should be placed on the development of science processes. Sev— eral new elementary science programs are available to help teach process skills, but inadequate evaluation instruments are available to measure the acquisition of these skills. Darrel Wayne Fyffe Procedure Thirty—six multiple-choice test items over two integrated pro- cesses, Formulating Hypotheses and Defining Operationally, were devel- oped using the behavioral objectives of Science — A Process Approach as a basis. These items were assembled into a 79 item group test in cooperation with Richard W. Robison. Simultaneously, individual testing of 56 students using the Individual Competency Measures, from Science - A Process Approach, was completed. Each student, who had studied from Science - A Process Approach materials the previous year, at Kinawa Middle School, Okemos, Michigan, was tested for a total of two hours. The group test was then administered to those same students. The scores on the Individual Competency Measures served as the external criterion measure for selection of the upper and lower 27 per cent cate- gories necessary to calculate group test item discrimination indices. Analysis of data from the responses included determining which items had a high (.20 or greater) index of item discrimination. Results Ten items for the process of Formulating Hypotheses and eleven items for the process of Defining Operationally had an index of item discrimination of .20 or greater. Due to the use of the external cri- terion the items with that index of item discrimination are considered satisfactory discriminators. Those items, with indices of item discrimination of .20 or greater, were then considered as a subtest of the 36 items. Student answer sheets were scored using only those items. From the two sets Darrel Wayne Fyffe of scores, Individual Competency Measures and subtests from the group test items, Pearson product-moment correlation coefficients were computed. It was found that Individual Competency Measure scores for the two integrated processes are correlated, at the .001 level of signifi- cance, with their representative items from the group test. Conclusion A study of the results leads to the conclusion that it is possible to measure the extent of acquisition of skills in the inte- grated science processes, Formulating Hypotheses and Defining Opera- tionally, if selection of test items is based upon reference to a criterion measure. Recommendations In view of the study, it is recommended that: 1. Measurement of acquisition of skills in the Integrated Pro- cesses should be based upon a criterion measure to deter- mine validity. The Individual Competency Measures from Science - A Process Approach can serve as the criterion measure. 2. The procedure employed in this study to develop, administer, and evaluate test items is recommended. AJthough this pro— cedure is time consuming, it is well worth the effort to obtain criterion-related validity. 3. Research should be initiated which will provide additional sources of test items and data from which to select appro- priate written tests of process skill acquisition. The procedures described in this study should prove valuable in the development of test items and instruments for objectives other than those of the processes. The use of a criterion score as a basis for choosing items should be adaptable to many field. THE DEVELOPMENT OF TEST ITEMS FOR THE INTEGRATED SCIENCE PROCESSES: FORMULATING HYPOTHESES AND DEFINING OPERATIONALLY By Darrel Wayne Fyffe A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY College of Education 1971 <:) Copyright by DARREL WAYNE FYFFE 1972 ACKNOWLEDGEMENTS The constant interest and perceptive criticism of my advisor, Dr. Glenn D. Berkheimer, is gratefully acknowledged. Through his effort and insight many advances were accomplished in the research. Additionally, the assistance of Drs. Charles A. Blackman, Jack B. Kinsinger, and, especially, Robert L. Ebel in their capacity as members of my advisory committee is appreciated. A special note of thanks is extended to Richard Wayne Robison, my partner in the research, and his advisor, Dr. Richard J. McLeod, for their devotion of extensive time periods to the study. The staff of the Okemos Kinawa Middle School was most coop- erative in accepting this research effort at their school. Without their help and the excellent student interest the study could not have been completed. Most importantly, my wife, Jean, and our children, Sharon and James, have the greatest "thank you" due. The many hours devoted to a study of this type requires many sacrifices on their part. My wife, also, served as an ever available sounding board for ideas, the contri- bution of her effort to the study is immeasureable. ii TABLE OF CONTENTS Page ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . ii LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . v LIST OF APPENDICES . . . . . . . . . . . . . . . . . . . . . . vi CHAPTER I. THE PROBLEM . . . . . . . . . . . . . . . . . . . . . . 1 Background . . . . . . . . . . . . . . . . . . . . 1 Need . . . . . . . . . . . . . . . . . . . . . . . 4 Purpose . . . . . . . . . . . . . . . . . . . . . 5 Design of the Study . . . . . . . . . . . . . . . 5 Nature of.Materials . . . . . . . . . . . . . . . 6 Level of Discrimination for Accepting or Rejecting Items . . . . . . . . . . . . . . 7 Assumptions and Limitations of the Study . . . . . 7 Overview of the Thesis . . . . . . . . . . . . . . 8 II. LITERATURE REVIEW . . . . . . . . . . . . . . . . . . . 9 Trends in Science Education . . . . . . . . . . . 9 Summary of Trends . . . . . . . . . . . . . . . . 17 Definition of Processes . . . . . . . . . . . . . 18 Summary of Definitions . . . . . . . . . . . . . . 22 Problems in the Evaluation of Processes . . . . . 22 Recent Efforts in Evaluating Process Skills . . . 24 Summary of Review of the Literature . . . . . . . 28 iii CHAPTER III. IV. PROCEDURES Design of the Study . Selection of the Students and School Selection of Processes and Competency Measures Development of Test Items . Administration of Competency Measures . Administration of Test Items Data Collection and Analysis Summary . The Problem . Level of Discrimination for Accepting or Rejecting Items Individual Competency Measure Analysis ANALYSIS OF RESULTS Test Item Analysis Correlation of Individual Summary . V. SUMMARY AND CONCLUSIONS BIBLIOGRAPHY APPENDICES Summary of Findings Conclusion Recommendations Problems and Implications for Further Research Implications for the Classroom Teacher Implications for Science Education 0 iv and Group Scores Page 30 30 31 33 34 37 39 40 41 43 43 43 43 47 53 60 61 61 62 63 63 65 66 68 71 LIST OF TABLES Table Page 1. The Percentage of Students Correctly Performing Each Task . . . . . . . . . . . . . . . . . 44 2. Frequency and Percentile Rank of Total Scores for the Individual Competency Measures . . . . . . . . 46 3. Indices of Discrimination for the Formulating Hypotheses Item . . . . . . . . . . . . . . . . . . . 49 4. Indices of Discrimination for the Defining Operationally Items . . . . . . . . . . . . . . . . . 51 5. Value of Pearson Product-Moment Correlation Coefficient Required for Specific Levels of Significance . . . . . 56 6. Correlation of Individual Competency Measure Scores and Group Test Scores for Each of the Four Processes ., 57 7. Correlation and Significance of Formulating Hypotheses Individual Competency Measure Scores with Scores from the Group Test Items for Formulating Hypotheses . . . 58 8. Correlation and Significance of Defining Operationally Individual Competency Measure Scores with Scores from the Group Test Items for Defining Operationally . . . 58 9. Correlation of Individual Competency Measure Scores and Group Test Scores . . . . . . . . . . . . . . . . . . 59 LIST OF APPENDICES Appendix A. Sample Record Sheets and Work Sheets for Individual Competency Measures . . . . . . . . . . . . . . . B. Group Test Items with Photographs and Script to Represent Slide Presentation . . . . . . . C. Item Analysis Data for Formulating Hypotheses Items D. Item Analysis Data for Defining Operationally Items vi Page 71 79 97 106 CHAPTER I THE PROBLEM Background 1 During the past decade the elementary education curriculum has been in constant change. One trend within the field of science edu- cation has been that of placing increasing emphasis upon the acqui- sition of strategies of inquiry or problem solving plans. A problem solving plan is thought of as a "highly individualized" sequence of "mental skills."2 These skills are sometimes known as processes and involve the ability to receive information, interpret data, develop conclusions, and propose solutions to problems. Although the role and choice of these processes has not been well defined, processes have come to be recognized as instructional concerns of the classroom teacher. Consequently, many suggestions have been published regarding ways in which children may be taught to develop processes. In fact, processes have become the focal point for several curricular projects in the natural and social sciences.3 The use of a process approach has been incorporated as the means of teaching skills 1Richard W. Burns and Gary D. Brooks, "What Are Educational Processes?" The Science Teacher, XXXVII (February, 1970), 27-28. 2 Ibid. 3 . Terry Borton, "What's Left When School's Forgotten?" Saturday Review, LIII (April 18, 1970), 69-71, 79. z. in the social sciences in: Man: A Course of Study and Materials and 5 Activities for Teachers and Children. In the natural sciences, one course that has been widely used is Science - A Process Approach, written under the auspices of the American Association for the Advancement of Science, in which "the process method of science teaching"6 is employed in organizing the entire program. The intent of the process method is to present prob— lems to elementary students in such a manner that process skills are needed to find solutions. That is, the use of mental skills is re— quired to state the problem, acquire data, develop conclusions and present a solution to that problem. The organizational scheme of Science — A Process Approach is based upon a sequential, heirarchical development of the processes. Students are introduced to the processes by science activities which require each student to use particular pro— cesses to solve the problems. The problems have been carefully sequen— ced so that the necessary skills are developed through the student laboratory exercises. Each process has associated with it several skills, in varying degrees of difficulty, which are organized in a heirarchy. 4J. S. Bruner and P. B. Dow, Man: A Course of Study: A Descrip— tion of an elementary social studies curriculum. (Cambridge, Massachu— setts, Education Development Center, 1967). 5F. H. Kresse, Materials, and Activities for Teachers and Children; A Project to develop and evaluate multi-media kits for elementary schools, Volume I and II, (Washington, D. C., U.S.O.E. Final Report, Project #5-0710, 1968). 6American Association for the Advancement of Science, "The Process Method of Science Teaching," Grade Teacher, LXXXIII (January, 1966), 59—61, 113. 3 As was stated earlier, the trend in science education has been toward the position of emphasizing the acquisition of process skills. A problem related to that trend is that, as yet, teachers do not have a time economical evaluation procedure available to determine the extent to which their students have acquired the desired process skills. In anticipation of, and as a partial solution to this concern, Competency Measures were developed for Science - A Process Approach. The Competency Measures, as one procedure to evaluate process skill acquisition, were available in two forms when this study was initiated: 1) Individual for Kindergarten through grade six, and 2) Group for grades four through six. Both forms consist of: ...tasks intended to assess the achievement of the objectives for each exercise.... In the ... competency measure each required task is described for th tester so that he can observe (the desired) behavior. The Individual Competency Measures are time-inefficient as evaluative devices, when used in the manner recommended, because they are intended to be administered to one student at a time. This pro— cedure requires the teacher to interview each student successively after each learning activity. To conserve time, the use of the Individual Competency Measures, in experimental classrooms, was limited to admin— istration, after each exercise, to a random sample of three students in each classroom. That procedure provided data on those three students and, by collecting information from hundreds of classrooms, for the Individual Competency Measure. However, it did not prove adequate as an evaluative procedure for each student in an entire class. Even with 7American Association for the Advancement of Science, An Evalua— tion Model and Its Application, Second Report, (Washington, D. C., 1968), pp. 9—100 4 the best student—teacher ratio, in ordinary classroom use, the indi— vidual testing requires more time than is usually available. The Group Competency Measures, as an alternative, have many of the same limitations. They are intended for use with: small groups of three to six students, students who have knowledge of the termi— nology of Science — A Process Approach, and classes immediately fol— lowing completion of a particular exercise from Science - A Process Approach. These restrictions are serious enough that the use of the Group Competency Measures will be confined to those classes using a major portion of the program and with sufficient personnel to ade- quately test all students. Lees If we accept the position that acquisition of process skills is one objective of education, then it follows that a need exists for time— economical evaluative means to determine the extent of the acquisition of those process skills. Certain characteristics of present measure— ment techniques, which will be treated in more detail in the following chapter, must be improved if measurement instruments are to be developed which will be of value to researchers in science education and, even— tually, to science teachers. For maximum utilization, a measurement instrument should meet three conditions: it must be presented in a language and style suited to the particular age group of interest, it most be supported by reference to some criterion measure, and it must be designed for class-size groups of children. Perhaps the most difficult of the three conditions is supporting the instrument by reference to some criterion measure. A criterion measure must be chosen for which the majority of opinion and evidence 5 is one of confidence and reliability. The validity of such an instru- ment should not be seriously impugned by eXpert opinion in the field of science education. Purpose The purpose of this study is to develop test items for use with seventh grade students to measure the acquisition of two inte— grated science processes. The procedure to accomplish this purpose includes writing, reviewing, administering, and analyzing test items. The processes selected are Formulating Hypotheses and Defining Opera- tionally which have been defined in terms of behavioral objectives and measured by use of Competency Measures in Science — A Process Approach. It is from these behavioral objectives and Individual Competency Mea- sures that a valid set of criterion measures will be selected for the study. By comparing the analysis data for each test item to the data received from the reSpective Individual Competency Measures, a selection of items which adequately test the process skills may be made. In conclusion, the problem for this study is the development, administration, analysis and publication of written test items which measure the extent of acquisition skills in two integrated processes, among seventh grade students, with the use of Individual Competency Measure scores as the criterion for acceptance. Design of the Study The design chosen for this study is a combination of two familiar designs: the One—Shot Case Study and the One-Group Pretest—Posttest.8 8Donald J. Campbell and Julian C. Stanley, ExPerimental and Quasi-Experimental Designs for Research, (Chicago: Rand McNally, 1963), pp. 6-12. 6 The combination refers to the One—Shot Case Study technique in which the observation of subjects follows the experimental treatment. The pretest-posttest situation involves two observation periods. In this study, the Individual Competency Measures will serve as a pretest while the instrument developed in this study will be the posttest. The experimental treatment is taken to be the prior instruction of students using Science — A Process Approach materials. Hence, symbol— ically the design is: X 01 02. Nature of Materials Early in the development of this study several decisions con- cerning the final materials were made. First, the items would be pre— pared in multiple-choice form to limit the range of answers and facili- tate scoring. Second, where appropriate, visible presentations (color slides) and audible presentations (Spoken script) would precede the item. Third, attempts would be made to confine the terminology and situations to the normal seventh grade level. Multiple choice test items will be deve10ped which require the student being examined to use the prerequisite skills and to satisfy the objectives established by Science - A Process Approach. The items will be examined by faculty and graduate students in science education with suggested revisions being incorporated and the resulting items then being administered to a small group of seventh grade students. Information from this preliminary testing will be used as a basis for further revision of items which can later be included in a group test form. The administration of the items to the larger, experimental 7 group of seventh grade students should be accomplished on the days fol- lowing administration of Individual Competency Measures. Level of Discrimination for Accepting or Rejectinngtems Each test item developed by this study will require the student to demonstrate performance of a process skill in such a manner that when item analysis data is calculated, using the external criterion of scores on related Individual Competency Measures as the basis for selection of the upper and lower groups, an index of item discrimina- tion of .20 or greater will result. Assumptions and Limitations of the Study Certain assumptions are made in conducting this study. First, based on information to be elaborated upon in the next chapter, the Individual Competency Measures of Science — A Process Approach are valid measures of a student's ability to use the respective process skills. Second, those students who have studied Science — A Process Approach in the school year prior to the one in which this study is conducted will have had sufficient experience with the vocabulary and manipulative skills employed in the Individual Competency Measures to qualify them as suitable subjects. Third, the choice of the index of item discrimination, from the statistical tests available, as the basis of evaluating the effective- ness of the test items is appropriate for this study. Fourth, no appreciable amount of learning will take place as a result of the administration of the Individual Competency Measures prior to the student response to written test items. 8 Fifth, the items developed by this study were designed to measure the acquisition of process skills and not for diagnostic purposes. Limitations upon interpretation of the results of this study should be recognized: the items will be tested only with seventh grade students who have received some exposure to Science - A Process Approach, and no data are available from other standardized sources concerning the level of ability of the students tested. Overview of the Thesis This chapter has presented the background, need, purpose, problem, design, materials, discrimination requirements, limitations and assumptions of this study. Chapter Two contains the background and reference material pertaining to the trends in science education, definition of processes and process education, problems in the eval- uation of processes, and recent efforts in evaluating process skills. Chapter Three contains a detailed description of the procedures used in carrying out this study including the selection of the sample students, administration of the Individual Competency Measures, develop- ment of the test items, and procedures for the analysis of data. Chapter Four includes the results and interpretation of the analysis of data. Chapter Five presents the summary, conclusions and implications for further research derived from this study. CHAPTER II LITERATURE REVIEW Any attempt to develop test instruments to evaluate the acquisition of process skills should include consideration of the past objectives, attempts and successes of teaching and evaluating those skills. One of the most recent evaluative procedures, Individ— ual and/or Group Competency Measures for Science — A Process Approach, was described in Chapter I as being time-inefficient. A more complete analysis of the literature of process evaluation is contained in this chapter. Although the chapter includes a review of the basis for pro- cess education developments as related to science process skill acqui- sition, the emphasis is on evaluation. The major sections of the chapter are: Trends in Science Education, Definition of Processes, Summary of Definitions, Summary of Trends, Problems in the Evaluation of Processes, and Recent Efforts in Evaluating Process Skills. Trends in Science Education Throughout the history of American education, numerous trends supporting one person or theory have develped. Each trend evolves or becomes discredited, after a period of time, so that few persons con— tinue to embrace the view or theory. This shift of short—1iVed move— ments within the natural sciences for the past century of science 1 instruction is recorded by Hurd and Gallagher. They describe science 1 . . . Paul DeHart Hurd and James Joseph Gallagher, New Directions in Elementary Science Teaching, (Wadsworth Publishing: Belmont, California, 1969), pp. 21-32. 10 trends, beginning in 1850, when the influence of Pestalozzi and the object lessons took on importance. Object lessons, while they did not continue in popularity, served to enhance implementation of more formal science instruction. Classroom instructional materials and procedures were revised, espec— ially at the primary level, throughout the 1800's. In one particularly progressive book, Howe2 published a comprehensive science program for grades one through nine. He listed: areas of study, teaching strategy, number of lessons materials required, storage suggestions, and times of the year for the greatest success. The lessons had been developed over a period of years of teaching. An examination of the book by Howe reveals that an under- standing and description of questioning styles, sequential lesson out- lines, experimental materials, and spiral curricula were contained. Unfortunately, little evidence of a continuing influence is noted from Howe's work.3 This is tragic since much of what Howe developed was the fore-runner of recent curriculum projects. ESpecially appealing is Howe's list of Guiding Principles, one stating that "mental powers must have exercise to grow."4 This relates to process education and evaluation because processes are mental powers or skills. Near the turn of the century the changing emphasis to nature study, as the science in schools, was stressed in a report. The report 2Edward G. Howe, Systematic Science Teaching, (D. Appleton, New York, 1894), pp. 1-336. 3Hurd, op. cit., p. 24. 4Howe, op. cit., p. l. 11 influenced many educators when it appeared as the Third Yearbook of the National Society for the Scientific Study of Education, Part II, Nature Study in 1904. The yearbook is written as if the authors were placed in the position of defending their view that the objective of nature study is to have: ...the pupil investigate phenomena and things for the purpose of degermining their relation. Nothing is studied in isolation. ‘ The teaching of science was again modified after Craig6 pub— lished a listing and analysis of the questions children ask. He found that the range of interest was great. The resulting change was directed toward a general science approach and this pattern, as a goal, has con- tinued until the present time.7 This accumulating trend was reported in 1932 when the National Society for the Study of Education published its Thirty—first Yearbook, A Program for Teaching Science. This publi- cation was concerned with the problems presented by the past years of nature-study programs because nature-study was seen as an emphasis upon fact rather than principle. As an alternative the book presents an outline of subjects to be taught, recommendations for the sequence of 5Wilbur S. Jackman, The Third Yearbook of the National Society for the Scientific Study of Education, Part II, Nature - Study, (University of Chicago Press, 1904), p. 12. Gerald S. Craig, Certain Techniques Used in DeveloPing a Course of Study in Science for the Horace Mann Elementary School, (New York: Teachers College, Columbia University, 1927), pp. 1-73. 7Education Development Center, Goals for the Correlation of Elementary Science and Mathematics, (Boston: Houghton Mifflin Company, 1969, p. 15. 12 naterials within the school, and plans for the preparation of teachers. 8 Throughout this yearbook the point is made that the integrated program of science teaching can best be organized about the conceptual themes of science. This approach and organizing pattern have continued in many science programs until the present. Smith has described the Thirty-first Yearbook as one which placed emphasis upon the major generalizations of science as objectives in instruction.9 The Yearbook also went on record in support of ele— memtary science rather than nature study and, as a result, contributed to the rapid advancement of science instruction at the elementary school level. The report advocated basing the selection of science content on personal and social criteria; thus, simultaneously conforming to and augmenting the changes in educational thinking which were developing in that direction.10 In 1947 the National Society for the Study of Education, prompted by the technological and theoretical advances made in science before and during World War II, again devoted the Yearbook to science instruc- tion. The editor expressed the need for a yearbook when he reported that: Instruction in science must take cognizance of the social impact of developments produced by science. . . . Science instruction has not only a great potential contribution to 8Guy M. Whipple, ed., A Program for Teaching Science, Thirty— first Yearbook, National Society for the Study of Education, (University of Chicago Press, 1932), pp. 1-364. 9Herbert A. Smith, "Educational Research Related to Science Instruction for the Elementary and Junior High School: A Review and Commentary," Journal of Research in Science Teachipg, Vol. I, No. 3, (1963), p. 206. 10 Ibid, p. 207. 13 make but also a responsibility to help develop in our youth the qualities of mind and the attitudes that will be of greatest usefullness to them in meeting the pressing social and economic problems that face the world. That Yearbook on school science instruction forceably made known the need for instruction in the instrumental skills. The writers include in these instrumental skills the ability to: "(1) read science content with understanding, (2) make observations of events; and (3) perform the various science activities."12 Although these skills were described, a more pervasive issue to science educators was noted on the same page. The teaching of the scientific method was seen as the central concern of science instruction. The improvement of planning with its reliance upon the basics of the scientific method was seen as a great contribution to the individual student and the society.13 Newer ideas were developed in a more recent report, the Fifty— ninth Yearbook of the National Society for the Study of Education. The authors present opinions which emphasize the increasing dependence of modern societies upon the developments of science. This Yearbook main— tains that the necessary understandings of citizens within the society must become concerns of science educators. To implement this emphasis the Yearbook presented a description of and a call for the teaching of the following processes of inquiry that are characteristic of science: 1. Reading and interpreting science writings 2. Locating authoritative sources of science information Nelson B. Henry, ed., Science Education in American Schools, National Society for the Society of Education, Forty—sixth Yearbook, (University of Chicago Press, 1947), p. 1. 12Ibid, p. 62. 13Ibid, p. 62. W ‘ ii 14 Performing suitable experiments for testing ideas Using the tools and techniques of science Recognizing the pertinancy and adequacy of data Making valid inferences and predictions from data . Recognizing and evaluating assumptions underlying techniques and processes used in solving problems Expressing ideas qualitatively and quantitatively . Using the knowledge of science for responsible social action 10. Seeking new relationships and ideas from known facts and concepts. \lChLfl-I—‘UJ 003 In the following year The Central Purpose of American Education described some of the processes which would empower one to use logic. Those processes included recalling, classifying, generalizing, comparing, evaluating, analyzing, synthesizing, deducing, and inferring.15 With the benefit of these mental processes it was felt that man might be better equipped to solve his problems. The same publication points out that these processes are the intent of earlier writings of 1918 and 1938 which stressed teaching the fundamental processes. Associated with acquiring these processes is the more complete ability to achieve the other traditional educational objectives.16 A procedure for teachers to follow in achieving these objectives was formulated to assist in enabling the student to achieve these successes, selecting problems which are within his graSp, providing clues and cues to their solution, suggesting alternative ways to think Nelson B. Henry, ed., Rethinking Science Education, National Society for the Study of Education, Fifty-ninth Yearbook, (University of Chicago Press, 1960), p. 37. 5National Education Association, The Central Purpose of American Education, (Washington, D. C., 1961), p. 5. 16Ibid, p. 5. 15 about them, and assessing continuously the progress of the pupil and the degree of difficulty of the problems before him.17 A problem exists because the implementation of the above modes of teaching are hampered by past experiences of teachers and the public. This past experience, for the most part, may be summarized by saying that science instruction has been organized almost entirely around a selected body of content and has thus neglected processes and strategies for obtaining and working with information.18 This discrepancy between objectives and tradition is difficult to reconcile. Problems are created and solutions need to be reached using the best available knowledge. A solution, of sorts, was pre- sented when the National Education Association released its statement that: ...the school must help the pupil grasp some of the main methods - the strategies of inquiry - by which man has sought to extend his knowledge and understanding of the world.... Educators, working with eXperts in the various disciplines, should choose content on the basis of its appropriateness for developing in pupils of various ages understanding of the various strategies of inquiry. Beginning in 1961 an effort was begun toward the eventual development of science materials which could fulfill these expectations. The American Association for the Advancement of Science published a feasibility study written by teachers and scientists on the preparation of elementary and junior high school science materials. This led to 17Ibid, p. 17. 18Hurd and Gallagher, op. cit., p. 13. 19National Education Association, op. cit., p. 19. 16 the appointment of the American Association for the Advancement of Science Commission on Science Education in the Spring of 1962. The Commission held conferences and writing sessions to determine the extent and direction of need in elementary science. Since then the development, testing, and publishing of materials has been continuing. The writing has been done mainly in summer conferences with the materials then being taught by trial teachers in fourteen cen- ters. Feedback from teachers was collected and used during the next writing conference to improve the materials. The entire program was developed in this manner. This program entitled Science - A Process Approach is presently designed for Kindergarten through grade six. Science - A Process Approach is not the only elementary science program which has developed during these years, e.g. Elementary Science Study and Science Curriculum Improvement Study. However, Science - A Process Approach has made more elaborate evaluative plans in that it "attempts to appraise the children's progress more systematically and in greater detail than do the others."20 Science — A Process Approach consists of a sequence of exercises for each grade level, K to 6, which are designed and equipped so that students are usually involved in activities which require manipulation of objects. The exercises are not sequentially related in content, how— ever, the unifying theme is that students are given experiences which are intended to deve10p the process skills. 20Robert Karplus and Herbert D. Thier, A New Look at Elementary School Science, (Rand McNally, Chicago, 1967), p. 8. 17 Science - A Process Approach has carefully stipulated and defined the processes which are involved. These include eight basic processes: ...observing, using space/time relationships, classifying, using numbgp, measuring, communicating, predicting, and inferring. Building upon the basic eight processes are the integrated processes: ...formulating hypotheses, defining operationally, con— trolling variables, interpreting data, and experimenting. 22 The distinction can be made between the basic and integrated processes that the basic are simple, beginning at the earliest grade level, while the integrated are complex combinations of the basic processes and are not introduced until later grades. This tOpic will be discussed in more detail in the next section of this chapter. Summary of Trends Elementary school science has ranged from being nonexistant through periods including nature study, experimentation, content attain— ment, and process acquisition. During the years following 1960, the emphasis has shifted toward development of process skills. The processes have gone by many names, however they refer to means of seeking, dealing with, and describing information. Implemen— tation of procedures to teach and evaluate acquisition of processes in the science curriculum has most completely been accomplished in Science- A Process Approach. 1American Association for the Advancement of Science, Science - A Process Approach Commentary for Teachers, (AAAS Misc, Pub. 68-7, 1968), p. 33. 22 Ibid, p. 157. 18 One of the criticisms directed toward process teaching has dealt with the problem that confirmation or evaluation of acquisition has been an extremely subjective decision. To overcome this criticism teachers must become more knowledgeable in recognizing process skills. One step in correcting this deficiency is the development of more precise definitions of processes. Definition of Processes Recent attempts to define the mental operations involved in science education may be traced to some introductory remarks in the Fifty-ninth Yearbook of the National Society for the Study of Education. At one point the Yearbook refers to "the nature of" this area to be critical thinking which includes: 1. Recognizing and defining a problem. 2. Clarifying the problem by making appropriate definitions, distinguishing between facts and assumptions, and collecting and organizing relevant information. 3. Formulating possible explanations or solutions (hypotheses.) 4. Selecting one or more promising hypotheses for testing and verification. 23 5. Stating tentative conclusions. Some clarification of those ideas is presented in later sections of the Yearbook, although specification of the behaviors associated with the processes is not accomplished. Therefore, teaching and evaluating the skills are not made more feasible. A more complete definition of some selected processes is made by the American Association for the Advancement of Science in The 23Nelson B. Henry, ed., Rethinking_Science Education, National Society for the Study of Education, Fifty-ninth Yearbook, (University of Chicago Press, 1960), pp. 42-3. 19 Psychological Bases of Science — A Process Approach. The process approach to teaching is described as: ...having children learn generalizable process skills which are behaviorally Specific, but which carry the promise of broad transferability across many subject matters. The pro— cess approach also rejects the notion of a highly generalizable 'creative ability' as a unitary trait. Instead, it adopts the idea that novel thought can be encouraged in relation to each of the processes of science-~observation, inference, communi— cation, measurement, and so on. The point of view is that if transferable intellectual processes are to be deve10ped in the child for application to continued learning in sciences, these intellectual skills must be separately identified, and 24 learned, and otherwise nurtured in a highly systematic manner. In 1968 the Third Experimental Edition of the Science — A Pro- cess Approach Commentary for Teachers was published. This contains the behaviorally defined objectives related to the thirteen processes used in Science — A Process Approach. Those objectives, for the two pro— cesses of interest in this study, are reproduced here: FORMULATING HYPOTHESES OBJECTIVES l. CONSTRUCT a hypothesis that is a generalization of observations or that is a generalized eXplanation. 2. CONSTRUCT and DEMONSTRATE a test of a hypothesis. 3. DISTINGUISH between observations that support a hypothesis and those that do not. 4. CONSTRUCT a revision of a hypothesis based on obser- vations that were made to test the hypothesis. DEFINING OPERATIONALLY OBJECTIVES l. DISTINGUISH between operational definitions and nonoperational definitions of the same thing. 24American Association for the Advancement of Science, The Psychological Bases of Science - A Process Approach, (AAAS Misc. Pub. 65-68, 1965), p. 4. 5 . . . . American Assoc1ation for the Advancement of Sc1ence, Commen- tary for Teachers, op. cit., p. 159. 20 2. IDENTIFY variables or words for which an opera- tional definition is needed, given a hypothesis, inference, question, graph, or table of data. 3. CONSTRUCT an operational definition which adequately describes a procedure, object, or property ofzgn object in the context in which it is used. Using varying terminology, several authors have discussed ideas similar to the processes described above. These works may be of value in understanding the Science - A Process Approach approach. Gagne, whose work has been a great influence on Science - A Process Approach, n27 has referred to processes as "learned capabilities, and also as "intellectual activities."28 Bruner has referred to the processes "29 both as "strategies and later as "skills."30 In an analysis of these documents Cole31 postulates that the processes are equivalent to the "operations" of Guilford32 and the "logical operations” of Piaget.33 261bid, p. 166. 27Robert M. Gagne, "Contributions to Human Development," Psycho- logical Review, LXXV, No. 3, 177—191. 28Robert M. Gagne, "Learning Hierarchies," Presidential Address, Division 15, American Psychological Association, (San Francisco, August, 1968). 29Jerome S. Bruner, et. al., A Study of Thinking, (John Wiley & Sons, Inc., London, 1956), pp. 1—330. 3OJerome S. Bruner, Toward A Theory of Instruction, (W. W. Nor- ton & Co., New York, 1967), pp. 1-176. 31Henry P. Cole, "Process Curricula and Creativity Deve10pment," Journal of Creative Behavior, Vol. III, No. 4, (Fall 1969), p. 244. 32J. P. Guilford, The Nature of Human Intelligence, (McGraw Hill, New York, 1967), pp. 1-538. 33Jean Piaget, Six Psychological Studies, (Random House, New York, 1967), pp. 1-169. 21 The impact of these persons and their ideas about processes on education are potentially great. One writer concludes that: Process education recognized that the first and foremost objectives of any curriculum should concern those intellec- tual skills which the learner needs if he is to acquire, organize, generate and utilize in a productive manner the information central to a discipline. First priority should be given to these intellectual skills or processes which have communality to many different academic and practical realms. Instruction to foster development of the process skills requires materials and methods particularly suited to the situation. Past efforts show that, in theory, educators have: ...always recognized the need for attention to the develop— ment of creative and adaptive behaviors in children, (while) instructional practice has been almost exclusively concerned with the transmission of knowledge and information to the child.35 Some other writers, working independently of Science — A Process Approach have also described processes. One of the most concise articles on the subject states that: ...processes are learned, transformational entities which are used in learning and problem solving regargless of the methods used and subject matter being considered.3 Burns and Brooks conclude that one must distinguish between the stated objectives and the approach. Processes are objectives or goals of education and are not, pr0per1y, "a method." There may exist many methods by which process skills may be acquired. 34Cole, 0p. cit., p. 244. 351bid, p. 253. 36Burns, op. cit., p. 28. 22 Summary of Definitions The processes, for the purposes of this study, will be consid- ered to be the behavioral manifestations of the requisite learning and problem solving skills which are exhibited by persons when placed in the situation of confronting a problem. Problems in the Evaluation of Processes Prior to deve10ping test items for inclusion in this research, a review of the literature was undertaken. This review, as described later, shows that no formal program of research has undertaken the problem of evaluating attainment of integrated process skills. The review did bring to light several of the problems associated with endeavors of this sort. Most of the problem areas noted have been considered quite carefully in terms of this study. Attempts have been made to control each of these problem areas. Among the problem areas are two that are described by Ebel. stated that items written for the higher mental processes have failed because of one of two reasons: One is that they may be quite difficult and thus call for more than ordinary examinees are capable of delivering... Another is that they may involve fairly complex situations, which re— quire many words to describe and may present the examinee with problems of comprehension and interpretation which3may be irrelevant to the main purpose of the examination. This study has attempted to deal with these problems by pro— viding visual and oral information about complex situations and using only the vocabulary that students could be expected to know. 37Robert L. Ebel, Measuring Educational Achievement, (Prentice— Hall, Inc., Englewood Cliffs, New Jersey, 1965), p. 52. 23 Ebel points out, also, the hazard of having difficulty in dis- tinguishing between different mental abilities and processes.38 Pos- sibly that problem of differentiation has been solved in this study since the processes are distinguished from each other in terms of the behavioral objectives of Science — A Process Approach. This point will be discussed more fully in Chapter III. Another hazard of developing test items is that one might attribute more value to student success on items than can rightfully 39 makes this be claimed as a part of their process education. Borton point and reminds us that testing programs have been mistakenly inter- preted in the past. One aspect of this is familiarity with special terminology. This need is not evident within the materials in this study so that students without a background will not be hindered. The greatest hazard in testing for the processes is that of not developing valid items. To overcome this problem a decision was made to utilize the objectives of the Individual Competency Measures as an adequate criterion upon which the performance on the test items may be evaluated. Horrocks and Schoonover describe the behavioral requirements of a criterion as; 1) true outcomes of the construct in question, 2) observable, 3) measurable in some quantitative fashion, 4) readily definable, and 5) agreed upon by the individuals concerned with estab- lishing the behavior as a criterion.40 Because Science — A Process 381bid, p. 53. 39Terry Borton, "What's Left When School's Forgotten?" Saturday Review, LIII (April 18, 1970), 79. 40John E. Horrocks and Thelma I. Schoonover, Measurement for Teachers, (Charles E. Merrill Publishing Company, Columbus, Ohio, 1968), p. 70. 24 Approach has become so widely used, these conditions seem also to be met by the use of the Individual Competency Measures as the criterion. Related to this hazard are the problems associated with devel— 0pment of testing instruments for any particular objective. To develop a valid instrument one should refrain from certain widely-followed practices. First, selection of items based upon a "correlation with the total score"41 should be avoided because the entire test may prove to be an invalid instrument. Secondly, Buros has mentioned the inappropriateness of selecting items from those which have "an increasing percentage of students in 42 successive grades" who correctly reSpond. He points out that the researcher cannot eliminate the possibility that maturation is a con— founding variable in such a decision. Keeping these hazards, criticisms and problems in mind, let us turn to some of the recent efforts in evaluation of process skill acquisition. Recent Efforts in Evaluating Process Skills In the last few years two dissertations have been written which deal with process tests. The first, by Tannenbaum,43 developed an instrument to measure eight processes at grades seven, eight, and nine. The instrument, Form II, has ninety—six multiple choice questions which 41Oscar K. Buros, "Criticisms of Commonly Used Methods of Valid- ating Achievement Test Items," Proceedings of the 1948 Invitational Con— ference on Testing Problems, (Educational Testing Service, 1949), p. 18. 421bid, p. 18. 43Robert S. Tannenbaum, The Development of the Test of Science Processes, Unpublished dissertation, (Teachers College, Columbia University, 1968), 201 pp. 25 were administered to 3,673 students in a carefully controlled sample. Tannenbaum has an extensive analysis of the data collected and reports subscores on the eight subtests. The reliabilities reported for the subscores are low in certain cases, .30 to .82. The major deficiency in Tannenbaum's study, however, is that no attempt was made to determine the criterion-related validity of the items before the items were tested and normed. The difficulty of that undertaking is described when it is reported that: The criterion-related validity of the Test of Science Processes is extremely difficult to investigate because this is the first attempt to assess this age level students' ability to use science processes. Although it may seem slightly irreverent to make the comparison, the problem is somewhat akin to the problems involved in validating the first IQ test at the beginning of this century. The task admittedly is difficult. However, procedures exist to accomplish at least a start in that direction. Tannenbaum has demon— strated a tremendous amount of effort and talent in the overall scholarly nature of his study. The two earlier criticisms of testing by Buros are, however, quite evident in Tannenbaum's study. Basically, it might be mentioned that both these criticized practices are exhibited in all eval- uation attempts which do not ascertain and utilize an external criterion. This study will eliminate both these practices by using the score upon the criterion measures, Individual Competency Measures, as the basis for correlational studies to select items. In all fairness, another criticized practice, by Buros, is that of not using some degree of "subjective procedures for evaluation"45 and selection of test 44Ibid, p. 108. 45Buros, op. cit., p. 18. 26 items. This practice is accomplished more successfully and consis— tently by Tannenbaum. The second dissertation, by Beard,46 is a description of the development of tests for assessing the processes of measuring and classifying in grades one, two, and three. Again in this study, the validation of items is done by content validity, i.e., it is decided by a panel of judges. Also, decisions were made to retain or reject entire forms of tests on the basis of reliability measures based on total test scores. Both of these researchers have failed to utilize a criterion upon which the success of the test or test items might be judged. It should be recognized that: ...the difficulty with testing usage centers very much on the determination of an adequate criterion which is independent of the testing instrument. Far too often the users of tests and even researcherz have come to accept test scores as an ultimate criterion. 7 This fault in test construction and validation exists for most of the testing procedures and instruments which are in common usage today. . 48 , The Sixth Mental Measurements Yearbook is an extenSive compen- dium of descriptions of the testing instruments which are currently in print, as well as containing analyses and critiques of the tests. Several of the authors of the critiques evidence some concern about the 46Jean Beard, Group Achievement Tests Develqped for Two Basic Processes of AAAS Science - A Process Approach. Unpublished dissertation, (Oregon State University, 1970), pp. 1—149. 47James R. Barclay, Controversial Issues in Testing, (Houghton Mifflin Co., Boston, 1968), p. 60. 48Oscar K. Buros, ed., The Sixth Mental Measurements Yearbook, (The Gryphon Press, Highland Park, New Jersey, 1965), pp. 1-1714. 27 validity of particular tests. Bryan states that in the Stanford Achievement Test, 1964 edition, the authors determined the content validity by "examining appropriate courses of study and textbooks."49 The result is that the tests obstruct revision of curricula. The authors of the Metropolitan Achievement Tests selected items on: ...the ability of an item to distinguish between students at successive grade levels. Again the criticism of Buros is ignored. Another reviewer maintains that the tests reflect what the authors think the "curriculum was" because the: ...authors reviewed expert pronouncements concerning the goals of elementary education, current research on the nature of essential skills, such as reading and the work- study skills, representative courses of study, and several widely-used textbook series in the various branches. After studying these the written items are assumed to be measuring those abilities. Another popular test, the California Achievement Tests, 1957 Edition with 1963 norms, reports coefficients of correlation which are uniformly high on sub-tests with similar sub-tests chosen from the Stanford and Metropolitan tests. This indicates that: "the skills sections from those batteries and the California Achievement Tests may 52 be tapping similar skills." As a final review we find that the Graw—Votaw Roger tests' source of materials and sampling techniques, to 4914:11:15.1. M. Bryan, in Buros, Ibid, p. 121. 50Paul L. Dressel, in Buros, Ibid, p. 58. 51Henry S. Dyer, in Buros, Ibid, p. 59. 52Jack C. Merwin, in Buros, Ibid, p. 18. 28 select items from those possible, are not reported clearly. Page concludes that this "characteristic remains a defect of this battery as well as of most other batteries available."53 Implied is the thought that little is done to insure validity of test items in other ways. Summary of Review of the Literature The successes and failures of past efforts in process skill evaluation, apprOpriately, have had an effect upon the development of this study. First, the problem of adequately defining and observing the processes has been treated by use of the behavioral objectives and exercises which have been developed and implemented by Science - A Process Approach. Literally hundreds of man—hours and the instruction of thousands of elementary school children combine to make this the most widely accepted standard in process education and evaluation in elementary science. Second, the procedures used in the selection of test items has been improved from those used by the writers of the major standardized tests of educational achievement. A criterion measure using a behav— ioral test was employed. The decision about the validity of each item, or groups of items, can be determined by data analysis. Third, this study is intended as the first in a series of studies to develop process items and, eventually, process tests. The development of test items in this study is viewed, therefore, as part of a larger project. 53Ellis Batten Page, in Buros, Ibid, p. 40. 29 In conclusion, the development of methods of teaching and evaluating the process skills has recently come into greater demand. The dramatic increase in our accumulated knowledge and the lack of evidence that process acquisition is a by-product of content-oriented education has convinced many science educators that emphasis should be placed on the development of science processes. Several new ele- mentary science programs are available to help teach process skills, but inadequate evaluation instruments are available to measure the acquisition of these skills. CHAPTER III PROCEDURES Based upon the need for evaluation of the acquisition of inte- grated process skills, a plan was arranged to develop test items. A cooperative agreement was made with Richard W. Robison who was inter- ested in a similar study over two different integrated processes. Design of the Study Earlier the design of this study was described symbolically as: X 01 02. This refers to the sequence of the three activities which are the essential elements of this study. The X represents the experimental treatment, the prior experience with Science - A Process Approach, that each student received. 0115 the first observation period which is the administration of the criterion measures, the Individual Competency Measures. These were administered to each student as the beginning point of this study, the X treatment having been accomplished independently of and prior to the initiation of this study. 02 is the second observation period in which the group test of four processes was administered to the students. This was the last phase of interaction with the students in this study. After this time the analysis and reporting of data were undertaken. Since this study had been intended as a means of developing test items for classroom use, the design of the study required testing in classrooms. Robison and the author decided upon the design above in which the Individual Competency Measures would first be administered. 30 31 and then the group test of four processes. Several factors were involved in this decision: the Individual Competency Measures are so lengthy it was deemed inadviseable to administer them at two different times since staggering testing with two groups would not eliminate the confounding variable of maturation, and other designs for testing would produce results from a smaller sample. Certain other decisions and actions were anticipated early in the study and the order of events was: Selection of students and school Selection of processes and competency measures Development of test items Administration of competency measures Administration of test items Data collection and analysis . Preparation of study report. \lOU‘l-L‘UDNi-J Certain of the above were under consideration simultaneously, however, none could be considered final until the previous decisions were complete. The seven decisions are described in the following sections. Selection of Students and School A decision was made early in the study of concentrate the efforts on developing test items for only one grade level. The basis for this decision was the elimination of as many variables as possible. At the age levels of subjects being considered, the range of their vocabularies was an important consideration. Another consideration was the difference in ability of various age levels to respond to similarly presented test— ing situations. In this study, then, the testing vocabulary and proce- dures could be directed at one specific age of student. Next, a decision was reached that the school chosen should be one in which the students would have had some exposure to Science — A Process Approach. A sample had to be selected in which the criterion 32 measures, Science - A Process Approach Individual Competency Measures, were valid. Students were required, therefore, who were familiar with the terminology and techniques of that science program and those mea- sures. A search was begun in Michigan to find a school where the condi- tions could be met. Several were mentioned and considered. The stu- dents in the seventh grade in the Okemos, Michigan, School District met the requirements. The Okemos schools had been using selected portions of Science — A Process Approach, Parts Four and Five, in grade six science classes. Okemos is a rapidly growing, suburban, bedroom community located three miles from East Lansing and the Michigan State University campus. The community is considered to be upper middle class socio- economically. Educationally, the community contains a large number of the University faculty and married graduate students. The school system is well equipped and has adequate financial support. Students who had completed at least one year of Science - A Process Approach were chosen from the students in the Kinawa Middle School of the Okemos, Michigan, Public Schools. The proximity of this school, coupled with the science background, created an ideal situation for the testing program required for test item development. The school principal stated that the seventh grade students had been assigned to one of two clusters (about 75 students) without regard to ability or background. A random assignment was not necessitated by the choice of design and analysis. Therefore, either of the clusters was appropriate for this study sample. Permission was obtained from the school principal to plan to 33 work with one science teacher and his three seventh grade science classes. The three classes met in morning sections and together com- prised one cluster of about 75 students, all of whom experienced the same teachers during the morning. A working agreement was reached with the teachers of science, English, and social studies to permit periodic work with one or more students from their classes. The science teacher deve10ped a list of those students who had attended the Okemos schools the previous year. Some students had moved into the school district during that school year and did not meet the requirements. However, there were 59 students who had the required background. These students were tested using the measures and design described later. Selection of Processes and Competency Measures Discussions with Robison produced the agreement that he would consider the integrated processes of Controlling Variables and Inter— preting Data. The process of EXperimenting, it was decided, was to not be evaluated because it ”encompasses and uses all of the processes."1 Also, Formulating Models was not used because of the lack of Competency Measures for that process. This study was to concentrate on the two remaining integrated processes, Formulating Hypotheses and Defining Operationally. Since evaluation was to be conducted in grade seven the more difficult Compe- tency Measures, associated with the exercises in Part Six, were chosen to represent the processes. American Association for the Advancement of Science, Science- A Process Approach Commentary for Teachers, (AAAS Misc. Pub. 68-7, 1968), p. 163. 34 This use of Individual Competency Measures is not part of the intended use. Each, originally, was written as an evaluative device to follow one teaching/learning exercise. Choosing a representative sample of measures for each process was based upon certain desires: (l) the measures should be representative of the activities for that process, (2) the measures should contain tasks representing all the skills for that process, (3) the measures should not be so narrowly tied to one exercise that success on that is determined by recall of performing the exercise, and (4) the measures should utilize materials and equipment that is relatively simple to obtain, work with, and main— tain. A canvass of the entire set of Individual Competency Measures available preceeded selection of those to represent each of the inte- grated processes. Formulating Hypotheses was measured by use of the Individual Competency Measures accompanying activities entitled Formu- lating Hypotheses 2 and Formulating Hypotheses 4. The first of these contains four tasks and the second contains ten. The number of tasks represent the situations or questions with which the student is con— fronted. Defining Operationally was measured using the Individual Compe— tency Measures identified as Defining Operationally 4, Defining Opera- tionally 5, and Defining Operationally 6. The first of these consists of three tasks while the second and third each contain five. Recording forms and student worksheets for the tasks are presented in Appendix A. Development of Test Items Test items were developed to require student behavior as described by the objectives of Science — A Process Approach. Ideas for specific 35 items were obtained from materials such as science textbooks and labo- ratory experiences used in grades five, six and seven. Using this procedure, Robison and the author prepared items for possible inclusion in a group test. These items for a group test were continually examined for their appropriateness for the study especially for their content validity. This review of items on the basis of content validity should not be confused with examining the items to determine the knowledge of science content required to choose a correct answer. However, content validity, here, refers to whether the item does require or permit a student to use and demonstrate the essential skills as outlined in Science — A Process Approach. The review activity included analysis of the items by both researchers, as well as by some faculty members and graduate students from the Science and Mathematics Teaching Center, Michigan State University. Items were considered for inclusion on the basis of: evi- dence of content validity to reviewers and appropriateness for seventh grade students. Items were revised or rejected until the final group of items was collected. The procedure followed in the review of items was to provide each reviewer with a list of the objectives for the four integrated processes at the same time that proposed test items were available. Each test item was then identified as measuring one objective or skill for a particular process. The reviewer then had two considerations to decide: (1) does the item require the use of the Specific process Skill identified? and (2) does the item present enough information that a skillful seventh grade student can respond correctly? 36 Many of the items written for the two processes of interest were pre-tested on two seventh grade students. Plans were made to pre- test items on six students from the Lansing, Michigan, schools on a Saturday morning, but because parental plans were changed, only two arrived. Nevertheless, the reaction of those two students to items was immeasureably valuable in revising those items and as a reference in preparing future items. The items were designed around two formats. Certain items were written to be supplemented by a visual clue, 35 mm Slides, and an oral clue, a spoken script. (See Appendix B). The completed items, photographs, and script used with each student are present in Appendix B. This collection consists of 36 multiple choice items. Responses were recorded by students on mimeographed answer sheets. Among this group of 36 items, 14 items were accompanied by the Slides and script. The remaining 22 items were presented with only the information on the test booklet and were placed in the latter part of the group test. Of the collection of 36 items, Formulating Hypotheses was represented by 18 items and Defining Operationally by 18 items. Listed below are the objectives for each of the two processes accom- panied by the numbers of items which were intended to assess each objective. FORMULATING HYPOTHESES OBJECTIVES l. CONSTRUCT a hypothesis that is a generalization of observa— tions or that is a generalized explanation. items: 1, 7, 15, 42, 52, 56, 61 2. CONSTRUCT and DEMONSTRATE a test of a hypothesis. items: 2, 20, 51, 53, 57 37 3. DISTINGUISH between observations that support a hypothesis and those that do not. items: 8, 9, 16, 59 4. CONSTRUCT a revision of a hypothesis based on observations that were made to test the hypothesis. items: 10, 79 DEFINING OPERATIONALLY OBJECTIVES l. DISTINGUISH between Operational definitions and nonopera- tional definitions of the same thing. items: 18, 22, 41, 54, 62, 76 2. IDENTIFY variables or words for which an operational defini- tion is needed, given a hypotheses, inference, question, graph, or table of data. items: 19, 21, 55, 58, 60 3. CONSTRUCT an operational definition which adequately describe a procedure, object, or property of an object in the context in which it is used. items: 14, 17, 43, 69, 70, 71, 72 Administration of Competency Measures Each Individual Competency Measure is a series of tasks which the student must perform to demonstrate a correct response. Science - A Process Approach techniques call for a checksheet to record student responses to each task or question. The nature of the testing Situation is that the teacher administers the measure to one student at a time. Each correct response was to receive a score of one point. The testing required that a place be found were interruptions and distractions could be held to a minimum. After some introduction to and inspection of the Kinawa Middle School, a place was chosen for meeting and testing the students which 38 was in a faculty hallway that connected the science rooms with storage areas, conference rooms, and the professional library. Facilities were provided so that the individual testing of students could be accom- plished in that hallway and in one conference room. The materials required for testing were stored in a small office. To reduce confusion, enough materials were collected for most Individual Competency Measures so that,if necessary, both researchers could be testing the same process, with different students, during one period. Both of the researchers familiarized themselves with all the Competency Measures prior to beginning testing. When the responses were thought questionable, the scoring was checked at a later time with the researcher of concern to assure that all scoring was consistent. The responses and some supplementary notes were marked on a Record Sheet for each student (Appendix A). The scheduling of the cluster of 75 students required that all testing be done in the morning. The students were available the second, third, and fourth periods, where each period was defined as 50 minutes plus five minutes to change classrooms. At the beginning of each period, one student was drawn from the science classroom to be tested individ— ually by each of the researchers. The individual testing was begun on April 13, and concluded on May 29, 1970. Both researchers were able to test at least four mornings each week during these three class periods. Individual students were tested for only one period at one sitting. About 48 students were able to finish completely with the Competency Measures for all four processes within two periods of testing. Eight students took ten to fifteen min- utes longer while the remaining three required a full third period. 39 No Individual Competency Measure was begun unless time was available to finish. Students were not informed of the score that they received, nor were they told correct responses. The teachers were not informed of individual results and no record was left with the administration. Administration of Test Items The group test was administered to each of the three classes during their science period on June 3 and 4, 1970. All students were asked to respond to the items even though some had not been tested individually. Each of the three classes reached item 30 on the first day and finished the test on the next day. Each student was provided with a pencil, an answer sheet, and a test booklet. The answer Sheet was marked by students by placing an X through the number of the correct response. Because of absences, there were only 56 students who completed the group test who had been among the 59 students tested individually. During the group testing Robison read the script (Appendix B) while advancing the slides. This author handled the directions, con— trolled the lighting and observed the class for signs to indicate move- ment to another slide. The classroom teacher permitted us complete con- trol of the classes during most of each period. In this way only items using slides were administered as a paced test. During that part of the test, though, the slides were only advanced when all students appeared to have answered the items pertaining to that Slide. 40 Data Collection and Analysis The rSSponses to both the Individual Competency Measures and the group test items were submitted to the Michigan State University Test Scoring Office to analyze and describe the related data from the two evaluation procedures. It Should be noted that several responses and interrelated sets of responses were available due to the design of this study. First, reSponses were available for each of the processes which had been tested by use of Individual Competency Measures. The second observation in the study involved student reSponses to test items. Particular items have earlier been identified as being directly related to each objective of the two integrated processes. The use of a pretest - posttest design implies some comparison of data for the two tests. To hasten this computational work, each student was assigned a two-digit identifying number. Then the item responses of each student were transferred to IBM H94060 answer Sheets which consist of 92 five choice items plus 40 ten choice items. The student's score on the Individual Competency Measures for each of the four processes were coded in the Spaces for ten choice items. The Michigan State University Test Scoring Office then analyzed the 56 papers by reporting item analysis data for all items, using as the criterion score, the student's scores on the Individual Competency Measures. This meant that separate item analyses were computed and reported for each item, one using each of the process scores as the criterion. Among the collection of data received from this first anal— ysis, only those items are reported on which the analysis is based upon the related process Individual Competency Measure. 41 After the results of that analysis were known, the answer Sheets were again scored on each of twenty-four scales, using only those items which had shown an index of item discrimination of .20 or greater. These twenty-four scales were identified as those items relating to one, or a combination of, particular objectives for an integrated process. The 24 scales are described in greater detail and by name in Chapter IV. The student responses to the Individual Competency Measures were also recorded on IBM answer sheets and twenty—one scores were obtained for scales that will be described in the following chapter. These 21 scores and the previously described 24 scores were then com— pared and analyzed by use of the Pearson product moment coefficient of correlation. The correlation data will in included in Chapter IV. Summary A group of seventh grade students were examined who had studied materials selected from Science - A Process Approach during the sixth grade school year. Each student was individually assessed using five Individual Competency Measures corresponding to two of the integrated processes. A group test was developed and administered in which each question required the ability to perform one of the process objectives. Student responses to group test items and on individual tests were recorded on machine scored answer sheets. Analyses of data were performed. The analyses took two forms: (1) an item analysis of the responses to each item, and (2) a comparison of the scores on selected sets of test items. The first analysis gives data which allows deci- sions to be made concerning the success or failure of each item to test 42 the desired objective. The second shows whether statistical relation- ships exist between scores on selected sets of items. A more complete description of these analyses and the results of them are contained in the following chapter. CHAPTER IV ANALYSIS OF RESULTS The Problem The problem of this Study was to develop test items, for use with seventh grade students, which measure the extent of acquisition of Skills in two integrated processes. The two processes, Formulating Hypotheses and Defining Operationally, have been defined by behavioral objectives and measured with Individual Competency Measures, from Science - A Process Approach, to serve as a criterion measure upon which to judge the success of test items. A total of 36 test items have been written and administered to a sample of 56 seventh—grade students. Statistical analysis of the resulting data are presented in this chapter. Level of Discrimination for Accepting or Rejecting Items Each test item developed by this study will require the student to demonstrate performance of a process skill in such a manner that item analysis data which is calculated, using the external criterion of scores on related Individual Competency Measures as the basis for selec— tion of the upper and lower groups, will yield an index of item dis- crimination of .20 or greater. Individual Competency Measure Analysis During administration of the Individual Competency Measures, a record of Student responses was made on Record Sheets like those in Appendix A. The number of correct reSponses was later scored and 43 44 transferred to machine scored sheets by assigning a value of one point for each correct reSponse. The responses for tasks on the Individual Competency Measures was analyzed. Table 1 shows the results for the 56 students by pre- senting the percentage of students who responded correctly to each task of the five Individual Compentency Measures. TABLE 1 THE PERCENTAGE OF STUDENTS CORRECTLY PERFORMING EACH TASK Competency Task Percent Measure Number Correct Formulating 1 98 Hypotheses 2 80 2 3 80 4 88 Formulating l 93 Hypotheses 2 97 4 3 88 4 92 5 98 6 85 7 78 8 97 9 90 10 86 Defining 1 68 Operationally 2 29 4 3 71 Defining 1 97 Operationally 2 78 5 3 81 4 83 5 85 Defining 1 68 Operationally 2 7 3 86 4 10 5 85 45 Examination of Table 1 reveals that only three tasks were com- pleted correctly by less than 68 per cent of the Students. Those three were quite difficult in comparison with the other tasks. The thought occurs, from observation of students, that these items were not cor— rectly reSponded to because of lack of student experience with the mate— rials and/or techniques. The activities which preceeded each Individual Competency Measure were not necessarily part of the background which the Okemos students had acquired in their studies from Science — A Pro- cess Approach during their sixth grade school year. In fact, Defining Operationally 4 — Task 2 requires the student to draw a line that is eleven degrees to the right of a given line. Only 29 per cent were able to perform this task. Many who could not stated that they had no idea of how to begin. The activity for which this Individual Competency Measure was designed had not been part of their previous work. Defining Operationally 6 — Tasks 2 and 4 require the student to, in turn, draw a graph and use that graph. These tasks were missed most often because the directions given do not demand, suggest, or even imply that a graph is required. Students were capable of performing tasks 1, 3, and 5 without doing the other two. It may be noted that on all the tasks for Formulating Hypotheses, the mean per cent correct was 89. No item was answered correctly by less than 78 per cent and two tasks were performed correctly by 98 per cent. Both these Individual Competency Measures had been answered by high per— centages of Students in the experimental classrooms reported in An 46 Evaluation Model and Its Application.1 These tasks were performed correctly by nearly 90 per cent of the students in experimental class- rooms. AS a further indication of the performance of students, the frequency and percentile of total scores on the Individual Competency Measures for the two processes are reported in Table 2. TABLE 2 FREQUENCY AND PERCENTILE RANK OF TOTAL SCORES FOR THE INDIVIDUAL COMPETENCY MEASURES Formulating Hypotheses 2 and 4 Score Frequency Percentile 8 3 3 9 1 6 10 3 9 11 10 20 12 6 34 13 12 49 14 24 80 Defining Operationally 4, 5, and 6 Score Frequency Percentile 2 l l 3 l 3 4 2 5 5 3 9 6 4 15 7 5 23 8 10 36 9 10 53 10 13 72 11 7 89 12 3 97 American Association for the Advancement of Science, An Evalua- tion Model and Its Application, Second Report, (Washington, D. C., 1968). 47 It should be noted in the above table that the distribution of the Formulating Hypotheses scores is skewed greatly toward the higher values. The mean of those values is 12.49 which the stand deviation is 1.88. For Defining Operationally the distribution is also skewed toward the higher values, with a mean of 8.48 and a standard deviation of 2.25 Test Item Analysis The complete data analysis for the test items is presented in the Appendices. Appendix C contains the data for Formulating Hypotheses items. Appendix D contains the data for Defining Operationally items. Selected portions of those data will best be understandable when dis— played in the form of tables, for clarification and comparison, later in this chapter. The item analysis data in each appendix contain three kinds of information for each of the items. First, a count is made of the number of students choosing each response. This information is broken into three categories: the upper 27 per cent, the middle 46 per cent, and the lower 27 per cent of the students based upon their Individual Com— petency Measure score for that one process. The Individual Competency Measure score, as was stated earlier, is the criterion score for analysis of the test items. Second, the item difficulty is given. This value, as recorded, is the percentage of students in the total group who incorrectly answered that item. For example, if 42 students among the 56 tested responded correctly then the 14 incorrect responses correSpond to 25 per cent. 48 The item difficulty in that case would be recorded as 25. An item with a higher item difficulty index was incorrectly answered by a greater percentage of the students. Third, the item discrimination is reported as a percentage value. This is calculated by subtracting the percentage of correct reSponses among students who were in the lower 27 per cent, on the Individual Competency Measure, from the percentage of students who were in the upper 27 per cent and had answered correctly. As an example, let us say that 9 of the 15 students in the lower group and 12 of the 15 in the upper group correctly responded. The discrimination is .60 sub- tracted from .80 or .20. This third bit of information regarding each item is the most valuable for this study. The review of literature in Chapter II pointed out the need to rely upon criterion scores in selecting items for inclusion on tests. The index of item discrimination seems to present the data required to make a decision concerning each item. It was necessary to consider the level of discrimination which would be desirable for acceptance of items. Ebel2 suggested that a level of .20 or greater be accepted for this study since the manner of analysis depended upon a criterion score. Further, a suggestion was made to attempt a revision of those items for which the discrimination was less than .20 and above .10. Below .10 the item might need to be rejected, depending upon its structure. In most discussions of item discrimination a value of .40 or higher is considered desirable. The usual practice, though, is for the 2 Robert L. Ebel, personal discussion. 49 upper and lower groups to be defined by the total score on the test. This study used the Individual Competency Measures as a criterion for separating the groups. Therefore, the responses to any test item had no effect on the choice of upper and lower groups. For these reasons, this study will consider an index of item discrimination of .20 or greater satisfactory. Table 3 lists the indices of discrimination for Formulating Hypotheses items. The complete item analysis data for Formulating Hypotheses items is contained in Appendix C. The items with discriminations of below .20 have been studied after the testing with students which afforded an opportunity to hear some student questions and comments about particular items. The testing coupled with the chance to review the item analysis data indicates that certain changes could be beneficial. Only those TABLE 3 INDICES OF DISCRIMINATION FOR THE FORMULATING HYPOTHESES ITEMS Item Number Index of Discrimination l .00 2 . . . . . . . . . . . . . . . . . . . . . . . . .20 7 . . . . . . . . . . . . . . . . . . . . . . . . . .20 Negative 8 . . . . . . . . . . . . . . . . . . . . . . . . . .26 9 . . . . . . . . . . . . . . . . . . . . . . . . . .13 10 . . . . . . . . . . . . . . . . . . . . . . . . . .06 15 . . . . . . . . . . . . . . . . . . . . . . . . . .20 16 . . . . . . . . . . . . . . . . . . . . . . . . . .27 20 . . . . . . . . . . . . . . . . . . . . . . . . . .13 42 . . . . . . . . . . . . . . . . . . . . . . . . . .33 51 . . . . . . . . . . . . . . . . . . . . . . . . . .20 52 . . . . . . . . . . . . . . . . . . . . . . . . . .26 53 . . . . . . . . . . . . . . . . . . . . . . . . . .07 Negative 56 . . . . . . . . . . . . . . . . . . . . . . . . . .20 57 . . . . . . . . . . . . . . . . . . . . . . . . . .13 59 . . . . . . . . . . . . . . . . . . . . . . . . . .46 61 . . . . . . . . . . . . . . . . . . . . . . . . . .00 79 . . . . . . . . . . . . . . . . . . . . . . . . . .46 50 revisions which give reason for being more successful are reported. The changes suggested are as follows: Item 16, even though it has a discrimination of .27, should have response 3 changed from "both" to "either." The complete item would then read: 16. If both bottles were examined two hours later and neither are frozen, this suggests that: 1. neither contains water. . more time is needed. either of the above are true. 2 3 4. neither of the above are true. For the process of Formulating Hypotheses items 1, 7, 9, 10, 20, 53, 57, and 61 each have an index of discrimination of less than .20 and should therefore be revised or omitted. The original test items are included in Appendix B. Revisions suggested by this testing and some background in test construction suggests the following changes may be worthwhile: Item 20, the first sentence of the stem of the item should be reworded so that the item reads: 20. Richard expressed an idea that all glass bottles, containing frozen water, will break. He tests this by using five bot- tles. A is left empty, B is one—fourth full, C is half full, D is three-fourths full, and E is full. All are placed in a freezer. If the idea is to be supported, which bottles must break? 1. E 2. D and E 3. A, D, and E 4. B, C, D, and E 5. A, B, C. D, and E Item 53, responses 1 through 4 should be revised to read: 53. Judy wishes to perform an experiment to test the idea that the slimy secretion on a snail's foot protects it from injury on sharp surfaces. She should: 1. force a snail to move across broken glass. 2. cut a snail's foot with a knife. 3. wash the secretion off a snail's foot and then force that snail to move across broken glass. read: 51 4. observe and touch the secretion using a magnifier. 5. do both 1 and 3 of the above. Item 57, the second sentence of the stem should be reworded to 57. Robert believes that all plant Stems grow toward their source of light. Which of the following experimental results does not support or reject his belief? A plant grown: l. in a window facing the sun will lean toward the window. 2. inside, under lights, will lean toward that light. 3. outside, in the sunlight, will grow without leaning. 4. inside, in total darkness, will grow without leaning. Item 1, 7, 9, 10, and 61 do not lend themselves to revision based on the evidence from this study. While some changes may be under- taken there seems little promise that the discrimination will improve. Therefore, those items should be omitted. For the items written for the process of Defining Operationally the following table summarizes the discrimination indices: TABLE 4 INDICES OF DISCRIMINATION FOR THE DEFINING OPERATIONALLY ITEMS Item Number Index of Discrimination 14 . . . . . . . . . .33 17 . . . . . . . . .00 18 .00 19 . . . . . . . . . . . . . . . . . . . . .40 21 . . . . . . . . . . . . . . . . . . . . .13 22 .07 41 . .53 43 .00 54 . . . . . . . . . .40 55 . . . . . . . . . . . . . . . . . . . . . . .40 58 . . . . . . . . . . . . . . . . . . . . . . . .54 6O . . . . . . . . . . . . . . . . . . . . . .27 62 . . . . . . . . . . . . . . . . . . . . . .40 69 . . . . . . . . . . . . . . . . . . . . .27 7O . . . . . . . . . . . . . . . . . . . . . .27 Negative 71 . . . . . . . . . . . . . . . . . . . . . . . . . .20 Negative 72 . . . . . . . . . . . . . . . . . . . . . . . . . . .46 76 . . . . . . . . . . . . . . . . . . . . . . .27 52 For the process of Defining Operationally items 17, 18, 21, 22, 43, 70, 71 each have an index of discrimination of less than .20 and should therefore be revised. The complete item analysis data for the Defining Operationally items is included as Appendix D. Careful con- sideration of those items based upon the analysis and testing experience suggests the revision of the items as follows: Item 21, the stem and response 1 should be made to read: 21. Based only on the slides shown, state a rule for the freezing of alcohol. 1. 2. 3. 4. Alcohol will freeze at -20°F. Water will freeze and alcohol will remain liquid at 5%. Alcohol will freeze and water will remain liquid at 50F. Alcohol will not freeze at any temperature. Item 43, reSponses l and 2 are revised as follows: 43. A jar of water at 100 degrees Centigrade is allowed to cool. It is at 40 degrees Centigrade when: 1. 2. 3 4 no boiling is occurring. it feels cold to the hand. a saccharin tablet dissolves in it in 25 seconds. all of the above are true. Item 71, the item stem should be reworded to read: 71. Which of the following definitions of a mountain is most easily agreed upon by two persons when they are standing upon an unknown mountain? A mountain is a land mass that: l. 2. 3. 4. projects above its surroundings. is higher than a hill. has an altitude of 5,000 feet or more. requires considerable work and time to climb. Items l7, 18, 22, and 70 do not lend themselves to revision based on the evidence from this study. While some changes may be under— taken, there seems little promise that the discrimination will be improved. Therefore, the items should be omitted. 53 Correlation of Individual and Group Scores One of the purposes of this study was to develOp group test items that would measure the same skills as the Individual Competency Measures. In using the scores from the Individual Competency Measures as the criterion measure in calculating the index of discrimination, one has valid evidence concerning the extent to which this objective was accomplished. Further evidence of this, however, would result from the calculation of correlation coefficients between the Individual Competency Measure scores and the scores of the cluster of items designed to measure the same integrated process. It can be argued that a high, significant correlation would indicate that the two measures are measuring the same process skills. Pearson product—moment correlation coefficient were computed for each pair of a series of 45 variables. The 45 variables were iden- tified as being the scores on all possible subsets of tasks, from the Individual Competency Measures, and from the group test items. Twenty—four of the variables were represented by scores on the group test items. One variable was assigned to each objective measured by the satisfactorily discriminating items. Another variable was then assigned to each possible combination of those first variables. The other 21 variables were measured as the number of correct reSponseS to the Individual Competency Measures or the objectives related to each task. Using the machine scored answer sheets then, on the Bowling Green State University IBM 360 computer, a matrix of all the correlation coefficients was obtained. Interpretation of those values is aided if identification of the variables is made in some simple, systematic manner. The first series of variables are scores on test items, with discriminations of for the Formulating Hypotheses Variable Number I—-t-‘ HOOWN®U14>UJNH r—nv— wN . . 14. 15. 54 .20 or greater, which had been written objectives. Objective Numbers and UJNNr—‘t-‘t-‘J-‘ri—I H N m 5 Test Item Numbers 15,42,52,56 2,15,42,51,52,56 8,15,16,42,52,56,59 15,42,52,S6,79 2,8,16,51,59 2,51,79 8,16,59,79 2,8,15,16,42,51,52, 56,59 2,15,42,51,52,56,79 8,15,16,42,52,56, 59,79 2,8,16,51,59,79 2,8,15,16,42,51,52, 56,59,79 The second series of variables are scores on test items, with discriminations of .20 or greater, which had been written for the Defining Operationally objectives. Variable Number 16. 17. 18. 19. 20. 21. 22. Objective Numbers and 2 l and 3 2 and 3 1,2 and 3 Test Item Numbers 41,54,62,76 19,55,58,6o 14,69,72 19,41,54,55,58,60 62,76 14,41,54,62,69,72, 76 14,19,55,58,60,69 72 14,19,41,54,55,58 60,62,69,72,76 55 Two variables were defined as the scores on test items prepared for another study. Variable 23. 24. Process Controlling Variables Interpreting Data Objectives All All The Individual Competency Measures and combinations of them are listed, below, as separate variables. Variable Number 25. 26. 27. 28. 29. 30. 31. 32. 33. Process Name Formulating Hypotheses I! Defining Operationally H Controlling Variables Interpreting Data Measure Number 4, 5 and 6 All All Twelve other variables were identified as being those tasks from Individual Competency Measures which are below grouped by the particular objectives for which they are written. Variable Number 34. 35. 36. 37. 38. 39. Process Name Formulating Hypotheses H H H II Objective Numbers 1 Nt—‘t—‘LQN m :3 mm 5:: no.0. UJOJNJ 1Richard Wayne Robison, "The Development of Items Which Assess the Processes of Controlling Variables and Interpreting Data," (unpub— lished Doctor's dissertation, Michigan State University, East Lansing, 1970). 56 Variable Process Objective Number Name Numbers 40. Defining l Operationally 41. " 2 42. " 3 43. " 1 and 2 44. " l and 3 45. " 2 and 3 Using the above 45 variables, the computer programming pre- pared a print-out of approximately one thousand different correlation coefficients. Interpretation of those values requires knowledge of the level of significance attributed to the Pearson product-moment correla- tion coefficient. A table of statistics yields the data below.2 TABLE 5 VALUE OF PEARSON PRODUCT-MOMENT CORRELATION COEFFICIENT REQUIRED FOR SPECIFIC LEVELS OF SIGNIFICANCE (N=56, df=54) Level of Significance Correlation Required .05 .265 .02 .311 .01 .343 .001 .430 From Table 5 it is obvious that correlation coefficients equal to or greater than .430 are significant at the .001 level. The data from Table 6 indicate that all correlations are well beyond this value. In calculating the correlation coefficients, only those items that have an index of discrimination of .20 or greater were used. This was done because only the items which were not defective and met our 2 N. M. Downie and R. W. Heath, Basic Statistical Methods, (New York, Harper & Row, Second Edition, 1965), p. 306. 57 TABLE 6 CORRELATION OF INDIVIDUAL COMPETENCY MEASURE SCORES AND GROUP TEST SCORES FOR EACH OF THE FOUR PROCESSES Process Correlation Significance Formulating Hypotheses .535 .001 Defining Operationally .565 .001 Controlling Variables .705 .001 Interpreting Data .660 .001 criteria are of interest. Using only those items that have an index of discrimination of .20 or greater does introduce a systematic bias which tends to inflate the correlation coefficient values. Evidence has been introduced to Show that it is possible to write test items which have an index of discrimination of .20 or greater. Subsequent research in testing of the processes will require arranging these items into test instruments. The correlation coefficients, even though slightly inflated by the systematic bias, are offered as additional evidence that group process tests can be written to measure the same process skills as the Individual Competency Measures. Another series of correlation values are of interest which relate the Individual Competency Measure scores to those of the variables mentioned earlier (identified by number) of the same process items from the group test. 58 TABLE 7 CORRELATION AND SIGNIFICANCE OF FORMULATING HYPOTHESES INDIVIDUAL COMPETENCY MEASURE SCORES WITH THE SCORES FROM THE GROUP TEST ITEMS FOR FORMULATING HYPOTHESES Variable Correlation Significance l .370 .05 2 .259 .05 3 .424 .01 4 .400 .01 5 .422 .01 6 .459 .001 7 .461 .001 8 .456 .001 9 .405 .01 10 .515 .001 11 .490 .001 12 .488 .001 13 .514 .001 14 .522 .001 15 .535 .001 As expected, the highest correlation is received when using all the group test items which had a discriminiation of .20 or greater for the process of Formulating Hypotheses. TABLE 8 CORRELATION AND SIGNIFICANCE OF DEFINING OPERATIONALLY INDIVIDUAL COMPETENCY MEASURE SCORES WITH THE SCORES FROM THE GROUP TEST ITEMS FOR DEFINING OPERATIONALLY Variable Correlation Significance 16 .495 .001 17 .414 .01 18 .464 .001 19 .523 .001 20 .546 .001 21 .512 .001 22 .565 .001 59 The above values point out that only those items for objective 2 of Defining Operationally, from the group test items, do not have a correlation significant at the .001 level with the Individual Compe- tency Measure. Again, the highest correlation is noted when all the items for Defining Operationally are used. It can be further noted that each of the two integrated pro- cesses of this study are significantly correlated with each of the three others tested in the research on some of the possible relation- ships. This is evident from the data of Table 9 in which only two correlations are not Significant at the .001 level. The Individual Competency Measure scores for Formulating Hypo— theses are not correlated, at the .001 level of significance, with the TABLE 9 CORRELATION OF INDIVIDUAL COMPETENCY MEASURE SCORES AND GROUP TEST SCORES Individual Group Competency Test Measures Item FH3 DO CV ID FH DO CV ID Individual FH 1 Competency DO .430 1 Measures CV .540 .573 1 ID .583 .536 .601 1 Group FH .535 .434 .656 .525 1 Test DO .425 .565 .664 .537 .662 1 Items CV .598 .483 .705 .503 .618 .786 1 ID .414 .502 .703 .660 .561 .728 .779 1 3FE = Formulating Hypotheses, D0 = Defining Operationally, CV Controlling Variables, and ID = Interpreting Data 60 group test scores of Defining Operationally and Interpreting Data. Both those correlations, though are higher than the correlation required at the .01 level of significance. Another correlation coefficient is of interest when reviewing the hundreds remaining. Variable 36 (Objective 3 of Formulating Hypo— theses Individual Competency Measures) is correlated with only two of the other 44 variables at a significant level, Variables 24 and 33, both significant at .05 level. It is interesting that the objective was tested by only one task on Formulating Hypotheses 2 and seems more related to Interpreting Data. Summary Item analysis data for the eighteen questions developed for each of the two processes are reported. Ten items were found to have a satisfactory index of discrimination for Formulating Hypotheses while eleven items were satisfactory for Defining Operationally. Certain items which had unsatisfactory discriminating power were revised. Correlations for selected pairs of the 45 variables from the Individual Competency Measures and group test items have been reported. The relationship between group test item scores and their respective Individual Competency Measure scores are, in all cases, greater than zero at the .01 level of significance. CHAPTER V SUMMARY AND CONCLUSIONS Instructional implementation to aid deve10pment of processes has become a concern of many classroom teachers. A problem associated with this developing concern is that present evaluation techniques are inadequate to assess the acquistion of process skills. This study was an attempt to develop group test items which can be shown to measure process skills. The procedures which have been adOpted are unlike those of many of the widely used standardized tests in that the success of individual test items was determined by comparison with the student reSponses on an independent, external criterion. Individual Competency Measures selected from Science — A Process Approach were administered to and served as the criterion for analysis of data from the sample of seventh grade students. Each Student was tested individually for approximately two hours and rated on having attained the objectives for each of four processes, two of which are discussed inthe study. Group test items over those two processes, Formulating Hypotheses and Defining Operationally, were developed using the behavioral objectives of Science - A Process Approach as a basis. These items were assembled into a 79-item, multiple-choice test in cooperation with Richard W. Robison. The group test was administered to the same sample of students who had previously been individually tested. Summary of Findings Item analysis data were obtained relating each item of the group test of four processes to the scores on the Individual Competency 61 62 Measures. Ten items for the process of Formulating Hypotheses and eleven items for the process of Defining Operationally had an index of item discrimination that was .20 or greater. The index of item dis- crimination of .20, since it is based in this study upon an external criterion, is sufficiently high to indicate a direct, positive rela- tionship between the two testing techniques. In an attempt to further define the nature of this relation- ship, the student answer sheets were, later, scored for each of the processes, using only those items with a discrimination of .20 or greater, and a significant correlation was found between the individual and group test scores on both of the processes. The correlation coef- ficients have been reported in Chapter IV. A brief summary, however, of those values and the statements of significance about them indi— cated that the Individual Competency Measure scores for the two inte— grated processes, Formulating Hypotheses and Defining Operationally, are correlated at the .001 level of significance with their representa- tive test item scores from the group test of four processes. Conclusion Study of the findings and interpretation of the experience of performing this study indicates that it is possible to measure the extent of acquisition of skills in the integrated science processes, Formulating Hypotheses and Defining Operationally, if selection of test items is based upon reference to a criterion measure. 63 Recommendations The review of literature, the research experience, and the data from this study suggests that the following recommendations be proposed: 1. Measurement of acquisition of skills in the Integrated Processes should be based upon a criterion measure to determine validity. The Individual Competency Measures from Science - A Process Approach can serve as the criterion measure. 2. The procedure employed in this study to develop, admin- ister and evaluate test items is recommended. Although this procedure is time consuming, it is well worth the effort to obtain criterion related validity. 3. Research should be initiated which will provide additional sources of test items and data from which to select appro- priate written tests of process skill acquisition. 4. The development of instruments and procedures to measure mental skill acquisition should proceed with a substantial, balanced research effort. This program should be begun immediately with adequate funding and competent personnel to assure the educational interests that aSpects of mental ability, other than command of knowledge and general intel— ligence, will be measurable. ' 5. Diagnostic instruments for classroom use should be deve10ped which would help teachers identify the level of process skill development of each child. Problems and Implications for Further Research To implement the above recommendations efforts to deve10p a more extensive battery of test items over these and related process skills should be undertaken. The extent of development of the test items in this study indicates that satisfactory evaluation instruments are possible. The test items produced in this study and others, from Similar research in the near future, need to be assembed into a group process skill test for which normative data can be obtained. The procedures 64 that should be involved in that undertaking include: 1. Preliminary testing of more items for particular pro- cesses as the study has reported. 2. Selection of those items from the complete battery avail— able which: (a) have been shown to produce an index of item discrimination of .20 or greater based upon a per- formance test criterion score and (b) lend support to an attempt to assemble a test which is balanced well in terms of the stated objectives. 3. Selection of a sample of students balanced nearly the same as the population on factors such as age, sex, educational background, presence of overt attempts to teach process development, sociological background, and teacher-related variables. 4. Development, printing and distribution of test booklets and testing instructions for mass testing of the sample on nearly simultaneous dates. 5. Collection of data relating student success on the test with factors such as the age, sex, background in process education, reading ability, success in school, sociological factors, and teacher-related factors. 6. Publication of results including final forms of the test, the normative data for future reference, and information needed by teachers desiring use of the finished instrument. The procedures described in this study should prove valuable in the deve10pment of test items and instruments for objectives other than those of the processes. The use of a criterion score as a basis for choosing items should be adaptable to many fields. Attempts at future development of test items and/or instruments, in fields, formerly thought to be untestable, should be formulated when criterion scores are available. One example might be to test the influence of reading skills on written test item success. Items might be identified which do not require high reading skills by using a reading Skill score as a criterion and selecting for retention those items which have an index of item discrimination near zero. 65 A major problem with the above studies, as suggested, is that few, if any, recognized or universally accepted goals in performance terms exist for many areas of human development and achievement. There— for, a set of behavioral objectives might be developed to determine the extent of agreement upon the rationale and experience of the sug— gested behaviors as goals. Achievement of those goals can then be used as the criterion measure in an attempt to develop a more effective evaluative means. The desire to develop test items, when an acceptable set of behaviors are available, can be justified as a means of increasing the efficiency of evaluation. Observation of subjects to determine whether they exhibit the desired and specified behaviors is an extremely time consuming task requiring highly trained and specialized personnel. An instrument consisting of written test items, with evidence of criterion related validity, would reduce the manpower needs for evaluation and free those individuals for other important work. Implications for the Classroom Teacher Testing programs of the type reported in this study should become more prevalent and frequent in the ensuing years. The decision of a teacher to rely upon data from an instrument of this type for eval— uation should be preceded by much thought. The teacher should ascertain, in advance, whether the instrument is based upon the same objectives as those undergirding the classroom instruction. At this time the classroom teacher should be aware that the possibility of deve10ping test items over objectives such as the inte— grated process skills does exist. Teachers Should become more adept at selecting, using and interpreting tests based upon the means of 66 development and documentation. The purposes of using an instrument are to provide more evaluative and diagnostic information than the teacher can obtain by other means. Implications for Science Education This research report describing the procedures and results of the evaluation of acquisition of the integrated process skills should heighten interest in teaching the process skills. Now it is more likely that evaluation instruments will soon be available. Persons who are reSponSible for science curriculum developments, selection, and implementation may, with more confidence, include activities and pupil experiences which are oriented toward process development. This confidence Should be supported by increased efforts to research pro— cess skill acquisition. The research efforts of science educators would be well employed, for an extended period of time, if concentrated on process acquisition. The present thinking of testing Specialists, that a test item does test the intended objective because a panel of educators agrees to it, should be modified. A more refined procedure, reference to a criterion measure, for determining the validity of an item is now available and that procedure should become more widely employed in research of the future. Consequently, the persons responsible for the preparation of elementary and secondary level teachers should begin more extensive efforts to acquaint the candidates with the problems and promise of process education. The key to the classroom experience of children is the success with which the teacher can develop classroom experiences 67 which are built upon his experience, attitudes, and objectives. Efforts should begin, now, to implement process skill acquisition activities into the professional preparation of all teachers. BIBLIOGRAPHY éZ; BIBLIOGRAPHY Books American Association for the Advancement of Science, An Evaluation Model and Its Application. Second Report. Washington, D. C., 1968. . The Psychological Bases of Science - A Process Approach. AAAS Misc. Publication 65-68, 1965. Science - A Process Approach Commentary for Teachers. AAAS Misc. Publication 68-7, 1968. Science - A Process Approach, Part Six. Fourth Experimental Editoin, AAAS Misc. Publication 67-11, 1967. Barclay, James R. Controversial Issues in Testing. Boston: Houghton Mifflin Co., 1968. Bruner, J. S, and P. B. Dow. Man: A Course of Study: A Description of an Elementary Social Studies Curriculum. Cambridge: Education Development Center, 1967. Bruner, Jerome S. et. al. A Study of Thinking. London: John Wiley & Sons, Inc., 1956. Bruner, Jerome S. Toward a Theory of Instruction. New York: W. W. Norton & Co., 1967. Buros, Oscar K. "Criticisms of Commonly Used Methods of Validating Achievement Test Items," Proceedings of the 1948 Invitational Conference on Testing Problems. Educational Testing Service, 1949. ed. The Sixth Mental Measurements Yearbook. Highland Park, New Jersey: The Gryphon Press, 1965. Campbell, Donald F., and Julian C. Stanley. Experimental and Quasi— Experimental Designs for Research. Chicago: Rand McNally, 1963. Craig, Gerald S. Certain Techniques Used in Developing a Course of Study in Science for the Horace Mann Elementary SchOol. New York: Teachers College, Columbia University, 1927. 68 69 Downie, N. M. and R. W. Heath. Basic Statistical Methods. New York: Harper & Row, Second Edition, 1965. Ebel, Robert L. Measuring Educational Achievement. Englewood Cliffs, New Jersey: Prentice-Hall, Inc., 1965. Education Development Center. Goals for the Correlation of Elementary Science and Mathematics. Boston: Houghton Mifflin Co., 1969. Guilford, J. P. The Nature of Human Intelligence. New York: McGraw Hill, 1967 0 Henry, Nelson B. (ed.). Rethinking Science Education. National Society for the Study of Education. Fifty-ninth Yearbook. University of Chicago Press, 1960. (ed.). Science Education in American Schools. National Society for the Society of Education. Forty-sixth Yearbook. University of Chicago Press, 1947. Horrocks, John E., and Thelma I. Schoonover. Measurement for Teachers. Columbus, Ohio: Charles E. Merrill Publishing Company, 1968. Howe, Edward G. Systematic Science Teaching. New York: D. Appleton, 1894. Hurd, Paul DeHart, and James Joseph Gallagher. New Directions in Ele- mentary Science Teaching. Belmont, California: Wadsworth Publishing, 1969. Jackman, Wilbur S. The Third Yearbook of the National Society for the Scientific Study of Education. Part II, Nature - Study. Chicago: University of Chicago Press, 1904. Karplus, Robert, and Herbert D. Thier. A New Look at Elementary School Science. Chicago: Rand McNally, 1967. National Education Association. The Central Purpose of American Education. Washington, D. C., 1961. Piaget, Jean. Six Psychological Studies. New York: Random House, 1967. Whipple, Guy M. (ed.). A Program for Teaching Science. Thirty-first Yearbook. National Society for the Study of Education. Chicago: University of Chicago Press, 1932. Periodicals American Association for the Advancement of Science. "The Process Method of Science Teaching." Grade Teacher, LXXXIII (January, 1966), 59-61, 113. 70 Borton, Terry. "What's Left When School's Forgotten?" Saturday Review, LIII (April 18, 1970), 69—71, 79. Burns, Richard W., and Gary D. Brooks. "What Are Educational Processes?" The Science Teacher, XXXVII (February, 1970), 27-28. Cole, Henry P. "Process Curricula and Creativity Development." Journal of Creative Behavior, Vol. III, No. 4, (Fall 1969), p. 244. Gagne, Robert M. "Contributions to Human Development." Psychological Review, LXXV, No. 3, 177-191 . "Learning Hierarchies." Presidential Address, Division 15, American Psychological Association, (San Francisco, August, 1968). Smith, Herbert A. "Educational Research Related to Science Instruction for the Elementary and Junior High School: A Review and Commentary." Journal of Research in Science Teaching, Vol. I, No. 3, (1963), p. 206. Unpublished Works Beard, Jean. "Group Achievement Tests Developed for Two Basic Processes of AAAS Science — A Process Approach." Unpublished dissertation, Oregon State University, 1970. Kresse, F. H. "Materials, and Activities for Teachers and Children." A Project to develop and evaluate multi-media kits for elemen- tary schools, Volume I and II, (Washington, D. C., U.S.O.E. Final Report, Project #5-0710, 1968). Robison, Richard Wayne. "The Development of Items Which Assess the Pro— cesses of Controlling Variables and Interpreting Data." Unpub— lished dissertation, Michigan State University, 1970. Tannenbaum, Robert S. "The Development of the Test of Science Processes." Unpublished dissertation, Teachers College, Columbia University, 1968. APPENDICES l7 _,- ‘ APPENDIX A SAMPLE RECORD SHEETS AND WORK SHEETS FOR INDIVIDUAL COMPETENCY MEASURES 71 RECORD SHEET INDIVIDUAL COMPETENCY MEASURES Name Process Score Formulating Hypotheses FH—2 FH-4 Defining Operationally DO-4 DO—5 DO-6 72 TOTAL TOTAL 73 Formulating Hypotheses 2 THE PUSH — ROD BOX INDIVIDUAL EOMPETENCY MEASURE Task 1: Task 2: Task 3: Task 4: The child constructs a hypothesis that could explain what is going on inside the box. For example: The strings are attached to a coiled spring inside the box, and there is a metal object on the end of each string that stops it from being pulled out of the box. The child makes any inference that can be tested. For example: If I pull one string out as far as it will go, I should be able to pull the other string out until it is Stopped by the metal object attached to it. The pupil carries out the test. He constructs a hypothesis that takes account of the observation that in both boxes the two strings can be pulled out at the same time, but that in one case more force has to be exerted to do this. Formulating Hypotheses 4 TASTERS AND NONTASTERS INDIVIDUAL COMPETENCY MEASURE Task 1: Task 2: Task 3: Task 4: Task 5: Task 6: Task 7: Task 8: Task 9: The child says not very many or maybe none. The child says they are more apt to be boys. The child states a hypothesis such as Color blindness does not occur very frequently, or something equivalent. The child states that color blindness is more frequent in males than in females, or something equivalent. The child answers yes. The child answers no. The child answers no. The child answers no or says it does not tell him any- thing about children of color-blind parents. The child answers yes, indicating that all of them will be color-blind according to the table. 74 Formulating Hypotheses 4 continued Task 10: The child answers that if the mother is color-blind, then all her sons will be color-blind, or something equivalent. Defining Operationally 4 DETERMINING THE DIRECTION OF TRUE NORTH INDIVIDUAL COMPETENCY MEASURE Task 1: Task 2: Task 3: The child draws the shortest line from 9 r0 the curve and labels it N. The child measures and constructs an angle to the right of his line, EN, which is within 20 of the required 110 (by your measurement). The child says or writes a statement similar to the following: The direction of true north is the direction of the shadow of a vertical object at a time halfway between the times of rising and setting of the sun. Defining Operationally 5 USING OPERATIONAL DEFINITIONS OF PARTS OF LIVING THINGS INDIVIDUAL COMPETENCY MEASURE Task 1: Task 2: Task 3: Task 4: Task 5: The child constructs an Operational definition that describes the observable characteristics of the peanut and the use to which it can be put. For example: an object which is light brown in color with a smooth but uneven surface approximately __pm in length and __cm in width. When Shaken it rattles. It contains one, two, or three small ellipsoidal brown objects that can be eaten. The child labels the Centruroides with an A. The identifies at least three observable characteristics of animal A. The child labels the Argiope with a B. The child identifies at least three observable character- istics of animal B. 75 Defining Operationally 6 MASS INDIVIDUAL COMPETENCY MEASURE Task 1: Task 2: Task 3: Task 4: Task 5: The child demonstrates an acceptable procedure in which he counts and records the number of vibrations with several different standard masses. The child constructs a graph of the data he has collected and recorded. The child places the cube on the vibrator and counts and records the number of vibrations. The child uses the graph to determine the mass of the cube. The child states an Operational definition such as: The mass of an object is equal to the number of grams which give the same number Of up and down motions as the object that is placed in the pan. 76 WORKSHEET O The direction of true north is 77 DEFINITIONS Argiope: body ellipsoidal; body not pointed; eight legs; each leg is longer than 3 cm; head is shaped like a Limulus: body pointed at tail end; point is straight; more than eight legs and claws; one small hook at the Boophilus: body ellipsoidal; body not pointed; eight short legs (less than 1 cm long); head is not shaped like a l. shield. 2. end of each leg. 3. shield. 4. Centruroides: pointed at tail end; point is curved; more than eight legs and claws; each leg and claw is less than 3 cm long; two small hooks at the end of each leg. Table l When both father and mother are colorblind, color blind— ness in the children occurs as follows: Sons all Daughters all When mother is color-blind and father is not color—blind, color blindness in the children is as follows: all none When father is color-blind and mother is not color-blind, color blindness in the children is as follows: 0.25 0.25 When neither father nor mother is color-blind, color blindness in the children is as follows: 0.25 none 78 APPENDIX B GROUP TEST ITEMS WITH PHOTOGRAPHS AND SCRIPT TO REPRESENT SLIDE PRESENTATION 79 APPENDIX B GROUP TEST ITEMS WITH PHOTOGRAPHS AND SCRIPT TO REPRESENT SLIDE PRESENTATION by Darrel W. Fyffe DIRECTIONS Please do not make any marks in this test booklet. When you are instructed on the test to WAIT please look toward the front of the room so that one can see when more information may be given to you. Listen carefully to the instructions that will be read. DO not return to an earlier question to change an answer after more information has been given. 80 81 The test items developed in this study follow. Also included is the script that was read to the students during administration of the group test items. The photographs are representative of the color slides which were projected. Directions for the narrator are under— lined, while those for students are capitalized and underlined. Some members of a science class wanted to see how far over the edge of a table they could balance a group of sticks. Show slide 1. 82 Each stick is twelve units long. Show slide 2. The balancing of each stick would look something like this. Show slide 3. Turn to page one and answer questions one and two. 83 1. If one stick is placed on top of the table and extended until it almost falls, where will the end of the stick reach? The number will be slightly less than: 1 l. 2. 6 3. 9 4. 12 5. cannot tell 2: If you could choose the next slide so that the answer to the above question would be known, which of the following would you chOose? l. a picture of a stick balancing . another picture of the size of the stick . a diagram of the science laws explaining it . any of the above would be satisfactory . none of the above would be satisfactory LII-wa STOP. WAIT FOR INSTRUCTIONS. The picture on the screen shows two piences of posterboard marked A and B. Show slide 9. Each has a paper clip attached by what appears to be a white string. 84 The two frames are now shown where A is in the same position but B has been turned upside down. Show slide 10. Turn to page three and answer question seven. 7. Based on this experiment, you might suppose that: 1. string B is held straight by a fine thread tied to the paper clip. 2. String A will remain straight when the frame is turned over. 3. string A will fall when the frame is turned over. 4. all of the above are true. STOP. WAIT FOR INSTRUCTIONS. 85 This slide shows fram A, also, turned upside down just like frame B. Show slide 11. Turn the page and answer both questions. 8. This additional information allows one to observe that string: A is pop held straight by a fine thread. . B is made Of a different material than string A. . A is made of a stiff wire that is now bent. B is made of a stiff wire that is straight. waH 9. What evidence is there that suggests that frame B is different from frame A? 1. Frame A is upside down. 2. Frame B is not upside down. 3. Paper clip A fell. 4. Paper clip B did not fall. STOP. WAIT FOR INSTRUCTIONS. 86 This picture shows both frames to be upside down as before. Show slide 12. However, both strings are now lying at the bottom. Turn to page five and answer question ten. 10. Based on the evidence from the slides, which of the following are possible explanations? 1. String B was held straight by a fine thread that has been cut. 2. String B was held straight by a magnet that has been removed. 3. String B was made of stiff wire that had been painted white. 4. All of the above appear to be possible from evidence in the slides. GO ON TO THE NEXT QUESTION. 87 THE BENDING OF A LONG STICK \l O 0‘ O height from floor (cmS) 50 weight (grams) 14. Based on this experiment, an object that has a weight of 100 grams is one that: 1. has 100 g. marked on it. 2 will bend the stick to 71.2 centimeters from the floor. 3. has been measured on a scale. 4. is made of metal. 5. is all of the above. STOP. WAIT FOR INSTRUCTIONS. The two bottles are filled. Show slide 22. 88 Each is now placed in a plastic bag...Show slide 23... ...and then in a freezer that is set at 50 Fahrenheit. Show slide 24. Turn to page seven, answer questions 15, 16, and 17, and then wait for the next slide. 15. Water f might e later b l. 2. 3. 4 16. If both frozen, 1. 2. 3. 4 17. If a ja or more 1. 2. 3. 4. The 89 reezes at temperatures of 32°F or below. Therefore one xpect the liquid in bottles X and Y to be frozen one day ecause: water freezes at any temperature of 32°F or below. the liquids in the bottles look like water. one day is long enought to freeze water. all of the above are true. bottles were examined two hours later and neither are this suggests that: neither contains water. more time is needed. both of the above are true. neither of the above are true. r of liquid freezes when left in a freezer for three hours it is one of the Cronons. Therefore: liquid X is a Cronon. liquid Y is a Cronon. both liquids are Cronons. none of the above are true. time is now one day later. Show slide 25. The bottles are removed and you can see the results. One bottle is broken and its liquid is frozen. Answer questions 18 and 19. 90 18. You now have evidence that the two liquids are: l. the same. 2. different. 3. Cronons. 4. water. 19. If another jar of water is left in a freezer for three hours and is found to be frozen we know that: l. the freezer was at a temperature of 00F. 2. the water was cold before freezing. 3. water is one of the Cronons. 4. All of the above must be true. STOP. WAIT FOR THE NEXT SLIDE. This slide shows the liquids that were in each bottle. Show slide 26. They are sitting beside the same bottle into which they were poured. Turn to page eight and answer questions 20, 21, and 22. 20. 21. 22. 91 Richard expressed an idea that all glass bottles will break when the liquid they contain freezes. He tests this by using five bottles. A is left empty, B is one-fourth full, C is half full, D is three-fourths full, and E is full. All are placed in a freezer. If the idea is to be supported, which bottles must break? 1. E 2. D and E 3. A, C, and E 4. B, C, D, and E 5. A, B, C, D, and E State a rule for deciding which liquids will freeze- 1. Alcohol will freeze at 00F. 2. Water will freeze and alcohol will remain liquid at 50F. 3. Alcohol will freeze and water will remain liquid at 50F. 4. Alcohol will not freeze at any temperature. Based on this experiment, alcohol may be defined as a liquid that: 1. freezes at a temperature below 50F. 2. will not freeze at 50F. 3. will not freeze at any temperature. 4. will do both 1 and 2. STOP. WAIT FOR INSTRUCTIONS. Using the following data table, answer the following questions. 41. ROD MATERIAL LENGTH TYPE TIME A metal 2 cm solid 5 secs B metal 2 cm hollow 10 secs C metal 8 cm solid 5 secs D metal 8 cm hollow 10 secs E plastic 2 cm solid 5 secs F plastic 2 cm hollow 10 secs G plastic 8 cm solid 5 secs H plastic 8 cm hollow 10 secs A solid rod is one which: 1. is metal. 2. is 8 centimeters long. 3. rolls down the incline in 5 seconds. 4. does or is all Of the above. 92 42. Evidence has been recorded to suggest that the rolling time for hollow rods: 1. is ten seconds in all cases. 2. does not change with the material. 3. does not change with the length 4 . is all of the above. STOP. WAIT FOR INSTRUCTIONS. The following information is used for questions 43 to 50. A group of students dissolved saccharin They measured the time required for one of different temperatures. The results below in the data table and pictured in TEMPERATURE OF WATER 1 tablets in one cup of water. tablet to dissolve in water of this experiment are listed the graph. TIME TO DISSOLVE (degrees Centigrade) (seconds) 10 degrees 60 30 degrees 30 60 degrees 20 80 degrees 15 TEMPERATURE (degrees Centigrade) 10 20 30 TIME (secs.) 40 50 60 43. A jar of water at 100 degrees Centigrade is allowed to cool. It is at 40 degrees Centigrade when: 1. no steam is visible. . it feels warm to the hand. 2 3. a saccharin tablet dissolves in it in 25 seconds. 4 . all of the above are true. 93 51. Jean watches a bull fight and states that bulls charge red objects. To test that idea she should use a bull and: 1. no matador; red objects and objects Of other colors places about the ring. 2. a matador standing and holding various colored Objects for a short time each. 3. a moving matador waving a cape that is red on one side and gree on the other. 4. a moving matador waving several capes of many different colors for a Short time each. 52. When a quart of water and a quart of alcohol are mixed the volume of the mixture is less than two quarts. Which of the following is 22E a possible explanation for this Observation? . Alcohol evaporates quickly. Alcohol and water expand when mixed. Liquids have space between their molecules. Liquids cool and contract when mixed. H J-‘LAJN 53. Judy wishes to perform an experiment to test the idea that the slimy secretion on a snail's foot protects it from injury on sharp surfaces. She should. 1. get a snail to move across broken glass. 2. try to cut a snail's foot with a knife. 3. wash the secretion Off a snail's foot and then have that snail move across broken glass. 4. consult a good book about snails. 5 do both 1 and 3 of the above. The following information is used for questions 54 and 55. Two stars which orbit each other are called twin stars. One particular star, Algol, has a star orbiting it every three days blocking its light. This star has never been seen by anyone and does not produce light. 54. This star is called a twin star because it: looks like Algol. . orbits Algol. passes in front of Algol. blocks the light from Algol. #0.)th 55. A star that does not produce light is called a dark star. Which Of the following is a dark star? 1. Algol. 2. The twin Of Algol. 3 A star that appears dark in the night sky. 4 A star that consists Of dark rock. 56. 57. 58. 59. 60. 94 A paper cup is partially filled with water and is heated by a candle placed under it. Although the flame is near the cup, it does not burn the cup. One could explain this by saying that the: . cup may have become water soaked. cup may be made of Special paper. water may be leaking through the botton. water may absorb the heat. above statements are all possible. LBJ-\wNH Robert believes that all plant stems grow toward their source of light. Which of the following experiments does pop support his belief? A plant grown: l. in a window facing the sun will lean toward the window. 2. inside, under lights, will lean toward that light. 3. outside, in the sunlight, will grow without learning. 4. inside, in total darkness, will drow without leaning. Work is done when an Object is moved. In which Of the following is np_work being done? 1. a tow truck pulling a car. 2. a boy throwing a baseball 3. a man holding a heavy book 4. a man walking along a road All stiff Objects can be bent a small amount when any force is applied. This idea can be accepted until: 1. people no longer believe it. 2. someone finds several objects that will bend. 3. someone says it is pop true. 4. someone finds an Object that will np£_bend. Which word in the statement, "I expect you to correctly solve all your math homework soon," requires a clearer explanation? 1. math 2. solve 3. soon 4. you 61. 62. 69. 70. 71. 95 An experiment is performed in which the stem and leaves of a tomato plant are removed and grafted onto a tobacco plant root. Also, the tobacco plant stem and leaves are grafted onto the tomato plant root. The plants are examined after several weeks and nicotine is found in the tomato leaves. One might predict that this is observed because the nicotine is produced in the: tobacco leaves. tobacco stem. . tobacco roots. tomato leaves. DWNH Mary has a thermometer in her bedroom. Her thermometer is best described as a (an): 1. mercury-filled tube. 2. device to measure temperature. 3. expensive gift from a friend. 4. indoor thermometer. You are given a block of wood and a container Of an unknown liquid. TO determine whether the wood will float in that liquid you could: I. find the specific gravity of the wood, if it is greater than 1.00 the wood will sink. 2. put the block of wood in the unknown liquid and watch it. 3. put the block of wood in other liquids and watch them. 4. put blocks of wood, but not the one given you, in your unknown liquid. Each of the following statements refers to an experiment. Which tells most clearly what to do and what to observe. 1. Add 5 cubic centimeters of sodium hydroxide to 50 cubic centimeters of grape juice. 2. Add sodium hydroxide to grape juice, the juice will change color. 3. An increase in the hydroxide concentration will cause color variations in indicators. 4. Grape juice contains chemical indicators that have definite colors. Which of the following definitions of a mountain is least likely to be disagreed upon by two different persons? 1. projects above its surroundings. 2. is higher than a hill. 3. has an altitude of 5000 feet or more. 4 requires considerable work and time to climb. 72. 96 A girl removed a lid from a jar by prying with the blade Of a table knife. From that Operation you might say a knife is a: l. sterling silver object with one sharp edge and a decorated handle. 2. stainless steel Object about eight inches long with a thin blade. 3. metal Object that can be used as a lever to Open jars. 4. form of an inclined plan that reduces the force needed to cut. Use the following contour map to answer the following three questions: 76. 79. A A person goes uphill when they more to a higher elevation. Which Of the following is uphill? l. B to A 2. B to D 3. C to A 4. D to B Plant cells have protoplasm that is either a liquid or a soft jelly. Several years after that idea was accepted a scientist found one type of green plant which had hard protoplasm. The idea should be changed to say that protoplasm of plant cells is: ' 1. liquid, hard or a soft jelly. 2. the non-living material Of the cell. 3. yellow in color. 4. none of the above. APPENDIX C ITEM ANALYSIS DATA FOR FORMULATING HYPOTHESES ITEMS 97 Item 1 The Correct Option is 2 98 Item Response Pattern 2 3 l 4 5 Omit Total Upper 27% l 7 4 l l l 15 'Middle 46% 0 15 3 l 6 l 26 Lower 27% l 7 4 1 1 1 15 Total 2 29 11 3 8 3 56 Item Statistics Index Of Difficulty 48 Index of Discrimination .00 Item 2 The Correct Option is 1 Item Response Pattern 1 2 3 4 5 Omit Total Upper 27% 6 1 3 1 3 l 15 Middle 46% 9 0 7 3 6 1 26 Lower 27% 3 3 1 5 2 1 15 Total 18 4 11 9 11 3 56 Item Statistics Index of Difficulty 68 Index of Discrimination .20 99 Item 7 The Correct Option is 4 Item Response Pattern l 2 3 4 Omit Total Upper 27% 8 5 1 0 l 15 Middle 46% 14 8 2 l l 26 Lower 27% 5 4 2 3 l 15 Total 27 17 5 4 3 56 Item Statistics Index of Difficulty 92 Index of Discrimination .20 Negative Item 8 The Correct Option is 1 Item Response Pattern l 2 3 4 Omit Total ,Upper 27% 11 l O 2 l 15 Middle 46% 20 2 2 l l 26 Lower 27% 7 2 1 4 l 15 Total 38 5 3 7 3 56 Item Statistics Index Of Difficulty 32 Index Of Discrimination .26 100 Item 9 The Correct Option is 4 Item Response Pattern l 2 3 4 Omit Total Upper 27% 0 0 7 7 1 15 Middle 46% 0 0 13 11 2 26 Lower 27% O 3 6 5 l 15 Total 0 3 26 23 4 56 Item Statistics Index of Difficulty 58 Index Of Discrimination .13 Item 10 The Correct Option is 4 Item Response Pattern l 2 3 4 Omit Total Upper 27% l 4 0 8 2 15 Middle 46% l 9 0 13 3 26 Lower 27% 3 3 l 7 l 15 Total 5 16 l 18 6 56 Item Statistics Index Of Difficulty 51 Index of Discrimination .06 101 Item 15 The Correct Option is 4 Item Response Pattern l 2 3 4 5 Omit Total Upper 27% l 3 0 10 l 15 Middle 46% 6 2 2 15 l 26 Lower 27% 5 l 1 7 l 15 Total 12 6 3 32 3 56 Item Statistics Index of Difficulty 42 Index of Discrimination .20 Item 16 The Correct Option is 3 Item Response Pattern l 2 3 4 5 Omit Total Upper 27% l 6 7 0 l 15 Middle 46% 2 10 8 5 l 26 Lower 27% 2 7 3 l 2 15 Total 5 23 18 6 4 56 Item Statistics Index of Difficulty 68 Index Of Discrimination .27 102 Item 20 The Correct Option is 4 Item Response Pattern l 2 3 4 5 Omit Total Upper 27% 5 l l 5 2 l 15 Middle 46% 9 0 0 8 8 l 26 Lower 27 % 5 4 l 3 l 1 15 Total 19 5 2 16 ll 3 56 Item Statistics Index Of Difficulty 72 Index of Discrimination .13 Item 42 The Correct Option is 4 Item Response Pattern l 2 3 4 Omit Total Upper 27% 4 3 O 8 9 15 Middle 46% 12 3 2 9 0 26 Lower 27% 5 2 2 3 3 15 Total 21 8 4 20 3 56 Item Statistics Index Of Difficulty 64 Index of Discrimination 33 Item 51 103 The Correct Option is l Item Response Pattern 1 2 3 4 Omit Total Upper 27% 8 3 1 3 O 15 Middle 46% 17 O 3 6 0 26 Lower 27% 5 l 5 l 3 15 Total 30 4 9 10 3 56 Item Statistics Index of Difficulty 46 Index of Discrimination .20 Item 52 The Correct Option is 2 Item Response Pattern l 2 3 4 Omit Total Upper 27% 3 8 l 3 0 15 Middle 46% 2 l9 2 1 1 26 Lower 27% 4 4 2 2 3 15 Total 9 31 5 6 5 56 Index Of Difficulty 45 Item Statistics Index of Discrimination .26 104 Item 53 The Correct Option is 5 Item Response Pattern 1 2 3 4 5 Omit Total Upper 27% 1 l 2 3 8 0 15 Middle 46% l l 3 4 17 0 26 Lower 27% 0 l 2 0 9 3 15 Total 2 3 7 7 34 3 56 Item Statistics Index of Difficulty 40 Index of Discrimination .07 Negative Item 56 The Correct Option is 5 Item Response Pattern l 2 3 4 5 Omit Total Upper 27% 0 2 0 2 11 0 15 Middle 46% l 0 0 4 21 0 26 Lower 27% 2 0 l l 8 3 15 Total 3 2 l 7 40 3 56 Item Statistics Index of Difficulty 29 Index of Discrimination .20 Item 57 The Correct Option is 4 105 Item Response Pattern 1 2 3 4 Omit Total Upper 27% 2 1 4 8 0 15 Middle 46% 5 3 8 10 0 26 Lower 27% l 2 3 6 3 15 Total 8 6 15 24 3 56 Item Statistics Index of Difficulty 57 Index Of Discrimination . Item 59 The Correct Option is 4 Item Response Pattern 1 2 3 4 Omit Total Upper 27% 0 0 1 14 0 15 Middle 46% 1 1 l 23 0 26 Lower 27% O 4 l 7 3 15 Total 1 5 3 44 3 56 Index of Difficulty 21 Index of Discrimination .46 Item Statistic S Item 61 106 The Correct Option is 3 Item Response Pattern 1 2 3 4 Omit Total Upper 27% 1 5 7 2 O 15 Middle 46% 7 5 12 l l 26 Lower 27% 3 2 7 0 3 15 Total 11 12 26 3 4 56 Item Statistics Index of Difficulty 53 Index Of Discrimination .00 Item 79 The Correct Option is 1 Item Response Pattern 1 2 3 4 Omit Total Upper 27% 11 2 O 2 0 15 Middle 46% 13 O 2 9 2 26 Lower 27% 4 5 1 2 3 15 Total 28 7 3 13 5 56 Index Of Difficulty 50 Index of Discrimination Item Statistics APPENDIX D ITEM ANALYSIS DATA FOR DEFINING OPERATIONALLY ITEMS 107 108 Item 14 The Correct Option is 2 Item Response Pattern l 2 3 4 5 Omit Total Upper 27% O 8 l O O 2 15 Middle 46% O 18 l O 6 l 26 Lower 27% 3 3 3 l 2 3 15 Total 3 29 5 l 12 6 56 Item Statistics Index of Difficulty 48 Index of Discrimination .33 Item 17 The Correct Option is 2 Item Response Pattern 1 2 4 5 Omit Total Upper 27% O O 8 6 l 15 Middle 46% O 6 6 11 3 26 Lower 27% O O 11 2 l 15 Total 0 6 25 20 5 56 Item Statistics Index of Difficulty 90 Index of Discrimination .OO 109 Item 18 The Correct Option is 2 Item Response Pattern 1 2 3 4 Omit Total Upper 27% O 14 O O 1 15 Middle 46% O 24 l 0 1 26 Lower 27% 0 14 4 O O l 15 Total 0 52 l O 3 56 Item Statistics Index of Difficulty 7 Index of Discrimination .00 Item 19 The Correct Option is 3 Item Response Pattern 1 2 3 4 Omit Total Upper 27% O 1 ll 2 l 15 Middle 46% l 4 13 6 2 16 Lower 27% 0 2 5 6 2 15 Total 1 7 29 14 5 56 Item Statistics Index of Difficulty 49 Index of Discrimination .40 110 Item 21 The Correct Option is 2 Item Response Pattern 1 2 3 4 Omit Total Upper 27% 0 ll 1 2 l 15 Middle 46% 0 15 3 5 l 26 Lower 27% 0 9 3 2 l 15 Total 2 35 7 9 3 56 Item Statistics Index of Difficulty 38 Index of Discrimination .13 Item 22 The Correct Option is 1 Item Response Pattern 1 2 3 4 Omit Total Upper 27% l 5 2 6 l 15 Middle 46% 4 12 4 3 3 26 Lower 27% O 7 5 2 1 15 Total 5 24 11 11 5 56 Item Statistics Index of Difficulty 92 Index of Discrimination .07 111 Item 41 The Correct Option is 3 Item Response Pattern 1 2 3 4 Omit Total Upper 27% l l 11 2 O 15 Middle 46% 3 0 12 9 2 26 Lower 27% 4 l 3 4 3 15 Total 8 2 26 15 5 56 Item Statistics Index of Difficulty 54 Index of Discrimination .53 Item 43 The Correct Option is 3 Item Response Pattern 1 2 3 4 Omit Total Upper 27% 2 3 6 4 O 15 Middle 46% l 2 14 7 2 26 Lower 27% 2 3 6 3 l 15 Total 5 8 26 14 3 56 Item Statistics Index of Difficulty 53 Index of Discrimination .00 112 Item 54 The Correct Option is 2 Item Response Pattern 1 2 3 4 Omit Total Upper 27% O 12 1 2 O 15 Middle 46% 1 15 6 2 2 26 Lower 27% 4 6 2 2 1 15 Total 5 33 0 6 3 56 Item Statistics Index of Difficulty 41 Index of Discrimination .40 Item 55 The Correct Option is 2 Item Response Pattern 1 2 3 4 Omit Total Upper 27% 4 11 O O O 15 Middle 46% 7 16 1 0 2 26 Lower 27% 3 4 3 2 1 15 Total 14 32 5 2 3 56 Item Statistics Index of Difficulty 43 Index of Discrimination .40 113 Item 58 The Correct Option is 3 Item Response Pattern 1 2 3 4 Omit Total Upper 27% l O 13 1 O 15 Middle 46% 1 1 14 7 3 26 Lower 27% 4 2 5 3 1 15 Total 6 3 32 11 4 56 Item Statistics Index of Difficulty 43 Index of Discrimination .54 Item 60 The Correct Option is 3 Item Response Pattern 1 2 3 4 Omit Total AUpper 27% 1 7 7 O 0 15 Middle 46% 2 9 10 2 3 26 Lower 27% .3 6 3 2 1 15 Total 6 22 20 4 4 56 Item Statistics Index of Difficulty 64 Index of Discrimination .27 114 Item 62 The Correct Option is 2 Item Response Pattern 1 2 3 4 Omit Total Upper 27% 2 12 0 1 0 15 Middle 46% 1 15 O 8 2 26 Lower 27% 4 6 1 3 1 15 Total 7 33 1 12 3 56 Item Statistics Index of Difficulty 41 Index of Discrimination .40 Item 69 The Correct Option is 2 Item Response Pattern 1 2 3 4 Omit Total Upper 27% 1 13 1 0 0 15 Middle 46% 0 18 4 2 2 26 Lower 27% 0 9 3 2 1 15 Total 1 4O 8 4 3 56 Item Statistics Index of Difficulty 28 Index of Discrimination 115 Item 70 The Correct Option is 2 Item Response Pattern 1 2 3 4 Omit Total Upper 27% 6 5 2 2 O 15 Middle 46% 11 6 5 2 2 26 Lower 27% 2 9 2 1 1 I 15 Total 19 20 9 5 3 56 Item Statistics Index of Difficulty 64 Index of Discrimination .27 Negative Item 71 The Correct Option is 3 Item Response Pattern 1 2 3 4 Omit Total Upper 27% 9 4 1 1 O 15 Middle 46% 10 6 5 3 2 26 Lower 27% 3 4 4 3 1 15 Total 22 14 10 7 3 56 Item Statistics Index of Difficulty 82 Index of Discrimination .20 Negative 116 Item 72 The Correct Option is 3 Item Response Pattern 1 2 3 4 Omit Total Upper 27% O 1 11 3 O 15 Middle 46% 1 3 14 6 2 26 Lower 27% 4 2 4 4 1 15 Total 5 6 29 13 3 56 Item Statistics Index of Difficulty 48 Index of Discrimination .46 Item 76 The Correct Option is 1 Item Response Pattern 1 2 3 4 Omit Total Upper 27% 10 O 4 0 1 15 Middle 46% 16 1 5 2 2 26 Lower 27% 6 2 2 4 l 15 Total 32 3 11 6 4 56 Item Statistics Index of Difficulty 43 Index of Discrimination .27 IIIII "llmlllwill" 5