common mom; menus m KEG-R KAEN, THAILAND 'ro DETERMINE r5513 TESTING mus A Dissertation for “10 Degree of pk. D. MICHIGAN STATE UNIVERSITY Sor-Wasna Pravalpruk 1974 LIBRAR Y g Michigan State University ,'\\\ ‘ F"!- ‘ - ~ ‘f- . . 1 ‘1 _\ _ i , I . \. "- . .- \ ~ . . ' J 1a- This is to certify that'the -— n '1 ‘ thesis entitled COMPARISON AMONG TEACHERS IN KI‘ION KAEN, THAILAND ' ' TO DETERMINE THEIR TESTING NEEDS ‘ presented by Sor-Wasna Pravalpruk has been accepted towards fulfillment of the requirements for Ph. D. degree in Education Major professor Date W / 0-7839 OVERDUE EINES ARE 25¢ PER DAY PER ITEM Return to book drop to remove this checkout from your record. V7 ABSTRACT COMPARISON AMONG TEACHERS IN KHON KAEN, THAILAND TO DETERMINE THEIR TESTING NEEDS By Sor-Wasna Pravalpruk In-service programs in educational measurement, with emphasis on test construction.procedures, i.e., test planning and curriculum analysis, test item editing, item analysis, and reporting test scores, have been given to teachers in various parts of Thailand by the Ministry of Education and the Test Bureau at the College of Education, Prasarnmitr. It was the purpose of this study to find out which group of teachers in Khon Kaen province actually needs in-service training and in which areas of measurement the need is greatest. This study also yields some followbup information on the effects of the earlier in-service programs. Two types of needs, actual and perceived needs, were investigated. A Likert type questionnaire was constructed to measure the perceived needs. The items were a random sample of the tasks usually involved in the process of test construction. The respondent was asked to rate each.task.from.1 to 5, where 1 meant most needed and 5 not needed at all. The first edition of the questionnaire had eight items for each of the areas of in-service training for a total of thirty-two items. The actual needs were measured by a random sampling of test items measuring knowledge on educational measurement similar Sor-Wasna Pravalpruk to those items used in introductory measurement courses at Michigan State University. There were thirty-two items, eight each under the four areas of the in—service program. Both questionnaire and test were tried out and the reliabilities were found to be .87 and .41 respectively. The revised editions of the questionnaire and test had twenty- eight items and twenty items respectively. They were sent together with the demographic data sheet and introduction letter to 400 teachers who were randomly selected from eight strata in Khon Kaen province. The stratification was based on the three variables: location of school university area or out of university area; levels of teacher education holders of teacher certificate and lower or holders of higher teacher certificate holders; and teacher experience less experience or more experience using eight years as the dividing line. Responses were received from teachers forming an equal sample size for each of the eight strata. Responses were coded and data cards were produced. All canned programs and FORTRAN programs were run on the CDC 6500 at the Computer Center at Michigan State University. The results of the descriptive information from the demo- graphic data sheet indicated that the teacher-to-students ratio from the sample was lower than the ratio from the population (1:31 and 1:33 respectively). The correlation between total actual needs and total perceived needs was very low, i.e., .004. The correlation between the number of teachers in the respondent's school with perceived needs was low and negative at -.069 and with actual needs at .160. The test of homogeneity of the variance-covariance matrix Sor-Wasna Pravalpruk was not significant for the four subscales of the actual needs but was significant for the perceived needs. The univariate repeated measures analysis was applied to the actual needs and the multivariate test to the perceived needs. The three way factorial design (2x2x2) with four repeated measures were analyzed by the FINN program. The design was crossed and balanced. The analysis of the actual needs showed that the interaction among location of schools, level of teacher education, teaching experience, and the subject matter in measurement was significant. Descriptive follow-up showed that the less experienced teachers holding only the teacher certificate who worked within the university area and the more experienced teachers holding the higher teacher certificate who also worked within the university area had the most contrasting profiles. The four way interaction was explained by the significance of the location by teacher education by teacher experience interaction on the contrast variable of Test Planning and Reporting Test Scores. The three-way interaction was not significant when Test Planning was analyzed individually, but the interaction was significant for Reporting Test Scores alone. In the presence of the four-way interaction, the subject matter by the level of teacher education by experience interaction, the subject matter by experience interaction, the subject matter by location interaction, the subject matter main effect, the location by experience interaction and the teacher education main effects were significant at a = .05. All interactions were presented in graph.form. 0n perceived needs, only three null hypotheses were rejected. The subject matter by location interaction was studied along with Sor-Wasna Pravalpruk the subject matter main effect. For teachers in the university area, the more experienced teachers had perceived needs ranging from highest to lowest in the following order: Item Analysis; Reporting Test Scores; Test Planning; and Item Editing. But for those outside the university area, the perceived needs on Item Editing was second, followed by Reporting Test Scores and Test Planning. In general, the Item Analysis had the highest perceived needs. The other null hypothesis rejected was the location of schools by teaching experience interaction. The interaction was disordinal which could be explained by the observation that less experienced teachers within the university area had lower perceived needs than the more experienced teachers within the university area but less experienced teachers outside the university area had higher perceived needs than more experienced teachers outside the university area. The difference between perceived needs of all (both inside and outside the university area) of the less experienced teachers was smaller than all of the more eXperienced teachers. COMPARISON AMONG TEACHERS IN KHON KAEN, THAILAND TO DETERMINE THEIR TESTING NEEDS By Sor-Wasna Pravalpruk A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Counseling, Personnel Services, and Educational Psychology 1974 ACKNOWLEDGEMENTS I wish to express my sincere appreciation to my Advisor and Chairman of my study, Professor William.A. Mehrens for his assistance and suggestions throughout my program of study and during the preparation of this dissertation. To all the members of my doctoral committee, I am grateful for their contribution of encouragement and insights. To Dr. Archibald B. Shaw, who was kind enough to spend many hours in correcting my English. To Drs. Maryellen McSweeney and Willard warrington for their help in the development and improvement of the study. Appreciation is also conveyed to Nitra Somswasdi, my sister, for her help in distributing and collecting the questionnaire. To Dr. Kowit Pravalpruk, my husband, for his help in analyzing the data. His constant support and encouraging me to persist with my study were very much appreciated. I acknowledge with appreciation the support of the Thai Government which allowed me to pursue my doctorate study. Special thanks are also extended to Terry Nafisi-Movaghar who typed with patience this dissertation. Finally, for their encouraging, I wish to extend my gratitude to my parents, my brothers and sisters, and to Puttaporn, my daughter. ii TABLE or CONTENTS Chapter I The Prom-m O O O O O O O O O O O O O O O I O O 0 Introduction .r. . . . . . . . . . . . . . . Need for the Study . . . . . . . . . . . . . Thai Educational System . . . . . . . . . . . Purpose of the Study . . . . . . . . . . . . Hypotheses . . . . . . . . . . . . . . . . . Definition of Terms . . . . . . . . . . . . . Overview . . . . . . . . . . . . . . . . . . II Review of the Literature . . . . . . . . . . . . The Importance of Classroom Testing . . . . . The Qualifications of Teachers as Evaluators In-Service Programs in Testing and Evaluation . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . III Procedure . . . . . . . . . . . . . . . . . . . . Population . . . . . . . . . . . . . . . . . Sampling Procedure . . . . . . . . . . . . . Instrument . . . . . . . . . . . . . . . . . Data Collection . . . . . . . . . . . . . . . Design . . . . . . . . . . . . . . . . . . . Analysis . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . iii 10 11 12 15 18 24 27 27 28 30 34 35 38 39 Chapter IV Analyses and Results . . . . . . . . . . . . . . Descriptive Data . . . . . . . . . . . . . . Homogeneity of Variance-Covariance Matrices Repeated Measure Analysis on Actual Needs . Repeated Measure Analysis on Perceived Needs Additional Analysis . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . V Summary and Conclusions . . . . . . . . . . . . Summary and Discussion . . . . . . . . . . . Conclusions and Implications . . . . . . . . Recommendations for Further Study . . . . . Appendix A The Instrument (First Edition) . . . . . . . Appendix B The Instrument (Final Edition) . . . . . . . Appendix C Pairwise Profile Analysis on Actual Needs . Appendix D Multivariate Analysis of Variance on Four Subscales on Actual Needs Bibliography . . . . . . . . . . . . . . . iv . 46 . 79 .103 .124 .126 .128 Table 3.1 3.2 4.1 4.2 4.3 4.4 4.5 4.6 4.7 LIST OF TABLES Total Number of Teachers in Khon Kaen Classified by Location of School, Class of Teacher Certificate and Years of Teaching Experience . . Design of the Three Independent Variables: Location of Schools, Levels of Teacher Education, and Teaching Experience . . . . . . . . . . Averages and Standard Deviations of the Number Teachers and Number of Students in Sampled Schools Classified According to Eight Groups of Teachers . . . . . . . . . . Inter-Correlations among Number of Teachers, Number of Students, Perceived Needs, and Actual Needs . . . . . . . . . . . . . . Variance-Covariance and Correlation Matrices of Four Test Scores . . . . . . . . . . . . Variance—Covariance and Correlation Matrices of Four Perceived Needs Scores . . . . . . Means and Standard Deviations of Four Test Scores (Actual Needs) . . . . . . . . . . . Univariate Repeated Measured Analysis on Test Scores . . . . . . . . . . . . . . . . Means and Standard Deviations of Four Perceived Needs Scores . . . . . . . . of 29 36 42 43 47 48 49 51 61 Table 4.8 4.9 4.10 4.11 4.12 Univariate and Multivariate Repeated ‘Measured Analysis on Perceived Needs . . . . . . Comparison on Perceived and Actual Needs between Teachers Who Attended and Did Not Attend the In-Service Program . . . . . . . . . . . . . Comparison on Perceived and Actual Needs between Teachers Who were Favorable and Not Favorable to the National Testing Policy . . Comparison on Perceived and Actual Needs between Male and Female Teachers . . . . . . . . Comparison on Perceived and Actual Needs among Teachers Who Taught Different Grade Levels vi 63 68 69 7O 71 Figure II III IV VI LIST OF FIGURES Graph Presentation of Cell Means for Subject Matter, Teacher Education, and Experience (Actual Needs) . Graph Presentation of Cell Means for Subject Matter and Teaching Experience (Actual Needs) . . . . . . Graph Presentation of Cell Means for Subject Matter and Location of Schools (Actual Needs) . . . . . . Graph Presentation of Cell Means for Location of Schools and Teaching Experience (Actual Needs) . . Graph Presentation of Cell Means for Location of Schools and Subject Matter (Perceived Needs) . . . Graph Presentation of Cell Means for Location of Schools and Experience of Teachers (Perceived Needs) . . . . . . . . . . . . . . . . vii 55 57 58 59 64 65 CHAPTER I THE PROBLEM Introduction Man has always been concerned with measurement and evaluation. From the earliest beginnings of society, men have measured the abilities of other men and have recognized the existence of differences in the abilities possessed by different individuals. As human society has grown in size and complexity the recognition of individual differences has increasingly been reflected in the structure of societies. In school, the teacher employs a central role in the testing and evaluation process. The teacher must make judgments, measure, appraise and report. To do this he must use tests, examinations, inventories and other types of measuring instruments and devices. He must know hOW’tO construct such instruments for his own purposes, and to use and interpret the results of others. He must know something about the analysis and interpretation of scores on tests. Classroom testing not only provides an expedient method for grading students, it also gives the student a far more valid indication of what the teacher considers important than do statements made in the textbook or by the teacher. Because it provides the basis for grades, the test indicates unmistakably to the student what should be learned or what is expected. Tests can also provide the teacher with needed information on precisely what students are learning and what they are not learning. Need for the Study Tests have become an integral part of our academic life. Teachers evaluate students from the time they are in kindergarten until they leave high school or college. Although decisions are often made on the basis of a student's scores, little, if any, thought is given to the qualifications or skill of the teacher as an evaluator. It is essential that we recognize the value that test results may play in the life of the student. Teachers occupy a central role in the testing and evaluation process. Teacher-made tests are frequently the major basis for evaluating the progress of students. They may be better than good commercially prepared achievement tests, in the sense that they may be more relevant to a teacher's particular objectives and pupils. Unfortunately, many teacher—made tests do not serve their function optimally because very often teachers have not been given adequate preparation in this area of competence. Two surveys done about 20 years ago (Noll, 1955; Allen, 1956) were in substantial agreement in showing that an intro- ductory course in measurement is not generally required by State Departments of Education for a teaching certificate. Most institutions preparing teachers offer an introductory course in educational measurement but comparatively few require it for a recommendation for a teaching certificate. More recent studies done by Goslin (1967) and Roeder (1972) lead to the same conclusion, that the problem of preparing teachers in measurement and evaluation is real, clear, and substantial. Many teachers do not seem to be adequately prepared in this respect, and apparently they recognize their inadequacy. If we accept testing and the use of test results as a part of the teachers' role, efforts need to be made to make sure that this part of their role is not carried out in a context of naivety about tests and what they measure. The role of teachers in testing is too important to be left to chance. In many systems, in-service courses are offered to teachers from time to time. A varying number of sessions is held according to the other demands being made on teachers. This type of program might be useful to help school personnel become more familiar with the theory and principles of measurement and evaluation. Thai Educational System The educational system in Thailand is quite different from the United States. Historically, the Thai educational system has had a strong connection with Buddhism, the main religion in Thailand. The wet (Buddhist temple) was the only place available for youngsters to be educated formally and the Buddhist monks were the only teachers. The subject matter was Buddhist philosOphy, history, and theology. Vocational education was either taught by a skillful person in that particular job or it was passed along within the family. Only a comparatively few youngsters and only boys, were given an opportunity to be educated in schools. Education for most was just on-the-job training or part of the living in the family occupation. Most of the boys learned to do as their fathers did, and the girls learned only the skills of house-work and home crafts. Therefore, testing for most was in the form of tests of performance and evaluation was just "it is the time for a birdie to fly alone." In 1871 the first school was established in the old palace in Bangkok, the capital city of Thailand. King Rama V (Chulalongkorn) supported the school, and formal education became the royal family and royal cousins' affair. Going to school was aimed towards becoming a royal officer in one of the civil services. Learning was to read, to write, and to have the social graces needed to get along in the King's services. Schooling proved to be such a success and so useful to the development of the kingdom that the King proclaimed that schools be establishedthroughout the country within a few years. By 1887 there was at least one school in each province. The curriculum was basically the three R's. A British-type school was implemented. .From the beginning and down to today, youngsters go to school three quarters of the year. There is no school during the summer. There are tests at the end of each quarter but the final examination at the end of the school year is the basis for promotion to the next grade. Early examinations were either in the form of class recital and individual recital or in essay form when the recital examination was too time consuming. When the Ministry of Education was founded in 1892, a nation-wide system of examinations was put into practice. For a long time, all tests were in essay form. In 1954 the objective test item was introduced. The initial effort was credited to M.L. Tuy Chumsai. With very little knowledge and understanding of the principles involved, teachers began to use objective tests. The lack of knowledge caused the production of bad objective test items.‘ The first textbook on educational measurement in the Thai language was written by Chaval Paeratakul and published in 1963. Two years later, a nation-wide inservice program in educational measurement was pr0posed and carried out by Chaval Paeratakul and the staff of the Test Bureau in the College of Education at Prasarnmit in an attempt to improve the quality of the teacherdmade tests. During this period, the Test Bureau constructed four standardized tests for grades 4 and 7, two in the subject Thai Language and two in Arithmetic. The tests and their norms were published in 1971. In 1965 in-service workshops were scheduled for 5 days, and covered these tapics: purposes of measurement; curriculum analysis as a blueprint for test construction; types of test items; item analysis; scores and norms; and reporting the test result to parents and authorities. The morning sessions included lectures by specialists from the College of Education and the Department of Teacher Training. The afternoon sessions were practicums. Teachers were assigned to one of the small groups working on bluprints, writing test items, performing item analysis, and reporting standard scores. In the past five years, the training has emphasized test construction and writing test items according to Bloom's Taxonomy. The quality of the test is critical because the evaluation that follows can slow down a student's progress through school by one year. If a student fails the final examination at the end of the school year he has to be in the same grade for another year with some unfavorable attitude and the label "repeater." Since education in Thailand is in effect a selective system, less than 7 percent of the first graders will finish early secondary school (grade 10). Entrance examinations are used to select students who want to go on to upper secondary school (grades 11 and 12). Another entrance examination at the end of grade 12 is used for admission to universities. Good tests are needed in the selection process so that educational wastage is minimum, In school, a student is mostly recognized and rewarded by teachers, parents, friends, and community on his cognitive abilities. Those cognitive abilities are measured by teacher-made tests. Again the quality of the test is crucial. Tremendous pressure is on a student whenever he is going to take a test. Sometimes it means his entire future depends upon one or two hours of his performance on the test. What will happen if one is not feeling well when the time comes to take the test? In 1962 the Department of Teacher Training implemented a new curriculum in both two and four year programs of teacher education. The new curriculum added one course in educational measurement and evaluation in the four year program. This course covered how to construct an objective test from planning to interpretation of the test score. It was 1966 when the first graduates from the new curriculum were assigned to schools. It was hoped that these new teachers would help other teachers in the same school learn how to construct objective tests. However seniority is a very real barrier in communication between old teachers and new teachers. If the seniority barrier assumption is true, there will be a difference in the new teachers' needs and the older teachers' needs for an in- service program in testing. It is interesting to investigate whether the older teachers need more testing services. Because of the centralized system of education in Thailand, information and policy are diffused from the Ministry of Education to the Provincial educational officers and then to the teachers. Within one province the central amphur will receive the information at first-hand. Other amphurs in the province will get them later depending on how far they are from the central amphur. In Khon Kaen, there is a university which is located in the central amphur. These two facts lead to the conclusion that teachers in the central amphur should have had more help and can have acquired knowledge about educational measurement easier than those in other amphurs. Therefore, they should have less need for in-service courses than teachers who are not in the university area. The level of the teachers' certificate furnishes another way to classify teacher groups. The holders of higher certificates took more courses in their four-year program and should have more knowledge about educational measurement. Teachers who hold the certificate gained in a two-year program, it is hypothesized, may have more need for help than do the holders of higher teacher certificates. As a result of this study, one should be able to conclude a) which group of teachers in Khon Kaen Province has the most urgent need, and b) what area of service is most needed. It is hypothesized that teachers who have Teacher Certificates or lower, who are not in the university area and who are older teachers (teach more than 8 years), have the highest need, and that writing the objective items will be the most needed area of service. Purpose of the Stud - The purpose of this study is to identify the areas of instruction in measurement which are needed most by the teachers in the Thailand province of Khon Kaen, and also to compare teachers who are grouped according to the amount of their experience in teaching, to the location of their schools, and to the kind of teaching certificate they hold. Actual and perceived needs in four fundamental areas will be measured: planning test construction or curriculum analysis, test item editing, item analysis, and reporting test scores. Hypotheses The major hypotheses to be examined are as follows: 1. Teachers who are in the university area may have fewer actual and perceived needs than teachers who are farther from the university in Khon Kaen Province. 2. Teachers who hold higher teacher certificates may have less actual and perceived needs than teachers who hold the teacher certificate in Khon Kaen Province. 3. Older teachers (more teaching experience) are expected to have more actual and perceived needs than newer teachers in Khon Kaen Province. 4. No difference is expected in actual and perceived needs among the four subject matters covered in the study. Besides these four hypotheses about main effects, all interactions will be tested. These interaction effects are determined in two parts: the interactions between the three variables of location, level of teacher education and level of experience; and the interactions between the three variables and the repeated measure variable, the four subject matter areas covered in this study. Definition of Terms Actual need is defined as a lack of knowiedge in a sub- area of tests and measurement as is disclosed by responses to groups of related test items that are designed to test desirable knowledge of measurement and evaluation. Perceived need is defined as the feeling of lacking desirable knowledge in tests and measurement as measured by a Likert Type format questionnaire about tasks in measurement and evaluation. Amphur is a political subdivision of a province. There are 13 amphurs in Khon Kaen Province. University Area is the Central Amphur of Khon Kaen Province. The university is Khon Raen University. Far From University Area describes all amphurs in Khon Kaen Province except the Central Amphur. It is also called outside university area. Certificate teacher is a teacher who graduated from a two-year teacher training curriculum, or its equivalent. Higher certificate teacher is a teacher who graduated from a four-year teacher training curriculum, or its equivalent. New teacher is a teacher who has been teaching for less than eight academic years. Old teacher is a teacher who has been teaching for eight or more academic years. 10 Overview The study is organized into five chapters, followed by appendices. In Chapter II, the literature relevant to the general problem and related areas is reviewed. The design of the study, the pilot study, the instrumentation, and the methods of analysis are discussed in Chapter III. The research data and results of the study are presented in Chapter IV. The final chapter contains a summary of the study, the conclusions, a discussion of the findings and implications for future research. CHAPTER II REVIEW OF THE LITERATURE The intent of this chapter is to review studies that are related to the problem as described in Chapter I. The review is divided into three sections: the importance of classroom testing; the qualifications of teachers as evaluators; and in—service programs in testing and evaluation. The Importance of Classroom Testing It is generally agreed that evaluation of pupil progress is a major aspect of the teacher's job. A good picture of where the pupil is and of how he is progressing is fundamental both to effective teaching and to effective learning. Classroom testing may be used in the form of quizzes, midterm and final examinations, and as general guideposts for teaching. Tests may be given at various times during the school year in order to stimulate learning through review and to determine status and progress in learning. As Mehrens and Lehmann (1973, p. 167) state: "Teacher-made tests are frequently the major basis for evaluating the students' progress in school." Teacher-made tests, by their definition, are constructed by a classroom teacher for use in his particular classes under conditions of his choosing. Classroom achievement tests, if properly prepared, are tailor—made for a certain class of pupils taught in a certain manner. They can be related to the kind of educational objectives ll 12 involved in the class and also to the relative emphasis each received. They also can be shaped to fit the particular characteristics of the pupils who compose the class. The teacher is in the best position to know the exact nature of the characteristics of his class. Further, since the apprOpriateness of any achievement test item is determined by the educational objectives of the class, the teacher is in an excellent position to build test items that measure suitable areas of the pupils' achievements and that are of suitable difficulty. Commercially prepared achievement tests may be used to obtain some of the information needed by teachers, and they may even be used to motivate students. But it is unusual for standardized tests to be administered more than once a year. Teacher-made tests are better in the sense that they are more relevant to a teacher's particular objectives and to his class. As Mehrens and Lehmann (1973, p. 202) say, "Classroom tests, despite some of their limitations, will never be replaced because they (a) tend to be more relevant, (b) can be tailored to fit a teacher's particular instructional objectives, and (c) can be adapted better to fit the needs and abilities of the students than can commercially published tests." Also Ebel (1962, p. 24) states: Even if good ready-made tests were generally available, a case could still be made for teacher- prepared tests; the chief reason being that the process of test development can help the teacher define his objectives. This process can result in tests that are more highly relevant than any external tests are likely to be. It can make the process of measuring educational achievement an integral part of the whole process of instruction as it should be. Schwartz and Tiedeman (1957, p. 110) say that teacher-made tests are valuable because they l3 1. Assess the extent and degree of student progress with.reference to specific classroom activities. 2. Motivate students. 3. Permit the teacher to ascertain individual strengths and weaknesses while the pupil is studying a particular subject—matter area. 4. Provide information for reporting purposes. 5. Provide immediate feedback for the teacher, who can then decide whether a particular unit or concept needs reteaching and/or whether individual pupils are in need for remedial instruction. 6. Provide for continuous evaluation. A good classroom test must be valid, reliable, and objective. It must possess good discriminative power, and it must be characterized by questions with a range of difficulty suitable to the abilities of all the students in the class. Bell (1972, p. 8) says: "A teacher who gives tests and uses them in the classroom without ever knowing whether and to what degree such qualities are present is flying blind." Teacher-constructed tests should be designed to measure areas of strengths and weaknesses, and thus enable the teacher to plan for a better program of organization, preparation, or presenta- tion, for the purpose of reteaching. Furthermore, Thorndike and Hagen (1969, pp. 30-33) suggest that functions of teacher-made tests are: a) Motivation. "Tests that are well constructed and effectively used can motivate students to develop good study habits, to correct errors, and to direct their activities toward the achievement of desired goals." (p. 31) b) Diagnosis and instruction. "The items on which an individual fails or on which many members of a class group fail can serve to identify points needing further study." (p. 31) 14 c) Defining teaching objectives. "We may know a teacher by the tests he makes. They tell what he is truly valuing in his pupils, even tbough.he himself does not know it, and they influence profoundly what his students will learn." (p. 32) d) Differentiation and certification of pupils. "The teacher inevitably has a responsibility for certifying pupils' accomplishments to higher levels of the educational enterprise and to the world outside the school." (p. 33) The authorities seem to agree that in constructing a test, the teacher must be concerned with the same factors as are the author and publisher of a standardized achievement test-validity, reliability, and objectivity. Evaluation of pupil achievement, in their view, is one of the teacher's important responsibilities. It is important that the teacher's device be well thought out and well made. For any type of written test, it is desirable to have a definite plan in advance of preparing the test items. The development of such a plan requires an analysis of the outcomes one is trying to achieve in the teaching of a particular course or unit and of the significant segments of content through which those objectives are to be realized. In addition, the plan should include the allocation of test items among the content areas and objectives, the types of items to be used, the total numbers of items in the test, and specifications for the spread of item difficulties. In Thailand, more emphasis is on the writteniexaminations in which teachers have to depend on the-tests they have made individually. Evaluation at the end of the school year is based almost entirely on the teacher-made tests (Paeratakul, 1963, in Thai edition). Only 15 the performance during the five days of the last week of the school year is the basis for determining which.students will go to the next grade or repeat the same grade in the next school year. So there is much more stress on teacher-made tests in Thailand than in the United States. The Qualifications of Teachers As Evaluators Teachers occupy a central role in the testing and evaluation process. Teacher-made tests are frequently the major basis for evaluating the progress of students. Many teacher-made tests do not serve their function optimally because of time pressures on teachers and, more important, because test-writing is not generally a part of the professional training of teachers (Grobman, 1971). In 1955 Victor Noll reported a study of requirements in measurement by state departments of education for teaching certificates and by 80 institutions preparing teachers in all parts of the country. The sampling included large publicly supported institutions, large privately supported institutions, teachers colleges, and liberal art colleges. A similar survey was reported in 1956 by Allen. The results in her study were based on questionnaire returns from 288 higher institutions with programs;for the preparation of teachers, an 88 percent return. The findings of the two surveys are in substantial agreement. They showed that an introductory course in measurement was not generally required by state departments of education for a teaching certificate. Most institutions preparing teachers, they found, offer an introductory course in educational ‘measurement but comparatively few require it for a recommendation for 16 a teaching certificate. Another study done under the Russell Sage Foundation in 1967 by Goslin states that ...with respect to formal training in test and measurement techniques (courses taken in college, attendance at clinics on standardized testing, and the like), the data indicate that less than 40 percent of all teachers have had more than minimal exposure (one course) to such training,Jand that a sizable prOportion of teachers have never had a course in measurement techniques or attended a clinic at which testing was discussed. Data also indicate that teacher involvement in testing is both high and of potentially great influence on the educational system and the children within it. If we accept testing and the use of test results as a part of the teacher role, efforts need to be made to make sure that this part of their role is not carried out by chance. (1967, p. 127) A more recent study was conducted by Roeder (1972) who surveyed the qualifications or skills of the teacher as an evaluator. The 940 elementary teacher-training institutions located in every state and the District of Columbia, were mailed a one-page questionnaire. A total of 916 responses (97.4 percent) were received. The data indicate that 57.7 percent of the institutions which were surveyed, or 496 institutions, did not require their prospective elementary teachers to complete a course in evaluation; 12.1 percent (104) required nothing more than a one or two semester hour course; 17.8 percent (158) required a three semester hour course and only 1.4 percent (12) required four or more semester hours or course work in evaluation. Roeder concludes that the majority of teachers receive only a minimal exposure to the complex world of evaluation. That is, most of today's elementary teachers are not prepared to make or use tests 0 17 Grobman (1967, 1971) sees testing as an integral part of teaching. She found that many teachers testified that the improvement of their test building skills has resulted in the improve- ment of their teaching, since it has made them.more sensitive to their teaching aims and the implementation of these. She claims that few teachers have had an opportunity to acquire a sufficient background in evaluation and testing to enable them to measure adequately the skills and understanding they wish to encourage in their students. Ebel (1961, p. 68) states that a teacher who is competent in educational measurement should: (1) Know the educational uses as well as the limitation, of educational tests. (2) Know the criteria by which the quality of a test should be judged and how to secure evidence relating to these criteria. (3) Know how to plan a test and write the test questions to be included in it. (4) Know how to select a standardized test that will be effective in a particular situation. (5) Know how to administer a test properly, efficiently, and fairly. (6) Know how to interpret test scores correctly and fully, but with recognition of their limitations. Further, in the 62nd Yearbook of the NationaljSocietygfor the Study of Education (1963) Hagen and Lindberg express a belief that classroom teachers: (1) must have a basic understanding of the evaluative process, including an understanding of the basic concepts of reliability, standard error of measurement, and validity as essential; (2) should know the general types of tests available and be aware of their uses, strengths, and limitations; and (3) should be able to 18 COmbine available test data with other records and to interpret them as they relate to individual children in their classes. Chaval Paeratakul (1963), the first person from Thailand who graduated in Tests and Measurements in the United States, along with other professionals in this area agreed that there was an urgent need for teachers in Thailand in tests and measurements. Other evidence to support this idea was the change in the curricula of teacher training education programs in 1962 and the nationdwide in—service program from 1965—1966. Teachers in Thailand have just begun to know what the objective tests are all about. They definitely need more help in getting acquainted with these subject matters. In-Service Prggrams in Testing_and Evaluation From studies reviewed in the last section, it is clear that the problem of competence of teachers in educational measurement and evaluation is real and substantial. They lead to the conclusion that many teachers do not seem to be adequately prepared in this respect and apparently they recognize their inadequacy. It is of the utmost importance to educational progress that the competence of teachers to measure educational attainments be improved. Ebel (1961) says that far more harm, perhaps ten times as much harm, is currently being done to student learning as a result of the shortcomings of the classroom tests by which a student's educational efforts are largely stimulated, directed, and evaluated, than is being done by all the faults of external testing programs. “'He suggests in-service programs as a means for improving the competence of teachers in measurement on the basis of experience with a variety 19 of in-service programs intended to help teachers solve measurement problems. These programs, variously referred to as conferences, seminars, institutes, or workshops, have ranged from an afternoon lecture to a three-day preschool program with several followbup meetings later in the year. Some of these programs were sponsored by a single school system and involved the teachers of that school in all subject areas and at all levels. Others reflected the interest of a single professional group, such as engineers or nurses. Goslin (1967, p. 129) suggests the publication of a short booklet‘ designed explicitly to deal with the problems faced by teachers in their role as testers as a supplement to a "more formal training in testing" to prospective teachers. Hagen and Lindberg (1963) claim that each school system must take responsibility for providing orientation and in-service training in testing because it is unrealistic to assume that teachers can learn all that they need to know about testing during their pre- service courses. There was a panel discussion about the improvement of the competence of classroom teachers in measurement and evaluation held by the National Council on Measurements Used in Education, at the 1959 annual meeting. The discussion centered on the questions of why in- service training is a problem, and what can be done to improve in- service training at the local level. As an example one local in-service program was presented in specific steps: 20 1) Preparation of local bulletins. A series of bulletins, supplementing the published materials concerning the tests in use in the county, have been written tying infithe testing program with the local program of instruction. 2) Use of school test coordinators. These test coordinators met regularly, especially before and after a scheduled testing program, to discuss problems involved in administering, scoring, and interpreting the test results. 3) Extension courses. 4) Faculty workshops. A considerable number of faculty workshops, varying in length from one to four or five sessions, have been held. 5) Development of community interest in the measurement program, through a judicious use of local newspaper publicity, talks to parent-teacher associations. 6) Use of local norms. 7) Provision of adequate physical facilities. 8) Use of demonstrations, lectures, etc., in the district's In-Service Training Center. 9) TV Workshop on testing. (Durost, pp. 32-33) Rather than the in-service program, McSwain (1959) focused on the role of the college faculty in the teacher education program. He said that during the period of student teaching each student- teacher should learn, from teacher and faculty advisor, about these areas: (1) a functional concept of evaluation and measurement in creative teaching and purposeful self-education; (2) techniques used in evaluating the behavioral responses of pupils; and (3) criteria to be applied in selecting and using educational measures to improve teacher diagnosis and effective procedures in preparing and interpreting cumulative child-accounting records. And, he added, 21 tzhe degree of competence in applied measurement developed by each student during this professional internship depends basically upon the leadership of the faculty counselor. In the same panel discussion, Charles Stevenson (1959) talked about measurement in the teaching-learning situation. He stated that the in-service program should cover: (1) common under- standing of what is to be measured among teachers, supervisors, coordinators, and superintendents; (2) the type of evaluative instruments which will be adequate in a particular situation; (3) clarification of how the results of measurement can be used to improve the teaching-learning situation; and (4) methods and procedures to be employed in improving these techniques of measurement. The last presentation in this panel, by Margaret Stevenson, was a discussion of the role of the classroom teacher in school testing programs. She suggested that after the school system has formulated its philosophy on testing; after representatives of all levels of school personnel have c00perated in developing the program;- after the testing program has been given a definite place in the school curriculum--then the major hurdles have been passed, and the school system is ready to address itself to the more direct problem of improving.classroom teacher competence in measurement and evaluation. That is, improving classroom teacher competence in measurement and evaluation must not be treated as an isolated question. It must be viewed in context, against the background of the numerous problems involved in establishing an adequate, democratically structured testing program in the school curriculum. The importance of in-service education for all educational 22 personnel is recognized throughout the literature of the teaching profession (Ebel, Ed., 1969, p. 645). The rapid expansion of knowledge, which has been reported extensively over the past several years, and its effect on changing methods and in develOping technology utilized in the classroom are major factors in making programs of in-service education necessary. A statewide research study in Tennessee about in-service education was conducted by Brimm and Tollett (1974). The conclusions of the study can be summarized as the following: (1) The primary purpose of in-service education is to upgrade the teacher's classroom performance. (2) If continuing professional growth is to be taken seriously, administrators and teachers must pool their knowledge and resources and seek to make in-service programs more responsive to the needs and interests of practicing classroom teachers. The two chief weaknesses of the in-service program as Ebel (1961) sees them are: (1) they are too brief. While an hour or two a year spent in considering measurement problems under the guidance of a specialist is far better than nothing at all, it is unreasonable to suppose that satisfying, enduring progress in solving the manifold problems of educational measurement, or in deve10ping the requisite knowledge, understanding, and skills, can be made in so short time; and (2) they involve too much talking and too little doing. For the cultivation of a practical art like educational measurement, sound pedagogy requires a mingling of theory and practice. Ebel presents the ideal program of in-service training for improving the competence of teachers in measurement as the following: 23 Suppose that a school administrator and his staff have decided to focus attention for a year on the improvement of classroom testing. Suppose they engage a specialist in educational testing to meet with them five times during the year, at intervals of six weeks or so, for a day or two. Participation in the initial program might well be limited to five, six, or seven groups of four to six teachers each. The goal of each group would be to make, to use and to analyze a quality test in a subject which all members of the particular group were teaching. Examples of subject area in which these tests might be developed are: fourth grade mathematics; sixth grade geography; eighth grade English; or high school history; chemistry; or economics. The first meeting of each participating group would be devoted to a description of the entire project, with special consideration of the first step - the preparation of specifications for the test to be developed. Sample specifications would be presented for study and analysis. Between the first and second sessions each teacher group would work out the specifications for its test. These could be reviewed at the second meeting, and work on item writing would be launched. The third meeting could be devoted to item review and test assembly, the fourth to test administration and analysis, and the fifth to a review of the test developed and of the entire project as a learning experience. (Ebel, 1961, pp. 70-71) A program like this, Ebel says, would produce not only a handful of excellent tests but also a sizable group of teachers whose competence in measurement was vastly improved and highly respectable. One of the most effective in-service programs in educational measurement in Thailand was carried out by Chaval Paeratakul in 1965-66. v//The purpose of the program was to improve the quality of the teacher- made tests. It was a five-day workshop. The material covered. included: purposes of measurement; curriculum analysis as a blue- print for test construction; types of test items; item analysis; scores and norms; and reporting the test result to parents and authorities. The morning sessions were lectures by authorities. 24 The afternoon sessions were practicums. Teachers were assigned to one of the small groups working on blueprints, writing test items, performing item analysis, and reporting standard scores. The training had emphasized test construction and writing test items according to Bloom's Taxonomy. Summary, In most of the literature, educational measurement and evaluation are viewed as essential and constructive (Ebel, 1972; Schultz, 1971; Sharp, 1966; Jessel, 1972; and Mehrens, 1967). 'Occasionally some suggest that education could go on perfectly well, perhaps much better than it has in the past, if tests and testing were abolished. This may imply that the tests these critics have seen or have used themselves are not good. No doubt some bad tests have actually been worse than no tests at all. Test scores are powerful educational tools, but they are only tools. They will not give direct, complete answers to the practical educational questions. They will not point unequivocally to a specific course of action in a given set of circumstances. They may indicate that something needs to be done. They may provide data that will help in deciding what to do. But they will not make the decision. Several writers (Mehrens, 1967; Trump, 1967; Holt, 1971; McMorris et a1, 1972; and Dyer, 1962) suggest that the limitations and underlying principles of tests must be recognized if one wants to use tests as good measuring tools in education. The qualifications of teachers as evaluators are widely 25 reported as less than satisfactory for most teachers. Studies on competence of teachers in measurement and evaluation conclude that most teachers do not have adequate knowledge in this area (Goodman, 1971; Noll, 1955; Allen, 1956; Goslin, 1967; and Roeder, 1972). These studies also suggest that more publication and service programs need to be Operated to improve the situation. The authorities conclude that in-service or workshop programs are needed to improve the competence of teachers in measurement. The purpose of the program is primarily to help teachers solve measurement problems. Much time, effort, and money needs to be spent to initiate or improve programs of services and assistance to teachers. The justifi- cation is that this is an important method of raising the standards of education. These help programs may be as little as a short booklet (Goslin, 1967), or they may be conferences, seminars, or workshops .which range from one afternoon to a five-session workshop (Huddleston, 1959; Ebel, 1961). The contents to be covered are the basic knowledge in educational measurement and evaluation that teachers should know, such as the limitations of educational tests, the properties of tests, ways of planning and constructing tests to use for one's own purposes, test administering, and how to interpret test results. Teacher~made tests in Thailand are much more important to effective teaching and to effective learning than in the United States since standardized tests are not available yet. And, also, more stress is put on the final examination than in the United States. The teacher-made final examination has at least seventy-five percent weight on evaluating "pass" or "fail" for the 26 students. The qualifications of Thai teachers as evaluators are much worse than in the United States. Also, the need for improving the competence of teachers in Thailand is very much greater than in the United States. CHAPTER III PROCEDURE This study can be classified as a comparative study. Questionnaires, sent through the mail or delivered personally, were used to collect the data. This chapter provides a description of the population, sampling procedure, instrument, data collection, and plan for data analysis. Findings in the study are presented in Chapter IV “and conclusions are given in Chapter V. Population Georgraphically, Khon Kaen province is in the northeastern part of Thailand, about 200 miles from Bangkok by the "Friendship" (Mitraparp) highway. In area, the province is 14,076 square- kilometers. The province is divided into thirteen amphurs and these thirteen amphurs have 116 tambons (subdivisions of amphur). The population is 1,136,939 people, the majority of whom are farmers. Educationally, there are 1047 schools, one technical institute, and one university. 0f the 1047 schools, thirty-five offer a high.school program either alone or in combination with the elementary and secondary levels in the same school. Forty-three of 1047 schools are schools for adults, which have both elementary and secondary levels. The population of interest is teachers in these 1047 schools. There are 5,913 teachers and 198,246 students. The average ratio of teacher to student is 1:33. The breakdown of teachers into 27 28 eight groups is presented in Table 3.1. The largest group (1534) is composed of teachers who are not in the central amphur, who have been teaching more than eight years and who hold the teacher certificate or lower. It is estimated that more than ninety percent of these teachers are in elementary schools or teach in elementary grades. Most of the big schools (more than 1000 students) are in the university area, so the number of teachers in the central amphur is the largest among the thirteen amphurs. The smallest group (210) is composed of teachers who are in the university area, who hold at least the higher teacher certificate, and who have been teaching more than eight years. Of the 5,913 teachers, sixteen hold master's degrees and 349 have at least the baccalaureate degree. Six percent of all teachers (365) hold at least baccalaureate degrees and most of them are teaching in the university area. Eighty-nine percent of the teachers (5249) teach at the elementary level (grades 1 to 7) and seventy-one percent (4212) in lower elementary level (grades 1 to 4). 0f the totals, 844 teachers are in fifty-eight private schools, mostly in the university area. All others are in public schools, either in provincial municipal sections or under the Departments of Elementary and Secondary Education. Sampling;grocedure A list of all the teachers in Khon Kaen province was provided by the‘Ministry of Education. Along with the names, the location of schools, ages of teachers, years of teaching experience, and levels of teacher education were given. The teachers were classified into eight groups, according to location of schools, 29 2am , an: :3 3L sou once new web «mna MQNH muumuo>wn on» ma uo . mop 33 SN man as a: 5335a was sH ooaofiuomxm ouooauonnm A ounmauunww . moomwuumwu one: uo..muw w .ouw w none moon one: so .mu» w .muw a done soon a Hmuos oumoumwuuou umnwwm . uoaog no oumoauauuou unnomoa soaumoog muzmHMmmxu qumode mo mudmw 92¢ MH.05 ME 3 3.252 2.991 <.05 MLT 3 1.679 1.545 >.05 MLE 3 0.704 0.648 >.05 MTE 3 4.545 4.181 <.01 MLTE 3 4.069 3.743 <.05 RM:LTE 936 1.087 (.05) a 2.61 F 0 F3,1000 3 1000 (.01) - 3.8 D 52 p 5 .05); - the subject matter by experience interaction CF - 2.991, p < .05): - the subject matter by teacher education by experience interaction (F I 4.181, p < .01); - and, the subject matter by location by teacher education by experience interaction (F - 3.743, p < .05). The other nine hypotheses were not rejected, at a 95 percent level of confidence. The significance of the four-way interaction supported the difference of the profiles among the eight groups of teachers, discussed earlier (under table of means and standard deviations on the actual needs). A followhup analysis was performed using the same computer program, but the variables were transformed to pairwise comparisons between Test Planning and Reporting Test Scores, Item Editing and Reporting Test Scores, and Item Analysis and Reporting Test Scores. The three-way interaction among location, teacher education, and experience was computed for each transformed variable. The univariate F ratio was significant only for the pairwise comparison of the Test Planning and Reporting Test Scores (F - 11.0774 with 1 and 312 degree of freedom, p < .001, see Appendix B). The other two comparisons were not significant. On examining all eight profiles, the most contrasting profiles were from the less experienced higher teacher certificate holders within the central amphur (university area) and the more experienced and higher teacher certificate holders within the central amphur (university area). The first group had the lowest mean on Test Planning 53 but the highest mean on Reporting Test Scores. On the contrary, the latter group tied for the highest mean on Test Planning but was the lowest on Reporting Test Scores. These reversed profiles suggested that the interaction was not ordinal. Another followhup was done to test the three—way interaction among location, teacher education and experience on the Test Planning and the Reporting Test Scores variable as well as the other two variables. The multivariate F ratio was significant at a = .05 (F - 3.618 with 4 and 309 degrees of freedom, p < .0068, see Appendix D) and the univariate F ratio indicated that only the three- way interaction on the Reporting Test Score variable was significant at a = .0049 . Although there was no significant difference in scores for Test Planning, Item Editing, and Item Analysis among the eight groups of teachers, there were some differences among them on the Reporting Test Score variable, which seems to account for the four-way interaction results. Since the four-way interaction was significant, a general profile analysis was not made and the following discussion is made on the condition of the presence of these interactions. The interaction among the subject matter. 18V€110f teacher education and experience was significant at a - .05 (F = 4.181, p < .01). It was concluded that the profiles of the four groups of teachers, regardless of the location of their schools, were different, as can be seen in Figure 1. The difference on each of the four subscales on the actual needs between the less experienced and more experienced teachers across teacher certificate holders and higher teacher certificate holders were quite substantial. On 54 Test Planning, Item Editing and Item Analysis variables, the less experienced groups were higher than those with.more experience within the groups of teacher certificate holders. This relationship was reversed within the groups of higher teacher certificate holders. The pattern of differences was not reversed for the Reporting Test Score variable. The differences, rather, were a matter of size. The difference between less experienced and more experienced teachers on the Reporting Test Scores variable was .035 for the teacher certificate holders, and .650 for the higher certificate holders The differing size of the differences between less experienced and more experienced teachers as well as the direction of differences ' was also found in the Item Analysis variable, (i.e., .115 within the teacher certificate holders and -.365 within the higher teacher certificate holders). Again the analysis of individual variables revealed that the interaction between teacher education and experience was significant only on the Reporting Test Scores variable at o - .0125 (F = 6.895 with 1 and 312 degrees of freedom, p < .0091). Figure 2 presents the profiles of less experienced teachers and more experienced teachers to explain the interaction between subject matter and experience. The differences between less experienced teachers and the more experienced teachers on the four sub-scales were -0.025, 0.055, -0.125 and 0.342 for the Test Planning, Item Editing, Item Analysis and Reporting Test Scores Respectively. The significance of the interaction between the subject matter and location of school is illustrated by the graph in Figure 3. The differences between teachers in the central amphur and those outside the central amphur on Test Planning, Item Editing, 55 FIGURE 1 GRAPH PRESENTATION OF CELL MEANS FOR SUBJECT MATTER, TEACHER EDUCATION, AND EXPERIENCE (ACTUAL NEEDS) NI 2.50 H 2.00 .. 1.50 ., 1.00 .. PL IE IA RE New teacher with teacher certificate --——- 01d teacher with teacher certificate ._.... .._.. New teacher with higher teacher certificate . . . . . . . Old teacher with higher teacher certificate 56 FIGURE 1 (CONTINUED) Test Planning Item Editing Item Analysis Reporting Test Scores (PL) (IE) (IA) (RE) New 22175 2.265 1.605 1.225 Cert. Old 2.075 2.105 1.490 1.190 High New 2.150 2.455 1.690 1.665 Cert. Old 2.300 2.505 2.055 1.015 57 FIGURE 2 GRAPH.PRESENTATION OF CELL MEANS FOR SUBJECT MATTER AND TEACHING EXPERIENCE (ACTUAL NEEDS) 3'6 2.5 2.00 1.5 1.00 PL IE IA RE PL a Test Planning pL IE 1A RE IE - Item Editing New 2.163 2.360 1.648 J1.445 IA-Item Analysis 01d 2.188 2.305 1.773 l1.103 RE - Reporting Test Scores New teachers —————— Old teachers 58 FIGURE 3 GRAPH PRESENTATION OF CELL MEANS FOR SUBJECT MATTER AND LOCATION OF SCHOOLS (ACTUAL NEEDS) NI 2.50 2.00 1.50 1.00 T PL PL 8 Test Planning IE I Item Editing IA = Item Analysis E3 Reporting Test Scores University area Non-University area IE RE University area Not in the University area PL IE IA RE 2.163 I 2.223 1.848 1.308 2.188 | 2.443 1.573 1.240 59 FIGURE 4 GRAPH PRESENTATION OF CELL MEANS FOR LOCATIONS OF SCHOOLS AND TEACHING EXPERIENCE (ACTUAL NEEDS) ii 8.00 .1 «L a rUniversity area 7.50 x + 1 4 7.00 T l bNon-University area ,1. New ' Old New - Teachers who have been teaching for less than 8 years 01d - Teachers who have been teaching 8 years or more New Old University area 7.375 7.688 Non- University area 7.838 7.025 60 Item Analysis and Reporting Test Scores were -0,025, -0.220, 0.093 and.0.068 respectively. The interaction was disordinal with.respect to location of schools. On the between-subject part, a significant interaction between location of school and experience was observed. This inter- action was explained by the reversed differences in the total actual needs (Fig. 4). The difference between less experienced and more experienced teachers within the central amphur was -0.313 (means of 7.375 and 7.688) while the same difference for teachers outside the central amphur was 0.813 (means of 7.838 and 7.025). The significance of the teacher education main effect was also observed (F - 8.855, p < .003). The mean for teacher certificate holders was smaller than the mean for higher teacher certificate holders (7.056 and 7.906 respectively). Repeated Measure Analysis on Perceived Needs The results of the data analysis on perceived needs are presented in the same pattern as in the previous section. First the summary data are discussed, then the testing hypotheses are presented followed by the graphic presentation of any interactions that were significant. All summary data, means and standard deviations, both of subscale scores and total perceived needs are presented in Table 4.7. The general profile illustrates all group means on Test Planning (20.816), Item Editing (20.906), Item Analysis (22.606), and Reporting Test Scores (20.897). The maximum perceived needs on Item Analysis with the three other means were about the same. Within the central 61 TABLE 4.7 MEANS AND STANDARD DEVIATIONS OF FOUR PERCEIVED NEEDS SCORES Location Teacher Experience Needs Total Education PL IE IA RE New 20.90 20.53 23.05 21.08 85.550 n a 40 5.07‘ 5.43 4.27 J 5.00 15.671 University Cert. Old 21.93‘ 21.40 23.68 21.73 88.725 1 New 20.23 19.73 22.78 21.10 83.825 Cert. Old 21.60 21.20 23.03 21.28 87.100 n - 40 3.59 4.18 4.22 5.82 14.604 n a 40 ' 5.23 4.73 4.33 4.49 15.100 Outside Cert. , Old 19.83 19.40 20.93 18.95 79.100 New 21.30 21.68 23.18 21.45 87.600 Cert. 01d 20.10 21.93 22.25 19.95 84.225 n - 40 04.484 4:22 4.96 5.28 14.591 20.810 20.906 22.606 20.897 All Groups 4.793 4.563 4.688 5.693 PL - Test Planning IE - Item Editing IA - Item Analysis RE - RepOrting Test Scores New 8 Teachers who have been Old - Teachers who have been teaching less than eight years teaching eight years or more 62 amphur (university area), the four groups had almost the same profile. There were small decreases from Test Planning to Item Editing and increases from Item Editing to Item Analysis. The means for Reporting Test Scores dropped to about the same as Test Planning. 0n the other hand, for those teachers outside the central amphur (outside university area), the means for three groups (except experienced teacher certificate holders) increased from Test Planning to Item Editing and increased to Item Analysis. Finally, for the Reporting Test Scores variable, the means decreased to about the same as the first two subscales. For experienced teacher certificate holders outside the central amphur the profile was the same as for those groups within the central amphur (university area). Contradictory information between the total actual needs and the total perceived needs was found on the more experienced teacher certificate holders outside the central amphur (outside university area). These teachers had the lowest means of the perceived needs of all eight groups, but they had the lowest actual needs score which indicates that they had the highest need. The group that reported the highest perceived needs was the more experienced teacher certificate holders in the central amphur (mean of 88.725). The tests of fifteen hypotheses concerning the perceived needs were performed by the multivariate repeated measure analysis. The between-subject part, i.e., hypotheses about location, teacher education, experience, and interactions among them, was tested by the univariate F ratio while the within-subject part, i.e., hypotheses about the subject matter and interactions with it,.by the multivariate F ratio. UNIVARIATE AND MULTIVARIATE REPEATED MEASURED ANALYSIS ON PERCEIVED NEEDS 63 TABLE 4.8 Sources DF Univariate Multivariate ' MS F P F P Location (L) 1 92.450 1.468 .227 Teacher Education (T) 1 17.112 0.272 .603 Experience (E) 1 15.313 0.243 .622 LT 1 135.200 2.146 .144 LE 1 336.200 5.337 .022 TE 1 13.613 0.216 .642 LTE 1 12.013 0.191 .663 R:LTE 312 62.997 Subject Matter CM) 3-310 24.349 .0001 ML 3-310 3.351 .0194 MT 3-310 0.458 .7120 ME 3-310 1.115 .3433 MLT 3-310 0.763 .5155 MLE 3-310 0.414 .7434 MTE 3—310 1.082 .3571 MLTE 3-310 0.552 .6474 64 FIGURE 5 GRAPH PRESENTATION OF CELL MEANS FOR LOCATION OF SCHOOLS AND SUBJECT MATTER (PERCEIVED NEEDS) >fl 23.00 ‘, 22.00 . 21.00 IAJLJ V 20.00 1 PL IE IA RE PL - Test Planning University area IE - Item Editing -—--—- Outside university IA - Item Analysis RE - Reporting Test Scores PL IE . IA . . RE University area 21.165 20.715 .23.135 21.298 outsideagzivemity 20.470 21.103 . 22.085 20.500 65 FIGURE 6 GRAPH PRESENTATION OF CELL MEANS FOR DOCATION OF SCHOOLS AND EXPERIENCE OF TEACHERS (PERCEIVED NEEDS) 3(- 87.00), 86.00), 85.00), 84.00V 83.00 L 82.001. 81.00,:L '1: * New Old University area ------- Outside university area New Old University area 84.688 87.913 Outside university area 86.638 81.663 66 All univariate F and multivariate F ratios are presented in Table 4.8. A rejection of the null hypotheses was indicated, at a - .05, on three of the fifteen hypotheses: the interaction between location of schools by experience (F = 5.377, with 1 and 312 degrees of freedom, p < .022); the subject matter main effect; and the subject matter by location of school interaction (multivariate F of 3.35 and 24.349 respectively with 3 and 310 degrees of freedom, p < .0001). Since the interaction between location and subject matter was significant, the general profile was not interpreted. One profile was made for all teachers in the central amphur (university area) and another for those outside the central amphur (outside university) area). Figure 5 presents these two profiles. For teachers in the central amphur (university area), the means vary from 21.165 for Test Planning to 20.715 for Item Editing to 23.135 for Item Analysis and finally to 21.298 for Reporting Test Scores. For teachers outside the university area the mean scores were 20.470, 21.103, 22.083 and 20.50. The largest difference between the groups within and those outside the university area was found on Item Analysis subscale with the value of 1.05. The interaction between locations of schools and experience was explained by the difference between less experienced teachers and more experienced teachers within the central amphur and the difference for teachers outside the central amphur on the total perceived needs score. The mean for less experienced teachers in the university area was 84.688 and more experienced teachers 87.913 with the difference of -3.225. For teachers outside the university area the mean for the 67 less experienced group was 86.638 and the more experienced group 81.663 with the difference of 4.975. The type of interaction was disordinal (Fig. 6). Additional Analysis Besides testing the hypothesesvof interest, further analyses were done to compare actual and perceived needs among various groups of teachers, defined by the following independent variables: having attended the in-service program; sex; favoring a national testing program; and grade taught. The analyses were done individually on both total scores and subscale scores. Means of the groups, F ratios testing the difference between groups, and the probabilities of the F ratios are presented in Tables 4.9 to 4.12. In Table 4.9, the comparison between teachers who had attended the in-service training program and those who had not attended was made. For total perceived needs, teachers who attended the in- service training had perceived themselves as having a greater need than those who had not attended the training program but the amount of difference was not large enough to be significant at a B .05 level. On all four subscales within the perceived needs, no significant differences were found. In the test of actual needs, the teachers who had not attended the in-service training program had higher need (indicated by lower scores from the test) than those who had attended the training program. Again, the amount of the difference was not large enough to show any significance. However, by examining the subscales separately, the difference in Item Editing was significant at o .05 in favor of those who had not attended the in-service training 68 TABLE 4.9 COMPARISON ON PERCEIVED AND ACTUAL NEEDS BETWEEN TEACHERS WHO ATTENDED AND DID NOT ATTEND THE IN-SERVICE PROGRAM In-Service Variables Attended Did Not Attend F P less than n :_51 n e_269 (X) (X) Perceived Needi PL 20.71 20.84 0.03 .8588 IE 21.39 20.81 0.68 .4110 IA 22.35 22.65 0.18 .6758 RE 21.94 20.70 2.04 .1541 Total 86.39 85.00 0.32 .5694 Actual Needs PL 2.25 2.15 0.35 .5545 IE 2.63 2.27 4.25 .0402 IA 1.78 1.69 0.25 .6203 RE 1.33 1.26 0.20 .6563 Total 8.00 7.38 2.44 .1195 PL - Test Planning IE - Item Editing IA - Item Analysis RE - Reporting Test Scores 69 TABLE 4.10 COMPARISON ON PERCEIVED AND ACTUAL NEEDS BETWEEN TEACHERS WHO WERE FAVORABLE AND NOT FAVORABLE TO THE NATIONAL TESTING POLICY National Testing Variables Favored Not Favored F P less than n f_115 n d 205 (X) (X) Perceived Needs PL‘ 21.68 20.33 5.91 .0157 IE 21.33 20.67 1.53 .2167 IA 23.00 22.39 1.26 .2630 RE 21.28 20.68 0.80 .3710 Total 87.29 84.07 3.02 .0832 Actual Needs PL 2.05 2.24 2.46 .1175 IE 2.43 2.27 1.58 .2095 IA 1.68 1.72 0.09 .7600 RE 1.45 1.07 5.13 .0242 Total 7.62 7.40 0.49 .4828 PL - Test Planning IE - Item Editing IA - Item Analysis (3 Reporting Test Scores 70 TABLE 4.11 COMPARISON ON PERCEIVED AND ACTUAL NEEDS BETWEEN MALE AND FEMALE TEACHERS Sex Variables Male Female F P less than n e_141 n {_179 (X) (X) Perceived Needs PL 20.30 21.22 2.88 .0908 IE 20.81 20.98 0.11 .7362 IA 22.63 22.59 0.71 .9331 RE 20.90 20.89 0.00 .9916 Total 84.65 85.68 0.33 .5647 Actual Needs PL 2.14 2.20 0.25 .6172 IE 2.37 2.30 0.32 .5710 IA 1.84 1.60 3.21 .0742 RE 1.33 1.22 0.83 .3641 Total 7.69 7.32 1.60 .2063 PL - Test Planning IE I Item Editing IA - Item Analysis RE - Reporting Test Scores 71 TABLE 4.12 COMPARISON ON PERCEIVED AND ACTUAL NEEDS AMONG TEACHERS WHO TAUGHT DIFFERENT GRADE LEVELS Grade Taught Variables G1 G2 G3 G4 F P less than. n e_119 n :_90 n f_103 n :_8 (X) (X) (X) (X) Perceived Needs PL 21.08 21.59 19.91 19.87 2.25 .0830 IE 20.85 21.56 20.61 18.25 1.65 .1784 IA 22.50 23.19 22.17 23.25 0.83 .4789 RE 20.27 22.11 20.58 20.62 1.97 .1185 Total 84.70 88.44 83.27 82.00 1.91 .1284 Actual Needs PL 2.11 2.29 2.16 2.12 0.52 .6665 IE 2.13 2.21 2.57 3.37 5.54 .0011 IA 1.47 1.87 1.83 1.87 2.39 .0689 RE 1.22 1.22 1.31 2.12 1.91 .1281 Total 6.93 7.59 7.86 9.50 4.32 .0053 PL - Test Planning G1 . Taught grades 1-4 IE 8 Item Editing G2 - Taught grades 5-7 IA - Item Analysis 03 - Taught grades 8-10 RE - Reporting Test Scores G - Taught grade 11 and higher 72 program. The mean for teachers who had attended the program was 2.63 and for those who had not attended it was 2.24. wThe F ratio was 4.25 with a probability less than .04. The other three subscales showed no significant differences. The comparison between teachers who favored a national testing policy and those who did not is presented in Table 4.10. In both the total perceived needs and total actual needs, there were no significant differences between the two groups. However, when the four subscales within each need were analyzed, significant differences were found in the scores for Test Planning in the perceived needs and for Reporting Test Scores in the actual needs. The perceived needs in Test Planning for teachers who favored the national testing policy was higher than for those who did not (means were 21.68 and 20.33 respectively, F - 5.91 and p < .0157). Conversely, the actual needs in Reporting Test Scores were higher for teachers who did not favor the national testing policy than those teachers who did (means were 1.07 and 1.45 respectively, P - 5.13 and p < .0242). A similar comparison was made between male and female teachers. No significant differences were reported in any of the subscales or in the total scores for both perceived and actual needs. A slight difference was found on the total perceived needs with the higher scores made by female teachers. Female teachers had lower scores in the actual needs, but these were not low enough to be significant. Group means on all needs, along with their associated F ratios and probabilities are presented in Table 4.11. 73 When teachers were compared according to the grades taught, a highly significant difference was observed in the Item Editing actual needs. Those who taught in grades 11 and higher had the lowest actual needs (means of 3.37) and next to the lowest were teachers who taught in grades 8 to 10. The actual needs in Item Editing increased as the level of grade taught decreased. Teachers who taught in grades 1 to 4 had maximum needs (mean 2.13). The F ratio was 5.54 which was significant at a 3 .01 (see Table 4.12). This finding agrees with the previous hypothesis which indicated that teachers who had higher teacher education had less actual need. Since most of the teacher certificate holders or lower were teaching in the lower levels, it is to be expected that teachers in grades 1 to 4 should show the highest actual needs. The pattern of differences among the four groups of teachers, classified by the grade level they taught, was also reflected in the scores for the total actual needs. Means were 6.93, 7.59, 7.86, and 9.50 for those who taught in grades 1 to 4, 5 to 7, 8 to 10, and 11 and higher, respectively. The F ratio was 4.32 with probability less than .0053. Summary The correlation between the actual and perceived needs was very low and homogeneity of the variance-covariance matrix for the actual needs was accepted but rejected for the perceived needs. The decision was made that the analyses should be separated between actual and perceived needs. The first descriptive discussion was on information from the demographic items. The data obtained tended to support that the sample represented the population. The univariate 74 repeated measure analysis was employed to test hypotheses concerning the actual needs. The four-way interaction was significant. The general profile was not interpreted. Summary data indicated that the more experienced teacher certificate holders outside the university area had the highest actual needs scores. Among the subject matter Reporting Test Scores was the most needed area. Item Analysis was next. The teacher education main effect was also significant. The means indicated that the teacher certificate holders needed help more than the higher teacher certificate holders. The multivariate repeated measures analysis was employed to test hypotheses concerning the perceived needs. The subject matter by location of school interaction was significant. Interpretations of profiles were made separately for the central amphur teachers and for those outside. Both groups had the most need in Item Analysis. Teachers in the central amphur were higher than those outside the university area on their perceived needs on the Item Analysis variable. In the university area, Item Editing was the least needed but in the outside group Test Planning was the least needed. Disordinal type interaction was found on the location of school by experience interaction. Means of the total perceived needs indicated that the more experienced teacher certificate holders outside the university area had the least perceived needs, which was in contradiction with the findings on actual needs. The hypotheses testing results were as follow: - Actual needs 1) There was no difference in actual needs between teachers in and outside of the university area. 75 2) Teacher certificate holders had more actual needs than the higher teacher certificate holders. 3) There was no difference in actual needs between more and less experienced teachers. 4) There were some difference in actual needs among the four subject matters; Reporting Test Scores seemed to be the most needed (lowest mean). 5) There was no interaction between location of schools and level of teacher education. 6) There was a disordinal interaction between location of schools and teaching experience. Less experienced teachers in the university area had more actual needs than less experienced teachers outside the university area but the more experienced teachers in the university area had less actual needs than the more experienced teachers outside the university area (Fig. 4). 7) There was no interaction between teaching experience and level of teacher education. 8) There was no three-way interaction among location of schools, teaching experience, and level of teacher education. 9) There was a disordinal interaction between subject matter and location of schools. Test Planning and Item Editing variables both in and outside the university areas had about the same needs. For Item Analysis, the university area group had more need than the outside university area group but the effect was reversed in the Reporting Test Score variable. 10) There was no interaction between subject matter and level of teacher education. 76 11) There was a disordinal interaction between subject matter and teaching experience. The less experienced teachers seemed to have less need in. Reporting Test Score. 12) There was no interaction among subject matter, location of schools and level of teacher education. 13) There was no interaction among the subject matter, location of schools and teaching experience. 14) There was a three-way interaction among the subject matter, teaching experience, and levels of teacher education (Fig. l). 15) There was a four-way interaction among the subject matter, location of schools, teaching experience, and level of teacher certificate. The follow-up test observed that there was no difference in scores for Test Planning, Item Editing, and Item Analysis among eight groups of teachers, but there were some differences among them on the Reporting Test Scores variable. - Perceived needs 1) There was no difference in perceived needs between teachers in and outside the university area. 2) There was no difference in perceived needs between the two groups of teachers classified by teacher education levels. 3) There was no difference in perceived needs between more and less experienced teachers. 4) There were some differences in perceived needs in the four subject matters. The Item Analysis variable seemed to be the most needed while the other three were about the same. 5) There was no interaction between location of schools and level of teacher education. 77 6) There was a disordinal interaction between location of schools and teaching experience. Among the less experienced teachers, those who were in the university area had fewer needs than the outside university area, but for more experienced teachers, outside university area had fewer needs than the university area. 7) There was no interaction between teaching experience and level of teacher education. 8) There was no interaction among location of schools, teaching experience, and levels of teacher education. 9) There was an interaction between the subject matter and location of schools as presented in Figure 5. The university area group had more needs in the Test Planning, Item Analysis, and Reporting Test Score variables but had less need in the Item Editing variable. 10) There was no interaction between the subject matter and level of teacher education. 11) There was no interaction between the subject matter and teaching experience. 12) There was no interaction among the subject matter, location of schools, and levels of teacher education. 13) There was no interaction among the subject matter, location of schools, and teaching experience. 14) There was no interaction among the subject matter, teaching experience, and level of teacher education. 15) There was no four-way interaction among the subject matter, location of schools, teaching experience, and levels of teacher education. 78 Analyses were also done to compare actual and perceived needs among various groups of teachers by the following independent variables: having attended the training program; sex; favoring a national testing program; and grade taught. Among teachers who attended and did not attend the in- service program, the former group had perceived themselves as having a greater need but had lower actual needs (indicated by higher scores from the test) than those who had not attended the training program. There were no significant differences on needs among teachers who favored and did not favor the national testing policy. Neither were there any differences between the sexes. There were differences among teachers who taught a different grade. Teachers who taught in higher grades had lower needs than those who taught in lower grade levels. 0 CHAPTER V SUMMARX AND CONCLUSIONS In—service programs in educational measurement, with emphasis on test construction procedures, i.e., test planning and curriculum analysis, test item editing, item analysis, and reporting test scores, have been given to teachers in various parts of Thailand by the Ministry of Education and the Test Bureau at the College of Education, Prasarnmitr. Undoubtedly, the cost of this in-service program.was high and the teachers who joined the program might not be those who actually needed the most help. It was the purpose of this study to find out which groups of teachers actually need in-service training and in which areas of measure- ment the need is the greatest. This study also yields some follow-up information on the effects of the earlier in-service programs. Two types of needs, actual and perceived, were investigated. Summary_and Discussion The data of this study were based on two factors: a) the perceived needs which were measures of the value judgment of teachers in the sample; and b) the actual needs which were measures of their knowledge in measurement and evaluation. A Likert type questionnaire was constructed to measure the perceived needs. The items were a random sample of the tasks usually involved in the process of test construction. The respondent was asked to rate each task from 1 to 5 where l meant most needed and 5 not needed at all. The first edition of the questionnaires had eight items for each of the areas of in-service training for a total 79 80 of thirty-two items. The actual needs were measured by a random sampling of test items measuring knowledge on educational measurement similar to those items used in basic measurement courses at Michigan State University. There were thirty-two items, eight each under the four areas of the in- service program. Both questionnaire and test were tried out and the reliabilities were found to be .87 and .41 respectively. The revised editions of the questionnaire and test had twenty- eight items and twenty items respectively. They were sent together with a demographic data sheet and introduction letter to 400 teachers who were randomly selected from eight strata in Khon Kaen province. The strati- fication was based on the three variables: location of school-university area or out of university area; levels of teacher education—teacher certificate holders or higher teacher certificate holders; and teacher experience-less experience or more experience using eight years as the dividing line. Responses were received from teachers forming an equal sample size for each of the eight strata. The correlation between total actual needs and total perceived needs was very low, i.e., .004. It was indicated that there was no relationship at all between the actual and the perceived needs. Since most of the teachers had little knowledge about the subject matter, they felt a great need in every task. At a particular level of actual needs, there were some teachers who would like to know more about the subject matter and some who reported lower perceived needs. For the latter group, the perceived needs were compared with routine work in school and teachers seemed to decide that they had no time to pay attention to measurement problems. At the level of their present knowledge on the subject matter, they seemed to think that they could carry on test construction and measurement in general very well. The variation in their 81 perception of their needs among those teachers who had the same level of actual needs caused the low correlation between the two needs. Another way to explain the low correlation between perceived and actual needs is by the observation that marginal distributions of both.variables were skewed. Most of the teachers reported high perceived needs and low actual needs. The scatter diagram formed a tendency of curvilinear relationship between the two variables. Therefore, the Pearson product moment correlation yielded a low value. From this study, the perceived needs were the teachers' value judgments about their needs in measurement and evaluation while the actual needs were what they knew in the subject matter areas. Since value judgments were different among those teachers who had the same levels of actual needs, this also contributed to the low correlation coefficient between the two needs. The correlation between the number of teachers in the respective respondents' school with their perceived needs was low and negative at -.069 and with their actual needs was low at .160. Since there was a high relation— ship between the number of teachers in school and the size of schools, the low correlation observed indicated that there was no relationship between the size of schools and the actual needs and between the size of schools and the perceived needs. The size of school was also related to the number of grades in school. Big schools generally were those that had grades 1 to 12. No matter how big the schools the teachers were in, their jobs were still determined by their levels of teacher education. Big schools included all levels of teachers, from those who had no teacher education at all to those who had master's degrees. The variation of teacher certificate in the big schools was greater than the variation in small schools. The test of homogeneity of the variance-covariance matrix was not significant for the four subscales of the actual needs but was significant for 82 the perceived needs. The univariate repeated measures analysis was applied to the actual needs and the multivariate test to the perceived needs. The three way factorial design (2x2x2) with.four repeated measures were analyzed by the FINN program. The design was crossed and balanced. The analysis of the actual needs showed that the interaction among location of schools, level of teacher education, teaching experience, and the subject matter was significant. Descriptive followbups showed that the less experienced teachers holding only the teacher certificate who worked within the university area and the more experienced holders of the higher teacher certificate who also worked within the university area had the most contrasting profiles. The four way interaction was explained by the significance of the location by teacher education by teacher experience interaction on the contrast variable of Test Planning and Reporting Test Scores. The three—way interaction was not significant when Test Planning was analyzed individually, but the interaction was significant for Reporting Test Scores alone. In the presence of the fourdway interaction, the subject matter by the level of teacher education by experience interaction, the subject matter by experience interaction, the subject matter by location interaction, the subject matter main effect, the location by experience interaction, and the teacher education main effects were significant at a - .05. All interactions were presented in graph form. The F ratio for testing two hypotheses concerning the main effects were significant. They were the subject matter main effect and the teacher education main effect. On the general profile actual needs in Reporting Test Scores was the highest need (lowest scores from the test) followed by Item Analysis and by Test Planning with Item Editing the lowest. However the general profile could not be made applicable to 83 all groups of teachers because the fourdway interaction was significant. The significance of the teacher education main effect was followed by the notion that the teacher certificate holders had more actual needs than the higher teacher certificate holders. 0n perceived needs, only three null hypotheses were rejected. The subject matter by location interaction was studied along with the subject matter main effect. For teachers in the university area, the more experienced teachers had perceived needs ranging from highest to lowest in the following order: Item Analysis; Reporting Test Scores; Test Planning; and Item Editing. But for those outside the university area, the perceived needs on Item Editing was second, followed by Reporting Test Scores and Test Planning. In general, the factor of Item Analysis had the highest scores in percieved needs. The other null hypothesis rejected was the location of schools by teaching experience interaction. The interaction was disordinal which could be explained by the observation that less experienced teachers within the university area had lower perceived needs than the more experienced teachers within the university area but less experienced teachers outside the university area had higher perceived needs than more experienced teachers outside university area. The difference between perceived needs of all (both inside and outside the university area) of the less experienced teachers was smaller than all of the more experienced teachers. Further comparisons were made to find whether there was any difference among levels of the following independent variables: having attended the in-service program; sex; favoring a national testing program; and grade taught. On total perceived and actual needs, a difference was found on the grade taught variable for total actual needs only. 84 It was indicated that the higher the grade taught, the higher the test scores and hence the lower their actual needs. The same observation was true for actual needs in Item Editing but not for the other three subject matters. The score in actual needs for Item Editing also differed between teachers who had attended the in-service program and those who had not, even though no difference was found in the total needs. The means indicated that those who had attended had lower actual needs. Between those who favored a national testing program and those who did not, the differences were found on Test Planning perceived needs and Reporting Test Scores actual needs. The means showed that teachers who favored a national testing program had higher Test Planning perceived needs than those who did not. For Reporting Test Scores actual needs, teachers who favored national testing needed less. There were no significant difference at all between male and female teachers. Conclusions and Implications The descriptive information supported that the random sample was very similar to the total population. This suggests that the bias due to the substitution of the samples in the outside-university area is minimum. The smaller teacher-student ratio for the sample as a whole was contributed by the larger proportion of central amphur schools (university area) than in the total population. Most of the large schools, all of which.were in the university area, had lower teacher—student ratios. Most of the schools outside the university area had much higher ratios because of the difficulty of getting teachers in the remote areas. Also, some of these remote schools are one-room schools. Sixteen percent of the 320 teachers in the sample had experience 85 in the in-service training program between 1961 and 1973. .Seventy- one percent of these fifty-one teachers held at least the higher teacher certificate, and sixty-five percent were more experienced teachers, who had been teaching for at least eight years. The conclusion could be drawn that in those years the program of in-service training was given to those who had already had more teacher education. Because of the nature of the subject matter, with some mathematics involved, the holders of only the teacher certificate and those teachers who had not had any teacher education at all may have turned down the invitation to join the program. A cross-tabulation between responses on attended/did not attend the in-service program and favored/did not favor a national testing policy revealed that of fifty-one teachers who did attend the in-service program, thirteen teachers favored a national testing policy. Thirty-eight teachers who attended the in-service program were unfavorable to a national testing policy. Twenty-six percent (thirteen out of fifty-one) of the teachers who attended the in-service program favored a national testing policy. On the other hand, thirty-eight percent (102 out of 269) of the teachers who did not attend the in-service program reported favoring such a policy. In the past, the in—service training program emphasized item editing in an attempt to improve the quality of the teacher-made test. The results of this attempt were reflected in the lower needs on both the actual and perceived needs tests than were found in other subject matter, i.e., lower than Item Analysis within percieved needs and lower than both Reporting Test Scores and Item Analysis in 86 actual needs. Because of the interaction of school location, teacher education, teacher experience and subject matters, the needs for each group were different. Therefore, the subject matter should be arranged according to the needs of a majority of teachers in each training session. The results of this study seem to indicate that it would be appropriate to emphasize item analysis procedures in future in-service programs because it was the area with the highest perceived needs score and was next to the highest in the test of actual needs. Further, item analysis is the next step in improving tests after the items have been written and tried out. In the Item Editing subject matter, teachers in the higher grades had less actual need than those who taught in lower grades. This may suggest that in the past, both in the in-service training program and in teacher-training curriculum, the emphasis was on item editing, i.e., writing the test questions more than any other area of subject matter in measurement and evaluation. In testing the difference between teachers who had attended and had not attended the in-service program, no significant difference was found between these two groups. But the data showed that there were some slight differences between them. In perceived needs, teachers who had attended the in-service training program felt more need than those who had not attended the program. This result might suggest that those who attended the program seemed to realize that there were several things that they must know about tests and measurement. The data, also, indicated that teachers who had attended the training program had higher scores on the actual needs test 87 (meaning that they had lower needs) than who had not attended the program. And, again, the significant difference was found in the Item Editing subject matter. One of the nationdwide programs conducted by Chaval Paeratakul during 1965 included spending two days out of the five- days sessions on test item editing alone. The activities included theory in writing good test items with examples of good and less good items. Then each teacher had to write some test questions and criticize them item by item. Another piece of evidence to support the position that there is less need in Item Editing subject matter was that in Chaval Paeratakul's measurement textbook, the first book in Thai language about measurement, the discussion about item editing subject matter takes four out of the nine chapters. The study also suggests that there should be a compulsory program requiring the holders of teacher certificates and teachers with no formal teacher education to attend the training sessions, especially those who have been teaching for at least eight years outside the university area and for new teachers within the university area. Special attempts should be made to give service to these two groups of teachers. The high actual needs for these two groups may be the result of a lack of interest and or of heavy loads or of routine teaching. Most of the teachers studied taught in the lower elementary grades (1 to 4), taught all subjects and for about 5 hours a day. They apparently found no time to study or to pay much attention to other professional activities. 88 Recommendations for Further Study_ Since the previous in-service training seemed to yield little benefit to the teacher certificate holders, perhaps because the work may have been too hard and conceptually the subject matter too difficult for them, there should be a study of any future in-service training to find out which group of teachers benefits the most from the training. Another attempt should be made to find suitable ways of presentation and interpretations of subject matter for those who have had no teacher education at all. An experimental study should be done on the effect of the in-service training on actual measurement in the schools. Each year the Department of Education in Khon Kaen has about 120 positions Open for graduates from the teacher training institutions. These new teachers might be randomly assigned to training and no-training groups. The no-training group would then be sent directly to the assigned schools. The training group would first go through a training program and then be sent to assigned schools. The followbup study could then compare the quality of the tests these new teachers made. The efficiency of the training program might be evaluated from the proposed study. Another experiment might be done to investigate the benefit from the training service. The gain of knowledge after the training should be studied to compare the gain made by holders of the teacher certificate with the gain made by the holders of the higher teacher certificate. The quality of the tests they use should also be investigated to discover the extent to which the new knowledge they have acquired is put into practice, and under what circumstances the application is facilitated. 89 If there is no improvement in the quality of the tests the in-service training is in vain. The effort and money that was put on the program were wasted. The application of the test results has been highly important in the Thai educational system. One test result might determine the students' whole future. If the test was not valid and reliable, a capable student might be forced to drop out of school. In order to evaluate students' abilities based on valid and reliable measures, sound national testing programs should be developed and put into practice. Standardized tests should be constructed and be available to all teachers. Funds should be provided by the Ministry of Education to construct national norms. A National Test Bureau should be established, to be a center of testing services and to carry on all educational testing jobs. Programs of in-service training could be part of the function of this National Test Bureau. Different programs should be established for different groups of teachers, depending on their needs and the levels of their education. For example, teachers with minimum educational background should be provided with basic information about how to construct a test, to write good items, and to interpret the test scores, since these subject matters do not require much knowledge of statistical procedures. Item analysis and transforming scores should be tasks done for them by personnel at the local educational offices. Since the four-way interaction (subject matter by location of schools by level of teacher education by teaching experience) was found, yet needs of teachers in each group were different in some way, different programs of in-service training must be planned for 90 particular groups of teachers and these programs must be flexible. That is, it is desirable that before planning an in-service program, a pilot study he conducted to find out the needs of teachers in the particular location and the level of their knowledge about the subject. Besides the standardization of tests and administration of the in-service program, the National Test Bureau should provide an information service, either through the preparation and distribution of a teachers' handbook on test construction, or through a quarterly journal on educational measurement and evaluation, or desirably through both. The College of Education at Khon Kaen University should add another dimension to respond to teachers' testing needs. As the closest resource, the university can provide an intermediate help to all teachers in the province. The services can be as individualized as mail corresponding, providing a practical suggestion to individual teacher's problems. The university test and research center should offer services such as consultation and information services. Graduate students majoring in Measurement and Evaluation should acquire most valuable experience through participation in this program. A summer work-shop is another service which the university can provide to teachers. Of course content and structure should be drawn carefully based on participants' characteristics and needs. The session should be about one to one and a half months long. Practical considera- tions would be more important than a theoretical orientation. Before the participant teachers leave the campus, they should be convinced that they can get help from the staff any time on any measurement problem that they may have in the future. They should leave with the feeling that 91 the university is trying to help all teachers, both participants and non- participants. The university is the place they can turn to when they would like to have their problems solved. APPENDIX A Directions: PMT I 0 APPENDIX A THE INSTRUMENT (FIRST EDITION) You need not identify yourself, but please answer each of the questions as accurately and as honestly as you can. Please mark X in the apprOpriate columns for each item. Do you think you need to know more about these tasks? Choosing the type of test to use Determining the length of the test Deciding what ability should be emphasized in the test Determing the ratio among questions that call for memory, application, and reasoning. Determining the number of items for each topic Administering the test Setting the difficulty level Arranging the items into the test writing good item stems (in multiple- choice item) Writing reasonable alternatives (in multiple-choice item) 92 Urgent Need Very Much Needed Some Need Little Need Don't Need ll. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 93 Writing good true-false items Scoring essay tests How to measure in the cognitive domain How to measure in the affective domain How to measure in the psycho—motor domain Writing better items Understanding the major uses and limitations of item analysis data Selecting good items. Computing and understanding the item difficulty index Computing and understanding the item discrimination index Understanding the difference and relationship between difficulty and discrimination Finding the validity of a test Finding the reliability of a test Revising items Reporting and writing recommendations to the principal and superintendent Choosing the type of test scores Computing and understanding T-scores Computing and understanding percentiles Assigning grades to students Urgent Need Very Much Needed Some Need Little Need Don't Need 30. 31. 32. 94 Drawing and interpreting profiles Using norms Distinguishing between norms and standard Urgent Need Very Much Needed Some Need Little Need Don't Need 95 PART II Mark.X in front of the best answer of each.item. 1. In evaluating an achievement test for possible use in your school, the most important issue is whether the test a) is based upon a thorough survey of teaching practices over the whole country. b) matches in content the objectives of instruction in the school. c) has sufficiently high predictive validity. d) is convenient to administer and score. e) has sufficiently high reliability. 2. The most important concern in judging how well a test samples a content area is its a) completeness. b) correlation with a criterion. c) representativeness. d) variety. 3. What is the first decision made by the test constructor? a) Characteristics of students taking the test. b) Number of test items to use. c) Types of test items to use. d) Difficulty of the test. e) Purpose of the test. 4. In constructing a table of specifications for a classroom test, the number of test items allotted to each cell should be a) varied each term, so that students are unable to detect the amount of emphasis to be given to each area in the test. b) approximately equal, so that no particular objective will receive undue emphasis. c) balanced between knowledge and higher level learning outcomes. d) determined by the emphasis to be given during instruction. 96 5. Which of the following will not be accomplished by constructing a test blueprint? a) help to clarify instructional objectives. b) improve the validity of the test. c) relate evaluation to teaching. d) help to "balance" the test. e) save construction time. 6. Which one of the test attributes is most likely to be use of a table of specifications? a) Difficulty b) Objectivity c) Reliability d) Sampling e) Usability 7. Which type of item format provides greatest content sampling in the shortest possible testing time? a) essay b) matching c) multiple-choice d) short-answer e) true-false 8. Which of the following is not a purpose of the classroom achievement test? a) Measuring attainment of instructional objectives. b) Diagnosting individual learning difficulties. c) Judging relative mastery of basic knowledge. d) Assessing relative curricular effectiveness. e) Assessing instructional effectiveness. 97 9. The best source of possible alternatives for a multiple-choice item is the a) explanations of poor students in the class. b) teacher's knowledge of the subject matter. c) questions of bright students in the class. d) textual material assigned to the class. 10. Multiple-choice items written in the form of questions a) assume complex reading skills. b) require no more detailed alternatives than do incomplete statements. c) have greater clarity than items phrased in incomplete statements. d) lend themselves to easier detection of construction flaws. 11. It is good practice in multiple—choice, single-correct answer test items to a) randomize the order of the answers, putting the correct alternative in each position an approximately equal number of times. b) put the correct answer last or next to last more often than in any other position. c) let chance take care of the order of alternatives within the item. d) arrange placement of the correct answers according to some predetermined scheme. 12. "None of the above" is an appropriate distractor for a multiple- choice item in cases where a) the other distractors are partially but not absolutely correct. b) the best answer is an absolutely correct answer. c) the number of possible responses is limited to two or three. d) a large variety of possible responses might be given. e) guessing is apt to be serious problem. 98 13. Which, if any, of the following should characterize the stem of a good multiple-choice test item? a) It should be a complete direct question. b) It should be the subject portion of the proposition the item is intended to test. c) It should express or imply a specific question. d) It need not have any of the foregoing characteristics. 14. How should the question and the responses of a multiple- choice test item be changed to make the item easier? a) question more specific, responses more alike. b) question more specific, responses more diverse. c) question more general, responses more alike. d) question more general, responses more diverse. 15. With respect to the multiple-choice test item a good distracter is a) only an untrue statement. b) only irrelevant statement. c) both untrue and irrelevant. d) either untrue or irrelevant. 16. Within a given item format, the items of a test should be arranged by a) subject-matter content. b) increasing level of complexity. c) level of reading difficulty. d) a random pattern. 17. If experts agree on the correct answer to a test item, the item is a) fair. b) objective. c) reliable. d) specific. 99 18. Fundamentally, item discrimination indicies attempt to indicate a) the ability of items to distinguish between persons having different levels of that characteristics which.the test measures. b) how well the items can be distinguished from one another within the test. c) the degree to which items from different tests can be discriminated. d) the ability of the item writer to develop items which are readable. 19. If three good students out of four answer a question correctly, what is its index of discrimination? a) 1.00 b) .75 c) .25 d) Insufficient information is given to answer the question. 20. It is likely that item discrimination will be at a maximum when the item difficulty index approaches a) 1.00 b) .75 c) .50 d) .25 e) 0.00 100 Items 21 through 24 are based on the following data: A class of 200 students was divided at the median into a high and a low group. Their responses to five 4 choice multiple-choice items are tabulated below. Correct responses are underlined. Number of students marking Item Number Grogp Response 1 Response 2 Response 3 Response 4 1 High 89_ 20 0 0 Low ‘45 3O 20 5 2 High 0 0 199 0 Low 0 15 '85 O 3 High 35- 10 10 45 Low 55_ 20 0 25 4 High 20 5 62_ 10 Low 20 25 35 20 5 High 30 30 0 '49 Low 5 55 5 32. Use the following KEY to answer each of the next four items: KEY: a) Item 1 b) Item 2 c) Item 3 d) Item 4 e) Item 5 21. Which item discriminated best between high and low groups? 22. Which item has the highest index of difficulty? 23. Which item has difficulty index of .50? 24. Which item shows negative discrimination? 101 24. The most easily defensible use of tests in schools is for a) b) e) d) accreditation of schools. motivation of study. teacher evaluation. pupil guidance. 26. If a pupil had a score at the median of his class, we know he a) b) e) d) 27. Norms are a) b) e) d) is at a score halfdway between the lowest and highest score. has a score higher than those of half his classmates. is doing acceptable work. has a score at the mean. lgag£_likely to remain current for which type of test? Achievement Aptitude Interest Projective 28. On test XYZ pupil A scored at the 80th percentile and pupil B scored at the 40th percentile. What statement may we make relative to their abilities? a) b) c} d) e) Pupil A has twice the measured ability of Pupil B. Pupil A's test score will be twice that of Pupil B. If Pupil A's score is superior to those of 70 pupils, B's score is superior to those of 35. Percentile ranks are meaningless in comparing the scores of A and B. We need to know more about the distribution of scores before we can compare the two percentiles. 102 29. A distribution of linear s-scores is always a) b) e) d) e) similar to the raw score distribution. a rectangular distribution. a normal distribution. a peaked distribution. a skewed distribution. 30. A person whose z-score was zero would have a raw score equal to a) b) e) d) zero a the mean. the median. none of these. 31. If one desires to give equal weight to several tests when combining scores, he will obtain the best results if he uses a) b) C) d) e) percentages. percentile scores. raw scores. standard scores. any of the above--it makes no difference. 32. Is it wise to base a pupil's grade on his achievement in relation to his ability? Why? a) b) C) d) Yes, because achievement depends on ability. Yes, because it gives every pupil an equal chance to make high.marks. No, because the ability required in a particular course is difficult to assess properly. No, because to do so simply intensifies the competition for high marks. APPENDIX B THE INSTRUMENT (FINAL EDITION) To my colleagues who teach in Khon Kaen Province schools, I am studying for my doctor's degree at Michigan State University, East Lansing, Michigan. The last requirement is that I do a research project. Because when I have finished I expect to return to the Faculty of Education at Khon Kaen University, and because Khon Kaen is my home city, I am studying whether the teachers in our province need and want help in making and using classroom tests. The Changwad Education Officer and the Amphur Education Officers are interested in helping in this study, and I am very grateful to them. Your Amphur Education Officer is distributing the questionnaires and will return them to me. But nobody except me will know just how you answered. When you have made your answers, put your questionnaire in the enclosed envelope, £331 it, and hand it to your principal, who will start it on its way back to me. Do not sign your name on the envelope. Answer each question as accurately and as honestly as you can. Your responses to this questionnaire will not be identified to you individually, therefore do not worry about being wrong. If my study is to be acceptable and later become useful, it is important that you take this task seriously and I get back all the questionnaires. It will take about thirty minutes for you to fill out the form. 103 104 Thank you very much for your help. The findings will probably make the basis for designing a course or workshop to be offered to all teachers. Cordially Yours, Sor Wasna Pravalpruk 105 Directions: You need not identify yourself, but please fill in all information on this page. This information is for research purposes only. Answer each of the questions on this and the following pages as accurately and as honestly as you can. PART I Please fill in the blanks and check the appropriate categories. Name of school Amphur What grade are you teaching? What subjects are you teaching? How many teachers are there in your school? How many students are there in your school? (approximate) Your sex B Male B Female Level of teacher education: [:J Cert. and lower [:1 Higher cert. and higher Years of teaching experience: [:J Less than 8 years [:1 Eight years and more 106 Did you attend the Measurement and Evaluation Workshop program during 1966-1968? D Yes D No Are you in favor of national testing? [I Yes D No 107 PART II Please mark X in the apprOpriate columns for each item. Do you think you need to know more about these tasks? 8 'o o o 'o :z 'o o o 'o o .o -o o o z: o o :z o n o :z u :8 :z o 8 63.“ SE83“ semaa 1. Choosing the type of test to use 2. Determining the length of the test 3. Deciding what ability should be emphasized in the test 4. Determining the number of items for each topic 5. Administering the test 6. Setting the difficulty level 7. Arranging the items into the test 8. writing good items stems (in multiple- choice item) 9. Writing reasonable alternatives (in multiple-choice item) 10. Writing good true-false items 11. Scoring essay tests 12. How to measure in the cognitive domain 13. How to measure in the affective domain 14. How to measure in the psycho-motor domain 15. Understanding the major uses and limitations of item analysis data 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 108 Computing and understanding the item difficulty index Computing and understanding the item discrimination index Understanding the difference and _ relationship between difficulty and discrimination Finding the validity of a test Finding the reliability of a test Revising items Reporting and writing recommendations to the principal and superintendent Choosing the type of test scores Computing and understanding T-scores Computing and understanding percentiles Drawing and interpreting profiles Using norms Distinguishing between norms and standard Urgent Need Very Much Needed Some Need Little Need Don't Need 109 PART III Mark X in front of the best answer of each.item. 1. In evaluating an achievement test for possible use in your school, the most important issue is whether the test a) is based upon a thorough survey of teaching practices over the whole country. b) matches in content the objectives of instruction in the school. c) has sufficiently high predictive validity. d) has sufficiently high reliability. 2. The most important concern in judging how well a test samples a content area is its a) length. b) correlation with a criterion. c) representativeness. d) variety. 3. In constructing a table of specifications for a classroom test, the number of test items allotted to each cell should be a) varied each term, so that students are unable to detect the amount of emphasis to be given to each area in the test. b) approximately equal, so that no particular objective will receive undue emphasis. c) balanced between knowledge and higher level learning outcomes. d) determined by the emphasis to be given during instruction. 4. Which of the following will not be accomplished by constructing a test blueprint? a) improve the validity of the test. b) relate evaluation to teaching. c) help to "balance" the test. d) save construction time. 110 5. Which type of item format provides greatest content sampling in the shortest possible testing time? a) matching. b) multiple-choice. c) short-answer. d) true-false 6. Multiple-choice items written in the form of questions a) assume complex reading skills. b) require no more detailed alternatives than do incomplete statements. c) have greater clarity than items phrased in incomplete statements. d) lend themselves to easier detection of construction flaws. 7. It is good practice in multiple-choice, single-correct answer test items to a) randomize the order of the answers, putting the correct alternative in each position an approximately equal number of times. b) put the correct answer last or next to last more often than in any other position. c) put the correct answer first more often than in any positions. d) arrange placement of the correct answers according to some predetermined scheme. 8. Which, if any, of the following should characterize the stem of a good multiple-choice test item? a) It should be the subject portion of the preposition the item is intended to test. b) It should express or imply a specific question. c) It need not have any of the foregoing characteristics. d) Both (a) and (b). 111 9. To make a multiple-choice item easier, we should make the a) question more specific, responses more alike. b) question more specific, responses more diverse. c) question more general, responses more alike. d) question more general, responses more diverse. 10. Within a given item format, the items of a test should be arranged by a) decreasing level of complexity. b) increasing level of complexity. c) level of reading difficulty. d) a random pattern. 11. Fundamentally, item discrimination indicies attempt to indicate a) the ability of items to distinguish between persons having different levels of that characteristics which the test measures. b) how well the items can be distinguished from one another within the test. c) the degree to which items from different tests can be discriminated. d) the ability of the item writer to develop items which are readable. 12. If three good students out of four answer a question correctly, what is its index of discrimination? a) 1.00 b) .75 c) .25 d) Insufficient information is given to answer the question. 112 Items 13 through 15 are based on the following data: A class of 200 students was divided at the median into a high and a low groups. Their responses to five 4 choice multiple- choice items are tabulated below. Correct responses are underlined. Number of students marking Item Number Group Response 1. Response 2. Regponse 3. Response 4. 1 High 99_ 20 0 0 Low .3; 30 20 5 2 High 0 0 $99 0 Low 0 15 99_ O 3 High 99. 10 10 45 Low .99 20 0 25 4 High 20 5 99' 10 Low 20 25 35 20 Use the following KEY to answer each of the next four items: KEY: a) Item 1 b) Item 2 c) Item 3 d) Item 4 13. Which item discriminated best between high and low groups? 14. Which item has difficulty index of .50? 15. Which item shows negative discrimination? 16. a) b) C) d) 17. 113 The norms for which type of test need to be revised more often? Achievement Aptitude Interest Projective On test XYZ pupil A scored at the 80th percentile and pupil B scored at the 40th percentile. What statement may we make relative to their abilities? a) b) C) d) 18. a) b) C) d) 19. Pupil A has twice the measured ability of Pupil B. Pupil A's test score will be twice that of Pupil B. If Pupil A's score is superior to those of 70 pupils, B's score is superior to those of 35. Percentile ranks are meaningless in comparing the scores of A and B. A person whose z-score was zero would have a raw score equal to zero a the mean. the mode. none of these. If one desires to give equal weight to several tests when combining scores, he will obtain the best results if he uses a) b) C) d) percentile scores. raw scores. standard scores. any of the above--it makes no difference. 114 20. Is it wise to base a pupil's grade on his achievement in relation to his ability? Why? a) Yes, because achievement depends on ability. b) Yes, because it gives every pupil an equal chance to make high marks. c) No, because the ability required in a particular course is difficult to assess properly. d) No, because to do so simply intensifies the competition for high marks. 115 4 s v ' quu Lwauh$1u30u3nsauunu V O V V A '4 93011033080143 LNOTUUTiyfiyfi Lanap‘m Michigan State University v A é V ¢ V a 4 a t d A V 1N8N7§aLuin3 33u uqnm38 3zmsqm3saflimm3flwu3 L 80330LN03003130N3083 QV o d G v ' ( ' V Q Qt] 00u3zlflm3a3u uu33m038008uunfilunmzflns3fi3887 unzisuunutflun3utnnmsqn0u 4‘ o J V v v 4 V V a/ 0 u$083n3ifins3170303300040311838g1u3au30883133108300031080108000883008 fithuu av V ' 4 ' a v u o é ' V V ' 4 00u1008833n13nua33083ufins3003330N3nuazs3Lns 0W3u1018033073uus I y d v I a o V v V 1u18730330 mamanwizqmm3ul3m 5030 N3ufins3fin3783tnsuaz33u3hszflupuvnuuu V t I O . U V t v 80003nfllun3nququuinuu3ungluy Ldsw3umsun303uw0uu01873083 020383180; ' a a u v A V ' V v ' lupuiaflns3fin3ia310007030N301w03:10:3ui3uaqlmfisunslfl V ' I unnaau03unnfi3zlm17003ifinu300:39000000uLM3fih 08:3:1003330303- I I iv A I l v 1 V v fiOUflBQMfiUlEuTWUUhhfl N3ulnmsqLfluu30080M3u 108010070090833N83031uniuwnma ' e Lm3uu yd a é Ru 4 V unnaan03ufl m man man m 8333:10u333031u3nlfl 000u010w0383u 4 v vv V' v ‘v NfihLaualiluuaufiqn 301NM3uWU383nnaflluuunnn08 A (I w v v I 0808303130831700538flflizluaun0L0038Nu30038833N13nfislu031080000 V A 7 V 4 V“ q; 80003Nfiwnnannw0um3u033nfinnm083w3uuazaquuuasnu3nnulwhen 10800w1:q0flu I 4 ' V n33n73un0300M3un3m10038003u v v 4 OfiflflfifilJulJO’fl ( ( 8.338u3 07:338quu ) 116 033030 . v vv V I ‘ ‘v‘ lfl7000u0303nw0001030130 Nhn33un3031uuuuasnu3nflvz13100313083 V a ' ‘4 VVULMWUN .v ' '4 OBUH o n§m3tmu00033u1u300330 “$003 OQIHTBQMLHNWZHM 031n07h3umhn3700u ........... .... ibdasu .................. 333280u ................. 33u3u0§10130178u ........ 3hu3ufih170u (fli:N3m) ........ g 1 twfiNBQM3u I 338 003 ' 1.1. 308; l flnfi. 87003033 I flnfl.§0 flfqg3 07080n33 1| V I I 33u3u07m3u80u u08033 a 0 I a fluazu3nn33 a 0 I V V I N3u188LN3001N0373008030N781083 L80 1300 .......... I Tutnu I y K ‘ I M3u3001ufin3180073um007:Lmfiufaiu I D 31011 1113811 117 I I Q 4 v A I v] I y I I V fiflflfi b W1UflU1fi953fiBfilfiMLWULflflfi UL?BQWB1U§M§ULMLWUQ1fi fiWMWUQWUWWBGfifli V I V I uwndqfi Tunw ‘fk‘aalumaqfi 9 (WBQfifiTufififififi) uazd b,m,€, :aqaalflvufiqmaqfi a 4 " haiumaqnwiLau v 4 IV maqnniuwnmqm TumaqnwiLau 9 b 1 m C C y I 9. fifiTlaflfiflleflMfifldflflfianLfiu fliuuu?aamuu V b. naififiuunhdwnu11%BQ%aaau V V V m. meaBUfidiluuaniinnwwmwaauaaaziiunq . I V V V 4V ¢. n1inwuufidquflaTflhaiv:aanwaaaunwa a4 -‘1 a. fifififiiflufifln u y «4 b. nninfiuufifiqfinuwnqwuwaamaaau ‘ V I V V an .1 d. n1?Liuqmaaauumazmatfluwaaaumquzun d. unitfiuufifififim1u1uflaaanuuuLfianmau - v 4 V 4 @. nnitfluumuLaaanmaaauuUULaafimau V 90. nwitfiuuflaaanunugnaw V V 99. nwiquhzuuumaafluunnamuu V I 6’ 9b. nfiiaflnmaaandhflmqw LmuWuamwamuwfiwaw: .--...._. ..qb v V 4 v a f v 4 9m. finiaflmuwugfinuiammuflwflnqunLiuu I 9;. fifiifiIfihB: Lfiu1u3%1Wfi:fifiB1 ummfihufi (V V I 9;. manwuufingLmuwflun1i1Lminzumaaau F...w--_r--‘. _-. -.M... .- ...-___....-..-_. I vb. fifiTHHUQWHfiSfiIWMMNfiflflfldflfifiuflfifidfiflfiflI flfififln ‘p. ‘n-w-q‘ 9w. hwififiudmuazmuquuunuwaqafiuwvafiuuumaa V flaflflu 9d. fiQWNUWfiWWITSMQWIfiQWNUWflIWUUflZBWUfi? V WWQUfiflBGflflfiBU 62$. b0. b9. biz). lob. 5x 118 V 4 maqn1$n1nmqm V nwimqhuwnLfiuamiqwaqmaaan L v nwrunhdwuLéauuwaqmaaau v V V nqrflinflgaunimflaaau I nwinranagmflizvfifififihL?uuua:iwuawuma I afluq 4 V 4 . finitaafi1fififlmfiaamzuuuLwaLauauanwiaau fin:fifiuqmuazmawnuuwumaafizuuutr 0’ G’ nwififiudmuazhawuunwufiaqtUa:tmuima ( Percentile ) I V n17LflfuuLfiunhawnafiuninluhwunWQflwaa fih1?uuTnuTmnwiwq Profile v I ‘ iwwflmtnmmuynm ( Norm ) V I I I a fiaumnmfiqizudwoanmflifimuaznwmigwu ( Standard ) Iv Tumaqn1$Lau 9 19 € WEUé m I 4 I M1UlMU7WQfi . 4' V fiamfiflBUfiWfififlfififififiIfififihMMfiUMTWU WUWUWMWBUWfififl 119 V I v d v 4' 4 I 9. maaan w:1aa11nfihqmau61u7:qLiuuwaqmwumaiv: I whiufiafiflgwnmutfiuauhai u 4. '1" I Q Id dfinwuwnwuun aquthemnwumaaufiwuuq flflawntfivamiqga I fifl??fll%flfifi§d I V InnaifivwimnawmaaBUhiaumanuLfiauwfimwLfluqlfififi Liwgfi fl. V mawnuwamaqmaaflu V( V v (45 V mdwufihwufiflEQfiaaaunungtnmmmm011 v V I“ I”, fidwuLfluwaumumflawaaanmadmwuu I I v V I flaaauawumazflaflfidwuumnmwqfifiquLflauw a v4" 1UfifiTfifiVUfififi1uum83flflufiBQfififiuddflflfif?tflflfififlfiflnfififl fi???: H. I V VI V I V tfléuunnfiffififlfiniaautfifliulufihL?uugdwt§34fl9zfiuwn1ufiaaaanwfima I I flLmqfififiTuumazmauflaqLflau13mw I V V maiw:flfihmenfifiiuflwufiawuafiuaznwiqmLufiua V haivzfifiuunThUflggaau . a r V t v I v v fiWTMWWWTWQQLfiTfitMMfifiQflTfi38H$1UfiWUWBTUfi UfilIUIBTMU I yv I maqumaaauflhdwuLfiuqmiafloéu I v ( fiauiufidwnfihfihfiWufiqiL?uuua:nwiaau I V V fidunfiuumvfiuduwaquwaaan I mqufliznfithaw V V V V V maaanfififiiuudaaniwuwnmafiqufiUTfiLdawqumwimauuaufiqfi H. I unnfihg 4 uUULaahmau V unnTutfiuufifimanfihq uuughafl 90. 120 v v». A InfinitfiuuflaaauuUULfianmau WfiflizlnhmnuwuLwaa:1I fl. vv ‘ v v I 4 v quufiLiuumaqqmmqwuaIMIIHWunqInfiumfiIIMMMWUWuizmugq V IV mILfiamszauuTfififi iumaqafinqnnwn v 4 4V v I m1LaafivzLfihizwqunulfluawuuquaqugn I V . V e "V nwuumuquaqmauuimmmLvuuwmaqnwimwnazii V 4 VIII 0 I V 4 5 ° ' InmaaanunnLaammau fifi?1flfi1flfififlfl§fifl¥1Ufifl fi.fi.fi.d. M539. uumdinwuumauqqii fl. V° I V v I v V A v TufiwmaugmaQIuma fi.%.fi.d.9. fiazmuTflmeflnfiIumaaauuuqaznu v V. IV IV V IV a A IflWmemanagmefl Lmuma I. WuuwhndwmuLaanaun Iv IV v V 0 V I v 4 A IfiWuanmanagmufl Imuma n. Tuuwnmfiwmataanauq v. V4 v I v #0 V ImmwmangmiuLInIfiuuuuimunuuflquuuannqi n1num11 0 V 4 4 v ' mauquumaaanunuLaantUfiaiuanumzauwaT: I. V I an?LfluflizfiwufleaflizTUhLIBWumfimanqumfluLfianLfluauuwunu martfluflizTUhfimwnLaww:Lawzvq V LfiuflizTuaunnlmfilh 3 MI 0. “fit I. 4 VV 4 I I A . v lWfl?:quflflfifluuUULfifififlflndfiflgufiifiLfin fiQTMWWM OJ o ”1 v 4‘19“" mauun.1quwa-aq fl uavmauLaan fifilfifldfiuufiflfl I fiBuUHD1NWHLW12fi01fl mazmautfianqunuuwnq . VV 4 ”4.; flauUWfifionfififid uazmauLaanTnaLmuonuuwnq V V I mauflfiuwulundwq uazmauLfiafimnqfiunqnq V V 4 v I v v v 4 I fifiTtTUIflBfiBDuflfitfiflLflfilfluflflfiBUMQfiSUUfifiiLTUQBUWIT? 4 V ‘1 I IiuovqfiflanwnLflqwu 4 V I Liuqaqmmflawuifluwfl V L?uqmwudfiWUIauw1—fih I V I?uqflquii51w 121 V 99. dfiquvfiuunmaqwaaaufiaa:17 It! I v v n. manuawuqimmwzunqumL?uuaanmwui:fiumawufi1nnim 4W 'v "JVBV fl. fififiufififlfiTQM?tunIflflfiflflflfifitflflflBHWWfiflflBUflqUfiflfiflUMQQ5UU I V V I v fl. fififlflflfifififihumfimfiQfiadflflfiflhuflfi:flflflflfivfififlflflflflufififlflfithu V v 4 v v Ivvvv“ a. hdqnanufiinmaqgaanmaaaumv:UiuflngaaauumaszWfiTanfi I V V V V d V v 4 I V I5 I4 ‘ . 9b. 1Hfiflfiflflflflu Ivfiflfiliflulfid m 1U é fiumaumauugn ufiMIWflBfiBUIBHUMflfiuW? . I IwuunLMIWI v '4" V ' 4 4 4 I. usaanInsfi LWTfizflfldMTWU3810flulwuBfi vv v I v 4 Immafiawuqufinq mauwfl 9m uq @é a 4" 4 I I 4 I I I, Tumuudquuntiuu boo flu uuqaanLfiu b nquwanquhzuuugquaznquhzuuumw I v I V V nqua: 900 mu “WTWIIWIfifidfilflflfififlflflflfldfiflfiflnfififlLaflfiflflu 6 $3 (flfiflflflgfigflfififlah 4 v v Mmfiu‘lm) o v 4 i I v d VWUUUUHLTUU NBUUWatfifilfiafi V I v 4 v 4 v 4 v 4 v 4 fig fifiMUfiLiUu WILEBH fl. fidlfiflfl fl. “Ilfiflfi fl. Wfilfiflfi Q. 9. ad £2 b0 0 o I fifi gg mo be a E. @G o o 900 o I fifi o 95 fig 0 m. 9Q m6 90 90 (K I fifi £5 b0 0 m: 6. I b0 C bf 90 2L-mm _) 5 5‘ a a I be 122 VV 4 V I V 4 Ifimdtaanvwoawqfinanma 9m an 9c 90'). 9‘0 9&0 9b. 0. fieluufiafiuwvvfiuungamafi V V I waiuuflrthfiIIMUInqwu .60 V maqufifiHuqvafiuunLfluau (a v 4vvv'4 anmUInm (Norms ) flaqwaanuflizanTuummaqfliuflaquniwuaqun H. V $3 9 V fifl b V flfl m V flfl C A 1 Jul-“if \ .1 V & weaaue11n5MQMfilunwififiuw * V flflfiflnfififlfifiufifih V v Iaaau1mm11nauqa V fiafiflndhfihuha V v d 4 V 4 f v. 0’ 4 9a. Tufiaaanuxnnu I Lfififififlfi. Tfiflzuuum do Lflaitmuuwa ( Percentihe ) Lmnmwum. ..V d K ( IV V I 6V V Lflfizuuu do LUBTLIHIMfi nfiuififlafififiufifidfifid flflqfigfifiafl fl. %. fl. 0. I fi.flfififiufifinfiifilflu b LMfiflBQ m. I fizuuumaq fi.Lfiu b tMfififiI %. V V I A V I A 01 fi.1mh:uuu§qnfiwfiuau do flu %.I:Tfifi:uuu§qnawhuau ma an I r I I I fifiuuuqLflflitmu1me1ufifi11nuuwumanwiLfl?untfiunh:uuumaqunt?uu b mu {1* ‘~ I ~ d1"T 9d. UWUfiLiUufiUM qufizuuu Z Lfiuguu LIIthuuumn ( hzuuu mvwnmaaau WUfiTG) I meTI %. H. Q. ( HUU 1| ' 4 tMfiTfiULQfiU ( Mean ) ' 4 V 4 twwhzuuuwflmuihuwanfi ( Mode ) I Infififimangn 123 I v V V I V V 95. hgfiuuflqmaafifiiiduhruuuvnnflaaauuawuazuuTfi81mamiwaduu?aufiuunmaqmaaau u ' v V :1 flfifizunLMWfiu Lflfifi17931fifizuuufiflfiqfi$dvraM§fi " 6' n. tflaftmuima ( Percentile ) ‘Zl. muuufiu ( Raw Score ) fl. muuummgqu( Standard Score ) V 4 v I. v I v a. WmnzuuuunnquunTh InMWWufiqwuunfiuumnqunu V 4 uAT 4u4.4u b0. figfififvtlufltuuuufflLfflTlGrade UHLTUU fiUfi??fiUfidfiUWUfiLiUuMWLWUUfiUflfifiM _ 4 v v A U 4 fl' anuwimuiahdnnmumwaqunLiuumuuuuiflLu I fl. m3: LWTWSUflIWU%UB§fiDfl31Mfi1u1Em V V I fl. fiQT LWiqszTuunL?uuflTan1a¢v:Thinh:uuu§qmeflnu v] A I 4 0’ 4 I I Lflfi77 LWTfi:flififlfiquqTfiqu1fifiufidqufiflfiflTZHBUWIMHUUEU I . ya IVA v Q. 1Mh1? LWTWSMWWHLfifififiiflfldflUM?:1fifltuuugd APPENDIX C APPENDIX C PAIRWISE PROFILE ANALYSIS 0N ACTUAL NEEDS Sources df MS F P less than All Groups PL-RE 1 130.46 128.05 .0001 IE-RE 1 178.45 168.16 .0001 IA—RE 1 30.18 25.83 .0001 Location PL-RE 1 0.35 0.35 .557 IE-RE 1 3.31 3.11 .079 IAeRE 1 1.70 1.46 .229 Teacher Education PL-RE 1 0.04 0.04 .845 IA-RE l 1.50 1.28 .258 Teaching Experience PL-RE 1 5.44 5.44 .022 IE-RE l 3.31 3.11 .079 IA-RE l 8.79 7.52 .007 Location x Education IE-RE l 0.76 0.71 .399 IA-RE l 1.14 0.97 .324 124 125 PARWISE PROFILE ANALYSIS 0N ACTUAL NEEDS continued Sources df MS F Y 1888 than Location x Experience PL-RE 1 1.50 1 .47 .226 ’ IE—RE 1 0.01 0.01 .939 f____fi% IA—RE 1 0.69 0.59 .443 a“ if. Education x Experience 2 PL—RE 1 7 .44 7 . 30 . 007 ‘ IE—RE 1 6.80 6.41 .012 IArRE 1 11.82 10.12 .002 Location x Education x Experience PL—RE 1 11.29 11.07 .001 IE—RE 1 1.06 1.00 .319 IArRE 1 0.98 0.84 .361 Within PL—RE 312 1.019 IE—RE 312 1.061 IArRE 312 1.169 APPENDIX D APPENDIX D MULTIVARIATE ANALYSIS OF VARIANCE ON FOUR SUBSCALES ON ACTUAL NEEDS Sources df MS F P less than Location 4 & 309 2.258 .063 Teacher Education 4 & 309 2.554 .039 PL 1 0.80 0.73 .393 IE 1 6.90 5.44 .020 IA 1 8.45 5.84 .016 RE 1 1.38 1.27 .261 Experience 4 & 309 2.549 .040 PL 1 0.05 0.05 .831 IE 1 0.25 0.20 .655 IA 1 1.25 0.86 .353 RE 1 9.45 8.69 .004 Location x Teacher Education 4 & 309 1.219 .303 Location x Experience 4 & 309 1.504 .201 Education x Experience 4 & 309 3.065 .017 PL 1 1.25 1.15 .286 IE 1 0.90 0.71 .400 IA 1 4.51 3.12 .078 RE 1 7.50 6.89 .009 “"55.- u: m3, 127 MMLTIVARIAIE ANALYSIS OF VARIANCE ON FOUR SUBSCALES ON ACTUAL NEEDS continued Sources df MS F P less than Location x Teacher Education x Experience 4 & 309 3.618 .007 PL 1 3.20 2.93 .088 IE 1 2.28 1.80 .181 IA 1 2.45 1.67 .194 RE 1 8.78 8.07 .005 Within PL , 312 1.092 IE 312 1.268 IA 312 1.447 RE 312 1.088 BIBLIOGRAPHY BIBLIOGRAPHY Ahmann, J.S., and Clock, MQD. Evaluating Pupil GrOwth. Boston, Massachusetts: Allyn and Bacon, Inc., 1971. Allen, Margaret E. "Status of Measurement Courses for Undergraduates in Teacher-Training Institutions." Thirteenth Yearbook National Council on Measurements Used in'Education, 1956: 67-73. Bell, R.D. "The Risky Business of Classroom Testing." Business Education Forum. 27 (October 1970): 8-10. Black, H. They Shall Not Pass. New York: William.MOrrow & Co., Inc., 1963. Bowman, A. "In-Service Training of Teachers in Measurement." The 18th Yearbook of the National Council on Measurement in Education. 1961: 43-49. Brimm, L. and Tollett, D.J. "How do Teachers Feel About In-Service Education?" Educational Leadership. 31 (March 1974): 521-525. Durost, N. "Problems in In-Service Training of Teachers in the Use of Measurement and Evaluation Techniques." The 16th Yearbook of the National Council on Measurement in Education. 1959: 31-33. Dyer, S. "Testing's Nine Major Pitfalls." Senior Scholastic. 80 (March 14, 1962): 21T-23T. Ebel, R.L. (Ed.) Encyclopedia of Educational Research. 4th ed., 1969. Ebel, R.L. Essentials of Educational Measurement. Englewood Cliffs, New Jersey: Prentice-Hall, Inc., 1972. Ebel, R.L. Essentials of Educational Measurement: Instructor's Manual. Englewood Cliffs, New Jersey: Prentice-Hall, Inc., 1972. Ebel, R.L. "Improving the Competence of Teachers in Educational Measurement." Clearing House. 36 (October 1961): 67-71. Ebel, R.L. "Measurement and the Teacher." Educational Leadership. 20 (October 1962): 20-24. Ebel, R.L. "The Social Consequences of Educational Testing." School and SOCiety. 92 (1964): 331-334. Findley, W.G. (Ed.) "The Impact and Improvement of School Testing Program." (Article by Elizabeth Hagen and Lucille Lindberg) The 62nd Yearbook of the National Society for the Study of Education Part II, 1963: 232-253. 128 129 Findley, W.G., and Bryan, Miriam M. "Tests Like Handguns..." (From Education Daily, January 20, 1972). 'Phi Delta Rappan 53 (March 1972): 442. Gerberich, J.R. "The Development of Educational Testing." Theory Into Practice. 2 (October, 1963): 184-191. Goslin, D.A. Teachers and Testing. Russell Sage Foundation, 1967. Grobman, Hulda, "Classroom Testing and the Biology Teacher: An Annotated Bibliography." American Biology Teacher. 29 (April, 1967): 282—285. Grobman, Hulda "Clansroom Testing in Biology: An Annotated Bibliography (11)?" American Biology Teacher. 33 (February 1971): 86-90. Heenan, D.K. and Chaval Paeratakul Education in Thailand: Evaluation of Instruction in Thailand. East Lansing, Michigan State University, 1966. Holt, J. "I Oppose Testing, Marking and Grading." Today's Education. 60 (March 1971): 28-31. Jessel, J.C. "Focus on Issues: Should we Declare a Moratorium on Mental Testing." Contempgpy Education. 44 (November 1972): 127. LaSage, W. "The Hassle Over Standardized Testing." Instructor. 82 (March 1973): 45-56. Link, F.R. "Teach-Made Tests." National Education Association Journal. 52 (October 1963): 23-25. McMorris, R.F. et al "Effects of Violating Item Construction Principles." Journal of Educational Measurement. 9 (Winter 1972): 287-295. McSwain, E.T. "The Role of the College Faculty in the Teacher Education Program." The 16th Yearbook of the National Council on Measurement in Education. 1959: 34~38. Mehrens, W.A. "The Consequences of Misusing Test Results." National Elementapy Principal. 47 (September 1967): 62-64. Mehrens, W.A. and Lehmann, I.J. Measurement and Evaluation in Education and ngcholpgy, New York: Holt, Rinehart and Winston, Inc., 1973. Mehrens, W.A. and Lehmann, I.J. Measurement and Evaluation in Education and Psychology: Instructor‘s Manual. New York: Holt, Rinehart and Winston, Inc., 1973. Noll, V.H. "Problems in the Pre-Service Preparation of Teachers in Measurement." 'The 18th Yearbook of the National Council on Measurement in Education. 1961: 35-42. 130 Noll, V.H. "Requirements in Educational Measurement for Prospective Teachers." School and Socie_y. 82 (September 17, 1955): 88-90 a Roeder, H.H. "Are Today's Teachers Prepared to Use Tests?" Peabody Journal of Education. 49 (April 1972): 2394240. Rose, H.A. and Elton, C.F. "Testing, on the Bias." Journal of College Student Personnel. 12 (September 1971): 362-364. Rosner, B. "Substance of Training in Measurement and Evaluation in Pre-Service Preparation of Teachers." The 18th Yearbook of the National Council on Measurement in Education. 1961: 51-54. Schutz, R.E. "The Role of Measurement in Education: Servant, Soulmate, Stoolpigeon, Statesman, Scapegoat, All of the Above, and/or None of the Above." Journal of Educational Measurement. 8 (Fall 1971): 141-146. Schwartz, A.A., and Tiedeman, D. Evaluating Student Progress in the Secondary School. New York: Longmans, 1957. Sharp, Evelyn W. "How Teachers See Testing." Educational Leadership 24 (November 1966): 141-143. Stevens, C.A. "Measurement in the Teaching-Learning Situation." The 16th Yearbook of the National Council on Measurement in Education. 1959: 39-42. Stevens, Margaret "The Role of the Classroom Teacher in School Testing Programs." The 16th Yearbook of the National Council on Measurement in Education. 1959: 43-46. Thorndike, R.L. and Hagen, Elizabeth Measurement and Evaluation in Psychology and Education. New York: John Wiley and Sons, Inc., 1969. Trump, J.L. "What's Wrong with Testing." Theory Into Practice. 2 (1963): 235-242. Tylor, Jean L. "Of Tools and Fools." Phi Delta Kappan. 55 (December 1973): 285-286. Wronski, S.P. and Raw Sawagdie-Panich Education in Thailand: Secondary Education, Manppwer and Education Planning in Thailand. East Lansing: Michigan State University, 1966. ‘D’J’l’d Lwa’ma mafiamfi’ma Tmud’nmwwfl‘n w$:um,ec:o.< (3". . J..- . MICHIGAN smTE UNIV. LIBRARIES 1111111111111111W111111111111111 31293100191877