AW. GORES. ROVES CTS 0F COACHING TO IMP EFFE ‘VERBAL ANALOGY AND NUMBER E R35 TESE v ON Thesis “for the Degree {of PhD * - ‘13}, 11...}. 9: . . . V « p. . . ., - . . r ‘w . . _ . 5‘ _ .%_.$.§_;. . .333... . 2.. . , V . ‘ . .ap...‘ man . V I p . A mmvxs {wwer "rt, . 1971 WCHI‘GAN STATE UNIVERSITY 301m, HENRY SGHWEI c .u. .t...! 6 . fl... Horn“ .., .1.‘ ..... 1.. v; a! N “l“ “HIM W M 3 1293 2 5 THITLTTITT . This is to certify that the thesis entitled THE EFFECTS OF COACHING TO IMPROVE SCORES ON VERBAL ANALOGY AND NUMBER SERIES TESTS presented by John Henry Schweitzer has been accepted towards fulfillment of the requirements for Ph.D. degree in Education mflm Major professor Date July 28, 1971 0-7639 . f N’ l I‘. > > y f’. - —— I ; .0“ “V .t , ' . k . (7 it ’:";‘u A t ' M I E E *I‘ ‘ i; T This 51;.1, ("V ',A “E. ts m' £0116"! ~—' 1 i 99 te"«, .-,_ . @7' “S "Home .. 3 3-,, 3 ‘~ , 3.» N}. the ”c.- \ . , Q'Ap. ', I." ‘ : '. mC-’ni't.\_ 7 ~17. . 0 4 ,. . 's'm .._,‘*.. :ino.1iz.g..;g.,~;.; ' l r.“ C001” t ‘,;—- 7mm. '-OS '3? imai.‘ .Kt -, an... .‘i {V Io‘fixf ' - ‘- ”“6517 as: :14; - . . W ‘ “W! m“ we, . , . .. w ;. . t. z 4 ‘ ‘W‘ groups . 0m 1', “vi- ‘ “ ~35)" V" - - I " .V‘i~=?'jt.‘u ., ~ -- 1% - ‘ ‘ (:5 a . _ \ v ‘ ig‘tochniques. tm m- up refine“: . J h . .t ‘ . Tm VIII“? Ktusfs, in; e 54-. ~54. ‘3; 5 ’ .. “w, i ‘ v ., .m‘ W; “P S“ fiv“ 1"“ ‘KC; ‘3’ "*(é::-‘Q4N)¢' :15“: ‘. ‘- — w“Ref sex and “flu: oqs‘H‘m “aflkafia -_ " “I? 19 ABSTRACT THE EFFECTS OF COACHING TO IMPROVE SCORES 0N VERBAL ANALOGY AND NUMBER SERIES TESTS By John Henry Schweitzer This study was conducted to systematically investigate the effects of coaching to improve scores on two item types cmnmufly used in reasoning tests, namely, verbal analogies and number series. The coach- ing was evaluated with respect to its effect on number of items solved correctly, the speed of solution, the testee's attitude toward the items, his test-taking confidence, the accuracy of this confidence, and the testee's own subjective evaluation of the coaching. The effects of the coaching on items dissimilar to the coaching material and the effects of the coaching over time were investigated. Also examined were the influences of initial ability level and sex of the person being coached upon the effectiveness of the coaching. The influence of coaching on test validity was also investigated. Eighty-eight college freshmen were randomly assigned to one of two treatment groups. One group received coaching in verbal analogy solution techniques; the other was coached in solving number series problems. There were actually two experiments with each group serving as the control group for the other. The design for each study included the factors of sex and ability level based on pretest scores. At the time of the pretest the subjects had already been randomly assigned to one of the two major groups. Assignment to high and low ability level a John Henry Schweitzer was based on the grand median of all subjects of both sexes in both groups with scores at the grand median equally split between the high and low ability level within each sex in each treatment group. The post hoc blocking procedure resulted in unequal cell sizes so subjects were randomly eliminated to obtain a balanced design. This procedure was followed in both studies, using verbal analogy pretest scores in one case and number series pretest scores in the other, and in each case the final outcome was nine subjects per cell for a total of 72 sub- jects. The 72 subjects were not the same in each study although there was a considerable overlap. ) a The series of dependent measures varied on three factors: time of posttest (immediate and delayed), type of item (similar and Z dissimilar to the coaching material), and testing condition (speeded . and power). The three design factors and the three measures factors 7 were all completely crossed. With two levels for each of the three measures factors, there was a total of eight repeated measures or sub- . tests for each study. Each of the subtests yielded four dependent variables: total score as measured by the number of correct items; test-taking attitude as measured by reaction to solving the items of the subtest; confidence as measured by the number of items estimated to be correct on the sub- test; and accuracy of test-taking confidence as measured by the abso— lute difference between number estimated to be correct and actual num- ber correct. In addition, speed of item solution was measured in two ways: by the number of items attempted on the subtests given under speeded conditions, and by the amount of time spent on the subtests given under power conditions. A final dependent variable was each _. , _.--- ._.___. __‘-‘ John Henry Schweitzer subject's evaluation of the effectiveness of the coaching in helping to solve each of the four item types. The data from the two experiments can be summarized as follows: 1. The group coached in number series solution techniques had a significantly higher mean score on all series items than the control group. The group coached in verbal analogies had a higher mean score on all analogies than the control group. This difference was not statisti- cally significant. However, the F ratio was only .03 below the F required for significance. 2. Coaching on verbal analogies interacted with sex, item type, and test condition in the hypothesized direction on total mean analogy scores. Coaching in number series interacted with item type in the hypothesized direction on total series scores. It interacted with time of posttest in an unhypothesized direction. 3. In both studies coaching interacted with sex on test-taking attitudes. Verbal analogy coaching increased the attitudes of males toward analogy tests, while number series coaching improved the attitudes of females toward series tests. 4. In the verbal analogies study, coaching increased confidence on analogy power tests but decreased confidence on the speeded tests. In the number series study coaching increased confidence on all series tests. It also interacted with sex and item type, resulting in greater confidence for females and greater confidence on number series problems. 5. In both studies coaching improved accuracy of test-taking confidence. In addition, coaching interacted with ability in the analogy study resulting in greater accuracy for the high ability group. In the number series study coaching interacted with sex and item type, «—-——‘- -.— v-‘ADC—Tw John Henry Schweitzer producing greater accuracy for females and on number series items. 6. Verbal analogy coaching increased analogy item solution time as measured both by number of items attempted on speeded tests and time spent on power tests. Significant interactions indicated this effect was strongest on the first posttest and for verbal analogies. Number series coaching decreased the time spent in solving series items as measured by the number of items attempted on speeded tests. This effect, which is the opposite of that hypothesized, was strongest on the delayed posttest and on number series items. Number series coaching also interacted with elapsed time until posttest on amount of time spent on series power tests. 7. The verbal analogy coaching group rated their coaching as most effective for verbal analogies while the number series group evaluated their coaching as most effective for number series problems. TWI'" """‘ " ‘ THE EFFECTS OF COACHING TO IMPROVE SCORES ON VERBAL ANALOGY AND NUMBER SERIES TESTS By John Henry Schweitzer A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR or PHILOSOPHY ‘ ‘ . "i Department of Counseling. Personnel Services and Educational Psychology l97l_ )uomem 2.5 ‘ mm of m it. We mining '_ ¢ ' ' ' ’ '.Ii- m: Ebel .... w Ml “Min. to .v ‘ . i topic of mg." 25.9512. arr .. -« ., _ . . . .’ alfgmouragaézem 3., :M a.» : r ‘ A V '1‘ MSU qrbteiu? 'v'. n» 19 w ': . --. f_ 7',- Inivers'ety am; a: 71- . M the cumintiu. a». .3, .; . ,. 7 . , ~ '_. 2. {creative ssq'ustv'eru. To fly m. In“?! C E: 0" " W‘Offim "~ 9 v ‘ - Ll ~‘¢‘a va‘ .\, » 5‘1” 8’ pmim-solnlg 1'36 {in-.1 'm, 5. ”g '- f. 4.1." - ‘i ‘ Finally, : want to dfhflefiizaRfi' was, My”, 4 '-»--.r casein-y: ‘ 1:} I, 1:”: firtmkmcle Or 95:: ‘~.,fl_~z~,,gfi_'&\._s‘z. .. ‘ I. IV father Dr. Dag: fiiwgfhfga. “7:2 “I; Thms 5cm u..— ' . , '» 1“."§-}:- . . a 7 \ . 5. - -. . ‘_ ' . »_ ‘ .“- . . ~ . 5;. '_ , : "-2 ,1» $“:_.. _ .* 33* :2 ,v. ‘ 4 6~ . < V -" v . . :5" - U ' i' "'l' ‘ ~. 7 . . ._ _- W v. f '4 ‘ :~.:~-' ‘ . 'c‘, I. ‘9' ‘ ' 5" '56" 9 ‘ 'v ACKNOWLEDGEMENTS I am grateful to the many persons who contributed to the success- ful completion of this study: to my chairman and friend, Dr. William Mehrens, whose unfailing faith in me gave me the confidence to persevere; to Dr. Robert Ebel who provided me with the example of a distinguished teacher and scholar; to Dr. Arvo Juola who first interested me in the general topic of my thesis; and to Dr. Norman Bell who provided enthu- siastic encouragement in the course of the study. I am also grateful to the Center for Urban Affairs at Michigan State University and to its director, Dr. Robert Green, who generously supported the completion of the study, and to Dr. Larry Lezotte for his many creative suggestions. To my mother, Eloise Schweitzer, I am grateful for teaching me the joy of problem-solving and the beauties of mathematics. Finally, I want to achnowledge the educational example provided by my great granduncle, Dr. Paul Schweitzer, my grandfather, Dr. Henry Schweitzer, my father, Dr. Paul Schweitzer, and my brothers, Drs. Paul, Philip and Thomas Schweitzer. iii Chapter Chapter Chapter Chapter TABLE OF CONTENTS I The Problem Need for the Study Purpose of the Study Objectives Definition of Coaching, Verbal Analogies and Number Series Rationale for Choice of Items Overview II Review of the Literature Introduction Historical View of Coaching Coaching and Ability Coaching and Sex The Effects of Coaching Over Time Similarity of the Coaching Material to the Test Coaching and Speed of Item Solution Coaching and Test-Taking Attitudes and Confidence Coaching and Test Validity Coaching and Test-Hiseness Summary III Design of the Study Sample Experimental Design Instrumentation Procedure Specific Coaching in Verbal Analogies Specific Coaching in Number Series Posttesting Data Preparation Hypotheses Statistical Analyses Summary IV Analyses and Results Coaching in Verbal Analogies Coaching in Number Series Summary Analogy Results Number Series Results page .p‘Zl-v ‘ ‘ Pu Appendix A: c " dix B: “Appendix C: L dix D: :0; iii“) dix E: Bibliography TABLE OF CONTENTS (cont'd.) Smary and Conclusions rpose Literature Review Results Discussion Instruments Specific Coaching in Verbal Analogies Specific Coaching in Number Series Evaluation Form Tables of Means ANOVA Tables page 73 83 87‘ 110 112 115 116 125 LIST OF TABLES Caption Type and Number of Items, Time Limits, and Reliabilities of the Subtests of the Pretest, Innediate Posttest and Delayed Posttest Correlations Between Analogy Test Scores and GPAs for the Coached Group in Verbal Analogies and the Control Group Mean Evaluations by the Analogy Group of the Effectiveness of their Coaching Correlations Between Series Test Scores and GPAs for the Coached Group in Number Series and the Control Group Mean Evaluations by the Series Group of the Effectiveness of Their Coaching Summary of the Significant Main Effects and Interactions on All Dependent Variables for the Analogy and Series Experiments Vi 56 57 67 68 7O }IF"IIVT'TKW' i -- Figure 3.1 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 LIST OF FIGURES Caption Experimental Design Showing the Independent Factors and the Measures Factors for the A Analogy Study Interaction of Analogy Coaching and Sex on Mean Analogy Scores Interaction of Analogy Coaching and Item Type on Mean Analogy Scores Interaction of Analogy Coaching and Degree of Test Speededness on Mean Analogy Scores Interaction of Analogy Coaching and Sex on Attitudes Toward Analogy Tests Interaction of Analogy Coaching and Degree of Test Speededness on Analogy Test-Taking Confidence Interaction of Analogy Coaching and Ability on Accuracy of Analogy Test-Taking Confidence Interaction of Analogy Coaching and Time of Posttest on Number of Speeded Analogy Items Attempted Interaction of Analogy Coaching and Item Type on Number of Speeded Analogy Items Attempted Interaction of Analogy Coaching and Time of Posttest on Time Spent on Analogy Power Tests Interaction of Analogy Coaching and Item Type on Time Spent on Analogy Power Tests Interaction of Number Series Coaching and Time of Posttest on Mean Series Scores vii 46 47 48 49 50 51 52 53 54 54 58 4.13 4.14 4.15 4.16 4.18 4.20 4.21 LIST OF FIGURES (cont‘d.) E95129. Interaction of Number Series Coaching and Item Type on Mean Series Scores Interaction of Number Series Coaching and Sex on Attitudes toward Series Tests Interaction of Number Series Coaching and Sex on Series Test-Taking Confidence Interaction of Number Series Coaching and Item Type on Series Test-Taking Confidence Interaction of Number Series Coaching and Sex on Accuracy of Series Test-Taking Confidence Interaction of Number Series Coaching and Item Type of Accuracy of Series Test-Taking Confidence Interaction of Number Series Coaching and Time of Posttest on Number of Speeded Series Items Attempted Interaction of Number Series Coaching and Item Type on Number of Speeded Items Attempted Interaction of Number Series Coaching and Time of Posttest on Time Spent on Series Power Tests Interaction of Type of Coaching and Item Type on Evaluations of the Effectiveness of the Coaching viii Page 60 61 62 63 63 64 65 66 69 CHAPTER I THE PROBLEM NEED FOR THE STUDY Testing is an integral part of today's society. Everyone in this country is required to take tests beginning in childhood and continuing up through adulthood. These tests play a crucial role for the indivi- dual, especially in the areas of education and employment. The amount of formal education an individual can obtain, the kind of education he receives, the kind of occupation he is encouraged to undertake, the decision as to whether he is hired for a particular job, and very often, the decision to promote him are all directly dependent on how he performs on the multitude of tests which he must take. Due to the great use of tests in decision-making in our society, tests and testing have come under attack from many quarters for many .reasons. One assumption on which these attacks are based is that tests are unfair in some ways to certain individuals. It is often argued, for example, that some people lack the test-taking skills that would allow them to reflect an accurate picture of their true ability or achievement level. In the 1960's in the United States the rise of the civil rights movement and increased concern for disadvantaged sectors of the popula- tion led many to question the validity of tests for these groups. This questioning was based in part upon speculation that these groups lack proper test-taking skills and attitudes, and that, therefore, their l ._._— q...’ 2 obtained scores do not reflect their true aptitude and achievement. This has led to a renewed interest in coaching and its effect on test scores. Also contributing to the reawakening of interest in improving test scores by special training has been the move toward performance contract- ing by American public schools. The performance which is guaranteed in many cases is usually measured by standardized tests. Large amounts of money ride on small differences in test scores. Therefore, the contrac- tors are interested in the use of coaching to raise test scores, and school systems are concerned with the effect of coaching on the validity of the tests. Still another contribution to the reawakening of interest in coaching to improve test scores has been the controversy sparked by Jensen (1969) dealing with the hereditability of intelligence. Obvi- ously, if coaching in test-taking skills and item-solving techniques can significantly raise scores on intelligence tests, then environment and past experience must be important determinants of intelligence test scores. In spite of this renewed interest in coaching, as Stanley (1970, p. 6) points out in discussing coaching to improve test scores, "not enough of this has been done yet in a rigorous way and reported." Connolly and Nantman (1964, p. 64) suggested that "It would be interest- ing to investigate the effect of specific training on analogies items." To date, the only study in this area was conducted by Moore (l97l). He found that coaching could improve analogy test performance. Millman and Pauk (1969, p. 72) state that "the verbal analogy items and the number and configuration series items seem to put the unsophisticated test taker at the greatest disadvantage.“ However, they do not present any -—~ -— 3 evidence for this. Thus, it seems clear that there is a need for more studies which investigate the effect of coaching on these specific item types and also which look at other variables in relation to coaching. Very few studies have focused on the effects of coaching on a particular type of item. Most coaching studies have used as a posttest measure a standardized test containing a number of different item types in different content areas. It may be that the specific effects of the coaching cannot be accurately evaluated by using such general measures. In addition, a great many coaching studies have ignored important related variables that might either be affected by the coaching or have an effect on the coaching. Four variables that could influence the effectiveness of the coaching are the initial ability level of the person being coached, the sex of the person being coached, the length of time between coaching and testing, and the similarity of the coaching material to the test material. Variables that the coaching itself may influence besides the score on the test are the speed with which the individual solves the items on a power test, the number of items attempted on a speeded test, the attitudes of the person toward the test and the self- confidence of the individual in taking tests. Finally, there is a need for current empirical studies of the effect of coaching on test relia- bility and validity. The inconclusive and often contradictory findings of the studies of coaching reported to date and the many variables related to coaching that have been uncontrolled or unexamined speak to the necessity of conducting more carefully controlled and designed experiments in this area. ---—v———‘_-.~ 4 PURPOSE OF THE STUDY It is the purpose of this study to investigate the effects of specific coaching in the solution of two types of items, verbal analogies and number series. The coaching is evaluated with respect to its influence on the number of items solved correctly, the speed of solution, the testee's attitude toward solving the items, his self-confidence in his ability to solve the items, and the accuracy of his self-evaluation of success in solving the items. The possible transfer effects of the coaching to similar item types and the effect of the coaching over time are investigated. The study also examines the possible influences of the entering ability level and sex of the person being coached upon the success of the coaching. Finally, the study determines the effect of coaching on the concurrent validity of verbal analogy and number series tests . OBJECTIVES The major objectives of this study are to: 1. Determine the effect that coaching in item solution techniques has upon subsequent test performance and determine the relationship of such an effect to initial ability, sex, elapsed time until posttest, similarity of the test items to the coached items, and degree of speeded- ness of the test. 2. Determine the effect of coaching on test-taking attitude and the relationship of such an effect to initial ability, sex, elapsed time until posttest, similarity of the test items to the coached items, and degree of Speededness of the test. 5 3. Determine the effect of coaching upon the amount of test- taking confidence and determine the relationship of such an effect to initial ability, sex, elapsed time until posttest, similarity of the test items to the coached items, and degree of Speededness of the test. 4. Determine the effect of coaching on accuracy of test-taking confidence and determine the relationship of this effect to initial ability, sex, elapsed time until posttest, similarity of the test items to the coached items, and the degree of Speededness of the test. 5. Determine the effect of coaching on the number of items attempted on speeded tests and determine the relationship of this effect to initial ability, sex, elapsed time until posttest, and similarity of the test items to the coached items. 6. Determine the effect of coaching upon time spent on a power test and determine the relationship of such an effect to initial ability, sex, elapsed time until posttest, and similarity of the test items to the coached items. 7. Determine the effect of coaching upon the concurrent validity \ of the test. DEFINITION OF COACHING, VERBAL ANALOGIES AND NUMBER SERIES The term coaching has had different meanings throughout the his— tory of aptitude testing. In the 19205 coaching often meant teaching the actual items contained on the test. The British in the 19505 used the term to mean familiarizing the testees with the kinds of items and materials contained in the test. The word coaching as used in this study means explaining to the subjects what a verbal analogy or number series item is, showing examples of the type of item, teaching rules that A ._..—._~ —__ 6 can be used to solve the item, and giving practice in applying those rules. The coaching does not include teaching solution techniques that would apply solely and specifically to items in the posttests nor does it involve any reference or relation to actual items in the posttests. The coaching was designed to increase scores on all verbal analogy test and all number series tests, and was not pointed toward any specific tests. A verbal analogy is defined as an item consisting of four words bearing some relationship to each other. Only the first two or three words are supplied. The task is to determine the relationship between the first two words and then pick a word or pair of words from a list of four or five alternatives so that the second pair of words bear the same relationship to each other as do the first pair. A number series item is defined as a list of numbers which bear some relationship to each other so as to form a definite pattern or series. The task is to determine the pattern involved and then to pick from a list of alternatives the next number or next two numbers that con- tinue the series. RATIONALE FOR CHOICE OF ITEMS This study investigates the effect of coaching upon the ability to solve two types of items, verbal analogies and number series. These two item types were selected for the study for a number of reasons. In the first place, as Millman and Pauk (l969) have pointed out, these two types of items leave the unsophisticated test taker at the greatest dis- advantage and would, therefore, be most susceptible to coaching. Secondly, both types of items are found in a wide range of aptitude and intelligence tests used to predict scholastic success. Verbal analogies 7 are used in the Miller Analogies Test, the Scholastic Aptitude Test, the Graduate Record Examination, the Medical College Admissions Test, the School and College Ability Tests, the Lorge—Thorndike Intelligence Tests, the Otis-Lennon Mental Ability Test, the Henmon-Nelson Tests of Mental Ability, and in many others. Number series items are found in the Large- Thorndike Intelligence Tests, the Otis-Lennon Mental Ability Test, the Henmon-Nelson Tests of Mental Ability, the Analysis of Relationships, the Short Form Test of Academic Aptitude, the California Test of Mental Maturity, and in many other tests of intelligence and academic aptitude. A third reason for the choice of these particular item types is that they represent each side of the classical dichotomy between verbal and non- verbal abilities. A comparison of the effect of coaching on each type of item is made. The final reason for making this selection is that for each item type there exists a similar item type which can be used in in- vestigating the breadth of influence of coaching in the solution of a specific item type. This study investigates the effect that coaching in verbal analogies has on the solution of analogies composed of geometric figures and the effect that coaching in number series items has on con- figuration series items. OVERVIEW In Chapter II most of the previous coaching studies reported in the literature are reviewed. Some coaching studies have considered only the effect of coaching on total test scores. The review in Chapter II discusses these, but it emphasizes those studies which, in addition to looking at total test scores, have considered other variables that could 'fi—‘w C—o. .“ -' Iv 8 have an effect on or be affected by the coaching. The literature review is so organized as to focus on the uniqueness of this study. Sample description and selection procedures, experimental design, hypotheses and statistical analyses are presented in Chapter III. In ad- dition this chapter contains a section explaining the coaching procedures used in the study and a section describing the instruments used to test the effects of the coaching. The results of the statistical tests are presented in Chapter IV, and in Chapter V the results are discussed and interpreted, implications are considered, and conclusions and summaries are given. CHAPTER II REVIEW OF THE LITERATURE INTRODUCTION This study, while not a replication of any previous research, is nevertheless based on the findings of other investigators. Although the literature is not saturated with studies investigating the effects of coaching upon test performance, there have been a rather substantial num- ber of investigations in the area. In this chapter the pertinent studies are reviewed and critiqued in order to develop a rationale for the present study. Previous coaching studies are examined under a number of differ- ent topics. In the first section of this chapter a brief history of coaching studies is presented. This is followed by two sections examin- ing previous findings on the relationship of the ability and sex of the person being coached to the efficacy of the coaching. The next two sec- tions deal with the effects of coaching over various lengths of time and the effects of coaching on tests of various degrees of similarity to the coaching material. These are followed by sections relating coaching to speed of item solution, test-taking attitudes and confidence, and test validity. The final section compares test-wiseness studies and coaching studies. HISTORICAL VIEW OF COACHING One of the earliest references to coaching in the literature was in an article by Thorndike in 1919. In this paper, Thorndike discussed 9 10 how tests might be misused through coaching, and he presented possible remedies for the situation. At the time of publication of Thorndike's paper, the intelligence testing movement was only 14 years old. Before the development of the first Binet-Simon Scale in 1904, psychologists had concentrated mainly on measuring sensory discrimination and reaction time. During the first two decades of this century, the various revi- sions of the Binet-Simon tests attracted wide attention among psycholo- gists. However, interest in coaching to improve scores on the test seems to have been nonexistent during this period. With the onset of World War I, a group of American psychologists was faced with the problem of deter- mining the general intellectual level of a million and a half recruits. The result was the first intelligence test, known as the Army Alpha. After the war, the release of this test for civilian use gave a tremen- dous spurt to the growth of the testing movement. During the testing boom of the twenties when objective tests were called “new-type" tests, numerous studies on the influence of coaching upon test scores began to appear in the literature. The coachability of both group and individual intelligence tests was investigated by the early coaching experimenters. They were inter- ested in the lasting effects of coaching over time, the comparative coachability of various types of intelligence tests, and the transfer effects of coaching to other intelligence tests. The findings of these investigators are presented later in the chapter. Following the great spurt of coaching studies in the twenties, interest in this area began to Wane. The next two decades produced very little in the way of coaching studies. However, in the 19505 the 11 educational system in Great Britain began to use group intelligence tests to determine the educational future of 11-year-old students. Because of the importance of the decision for the individual and the school, both students and educators became interested in the efficacy of coaching to improve test scores. A spate of articles dealing with coaching again be- gan to appear in the literature. In the United States in the late 19505 and early sixties the increased demand for college acceptance led colleges to rely more heavily on standardized aptitude tests such as the Scholastic Aptitude Test. This in turn led to an interest in whether these tests were susceptible to coaching, and coaching studies again began to appear in the literature. The civil rights movement, increased concern for the disadvantaged, the issue of hereditability of intelligence and the continued proliferation of testing in our society have all combined to keep the issue alive. Some recent investigations (e.g., Millman, 1966; Juola, 1969; Slakter, et al., 1970; Oakland, 1971) have focused on measuring test- wiseness or test-taking ability, and in determining its influence on aptitude and achievement test scores. Others (Roberts and Oppenheim, 1966; Millman and Pauk, 1969; Juola, 1969; Moore, 1971) have focused their attention on coaching or teaching people how to take tests. COACHING AND ABILITY Some of the first investigators of coaching on aptitude tests speculated that the coaching would actually increase the intelligence or ability of the person being coached, but this view is no longer very popular. The section of this chapter on coaching and validity contains a fuller discussion of the relation between coaching and true ability as 12 measured by test performance. When the effect of ability on the outcome of coaching is examined, no real consensus is achieved. Many investi- gators have attempted to determine whether coaching is more effective for high or low ability test-takers, and they have reached varying conclu- sions. In 1928 Casey trained a group of first grade pupils for eight hours to solve items similar to those on the Stanford-Binet, and follow- ing the training program the Stanford-Binet was administered to all the subjects. The training group scored higher than did the control group, but the students in the training group with lower original mental ages made greater gains than did the higher mental age students. Somewhat different findings were reported by Harter (1928). She trained a group of higher IQ subjects to recognize similarities in pairs of words and a group of lower IQ subjects to recognize differences in pairs of words. The higher IQ group performed significantly better than a control group, but there were no differences for the lower 10 groups. This finding that the higher IQ group benefited most from training is not clear-cut since the nature of the training and the tasks were different for the two groups. Wiseman and Wrigley (1953) coached a group of British school children in methods of taking the Moray House Tests. The coaching tech- nique was based on a published text containing information on how to take tests. They found that the effect of coaching was dependent on level of initial ability. The lower the initial IQ the greater the gain due to the coaching. James (1953) investigated a school district in Great Britain which introduced official coaching for all students before taking the national examinations. By comparing the distribution of IQ scores 13 before and after the introduction of the official coaching program he found that the average gain of about five IQ points was evenly distri- buted across the range of 105. Low, middle, and high ability students all benefited equally from coaching. In a study of the effects of coaching on the Moray House Intelli- gence Tests, Dempster (1954) found pretest-posttest correlations of over .93 for the coached group. He interpreted this as indicating that the coaching was equally effective at all levels of ability. Vernon (1954) at the conclusion of a symposium on the effects of coaching and practice in intelligence tests stated that the evidence showed that bright child- ren benefit more from practice and dull ones improve more from coaching. Spielberger (1959) studied improvement on the Miller Analogies Test (MAT) due to practice. The correlation of -.50 between initial ability level and gain scores was undoubtedly due to regression. Colver and Spielberger (1961) reported a correlation of .86 between initial and final scores on the MAT, indicating equal gain for all ability levels. Coladarci (1960) analyzed MAT score changes and found that mid- dle ability persons gained more than high or low ability persons. French (1955) reported a large scale study of the effectiveness of coaching to improve Scholastic Aptitude Test (SAT) scores conducted in three public schools. All the seniors in the schools took the SAT in September and were retested with another form of the test in March. Pupils in school A received no coaching, pupils in school B received a total of three hours of coaching in vocabulary and reading comprehension and one and one half hours practice with sample SAT questions. In school C the pupils received 20 coaching sessions in solving both math and verbal exercises similar to those contained in the SAT. One of the l4 findings of the study was that in the coached group greater gains were made by those of higher ability. Schubert (1967) studied the effect of training on the performance on the Block Design Subtest of the Wechsler Intelligence Scale for Child— ren. He found that the gain after training correlated positively with initial IQ. In summarizing these studies of the effect of coaching on test performance at different ability levels, it should be noted that three investigators found most improvement at high ability levels, one reported greatest improvement at the middle level, three concluded that the low ability levels improved the most, and three investigators found equal improvement at all ability levels. It is clear that no conclusive state- ment can be made regarding the relationship of entering ability level and the effectiveness of coaching. COACHING AND SEX The earliest studies of the effect of coaching did not control for the sex of the person being coached, and thus no comparison was made as to the relative effectiveness of coaching for boys and girls. In the 19505 the British psychologists who were investigating coaching in rela- tion to their national exams were the first to study the possible inter- action effects between coaching and the sex of the individual being coached. In 1953, James reported on the effects of an official coaching program conducted by a school district in Great Britain. All children who were to take the national examinations received coaching, and the results of the testing were compared with the norms from the previous 15 year. It was found that all students benefited from the coaching, but that girls gained more than boys. In another study of coaching, Dempster (1954) specifically designed his coaching to improve scores on the Moray House Intelligence Tests. He found that in the two verbal tests the girls gained more than the boys, but on a non-verbal mental ability test, boys gained more than the girls. Dempster did not attempt to explain these mixed findings. In a symposium designed to bring together all the known findings on the effects of coaching, Vernon (1954) concluded that at the time the relationship of coaching and sex had not yet been deter- mined. Some studies showed girls profiting more from coaching and others showed the reverse. In 1957, Heim and Watts conducted a study in which subjects were coached on methods of solving highly speeded spacial problems. They found that the coaching was more effective for girls than for boys. Schubert (1967) conducted a study designed to evaluate the effect of training on performance on the WISC Block Design Subtest. He found equal improvement for both boys and girls. A study conducted in this country in which the sex of those receiving the coaching was taken into account was conducted by French in 1955. Students in three public schools took one form of the SAT in Sep- tember of their senior year and received another form in March. Students at school A received no coaching. At school B the students received four and one half hours of coaching in vocabulary, reading comprehension and sample verbal questions from the SAT. At school C the students received 20 periods of coaching using verbal and mathematical items. French found that at both school B and school C the girls gained more than did the boys on SAT verbal scores when compared with their counterparts at school A who received no coaching. For the SAT mathematical scores the “won.— ‘OAfi--—-- 16 effect of coaching with respect to sex was not so clearcut. The gain scores of the boys and girls at school C, the only school to receive mathematical coaching, were compared to schools A and B. Compared to school A, the girls at school C seemed to benefit more from the mathe- matical coaching than did the boys. The effect was just the opposite when school C scores were compared to school B. When the scores were broken down both by sex and whether the students had been studying mathematics during the year, an interesting finding was observed. In comparing the gain scores of students in school C to school A and school B, it was found that in both comparisons girls taking mathematics bene- fited much more from the coaching than did girls not taking mathematics, but that boys ggt taking mathematics benefited more than boys who were studying math. French did not speculate on the possible causes of these findings or on their significance. It is interesting to compare French's findings to those of Dyer (1953). Dyer instituted a coaching program using verbal and mathematical materials corresponding to the two sections of the SAT at an independent preparation school for boys. In comparing the scores received by the boys who were coached to scores of the boys in a similar school who received no coaching, Dyer found that the verbal coaching had a minimal effect, but that the mathematical coaching was much more effective for boys who were not taking math courses in their senior year. This finding was similar to what was reported by French (1955). It seems true that the conclusion reached by Vernon in 1954 that the relationship between coaching and sex had not yet been determined remains valid today. 17 THE EFFECTS OF COACHING OVER TIME Some of the very first investigators of the effects of coaching upon test performance were interested in how well the results of the coaching would hold up over time. In 1928 Casey reported a study in which an experimental group of first grade children were coached on material similar to items on the Stanford-Binet. A control group was not given any coaching. The children were tested before the coaching, immedi- ately after the coaching, and six weeks after the coaching was completed. The coached group showed greater gains than did the control group on the inmediate posttest and this effect was not lost after six weeks. In a similar experiment Davidson (1928) found that training on four tasks similar to the Stanford-Binet tests resulted in greater gains in these particular tasks for the coached group over the control. In a follow up test three months afterwards it was found that the results were in evi- dence to a lesser degree. The effect of the training seemed to diminish over time. In still another study involving coaching for the Stanford-Binet, Greene (1928) used three groups. One group was coached on the actual items from the Stanford-Binet, a second group received coaching on similar but not identical material, and the third group got no coaching at all. Posttesting took place immediately after the coaching, after three months, after a year, and after three years. The group trained on the actual items obtained the highest scores on the immediate posttest, and the similar group surpassed the control group. After three months the differences were still evident. After one year had elapsed there were only slight differences, and after three years the differences had dis- ' appeared entirely. - -wv——-‘-— ———-o 18 Dempster (1954) reported finding positive results as a result of coaching to increase scores on the Moray House Intelligence Tests. In evaluating his results he reported that the effect of coaching soon wears off, but he provided no evidence for this and did not give any estimate of the time period in which the effect would disappear. In summing up these studies on the effect of coaching over time, it would seem reasonable to expect that the effect would begin to diminish with the passage of time. All the available evidence indicates that this is true. However, no recent study of coaching has looked at this vari- able, and it has never been examined in relation to specific coaching for specific types of items. SIMILARITY OF THE COACHING MATERIAL TO THE TEST The amount of similarity between the material upon which one is coached including any practice items and the actual items making up the criterion test undoubtedly has an effect on the efficacy of the coaching. To look at the extreme cases, one would expect great improvement after coaching on the actual items contained in the posttest. 0n the other hand, there would be no reason to expect any increase in scores if the coaching material were completely dissimilar from that used in the post- test. The variable of similarity of the coaching material to the test material has been investigated in a number of studies but it is difficult to compare them because of the lack of any common definition of degree of similarity. In a smaller investigation that grew out of a large scale coach- ing study, French (1955) examined the effect of coaching students on the 'identical questions that appeared on the SAT posttest. He found that 2‘ 0—.- ._-.— 19 coaching on identical items produced an average increase of 47 points higher on the verbal score and 15 points higher on the mathematical score than the average increases produced by coaching without the benefit of identical items. In 1928 Greene reported the results of a study investigating coaching of seven-year-old children on the Stanford-Binet. She investi- gated three groups: the coached group was trained on the actual test material; the similar group was trained on material similar, but not identical to the actual test material; the control group received no training at all. The coached group achieved markedly higher scores than the other two groups on an immediate posttest administration of the Stanford-Binet. The similar group had scores falling between the coached group and the control group. Harter (1928) coached students on identifying similarities of pairs of words presented in a list. The test was composed of pairs of words identical to the training list, pairs of words similar to the train- ing words, and pairs of words bearing no direct similarity to the train- ing list. She found that the effect of the training was greatest on the identical pairs, somewhat less on the similar pairs and least on the pairs having no similarity. Davidson (1928) trained third grade children on tasks similar to the Stanford-Binet test material. She found some indication of a transfer of coaching effect from digit span to memory span for letters. In the study reported in 1954, Dempster coached a group of boys and girls using materials specifically designed to raise scores on the Moray House Intelligence Tests. As a part of the study he administered verbal and nonverbal tests which were unlike the Moray House Tests. He .,_ 20 found that overall, coaching on dissimilar material is less effective than coaching on similar material. Roberts and Oppenheim (1966) administered coaching materials specifically designed to increase scores on the Preliminary Scholastic Aptitude Test (PSAT) to a group of educationally disadvantaged high school students in Tennessee. They also administered the Sequential Tests of Educational Progress (STEP) Level 3 Reading and Mathematics tests. The results of the coaching on the PSAT scores tended to be small but sta- tistically significant. The authors did not report the scores on the STEP tests so it was impossible to evaluate the effect of the coaching for a specific test upon a different but similar test. These studies seem to clearly indicate that coaching is most effective when the coaching material is most similar to the test material. As the coaching material becomes less similar to the test material, the effectiveness of the coaching generally tends to diminish. COACHING AND SPEED OF ITEM SOLUTION The effect of coaching on the speed with which the test-taker solves the items on the test has not yet been systematically investigated and reported in the literature. It certainly is an important variable in view of the rather speeded nature of most current aptitude and intelli- gence tests. Two authors in the past decade have to some extent speculated on the role of speed in solving analogies. Willner (1964) investigated the reasoning processes that one undergoes in reaching the correct solution in a verbal analogy. He found that many persons merely choose the alter- native most closely associated with the third word of the analogy, and v—vo—r- 21 this often leads to the correct answer. However, he went on to state that to correctly solve analogies, a subject must develop and maintain a complex set. He may be distracted from this set by an association, and this may be especially true in timed tests. Another author who studied speed of solving analogies was Moore (1966). He investigated the rela- tionship between time spent in solving the four sample analogy problems in the Miller Analogy Test (MAT) booklet and the raw score received on the MAT. He found a correlation of -.40 between these variables and speculated that the relationship could be due to the fact that examinees taking longest to complete the sample items are those who have most dif- ficulty in understanding the analogy solving procedure. French (1955) has been the only researcher to date to report on the effects of coaching on speed. He used the number of items attempted on the SAT as a measure of the student's speed and found that coaching resulted in students working more slowly. He felt that this was a sur- prising finding and attributed it to the gain of knowledge and confidence which served to increase caution on the part of the coached students and cut down on guessing. This finding suggests that coaching might have a differential effect on speed and power tests. If coaching increases the time spent per item, but also increases accuracy, it should have its most beneficial effect on a power test where there is enough time to attempt all the items. On a highly speeded test coaching should have a less positive or even a negative effect since the increased accuracy of the coached stu- dents could be offset by the greater number of items attempted on the part of the uncoached students. 22 COACHING AND TEST-TAKING ATTITUDES AND CONFIDENCE Psychometricians often speak of the importance of test-taking attitudes and confidence in obtaining high scores on aptitude and achievement tests. Millman and Pauk (1969) implied that positive atti- tudes and a moderate amount of confidence can result in higher test scores. Anastasi (1969) stated that individuals with prior testing ex- perience have certain advantages which include more self-confidence and better test-taking attitudes. In a study of test-taking attitudes among university and high school students, Cunningham (1966) found that posi- tive test-taking attitudes are related to high achievement. Even though it is widely accepted that attitudes and confidence are important variables in test-taking, none of the coaching studies to date have systematically investigated the effect of coaching in these areas. However, a number of investigators have speculated on the effect that coaching had on the attitudes of those receiving the coaching. For example, Casey (1928, p. 433) in a study of coaching for the Stanford- Binet reported that, "the training seems to have given the children some- thing in the way of an attitude or interest that enables them to gain in mental age.“ Harter (1928) also felt that it was possible to impart to children through drill and coaching an attitude or method of approach to the problem which enabled them to perform better. In 1938, Vernon found that a group of students gained eight IQ points after taking a four week testing course. He attributed the increase to greater test sophistica- tion which he defined as including a better subjective attitude. Wiseman (1954) felt that test sophistication is the most important single element involved in improvement from practice or coaching. -: ‘vw.~4—-‘ w 23 In his large scale study on coaching high school students for the SAT, French (1955) implied that the coaching served to increase the amount of confidence the students had in taking tests. Finally, Oakland (1971) reported that teaching test-wiseness skills to preschool disadvan- taged children seemed to give them more confidence and better attitudes toward testing, although no formal attempt was made to assess these variables. In spite of all the speculation, no study to date has attempted to objectively measure the effect that coaching has on test-taking atti- tudes and confidence. COACHING AND TEST VALIDITY The issue of the effect of coaching upon the validity of the test is a complex one and can be approached from several different angles. Anastasi (1968) introduced the concept of breadth of influence in dis- cussing the problem. She stated that coaching and education both tend to raise scores on a test. However, coaching might merely raise a test score “without appreciably affecting the behavior domain that the test is designed to predict“ (Anastasi, 1968, p. 567). Anastasi argues that it; coaching in this sense tends to invalidate the test. An alternative argument holds that some persons use highly sophisticated test-taking techniques which give them an advantage over their less sophisticated classmates. Certain persons or groups lack test-taking skills and therefore are unable to perform at the level indi- cated by their true aptitude and achievement. Those holding this point of view argue that coaching in test-taking skills should actually increase the validity of a test by providing everyone with an opportunity 24 to demonstrate his competence. Ebel (1965, p. 206) pointed out that "more error in measurement is likely to originate from students who have too little, rather than too much, skill in taking tests.“ Vernon (1954) stated that the reliability and validity of intelligence tests are lowered when some children have coaching and others do not. He concluded that the available evidence pointed to the fact that coaching and prac- tice for all would make tests more valid and more reliable. Gulliksen (1950) has suggested the development of tests which have intrinsic validity. He argues that use of such tests would elimi- nate the problem of coaching and test validity since coaching would improve both test performance and criterion performance, and in fact coaching would be recommended for everyone. A review of the literature tends to support the view that coach- ing increases the validity of tests, although this is not because most tests have intrinsic validity. It appears that other factors are operat- ing to produce the phenomenon. Richardson and Robinson (1921) found the correlation between the Army Alpha Test and college grades increased as the subjects had more practice in taking the test. Glick (1925) corre- lated scores on the Army Alpha Test with semester grades both before and after a coaching and practice session with the test. He found that the correlations were higher for four groups of students after the coaching. The four groups were college students, high school students, junior high school students, and seventh and eighth grade students. Although in no case was the increase in correlation statistically significant, he con— cluded that performance on an intelligence test after coaching is more predictive of school success than when no practice is given. ... _ -W W.— *ra -W 25 Ortar (1960) conducted a unique experiment in which she demon- strated that correlation of scores on a specially devised test with student's grades could be increased by coaching. She was concerned with assessing the educational aptitude of new immigrant children to Israel. These children came from different countries with markedly different social and cultural backgrounds. Most had not had any educational expe- rience, and non of the available tests seemed suitable for measuring their learning potential. She therefore devised a new test utilizing tasks equally unfamiliar to all children. These tasks involved super- imposed stencils from which a pattern was to be constructed. The test consisted of three parts. The first one was given and scored like any other test. In the second part a number of specially chosen items were 'administered and if the subject had difficulty with any one, the examiner explained the principles involved using the subject's own method of approach. The third part of the test, containing items using similar principles, was then administered. Ortar reported that scores on the third part of the test, after coaching, correlated substantially higher with grades in academic subjects than did scores on the first part before the coaching. The test devised by Ortar actually includes a learning situation. It seems logical that children who could profit most from the coaching on the test would be the ones most likely to gain most from classroom in- struotion. This viewpoint could explain the higher correlations after coaching. The very qualities that enable one to benefit from the coach- ing (motivation, intelligence, and docility) are the same qualities that result in high academic achievement. 26 COACHING AND TEST-WISENESS Recently in the literature there has appeared a series of test- wiseness studies which are related to the coaching studies of old, but which approach the problem from a different tack. The concept of test- wiseness has been around for many years. It was first mentioned in print by Vernon in 1938 in an article on intelligence test sophistication. Since then many authors have written of test-sophistication or test—wise- ness and the opposite concept of test naiveté or test blindness. Few writers have bothered to define the concept, although it generally means the ability to take tests and to score as high as or even higher than would be warranted by true aptitude or achievement. Lack of test-wise- ness could result in a lower score than warranted. Millman, et al., (1965) attempted to analyze, list, and categorize test-wiseness techniques. They recommended the developing of measures of test-wiseness. A number of investigators (Gibb, 1964; Millman, 1966; Juola, 1969; Slakter, et al., 1970) have attempted to develop and/or validate measures of test-wiseness. Others have attempted to teach test- wiseness principles (Wahlstrom and Boersma, 1968; Wahlstrom, 1968; Moore, et al., 1966). The teaching of test-wiseness principles in order to in— crease scores on aptitude or achievement tests is not much different from what was called coaching in earlier studies. Some current studies which speak of teaching test-wiseness are really classical coaching studies (Oakland, 1971; Moore, 1971). Those who study test-wiseness and those who are concerned with coaching are both really interested in different ends of the same continuum. Teaching test-wiseness usually refers to impart- ing techniques useful in a large variety of tests, and coaching generally 27 refers to teaching test-taking techniques applicable only to a specific test or type of item. SUMMARY In this chapter the effects of coaching have been examined in relation to a series of important variables. No conclusions can be made as to the relationship of entering ability level and the effectiveness of coaching. Three of the studies reviewed showed coaching to be most effec- tive for high ability levels, three others found it most effective for low ability levels, and one reported greatest improvement at middle level. In addition, three studies showed no differential effects of coaching at different ability levels. The findings relating sex to coaching are equally inconclusive; some investigators have found girls most suscep- tible to coaching, others concluded boys were most coachable, and still others got mixed results or no difference. There is general agreement in the literature that the effects of coaching tend to diminish with the passage of time. There is also a consensus that coaching is most effective when the coaching material is very similar to the test and becomes less effective as the dissimilarity between the coaching material and the test increases. No studies have been designed to specifically examine the effect of coaching on speed of taking tests. However, one study indicates that coaching may increase the time spent in solving items. Similarly, no study has examined the effect of coaching upon test-taking attitudes and confidence, although a number of authors have speculated that coaching improves attitudes and confidence. 28 Several investigations in the 19205 and one recent study in Israel found that coaching improved test validity as measured by the correlation of test scores with school success. There have been no recent empirical studies of the effect of coaching on test validity in this country. Some recent investigators have been more interested in teaching and measuring test-wiseness. It is felt that there is no clear-cut dis- tinction between the two kinds of studies. ( CHAPTER III ’ DESIGN OF THE STUDY This chapter contains descriptions of the sample, the experi- mental design, the instrumentation, the procedures followed in collect- ing and preparing the data for analysis, the research hypotheses, and the statistical analyses. SAMPLE The sample consisted of 88 freshmen students at Michigan State University who had enrolled as first-time freshmen in the fall term of ’ 1970. About 200 potential subjects were randomly selected from the enrollment list and were contacted by mail. They were asked to serve as paid subjects in an experiment to see whether test—taking skills could be taught. Approximately 130 students volunteered to participate. Of these, 88 were selected on the basis of their availability during the time the experiment was conducted. The subjects had all taken the Scholastic Aptitude Tests and a battery of placement tests as part of the normal procedure for incoming freshmen at MSU. They undoubtedly had taken other aptitude and achieve- ment tests during their high school period, so they could be described as generally sophisticated with re5pect to test-taking. EXPERIMENTAL DESIGN The 88 subjects were randomly assigned to one of two main groups, one group receiving coaching in verbal analogies and the other receiving 29 I" -.‘_~- _ 3O coaching in number series. Each group was pretested and posttested with identical instruments containing both analogy subtests and series sub- tests. The study could really be viewed as two separate experiments, one examining the effects of coaching on verbal analogies and the other look- ing at the effects of coaching on number series. In the first experiment the group receiving analogy coaching was the treatment group and the other group was the control; the dependent variables were the scores on all the analogy subtests. In the second experiment the experimental and control groups were reversed and the dependent variables were the scores on the series subtests. The experimental design for the analogies experiment is presented in Figure 3.1. The design is exactly the same for both experiments except that the experimental and control groups are interchanged. At the time of the pretest, the subjects had already been randomly assigned to one of the two major groups. The grand median for all 88 subjects on the pre- test was determined and post hoc blocking was employed to develop the high and low ability groups. Assignment to high and low ability level was based on the grand median of all subjects of both sexes in both groups with scores falling at the grand median equally split between the high and low ability level within each sex in each treatment group. At the completion of this procedure it was found that the smallest cell con- (tained nine subjects. 'Subjects were randomly eliminated from the other cells to obtain a balanced design with nine subjects per cell. Two sub- jects per cell or a total of 16 subjects were lost by this procedure. This procedure was repeated for the second experiment on the basis of a number series pretest, and again a balanced design with nine subjects per cell was obtained. 31 mLovumm meamooz 0:9 ucw xuwfima< new xmm .azogo mo meopomd ucmucmqmucH asp mcwzosm cmwmmo Pmpcwsvemgxm F.m mmzwud 3°: mm_morm:< myaemd :3: 3 mcwcueou 3o: m_mz oz :3: so: m—eEmm marmormc< :3: :3 so: oFaz mcwgomou :m:: ummgm gaze; ummam gaze; umogm cmzom uwmam ewzom mm_mo~a=< mummo~m=< mmvmo—m=< mm_mo_mc< manure Faaem> wezmwe ~anew> xuwpwn< xwm azoco pmoupmom umxupmo anonymom mumwugE=H mLouuau mezmmmz weapon; cornea p N 1).:I' -—-._ ,__ 32 INSTRUMENTATION The instruments used in this study consisted of four types of items: verbal analogies, figure analogies, number series, and figure series. The items were all selected from various aptitude, reasoning and intelligence tests, suitable for the college freshmen level. The selec- tion of the verbal analogy items was conditioned upon their not contain- ing words of very difficult vocabulary or specific factual information. Such analogies would not be susceptible to coaching and would, therefore, be of no use to this study. The figure analogy items and the number and figure series items were selected only under the condition that they be of the appropriate difficulty level. The items selected were then ran- domly assigned to various subtests. An attempt was made to divide items from the same aptitude or intelligence tests equally across all of the newly developed subtests in order to keep the difficulty level and the content areas of the new tests roughly equivalent. Altogether 18 sub- tests were formed in this way. The kind and number of items forming the subtests of the pretest and the immediate and delayed posttests are presented in Table 3.1 along with the time limits and the Hoyt internal consistency reliabilities for each subtest. The use of an internal consistency reliability resulted in artificially inflated reliabilities for the speeded tests. The 18 subtests were called forms and were lettered from A to R. Copies of the subtests, reduced in size by 35 percent, and the answer keys can be found in Appendix A. The time limits for the subtests of the pretest were chosen to approximate the amount of time per item that is generally allowed on a o -._ __ 33 TABLE 3.1 Type and Number of Items, Time Limits, and Reliabilities of the Subtests of the Pretest, Immediate Posttest and Delayed Posttest Number Time Limit Hoyt Form Type of Item of Items in Minutes Reliability Pretest A Verbal Analogies l6 8 .62 B Number Series 16 8 .77 IMmed. C Verbal Analogies — Power 16 12 .52 Posttest D Verbal Analogies - Speed 16 5 .71 E Figure Analogies - Power 10 7 .33 F Figure Analogies - Speed 10 3 .47 G Number Series - Power 16 12 .62 H Number Series - Speed 16 5 .80 I Figure Series - Power 10 7 .42 J Figure Series - Speed 10 3 .52 Delayed K Verbal Analogies - Power 16 12 .56 Posttest L Verbal Analogies — Speed 16 5 .69 M Figure Analogies - Power 10 7 .30 N Figure Analogies - Speed 16 3 .14 0 Number Series - Power 16 12 .70 P Number Series - Speed 16 5 .80 Q Figure Series - Power 10 7 .20 R Figure Series - Speed 10 3 .57 34 typical aptitude test. A survey of the time limits on aptitude tests containing verbal analogies and number series indicated for both item types the time limit per item is usually about a half minute. Therefore the time limit was set at eight minutes for each of the 16 item subtests of the pretest. The time limits for the speed and power subtests of the post- tests were determined after a limited tryout testing. Time limits were set so that under power conditions almost everyone would have a chance to attempt all the items, and under speeded conditions very few persons would reach the last item. All the items for each subtest were contained on a single page. At the bottom of the page the following two questions were asked: How many items on this page do you think you answered correctly? )How did you feel about answering the items on this page? (check one Liked Liked’ Liked' Neutral' Disliked DiSlikedl Disliked Very to a a to Very Much Some Little Little Some Much Extent Extent In addition, on the power tests, the subjects were asked to: Write the number that is on the board when you finish all the items on this page. This number represented the number of minutes which had elapsed since the beginning of that subtest. This question was not asked on the subtests given under speeded conditions since for these tests it was expected that few, if any, students would complete all the items on the test. Each of the subtests was preceded by a cover sheet containing examples of the type of item to follow so that the subjects would understand what was re- quired of them. Examples of the cover sheets may be found in Appendix A. 35 PROCEDURE The 44 students in the analogies training group reported to a large room in the center of campus at 7 p.m. on a Monday night early in the spring term. The number series group received their training the following night in the same room. When all had arrived, both forms of the pretest were distributed. After the administration of the pretest, the papers were collected and the training program began. The coaching for each group lasted slightly over one hour. It consisted in each case of an explanation of what a verbal analogy or number series item is, a presentation of some sample items and the rules used to solve them, a demonstration of how the rules are used, practice in applying the rules, and discussion of the correct answers. The ratio- nale for employing both coaching and practice was based on the obvious findings of previous investigators (Dempster, 1954; Vernon, 1954; Heim and Watts, 1957) that a combination of both coaching and practice is more effective than either alone. Specific Coaching in Verbal Analogies The analogies coaching material was developed from the chapter on verbal analogies in Millman and Pauk (1969) supplemented by instruc- tional materials on analogies from Educational Testing Service (1965) and materials used by Moore (1971) in his study of coaching. The analogy coaching material was tried informally on a small group of students and was revised on the basis of the difficulties these students had with a sample test. The two basic rules for solving verbal analogies presented to the analogy group were: (1) silently verbalize the relationship between 36 the two words in the stem, and (2) substitute each successive pair of words into the verbalized relationship and select the pair that fits best into the relationship. The rationale behind verbalizing the rela- tionship was to provide an aid in maintaining the "complex set" described by Willner (1964) as being necessary for solving verbal analogies. Ways to sharpen or broaden the relationship, if necessary, to determine the correct answer were also demonstrated. A more specific description of the analogy coaching can be found in Appendix B. Specific Coachingfin Number Series Rules for solving number series items were initially adapted from Millman and Pauk (1969). An informal tryout of these rules with a group of students indicated the need for some further modification and the inclusion of some more practical hints for obtaining the correct answer. The basic rules for solving number series are similar to the rules for verbal analogies: (1) find the rule and (2) apply it. A set of recommendations in addition to the procedures described by Millman and Pauk was included in the number series coaching. Mistakes commonly made by students in the tryout section were described to the subjects and procedures for guarding against such mistakes were discussed. Appendix B contains a detailed description of the number series coaching. In both coaching periods an overhead projector was used to pre- sent sample items to the whole group. It should be emphasized that the two coaching techniques were designed to improve performance on all verbal analogies, and on all number series, respectively. In no case was any reference made to any specific item contained on any of the posttes ts . 37 Posttesting At the end of the coaching period the students took a short break and then returned to their seats for the first posttest. This test contained verbal analogies, figure analogies, number series, and figure series items. There were two subtests for each item type, one given under power conditions and the other under speeded conditions. Thus, this posttest consisted of four analogy subtests and four series subtests. The total testing time for the first posttest was about one hour and ten minutes. At the end of the first posttest the students were reminded that they should return the following Monday night at the same time for the second posttest and were told that they would be paid at the end of the second testing program. The same procedure was fol- lowed for the 44 students in the number series training group except that they reported on two consecutive Tuesday nights. At the end of the second posttest, the students were asked to fill out an evaluation form containing their reactions to the training that they had received. A copy of this form is contained in Appendix C. For each of the four types of items composing the posttests, the students were asked to check the category that best described their reactions to the training. The categories were: (1) definitely helped to solve that type of item, (2) probably helped to solve that type of item, (3) prob- ably did not help to solve that type of item, and (4) definitely did not help to solve that type of item. The students were asked to be as honest as possible in evaluating their training, and they were told they did not have to identify themselves by writing their names on the evaluation form. 38 Data Preparation The tests were scored by placing a scoring stencil over the sub- test and counting the number of correct answers. For each individual on each subtest the following dependent variables were generated: Number correct. This score was simply the total number of items answered correctly on the given subtest. Attitude toward the test. This variable was developed by taking the response to the question asking how the individual felt about the item on the page and assigning a 7 if he checked Liked Very Much, a 6 for Liked to Some Extent, and so on down to a l for Disliked Very Much. Number estimated correct. This was the response to the question asking how many items on the page the individual thought he got correct, and was used as a measure of confidence in taking the test. Accuracy of the estimate. This score represented the absolute difference between the number of items estimated to be correct and the actual number of correct items for the individual. Number of items attempted. This score was useful only for the speeded tests since almost everyone attempted every item on the power tests. It represented the number of items on a given subtest for which an answer was given. Time spent on power test. This score represented the amount of time spent working on the subtest and was obtained from the response to the question asking the subject to write down the number that was on the board when he finished all the items on the page. 39 Coaching evaluations. From the responses on the evaluation forms as to whether the coaching helped to solve each item type, four addi— tional scores were developed. For each of the four item types a sub- ject received a 4 if he checked Definitely Helped, 3 for Probably Helped, 2 for Probably Did Not Help, and l for Definitely Did Not Help. These variables plus the other pertinent information such as the kind of coaching received, the sex of the individual, his ability level for each study based upon scores on the pretest, and his cumulative grade point average (GPA) at the end of winter term 1971 were all punched on IBM cards so that a computer could be used to carry out the statistical analyses. HYPOTHESES The major hypotheses of the study, each dealing with one of the dependent variables described in the procedure section, are presented here. Under the first major hypothesis are listed a number of subhypoth- eses describing expected interactions. Under the subsequent major hypotheses no subhypotheses for interactions are made since there is no empirical or theoretical evidence on which such hypotheses could be based. However, statistical tests for these interactions were applied, and the significant interactions are discussed and interpreted. The first eight hypotheses are the same for both the study investigating the effects of coaching on analogy test performance and the study investi- gating the effects of coaching on series test performance. Therefore, the hypotheses are presented only once. 1. The group coached in item solution techniques will perform better on subsequent tests as measured by total test score. 40 la. There will be an interaction between coaching and initial ability level with the low ability level benefiting more from coaching than the upper ability level. 1b. There will be an interaction between coaching and the sex of the coached individuals. 1c. There will be an interaction between coaching and elapsed time until the posttest with the effects of coaching being greatest on the immediate posttest. ld. There will be an interaction between coaching and the simi- larity of the test items to the coaching material with the effects of coaching being greatest for the similar items. 1e. There will be an interaction between coaching and degree of speededness of the test with the coaching being most effective on the power tests. 2. The group coached in item solution techniques will have more positive test-taking attitudes toward those items than the control group. 3. The coached group will have more test-taking confidence than the control group when confidence is measured by the number of items estimated to be correct on the tests. 4. The coached group will exhibit greater accuracy of test-taking confidence than the control group as measured by the absolute difference between the number of items estimated to be correct and the actual number of correct items. 5. The coached group will attempt fewer items on the speeded tests than the control group. I 6. The coached group will spend greater time on the power tests than the control group. 41 7. The correlations between test scores and GPA will be higher for the coached group than the control group. 8. The coached group will evaluate the coaching as being most effective for the item types similar to the coaching material and less effective for the other types of items. In addition to the above hypotheses which were tested twice, once for the analogies experiment and once for the series experiment, the following hypothesis was tested once for all subjects. 9. There will be a disordinal interaction between the type of coaching received and the evaluations of the effectiveness of the coach- ing for the different item types with the verbal analogies group rating the coaching most effective for analogy items and the number series group rating the coaching most effective for series items. STATISTICAL ANALYSES A repeated measures analysis of variance with three design fac- tors and three measures factors, all completely crossed, was used to analyze the data for the first four hypotheses. This analysis was used so that possible interactions between the coaching and the other design and measures factors could be tested. The three design factors were the treatment-control dimension, sex, and initial ability level. The three measures factors were the immediate-delayed posttest dimension, the item similarity factor, and the speed-power dimension. This analysis was employed to test the hypotheses dealing with the first four dependent variables for both the analogy and the number series studies. Hypotheses 5 and 6 were tested by a repeated measures analysis of variance with the same three design factors and only two measures 42 factors. The speed-power dimension was not included since Hypothesis 5 dealt only with speeded tests and Hypothesis 6 was only concerned with the power tests. For Hypothesis 7 the differences in the values of the correla- tions were tested by changing the correlations into Zs through Fisher's Z transformation and testing the difference between the Z5 for signifi- cance. Hypothesis 8 for both experiments was tested using a repeated measures analysis of variance. Since this hypothesis dealt solely with the coached groups in each instance, there were only two design factors - sex and ability level. The single measures factor had four levels, one corresponding to each of the four types of items. Hypothesis 9 was tested using all of the original 88 subjects in a repeated measures analysis of variance. This was possible since ability level was not included in the design for this hypothesis. Hypo- thesis 9 was concerned with the group coached in verbal analogies compared to the group coached in number series. The ability groups for the analogy experiment were based on analogy pretest scores while in the number series experiment ability levels were determined from number series pretest scores. Since the two ability level groupings were not the same, the two could not be compared directly and therefore ability level was dropped for this hypothesis. The data from all 88 original subjects were then included in this analysis. In the repeated measures analyses of variance computed in these studies, all the design factors and measures factors were considered to be fixed. In order that the F ratio of the repeated measures analysis of variance actually follow an F distribution, the assumption must be met 43 that the off-diagonal correlations of the matrix of repeated measures are equal. This assumption is automatically met for every F test involving a single measures factor having only two levels since this results in a single off-diagonal correlation. Adjusting the degrees of freedom by the Greenhouse and Geisser (1959) procedure in these cases does not change the number of degrees of freedom in the numerator and the denominator. Hypotheses 8 and 9 involved a single measures factor with four levels and in this case the degrees of freedom were adjusted by the Greenhouse and Geisser procedure to account for possible spurious signi- ficance caused by unequal off-diagonal correlations. SUMMARY Actually, there were two studies described in this chapter - one designed to examine the effects of coaching in verbal analogy solution techniques on subsequent analogy test performance and the other designed to look at the effects of coaching in number series solution techniques on subsequent series test performance. Eighty-eight subjects were ran- domly assigned to two groups, one receiving coaching in verbal analogies and the other coaching in number series. In the analogy experiment, the number series coaching group served as the control, while in the number series experiment the analogy coaching group was the control. In addition to the experimental-control dimension, the design for each study included two other factors - sex and ability level based on pretest scores. The dependent measures varied on three factors: time of posttest (immediate and delayed), type of item (similar and dis- similar to the coaching material), and testing condition (speeded and power). 44 Examination of the effects of the coaching on the number of cor- rect analogies or series items obtained on the posttests was the main purpose of each study. Also examined, however, were the effects of coach- ing on test-taking attitudes, test-taking confidence, accuracy of test- taking confidence, Speed of item solution, and test validity. Finally, the subjects' own evaluations of the effectiveness of the coaching were analyzed. CHAPTER IV ANALYSES AND RESULTS The results of the two studies of coaching will be presented in this chapter. Although the two studies were exactly parallel in design, hypotheses and statistical analyses, they each examined the effect of coaching on a different type of item. Therefore, the analyses and results will be presented separately for each study. For both studies, Hypotheses 1 through 6 and Hypothesis 8 were tested using a repeated measures analysis of variance (ANOVA). For all hypotheses the .05 alpha level with the appropriate degrees of freedom was used. In each study a total of seven dependent variables were examined. Another ANOVA compared the evaluations of the coaching of the two groups, resulting in a total of fifteen ANOVAs. The complete tables of means for all groups on all variables are presented in Appendix D, and all the ANOVA tables are presented in Appendix E. Hypothesis 7 in each study was tested by transforming the correlations to Fisher 25. COACHING IN VERBAL ANALOGIES Hypothesis 1. The group coached in solving verbal analogies will perform better on subsequent analogy tests as measured by total test score. The difference in means between the coached group (i=9.32) and the control group (i=8.78) was not found to be statistically significant (F=3.97 df 1,64). Therefore this hypothesis was not accepted although the mean difference of the two groups was in the hypothesized direction and the F ratio was less than .03 below the F value needed for signifi- cance at the .05 level. 45 46 Hypothesis 1a. There will be an interaction between coaching and initial ability level with the low ability level group benefit- ing more from coaching than the upper ability level. The F ratio of .001 (df 1,64) was not statistically significant. The data did not support the hypothesis of an interaction between coach- ing and ability level. Hypothesis lb. There will be an interaction between coaching and the sex of the coached individuals. This hypothesis was supported by the data (F=4.28 df 1,64). The significant interaction is graphically depicted in Figure 4.1. The coaching was quite effective for the males, but it didn't have any affect on the females. 10.0 - 9.5 _ Treatment 9.0 J 8.5 4 ;-Control T 1 l ‘1‘ ‘ Males Females Mean Analogy Scores Figure 4.1 Interaction of Analogy Coaching and Sex on Mean Analogy Scores Hypothesis 1c. There will be an interaction between coaching and elapsed time until the post-test with the effects of the coach- ing being greatest on the immediate posttest. This hypothesis was not supported by the data (F=1.90 df 1,64). 47 Hypothesis 1d. There will be an interaction between coaching and the similarity of the test items to the coaching material with the effects of the coaching being greatest for the similar items. The obtained F ratio of 5.36 (df 1,64) was statistically signifi- cant. The difference between the coached group and the control group was greater for the verbal analogies and less for the figure analogies. This interaction is shown in Figure 4.2. m 11 d (D ‘5 ‘4”: 10 ~ K' Treatment 5. O E 9 . < 5 Control 0 8 2 C1 ‘1’ Verbal Figure Analogies Analogies Figure 4.2 Interaction of Analogy Coaching and Item Type on Mean Analogy Scores Hypothesis 1e. There will be an interaction between coaching and the degree of speededness of the test with the coaching being most effective on the power tests. The obtained F ratio of 7.87 (df 1,64) was statistically signi- ficant, supporting the hypothesis of an interaction between coaching and degree of speededness of the test. Reference to Figure 4.3 which shows the interaction, indicates that the coached group performed at a higher 48 level on the power tests, but there was no difference on the speeded tests. ,n 10 a m L. 8 m (Treatment >~, 01 O .2; 9 ~ 5 g Control :2 Figure 4.3 Interaction of Analogy Coaching and Degree of Speededness on Mean Analogy Scores Hypothesis 2. The group coached in solving verbal analogies will have more positive test-taking attitudes toward analogy items. The means of the two groups on attitudes toward analogy items were very similar (coached group i=5.05, control group i=4.83), and the F-ratio was not significant (F=l.26 df 1,64). There was no significant difference in attitudes toward analogies between the coached group and the control group. However, there was a significant interaction involving coaching. With respect to attitude toward analogies, coaching interacted signifi- cantly with the sex of the person being coached (F=4.73 df 1,64). Figure 4.4 depicts the disordinal interaction. For males the treatment 49 resulted in more positive attitudes, but the effect was reversed for females. As a result of the coaching, females had slightly less positive attitudes than the control group. / Treatment 2 5 .. It! 3 U) 1323‘ 8.93 K Control 11>. 3 03 +4 o 3373 11.. «H C << Jv Males Females Figure 4.4 Interaction of Analogy Coaching and Sex on Attitudes Toward Analogy Tests Hypothesis 3. The coached group will have more test-taking con- fidence as measured by the number of items estimated to be correct on the analogy tests. The means of the two groups were almost identical (coached group i=9.50, control group i=9.48). The difference was not significant (F=.OO df 1,64). However, there was a significant disordinal interaction between coaching and degree of speededness of the tests (F=13.66 df 1.64). As is shown in Figure 4.5, the coaching seems to have increased confidence on power tests while it produced less confidence on speeded tests. 50 10 ‘- Control 1 9 g Treatment -—’}a Analogy Test-Taking Confidence 8. 1': . 1: Power Speed Figure 4.5 Interaction of Analogy Coaching and Degree of Test Speededness on Analogy Test-Taking Confidence Hypothesis 4. The coached group will exhibit greater accuracy of test-taking confidence as measured by the absolute difference between the number of items estimated to be correct and the actual number of correct items. The group receiving coaching in verbal analogies had a lower mean score (x=l.53) than the control group (x=l.84). This significant differ- ence (F=4.3O df 1,64) indicates that the coached group was more accurate in their test-taking confidence since low scores indicate greater accuracy. There also was a significant interaction, shown in Figure 4.6, between coaching and ability levels with respect to accuracy of test- taking confidence (F=5.12 df 1,64). Coaching improved the accuracy of 51 the high ability group, but it had little effect on those of low analogy ability. 2.0 1 1.8 u Treatment 1.4 - Accuracy of Confidence 05 I an: 1 High Low cli- Figure 4.6 Interaction of Analogy Coaching and Ability on Accuracy of Analogy Test-Taking Confidence Hypothesis 5. The coached group will attempt fewer items on the speeded tests. For the coached group the mean number of items attempted on the speeded tests was 10.72, while the mean for the uncoached group was 12.13. The F ratio of 28.59 (df 1,64) indicated that this was a signi- ficant difference and the hypothesis was supported. In addition, there were two significant interactions for the number of items attempted on speeded tests. Coaching interacted with the elapsed time until posttest (F=ll.15 df 1,64) as indicated in Figure 4.7 and with the similarity of test items to the coaching material 52 (F=20.80 df 1.64) as is indicated in Figure 4.8. Figure 4.7 shows that on the immediate posttest the coached group attempted about two fewer items per speeded test, but that on the delayed posttest the effect had begun to diminish and the difference between the two groups was less than one item per test. Control 12.4 % '0 O.) .p ?" 11 n .5.) .p < g 10 .. K- Treatment 13 H ,L '[ e : Immediate Delayed Posttest Posttest Figure 4.7 Interaction of Analogy Coaching and Time of Posttest on Number of Speeded Analogy Items Attempted Examination of Figure 4.8 indicates that the effect of coaching .on number of items attempted was greater on tests of verbal analogies than on tests of figure analOgies. 53 16 1' ,0 14 a .3 D. g 12 ., 2" Control .3.» <1: (I) E 10 - d) .p "" Treatment / 8 a ,L L : : Verbal Figure Analogies Analogies Figure 4.8 Interaction of Analogy Coaching and Item Type on Number of Speeded Analogy Items Attempted Hypothesis 6. The coached group will spend greater time on the power tests. The coached group spent an average of 5.56 minutes on the power tests and the control group averaged 4.16 minutes. This difference was statistically significant (F=35.70, df 1,64). For this variable the coaching interacted with elapsed time until posttest (F=l3.57 df 1,64) and also with similarity of test items to the coaching material (F=27.93 df 1,64). As is shown in Figure 4.9, the treatment group spent an average of almost two minutes more than the control group on each power test of the immediate posttest. On the delayed posttest the difference had narrowed to less than one minute per test. 54 7‘1 .3 Treatment §6q (- 5 +35" C O) 0. _g I Control arJfl [.— 4- . : Immediate Delayed Posttest Posttest Figure 4.9 Interaction of Analogy Coaching and Time of Posttest on Time Spent on Analogy Power Tests Figure 4.10 shows that the effect of the coaching in increasing time spent on power tests was greater on the verbal analogy tests than on the figure analogy tests. 7 ‘1 3 6 4 (Treatment a .— g 5 -1 .p s: (D m Control ._:5“ E ‘5 ' '— ‘L : 4 Verbal Figure Analogies Analogies Figure 4.10 Interaction of Analogy Coaching and Item Type on Time Spent on Analogy Power Tests 55 Hypothesis 7. The correlations between GPA and verbal and figure analogy test scores will be higher for the group coached in verbal analogies than for the control group. Table 4.1 shows the correlations for both groups and the direc- tion of the differences between the correlations for all the analogy subtests and for some combinations of the subtests. In only one case, the figure analogy subtest given under power conditions in the immediate posttest, was the difference between the correlations statistically significant, and this test involved a probably spurious negative corre- lation for the control group. The pattern of differences in the correlations for the two groups indicates higher correlations for the coached group in two-thirds of the cases, but this pattern is not con- sistent enough to conclude that coaching increased the correlations between the analogy tests and GPA. Hypothesis 8. The coached group will evaluate the coaching as being most effective for the item types similar to the coach- ing material and less effective for the other types of items. The data showed that there were significant differences in the evaluations of the different item types by the analogy coaching group (F=24.59 df 3,96). Table 4.2 shows that the analogy coaching was related most effective for verbal analogies and least effective for series items. No significant differences were found between high and low abil- ity groups and males and females in their evaluations of the effective- ness of the coaching. 56 TABLE 4.1 Correlations between Analogy Test Scores and GPAs for the Group Coached in Verbal Analogies and the Control Group Item Coaching Control Difference Type Condition Form Group Group Direction Verbal Power C .16 .40 - Immed. Analogies Speed 0 .33 .27 + Posttest Figure Power E .36 -.16 + * Analogies Speed F .13 .01 + Verbal Power K .24 .27 - Delayed Analogies Speed L .33 .24 + Posttest Figure Power M -.20 .09 - Analogies Speed N -.04 -.20 + All Analogies Tests .34 .27 + All Verbal Analogies .36 .35 + Verbal Analogies Power .24 .39 - Verbal Analogies Speed .37 .28 + Verbal Analogies Immediate Posttest .32 .37 - Verbal Analogies Delayed Posttest .33 .28 + * Sig. at .05 57 TABLE 4.2 Mean Evaluation by the Analogy Group of the Effectiveness of their Coaching Type of Item Verbal Figure Number Figure Analogies Analogies Series Series Mean Evaluation 3.4 2.7 2.4 2.4 Score COACHING IN NUMBER SERIES Hypothesis 1. The group coached in solving number series will perform better on subsequent tests as measured by total test score. The difference in means between the coached group (i=lO.10) and the control group (i=8.86) was found to be statistically significant (F=21.Sl df 1,64). Thus, this hypothesis was accepted and it was con- cluded that the coaching improved scores on series tests. Hypothesis la. There will be an interaction between coaching and initial ability level with the low ability level group benefiting more from coaching than the upper ability level. The F ratio of .162 (df 1,64) was not significant. This hypoth- esis was not accepted and it was concluded that the coaching was equally effective for both the high and low ability groups. Hypothesis lb. There will be an interaction between coaching and the sex of the coached individuals. This hypothesis was not supported by the data (F=3.08 df 1,64). It can be concluded that coaching on number series did not have a signi- ficantly different effect on males and females. 58 Hypothesis 1c. There will be an interaction between coaching and elapsed time until the posttest with the effects of coaching being greatest on the immediate posttest. This hypothesis of a significant interaction was supported by the data (F=5.46 df 1,64). The interaction is presented graphically in Figure 4.11. Contrary to what was hypothesized, the effects of the coaching were greater on the delayed posttest than on the immediate post- test. Possible interpretations of this surprising finding are given in Chapter V. 10.5 a 8 10.0 .- / g R Treatment V) . m 9.5 4 CD '2 G) V’ 9.0 - : __: (6 fl (U z j 8-5 - Control t % 1 Immediate Delayed Posttest Posttest Figure 4.11 Interaction of Number Series Coaching and Time of Posttest on Mean Series Scores Hypothesis 1d. There will be an interaction between coaching and the similarity of the test items to the coaching material with the effects of the coaching being greatest for the similar items. The obtained F ratio of 29.37 (df 1,64) was statistically signi- ficant and the hypothesis was accepted. As is shown in Figure 4.12, the effect of the coaching on the number series items was quite strong, but I 59 there was no difference between the groups on the figure series items. 12 ~ 3 ll . g ur- Treatment ”1 lO . m .2 S. .919. Control-w’" E 2: 3 . it . . Number Figure Series Series Figure 4.12 Interaction of Number Series Coaching and Item Type on Mean Series Scores Hypothesis 1e. There will be an interaction between coaching and the degree of speededness of the tests with the coaching being most effective on the power tests. The F ratio of .19 (df 1,64) was not significant. This hypoth- esis was not supported by the data and it is concluded that the coach- ing was equally effective for both speed and power tests. Hypothesis 2. The group coached in solving number series will have more positive attitudes toward series items. The means of the two groups (coached group i=5.21, control group i=4.94) differed only slightly and the difference was not significant (F=l.50 df 1,64). This hypothesis was not accepted. 60 There was a significant interaction (F=8.15 df 1,64) between coaching and sex of those being coached. This disordinal interaction is presented in Figure 4.13. Coaching produced more positive attitudes toward series items for females who received coaching and it resulted in less favorable attitudes for males. Treatment 5.5 - ‘\\\y UT a) '0 3 3: 500 d 4.) 22 4.5 1 , Control -——J" J. 1: . . Males Females Figure 4.13 Interaction of Number Series Coaching and Sex on Attitudes toward Series Tests Hypothesis 3. The coached group will have more test-taking con- fidence as measured by the number of items estimated to be correct on the series items. The difference between the means of the coached group (i=10.17) and the control group i=8.94) was found to be significant (F=14.28 df 1,64). The hypothesis that coaching in number series produces greater test-taking confidence was supported by the data. There was a significant interaction (F=4.63 df 1,64) between coaching and sex of the person coached for the variable of test-taking confidence. Figure 4.14 indicates that although coaching had little 61 effect on the test-taking confidence of male subjects, it did signifi- cantly increase the confidence of females. 11 l a Treatment 8 ..£ OJ :6 10.. :__ .2 C 8 019‘ .5 1g Control‘-—-‘/(a +7 8 . 4.: 3. Hit a : Males Females Figure 4.14 Interaction of Number Series Coaching and Sex on Series Test-Taking Confidence Another significant interaction (F=18.67 df 1,64) for this vari- able is presented in Figure 4.15. Coaching had a significant effect in raising confidence on number series items but it had only a minimal effect on confidence on figure series items. 62 12 . 2?“ Treatment 10 . Test-Taking Confidence 8 _, Control-*3 L J j 1 Number Figure Series Series Figure 4.15 Interaction of Number Series Coaching and Item Type on Series Test-Taking Confidence Hypothesis 4. The coached group will exhibit greater accuracy of test-taking confidence as measured by the absolute difference between the number of items estimated to be correct and the actual number of correct items. This hypothesis was supported by the data (F=4.66 df 1,64). The mean of the group coached in number series was 1.00, while the uncoached group had a mean of 1.32. Since low scores indicate greater accuracy it can be concluded that coaching significantly improved accuracy of test- taking confidence. The interaction of coaching and sex of the person being coached was also significant (F=4.07 df 1,64). The graphic depiction of this interaction in Figure 4.16 shows that the coaching had little effect on the accuracy of the test-taking confidence of the males, but it signifi- cantly improved the accuracy of the females. 63 ;,g; 1.6 ‘ m: :24: C 1 "SE 1.4 .. ontro N O >u3 So» L: d 3:: 1.2 U!!! <1— 1.o - / 1. KTreatment ‘1' . . Males Females Figure 4.16 Interaction of Number Series Coaching and Sex on Accuracy of Series Test-Taking Confidence Coaching made little change in the accuracy of test-taking con- fidence for figure series, but it had a strong effect on number series items. This significant interaction (F=8.50 df 1,64) is presented in Figure 4.17. 1.6 4 +58 1.0” \ {- Control 1n: me: "E “a“; 1.2- >a: U 22’ ‘21:; 1.0-