’. 9 {11-33 2.1-»? si- Mn: :; ‘ 9 a. ‘ >V‘I" ’- .L’ INTERACT! h rat seemed so mic-amt of progr: ”firmly complex itaundertying lir ‘m variables prover "63. For example, 1 mi Tasks (step si WC? answer) have WW complex des 3" a {War under-s? ‘E‘m' exDeriment l ”“Yfiofiesized infe. MW“? levels. i ”i “t, ““9“?“ subjecfi I I f . --+;x The Sfudy Ins des "eleTimsmp am U: 3”W Van . . afiffncelly’ if i ABSTRACT INTERACTING EFFECTS OF VARYING STEP SIZE AND FEEDBACK IN PROGRAMED INSTRUCTION Hakim What seemed so simple and clear In the early history of the development of programed instruction has shown Its true face, that of an extremely complex instance of meaningful verbal learning. The prin- ciples underlying linear programed Instruction, developed by analogy from variables proven to be critical in operant conditioning, have not held. Fer example, findings concerning the importance of small incre- mental tasks (step size) and immediate reinforcement (feedback of the correct answer) have been contradictory. it was suggested that studies Involving complex designs offering interactive information were needed to 'gain a better understanding of this phenomenon. This study developed a factorial experiment utilizing multiple dependent measures to Investigate the hypothesized interaction between step size and feedback over differing achievement levels. In doing so, the study also assessed the adequacy of the students' subJective confidence level as a measure of step size. QDJESIlllfi. The study was designed to answer a series of questions concerning the relationship existing between the amount of information given before, and after, the overt response, and Its effect upon multiple outcomes. More specifically, It was planned to test the following hypotheses: only nan- gill/F John M. Gordon, Jr. In terms of the comprehension of concepts: I. Providing knowledge of correct response does not increase the effectiveness of small step size programs for students at all achievement levels. 2. Providing knowledge of correct response does increase the effectiveness of moderate step size programs at all achievement levels. 3. Providing moderately difficult frames with knowledge of correct response will be more effective than any other combination of step size and feedback for all achievement levels. 4. Providing moderately difficult frames with knowledge of correct response will reduce the boredom or "pail" effect among the upper and middle third achievement levels. men A selected portion of a published program covering static electri- city and voltalc cells was field tested and revised twice.to reach the minimal error rate conditions needed. During this phase, attempts were made to judge the usefulness of assessing the student's conficence in his frame response as an indicator of frame step size. The ratings proved partially helpful In frame revision but not acceptable in attempting to determine step size. Three additional variations of the basic program were developed to serve as step size levels. The first contained one or two letter prompts for every response, thus representing the easiest version. Redundant and review frames plus key words within frames were systematically elimi- nated from the other two versions, resulting in two more difficult step size levels. Each of the total of four variations was duplicated with a form containing the correct response, and one without, to constitute the eight treatment materials. 92 1x John M. Gordon, Jr. At the same time, a general science achievement test was being developed using differentiating items from previous teacher-made tests. The students were ranked according to their scores on this test, divided into thirds and randomly assigned to the eight program variations. The assignment procedure called for randomizing within each succeeding group of eight students while proceeding down the ranked list within the three "levels". The resultant was a 4 x 2 (step size x feedback) factorial experiment randomized within blocks design. All available, approximately 400, 7th grade science students were assigned to programs. No sampling was undertaken. Two separate criterion tests were developed, one involving knowledge and comprehension items; the other, application items. A set of affective rating scales were logically developed and plans to take time estimates made. The time to complete the program and testing phase ranged from two to eight fifty-minute periods. Students finishing early were given rema- dial or enriching Individual study. mm Osli size was equated at l6 by random elimination for ease in statistical calculation. This operation brought the total to 24 x I6 or 364 subjects. .Muitivariate ratios were calculated for the pertinent cognitive sources of variance. Appropriate univariate F-tests, and Indi- vidual comparisons were made to substantiate or refute the theoretical hypotheses and questions. Three-way factorial univariate analyses of variance were calculated for the affective scales, time estimates and error rates. lntercorreiatlon matrices were computed for all the possible affective and cognitive combinations for groups taking each of the eight program variations. -—E- ww—a ‘.,A basic John M. Gordon, Jr. Considering the independent effects upon the application scores: i. a significant variation remained due to prior achievement levels even when program knowledge-application score covariance was partialled out. 2. a slghificant step size x feedback interaction remained when the knowledge interaction was partialled out. The version of the most difficult variation was as effective as the easy variation with feedback. 3. neither step size or feedback was an independently signi- ficant factor. Considering the effect upon the affective ratings: I. varying feedback accounted for differences between interested- Bored ratings. Those receiving feedback rated themselves as being less bored. 2. the moderate step size variation with feedback tended to produce less boredom ratings within the middle third achievement level (than the other versions) but the differ- ences were neither large nor consistent. 3. lower third achievement level students rated Variation ll most interesting. 4. varying step size produced differences between the Successful- Freetrated and Progress-No Progress ratings In the expected directions. Considering the effects upon the over the summer knowledge loss: i. only prior achievement demonstrated any effect upon reten- tion loss: the lower third lost the least, but they had little to lose.' Considering the effects upon error rate and time to complete the programs: i. all main factor effects and their interactions were significant influences upon error rate and time to complete the program. Finally, there was no consistent indication of a relationship existing between any of the cognitive measures, prior achievement test scores, error rate, knowledge test scores or application test scores, and any of the affective ratings. ..u -John M. Gordon, Jr. The knowledge and application tests were given again in the ensuing fail, approximately five months after the program administration. Only knowledge scores were meaningful as many of the application exams were not completed within the allotted school period time limit. A univari- ate factorial analysis of variance was computed on the knowledge loss SCOI’OS e W The subjective confidence estimate added little information to the error rate In assessing step size. Estimates were either high or low. Conditions suggested that a response latency would be more informative. Considering the overall effects on the combination of knowledge and application scores: I. prior achievement level was a more Important influence than was expected. 2. a small, in relation to achievement level effects, but significant interaction between step size and feedback was found. Taken as though they were Independent, step size was, and feedback was 391, a significant factor. 3. there were ng,step size or feedback by achievement level interactions. Considering the independent effects upon knewledge scores: I. prior achievement levels accounted for more variation than expected. 2. the step size x feedback interaction was significant allowing the test between treatment combinations which demonstrated that feedback was g31_of value to the two small step size variations and only became Operative with the moderately difficultuparqaflon, 3. the overall knowledge and application step size effect (although not independent because of its interaction with feedback) was found within the knowledge criterion outcomes. INTERACTING EFFECTS or VARYING STEP-SIZE AND FEEDBACK ’ IN PROGRAMED INSTRUCTION BY ,6 JOHN w: GORDON, JR. A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY College of Education I965 ACKNOWLEDGMENTS The author wishes to acknowledge the Small Contract Program of the Cooperative Research Branch, Office of Education, Department of Health, Education, and Welfare for providing funds to carry out the research project. Special thanks are extended to the students, faculty, and administration of West Junior High School for allowing the Project to be interjected into their educational program. The author wishes to specifically recognize Mr. Glenn Burgett, Principal, Dr. David Schullert, Science Coordinator for the Lansing School System, Mr. Glen Baxter, Science Coordinator for West Junior High, and Mrs. Kathy Kolster, Mr. Lloyd MacPherson, Mr. Robert Bailey, and Mr. John Wedding, Science Teachers, for their cooperatlon. To Dr. David R. Krathwohl goes much gratitude for sponsoring the project and providing the major impetus and support. Other faculty members who were instrumental in seeing the project through to completion were Dr. Walter Stellwagen, Dr. William Stellwagen, Dr. Joseph Soups, and Dr. John Barson. Special thanks go to Dr. Barson for his encouragement. Finally, the author wishes to acknowledge his wife, Rachel, who waited patiently for iT to be completed. TABLE OF CONTENTS CHAPTER PAGE I. ROBLEM.OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO00.0.00... ' Problem Perspective.................. ....... ..... ..... Rationale and Purpose of the Study.................... Choosing the Classes of Learners................... Choosing the Classes of Objectives................. Theoretical Position and Hypothesis Development....... Related Purpose....................................... Overview of the Report................................ -O\I\JI\JI#— 'l. REV'EW OF WE L'TERAMEOOOOIOODOOOOOOOOOOOOOOOOOOOOOOOOO '2 Step Size Definition....................:............. l2 Effects of Step Size Variation.............. ..... ..... l4 Intelligence and Prior Achievement.................... i8 Effects of Feedback Variation......................... 20 Cognitive Outcomes.................................... 23 LongJTerm Retention................................ 23 Application Transfer............................... 24 Affective Outcomes.................................... 25 Summary............................................... 27 "I. PRmEDURESeeeeeeeeeeeeeeeeeeeeeeeeseeeeseeeeeeeeeeeeeeeee 30 Sample................................................ 30 Unique Role of the Four Teachers...................... 3| Content Selection..................................... 32 Step Size Assessment.................................. 33 Results of Confidence Rating Assessment............ 34 Conclusion......................................... 35 Development of Programs of Varying Step Sizes......... 36 Prior Achievement Placement Exam................3..... 39 Program Administration................................ 40 Knowledge and Application Criterion Tests............. 42 Affective Questionnaire............................... 43 Rationale Underlying the Analyses of the Research..... Hypotheses and Ouestions........................... 44 Study Design and Rationale............................ 45 Summary............................................... 46 mi Fccccc Chcc CHAPTER Ive RESULTSeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee000000'0000-00 Major Varleble Effects Upon Progre- Characteristics..... Eff" ”“0000000000000.00...OOOOOOOOOOOOOOOOOOOOOOOO T‘- *0 ml". Mr"...OOOOOOOOOOOOOOOOOOIOOOO... 0*“.r Program CharactorlstICS....o...................... M‘+'V. m+°OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO..00... SWISTICOI m‘m'.0000OOOOOOOCOOOOOOOOOOOOO0.00... Knowledge-Application Score Correlations, Graphs, and UNIquanoss...................................... ml+|v.r'a+° T“+OOOOOOOOOOOOOCCCOIOOCIOO...0.0.0.... APP'OP'IOTO UnIVCrIaf. Th3fseeeeeeeeeeeseeseeeeeeeeee Knowledge Criterion Hypotheses....................... APPIICUTIOH CPITOPIOD ou.s*I°"eeeeeeeeeeeeeeeeeeeeeee R‘*'"+‘on m WuflonOOOOOOOOOOOO00.0.0000000000000 Aff”*'v. MOOOOOOOOOOOO0.000000000DOOOOOOOOOOOI00.... Cognitive, Affective, and Cognitive-Affective Wr.'fllmso.fl.iOOOIOOOOOOOOOOOOOOO0......0.0.0.... MU'fs ISUWYODOOOOOOOOOOOOOOOOOOOOOOIOOIOOO0.0.0.0... DI'CUSSIOH MWi”ODOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO Discussion of Problem Section Cuestions................. Operant Conditioning Position........................... W+IQUI+Y M'flmOOO...OOOOOOOOOOOOOOOOOOOOO0.0.000... WITIV. ThmIst. m3'+TMOOOOOOOOOOOOOIOOOOOOOOOO0.0 mn'f'V° mmr'WOOOOO0.0.0.0...OOOICOOOOOOOOOOOOOOO EflTrY Raper+°Ir.eeeeeeeeeeeeeeeeeeeseeeoeeseeeeeeeeee Individual Accomodation and Pacing................... Tak D’ff‘cu'fYOOOOOOOOQOOOOOOOOOOOOOOOOOOOOOOOOOOOOO Overall Conclusion as to Theoretical Rationale.......... N.” PhObIOMS ThaT EVOIV.deeeeeeeeeeeeeeeeeeeeeeoeeeeeeee Contingent Generalizatlons.............................. Conclusions Concerning the Role of Step Size and F..db.Ck In Programing Siraf.QYeeeeeeeeeeeeeeeeeeeeeeee v. my, consulslows, no mmcmows was new assmcw eseoooeeeeeseeeeeeeeeeeeeeeeeeeeseeeeeseeeeeeeeee 8°CKgr°undeeeeeeeeeeeeeeeeeeeeesseseseeeeeeeeseeeeeeesee ObJ.c+Iv.seeeeeeeeeeeeeeeeesseeeeeeeeeeeeeeeeeseeeeeesee Procedur.eesseeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee mc'us'WOOGOODOOOOOOOO0......OOOOOOOOOOOCOIOQOOOOOOOO Implications for New Research........................... MIX 'OOCCOCQOOOOOOOOIOOCOOOOOOOOOOOOOCOOOCCO.9.0.0... MIX ‘IOOOOOOCOOOOOOOOO0..00..OOOOQOCOOOOOOIOOOOOOOCOO W'x "'OOOOOOOO...CO...IO...OOOOOOOOOOOOCOOOOOOIOO0.. B'R'WOOOOOOOOOOOIO0.0.0...0......OOOOIOOOOOOOOOOOOO as assess s § 83388383888321?“ mean: 38 II9 I2I I33 It. TABLE TABLE 3.I TABLE 4.I TABLE 4.2 TABLE 4.3 TABLE 4.4 TABLE 4.5 TABLE 4.6 TABLE 4.7 TABLE 4.8 TABLE 4.9 TABLE 4.IO TABLE 4.II TABLE 4.I2 LIST OF TABLES EXAMPLE FRAMES FROM THE FOUR STEP SIZE VARIATIONS.... THREE-WAY FACTORIAL SOURCE OF VARIATION TABLE FOR THE ERROR RATE DATA ("=3%)eeeseeeeeeeeeeeeseeeeeees THREE-WAY FACTORIAL SOURCE OF VARIATION TABLE FOR THE TIME TO COMPLETION OATA (Ns336)................. NUMBER OF RESPONSES, DENSITY, PROGRAMING RULES, MEAN ERROR RATES, AND MEAN SECONDS/RESPONSE FOR THE Pmm VERS|WSOOO00.000.000.000...0.00.00.00.00... RELATION OF KNOWLEDGE AND APPLICATION TEST SCORES.... MEANS AND STANDARD DEVIATIONS FOR THE KNOWLEDGE AND APPL'CAT'ON SCORESCOOIICOOOOOOOOIOOOOOOOOOOOOOOOOOQO SUMS, SUMS OF SQUARES AND CROSS PRODUCTS OF KNOWLEDGE (X) AND AWL|CATJON (Y) SCWESOOCOOOOCOOOOOOOCOOOOOO THREE-WAY FACTORIAL SOURCE OF VARIATION TABLE FOR THE MULTIVARIATE ANALYSIS OF THE COMBINED KNOWLEDGE AND APPLlCATION SCORES ("=w)0000000000000000000.00.... THREE-WAY FACTORIAL SOURCE OF VARIATION TABLE FOR THE KWLEWE SCORES (N-fl).OOOOOOOIOOCOOOOO0.0.0.0.... THREE-WAY FACTORIAL SOURCE OF VARIATION TABLE FOR THE APPLlCAT|ON SCWES (N-fl)000000000000000000.0.0.... MAIN EFFECT AND INTERACTION MEANS FOR KNOWLEDGE AND APPLICAT'ON TESTSOOOOO00.00.0000...OOOOOOOOOOOOCOOO. ANALYSIS OF THE APPLICATION SCORE VARIANCE DUE TO THE ACHIEVEMENT LEVEL MAIN EFFECT THAT REMAINS WHEN KNOWLEDGE SCORES ARE COVARIED OUT................... ANALYSIS OF THE APPLICATION SCORE VARIANCE DUE TO THE STEP SIZE MAIN EFFECT THAT REMAINS wHEN THE KNOWLEDGE SCORES ARE COVARIED OUT................... PAGE 5| SI 52 56 57 60 60 6| 62 63 ems 'éi i,§6 id t..9 ‘ii 4,20 ‘ls. TABLE TABLE 4 . I 3 TABLE 4. I4 TABLE 4. l5 TABLE 4. I6 TABLE 4. Ii TABLE 4.I8 TABLE 4. I9 TABLE 4 . 20 TABLE 4.2I TABLE 4 . 22 ANALYSIS C!" THE APPLICATIGI SCORE VARIAICE DUE TO THE STEP SIZE X FEEDBACK INTERACTION WHEN THE KNOWLEDGE SCMES ARE COVARIED OUT.................. BETWEEN MEAN AND RESIDUAL CORRELATIONS BETWEEN KNOWLEDGE AND APPLICATION SCORES FOR THE SIGNIFICANT SMESMVAR'AT'ONCOCO0.0.0.0....000000.00.0000... KNWLEDGE SCORE MEANS FOR EACH OVERALL ACHIEVEMENT LEVEL AND TREATMENT GROUP (N'I6 PER CELL). . . . . . . . . . . APPLICATION SCORE MEANS FOR EACH OVERALL ACHIEVEMENT LEVEL AND TREATMENT GROUP (N'I6 PER CELL). . . . . . . . . . . THREE-WAY FACTORIAL SWRCE OF VARIATION TABLE FOR THE RETENTION LOSS SOMES (N.'92)eeesessseeeeeeeeeeeeees COAPILATION OF THE F-TABLE PROBABILITIES FOR THE SIGNIFICANT MAIN EFFECTS AND INTERACTIONS ON THE AFFEfl'VE RAT'NG DATAOQOOOOOOCOO0.0.0.0000...0...... CWPARISON OF THE INTERESTED - BORED RATING MEANS FOR THE UPPER AND MIDDLE THIRD ACHIEVEMENT LEVEL GROUPS USING THE WITH FEEDBACK VERSIONS (N‘I4 PER CELL).... CCNPARISON OF THE INTERESTED - BORED RATING MEANS FOR THE LOVER THIRD ACHIEVEMENT LEVEL GROUPS USIMS ALL VERS'ONS (Na'4 PER CELL)...’OOOOOOOOOOOOOOOCOOO0..O. CORRELATIONS BETWEEN ENTRY ACHIEVEMENT Am KNOWLEDGE AND APPLICATION SCORES FOR THE EIGHT PROGRAM GROUPS. CMPILATION OF THE F-TABLE PROBABILITIES FOR THE SIGNIFICANT MAIN EFFECTS AND INTERACTIONS 0N MAJOR CRETER'm MEASURESOCOOOOOOOOOOOOO00.0.0...000......O PAGE 65 66 70 7| 72 75 75 76 79 CHAPTER I PROBLEM The rationale for programed instruction is not universally agreed upon. Consider, for example, the following pairs of contrasting state- ments. The first of each pair, taken from one of the position papers presented by B. F. Skinner, argues for a particular facet of the technique that bears his name. The second, made by individuals keenly interested in the operations of linear programed instruction, discusses the effects of these facets. In discussing the need for immediate feedback, Skinner stated that: "Immediate feedback encourages a more careful reading of programmed material than is the case in studying a text, where the consequences of attention or inattention are so long deferred that they have little effect on reading skills. The behavior involved in observing or attending to detail - as in inspecting charts and models or listening closely to re- corded speech - Is efficiently shaped by the contingencies arranged by the machine. And when an immediate result is in the balance, a student will be more likely to learn how to marshall relevant material, to concentrate on specific features of a presentation, to reject irrelevant materials, to refuse the easy but wrong solution, and to tolerate indecision, all of which are involved in effective thinking" (Skinner, in Lumsdaine and Glaser, I960, p. I54). Glaser, after having watched students working through a program, W85 2 ". . . impressed that you can leave out, sepeciaiiy in low error rate programs, a lot of the information feedback for many frames, because performing is reinforcing itself and information feedback is often ignored by the student. Sometimes it is necessary to force them to look at It, but that has some instrumentation problems" (Glaser, I963, p. IBB). Concerning small step size, Skinner wrote: "A second requirement of a minimal teaching machine also distinguishes it from earlier versions. In acquiring complex behavior, the student must pass through a carefully designed sequence of steps, often of considerable length. Each step must be so small that it can always be taken, yet in taking it the student moves somewhat closer to fully competent behavior" (Skinner, in Lumsdaine and Glaser, I960, p. I4l). Lipson, after taking these small steps, remarked: "As I proceeded, turning pages and writing down answers, the novelty quickly wore off, and I found myself growing increasing- ly bored. The steps in the program were so minute that It became less and less necessary to think: the correct answers came with what amounted to dulling certainty" (Lipson, I962, P- I). Finally, Skinner confidently concluded: "The learning process is now much better understood. Much of what we know has come from studying the behavior of lower organisms, but the results hold surprisingly well for human subjects. The emphasis in this research has not been on proving or disproving theories but on discovering and controlling the variables of which learning is a function. This practical orientation has paid off, for a surprising degree of control has been achieved. By arranging apprOpriate 'contingencies of reinforcement', specific forms of behavior can be set up and brought under the control of Specific classes of stimuli. The resulting behavior can be maintained in strength for long periods of time” (Skinner, in Lumsdaine and Glaser, I960, p. I40). Hilgard surveyed the ensuing research findings and countered: "It has turned out that Skinner's confidence, while Important.in promoting programmed learning, has not been fully sustained. This follows in part because the analogies he uses are just that, and do not necessarily represent identities in the processes involved in Skinner boxes and programmed learning . . . . I believe that the advances made in programmed learning have been based very little upon a strict application of learning theory, regardless of what devotees of the different theories may assert" (Hilgard, I963, p. I36). These problems concerning linear programed instruction can be thought of as part of a broader educational question; that of timing, (N'"when to introduce information into complex, verbal learning to achieve optimal retention, and transfer" (Wittrock and Twelker, I964, in lo). Is it, as the linear programed instruction originators stress, most effective and efficient to provide a highly redundant discourse such that there Is a high probability that the student will respond correctly, and then follow immediately with the correct response? Or should one provide a less redundant narrative such that the student can only offer a tentative response and then give him the correct answer? Would one strategy be more dependent upon the information within the correct answer than the other? Moreover, are there different strategies of information introduction for particular classes of learners? it is conceivable that the bright student would flourish with less redundancy and more difficult discourse, whereas the slow student would be retarded. Learning is also not without Its affective counterpart. Are there concomitant affective states which accompany these information strategies that tend to enhance or interfere with the cognitive functioning? Would the less redundant narrative, being more difficult, produce a level of anxiety which would seriously disable certain classes of learners? The already complex situation Is further complicated by the multiple outcomes of instruction, long-term retention and application. Is there a single strategy, using the linear programed instruction framework, that is most effective in attaining both outcomes? Finally, is there a single position, such as Skinner's, that can provide a rationale for generating solutions to the foregoing questions? Ifilgard obviously feels that none of the major positions, that of Skinner, the contigulty theory of Guthrie, or cognitive theory, can account for the "advances made in programmed instruction." If so, is there an amalgam of these positions which would serve as a model for producing operating principles, for programers as producers, and students 0 and teachers as users? 9 n f e Lumsdaine views the problem In these terms and asks"whether a science of Instructional programming, dealing with intermediate- level principles, needs to be devel0ped as such - or whether implications for a more general learning and behavior theory can ultimately suffice as a foundation from which a technology of practical programing princi- ples can be derived" (I964, p. 393).~ He continues to present a rationale for the development of these Intermediate-level principles by reasoning that: "The need for and probable character of such intermediate level principles in the deveIOpment of a science of Instruction rests In part on the proposition that, in view of the com- plexity of human learning and the diversity of human learning tasks, we can expect to find relatively few universal generall- zations tl'at hold for all classes of instructional objectives, all classes of learners, and all conditions of instruction. Rather what is likely to be most needed is a series of 'contingent' generalizations that take account of the interactions of variables. Experimentally, this position argues for fac- torlal experiments In which two or more variables are studied in combination so that qualifications on a generalization can be determined, and we may validate contingent generalizations of the form: Under condition A, one is obtained, whereas under Condition 8, result two is obtained" (p. 394). The rationale for the present study followed this same line of reasoning and accepted the development of a series of contingent general- Izations as its objective. The study proposed a factorial experiment ”hich investigated the interactions between varying amounts of information Prior to and after the overt response. Accordingly, these interacting 0“acts were studied with different classes of instructional objectives and different classes of learners. There might come a point, however, at which some creative individual "III combine the principles which this and other research are suggesting into a set of higher-order generalizations - a general learning and behavior theory. As Schramm says, "perhaps this is the curve of progress we should expect - many small advances, resulting, over time, In the accumulation of insights to the size of a critical mass" (i965, P. 9). W The choice of what classes of learners to sample, as in most educational research, was not really a choice at all. One accepts what is available. Fortunately, four seventh grade teachers In the Lansing schools were interested and willing to try out the experimental materials within their classes. Obtaining seventh graders as a target population was an unexpected surprise,for the achievement range within this grade is quite large. The wide range of achievement made it possible to survey effects of the factorial combinations over three fairly wells deflned levels. WW If we can expect a separate set of contingent generalizations for different educational objectives, then the choice of these objectives and their corresponding criterion performances was crucial. Tao few studies have made an attempt to clarify these separate outcomes. Bloom, et. aI., (I956), In their analysis of these various objec- tives in the cognitive domain, provided a workable framework for separating outcomes. The authors' classification system identified Isnowledge, comprehension, application, analysis, synthesis, and evalua- ‘Tion as possible categories. Only the first three objectives, knowledge, Comprehension, and application, were judged by the teachers in our sample as applicable to the content to be conveyed to their classes. Since the program content was primarily concepts and principles rather than facts, we combined knowledge and comprehension categories into one instrument. A second instrument would be prepared to assess the re- cognition and application of principles within new situations. Settling upon multiple criterion tests opened the possibility of Identifying contingent relations for both outcomes within the one experiment. It also offered to provide evidence regarding the supposed hierarchical relationship of these categories. Other evidence within the literature pointed to the need for an assessment of long-term retention as a relevant educational objective. Certain of the step size x feedback variations of the program might possibly result in greater retention. It was, therefore, decided to reedminister the criterion tests during the following fall using the summer vacation as an Interim period. Krumboltz (I963) introduced more Intriguing debate when he contended that program error rate was also a dependent variable. He maintained it was dependent upon the various inter- and infra-frame cue manipulations. What, then, Is the relationship between error rate and the other criteria - knowledge and comprehension, application, and retention? The common term for the decoding and encoding of content is acquisition. ls error rate a measure of program acquisition? On the one hand, the answer may be affirmative, in that program frames so Closely resemble test Items that It could be said that a student making Very few errors in working thr0ugh the program has acquired its content. Chn the other hand, the answer may be negative, since some would claim 'That if a student really had acquired the content, he would be able to Perform well on a separate knowledge criterion test, even though the test items were almost cepies of certain, selected program frames. Still others might say that he really has not acquired the content until he could demonstrate that he had retained the material for some specified length of time. The Investigator, consequently, chose to study all forms of acquisition, independently, and in relation to one another. a l 5 el The informational strategy of linear programed instruction calls for a repetitive cycle consisting of: presenting Information, leaving one or more words out of context to be filled in by the learner, and immediately offering the missing words. This series is continuously repeated In a sequence which either follows a planned pattern, e.g., ruleg, conversational chaining, or a most logical sequence, that Is, logical for the programsr. In this seemingly simple informational cycle are all the basic features of the learning process; felt need (to learn). perceived goal (the correct answer), an increase in tension (problem of generating the missing word or words), activity of the provisional try (writing in the perceived words), and feedback (both knowledge of the correct answer and relief of the tension or reward if correct). The original basic strategy, borrowed from Operant conditioning, called for enough prior information to assure the probability of less than one in ten that the learner would make an incorrect response. The learner's response must be written and the correct response must be Shown inmediately following the overt sct. The words selected for the tHanks or responses must be crucial to the information content of the IDrogram. Not long after the advent of this strategy in program form, reports of discontent were heard: the frames were boring, writing the response seemed a hindrance, the correct response was being ignored. To explain these occurrences and offer a hypothetical solution, the following set of statements are advanced: 3. if the prior information is so redundant as to make the missing word obvious to the point of certainty, a) there is no increase in tension, hence no challenge, and boredom results; b) there is no need for the information within the feedback; and c) since there is no tension, the feedback loses its re- ward value as well. If, however, the prior information were to be controlled so that there was a lack of certainty in selecting the reaponse and an accompanying arousal of tension, a) the boredom would disappear; b) the feedback would partially maintain Its information value; and c) the choice of a correct response would retain its rewarding function with its reduction of tension. The prior Information should not be so precise or so vague that there Is no possibility, or only a small probability, that the student will discover the missing words. In such a case, a) too great a tension level Is created, b) the information in the feedback is not seen as relevant, and c) its reward value Is lost. Thus, a program whose prior Information presentations might evoke both response uncertainty and slight tension, would result In: More efficient content acquisition due to the reduction of prior Information redundancy. 2. Assurance that the student complete the full cycle of the learning process which would result In more effective Immediate and long-term retention of the program content. 3. A greater probability of problem-solving processes, as yet undefined, being activated which in turn might transfer to tasks where application of content principles is needed for solution. 4. Maintenance of positive effect because of the continuous tension arousal and reduction Inherent in this version. The following research hypotheses and questions developed from these suppositions and related findings of other studies are suggested: In terms of the comprehension of concepts: Providing knowledge of correct response does not increase the effectiveness of small step size programs for students at all achievement levels. Providing knowledge of correct response does increase the effectiveness of moderate step size programs at all achievement levels. Providing moderately difficult frames with knowledge of correct response will be more effective than any other combination of step size and feedback for all achievement levels. Providing moderately difficult frames with knowledge of correct response will reduce the boredom or "pail" effect among the upper and middle third achievement levels. ID The following questions evolved from consideration of other instruc- tional problems that had not been a part of the original rationals. I. How adequate is the subjective confidence rating the learner gives to his frame response as an indicator of size of step? 2. What effect will programs of varying step size and feedback have upon the learner's ability to apply the concepts and principles within these programs to similar situations? 3. What effect will programs of varying step size and feedback have upon the learner's ability to retain the program concepts over a long Interval of time? 4. What relationship exists between prior achievement, program achievement, and the ability to apply the program concepts and principles to similar situations? 5. What effect will programs of varying stepslzs and feedback have upon the level of boredom among the below-average achievers? Re d e It was Initially hoped that the evidence resulting from the study would be applicable to the more common student-teacher instructional interaction. Upon further analysis, there arose a number of distinct differences between the two instructional patterns which make any such Inferences quite strained. The following list attempts to relate these major distinctions: Student-Teacher Interaction Linear Programed Instruction I. Teacher paced Student paced 2. Few students respond All students respond 3. More or less sporadic sequence Empirically tested sequence 4. Calls for listening comprehension Calls for reading comprehension As a reSult, the study restricts Its remarks to situations which relate to the task and methods of linear programed Instruction. W The discussion within Chapter II centers upon reviewing and re- lating the findings of other research projects to the theoretical variables and framework just presented. The discussion contains those studies which led to the original hypothesis formulation and also those pertinent studies which have been published since the time of the original proposal. A further clarification of the methods used to put into operation the main variables, step size and feedback, plus the major procedural decisions are reported in Chapter III. Specific manipulative controll- lng techniques are explained as well as those in the basic design. A rationale Is offered for the statistical analysis that follows in the next chapter. The Specific findings and decisions concerning the theoretical hypotheses and questiOns are reported In Chapter IV. Summarizing tables are used, wherever possible, to aid in the interpretation of the multiple outcomes. The relation of the results to the theoretical framework Is discussed. The report ends In Chapter V with a summary, conclusions, and Implications for further research. Appendices and a bibliography follow. The appendices contain graphs of the two- way interactions of the dependent variables, a student by Item confidence rating matrix, a sample affective questionnaire, and sample prior achievement, knowledge, and application tests. CHAPTER II REVIEW OF THE LITERATURE The review of the literature will concentrate upon those studies which led to the development of the original theoretical framework. Projects published subsequent to the initiation of the project are included and their relation to the original proposal is explained. SIQQ Size Dgflnltlgn The important principle that mastery of a complex subject should be built up by fairly small steps seems no more than common sense. But when Skinner advocated it In conjunction with his other programming rules, the educational world took notice. It was some time before those who accepted these principles realized the inherent complexity in operationally determining the "size" of the steps toward task mastery. Simply calculating the per cent error rate for a program frame and checking this percentage against some pre-determined criterion was little help. It was the actual manipulation of frame content that caused the difficulties. As a result "size of step" began to take on various definitions for example, the number of words per frame, or complexity of length of the response. The meaning associated with error rate, that of degree of difficulty of giving the correct answer, was generally accepted. Yet as Lumsdaine (l959) I2 l3 cogently pointed out this notion of degree of difficulty gave step size a duality of meaning. Difficulty level was a dependent variable, dependent upon the cue manipulation within the frame and the learners on which the program was being validated. The error rate percentage, therefore, was only one Indication of the complex Interaction between frame content and learner repatoire. Little progress was made in clarifying this issue. This led Deterline (l963) to state that "no one has yet satisfactorily defined step-size? (p. 6). Smith's solution to the problem was to abolish the term and begin an experimental analysis to ”iden- tify the behavioral components of that class of behavior now generally titled 'step-size'" (I963, p. l). Earlier Shay (I962) had developed a cumulative probability estimate for a given Item based on the error rate of all the items in the program. His method, although mathematically and operationally correct, did not help explain the "class of behaviors" that the student per- forms in working through a sequence of Items. Jacobs (I963) suggested that someone explore the use of a subjective probability estimate of the difficulty of a given frame for a particular learner as a way of assessing this elusive step-size. To determine the reliability of this sub- Jective measure, he surveyed the literature and concluded that "In general, ratings of difficulty, ratings of confidence, and measures of correctness correlated fairly highly In the expected ‘directlons" (p. 35-6). Johnson, In a discussion of subjective‘ ratings, said "In all probability, ratings of the difficUlty l4 of judgments and the confidence of these judgments may be antonymous designations for the same variable" (l955, p. I68). One objective of the present project was to determine a reliable method for measuring step-size by Investigating a method of a assessing the subjective confidence the learner has in his response to a given frame. The confidence measure rather than a subjective probability had been selected on an intuitive basis as being an easier judgment for seventh grade students to make. Suppes (I964) suggested the use of response latency as a method for defining frame or task performance. He discovered a latency decrease over repeated presentations. Brooks (I964) found median response latency to covary with error rate and concluded, "Latency efficiently tells something about all frames -- not just those missed. Error points up excessively difficult frames; latency can indicate as well, those which are too easy. Latency can guide toward greater efficiency or uniformity of task loading, perhaps to arrangements of difficulty which improve student motivation. A quantified measure, the variance of latencies may index the degree of ambiguity. . . Error and latency data, can be complementary aids in developing experi- mental materials" (p. 4-5). ffe s f e -S ze Var a on Although this research ls primarily dedicated to the contingent relationship between step-size and feedback, it is .Important to review those studies reflecting either of the variables as an individual source of criterion variation. With regard to step-size, the results have been contradictory. Coulson and Silberman (I960) and Campbell (l96l) found students, working with the small step version scoring significantly better l4 of judgments and the confidence of these judgments may be antonymous designations for the same variable" (I955, p. I68). One objective of the present project was to determine a reliable method for measuring step-size by investigating a methOd of a assessing the subjective confidence the learner has in his response to a given frame. The confidence measure rather than a subjective probability had been selected on an intuitive basis as being an easier judgment for seventh grade students to make. Suppes (I964) suggested the use of response latency as a method for defining frame or task performance. He discovered a latency decrease over repeated presentations. Brooks (l964) found median response latency to covary with error rate and concluded, "Latency efficiently tells something about all frames -- not just those missed. Error points up excessively difficult frames; latency can indicate as well, those which are too easy. Latency can guide toward greater efficiency or uniformity of task loading, perhaps to arrangements of difficulty which improve student motivation. A quantified measure, the variance of latencies may Index the degree of ambiguity. . . Error and latency data can be complementary aids in developing experi- mental materials" (p. 4-5). ffe s f a -Size Var a on Although this research is primarily dedicated to the contingent relationship between step-size and feedback, It is important to review those studies reflecting either of the variables as an individual source of criterion variation. With regard to step-size, the results have been contradictory. Coulson and Silberman (l960) and Campbell (l96l) found students, working with the small step version scoring significantly better l5 than those using the large step version. Smith and Moore (i962), Briggs (l958), and Shay (l96l) found no significant differences. To complicate matters, Goldbeck and Campbell (l962) and Evans, Glaser, and Homma (l960) both obtained results suggesting that some intermediate level may be optimal. All of these studies provided feedback in the form of the correct answer. The possi- bility of some optimal level above the required ten per cent error rate criterion provided one of the major ties between the literature and the project rationale. Krumboltz (I963), in a series of clever and comprehensive experiments, attempted a systematic investigation of the effects of various methods of varying step size on error rate and criterion performance. His general thesis was "that [Trame7 difficulty level is a dependent variable, not an independent variable, and may vary directly or inversely with criterion performance depending on a number of independent variables which influence both difficulty level and criterion performance" (p. I). This conception sheds a much different light upon the problem of step size variation. in other words, one must be concerned with what it is he is manipulating within the frame, the independent variables, and be aware that step size or difficulty level is an outcome just as are other criterion measures. These manipulative variables, for want of a better classifica- tion, could be split into those that work with a frame - intra- frame - and those which manipulate entire frames with reSpect to the surrounding frames - inter-frame. One could also think in terms of adding or eliminating information, either l6 intra- or inter-frames. Krumboltz's basic attack seemed to be one of adding information Intra-frames. Adding irrelevant information to already small step frames Increased error rate while adding clues and hints decreased error rate as expected. Neither addition, however, affected criterion performance. The basic information, although disguised, remained. Increasing both error rate and criterion performance was accomplished by asking the students to discriminate between plausible alternatives in a multiple-choice type of program. These results were difficult to compare with findings from the other studies because the task was one of recognition rather than recall. The basic information was, however, supplemented by relevant information within the plausible alternatives. ‘This relevant information probably provided more examples of concepts, repeated more facts, etc., all of which might be expected to increase both kinds of performance. The present study developed one form similar to Krumboltz's cueing or prompting version in the attempt to construct an easier variation than the basic small step program. In making the program more difficult, both intra- and inter-frame redundancy was reduced by removing information in direct contrast to Krumboltz's addition of irrelevant information, more complex steps, and less familiar terms. Klaus (l964), while studying the relationship of step-size, error rate, and achievement, defined step-size without reference to learner behavior. Four components; reSponse, cue, context, and enrichment, were used to differentiate frames. Eight indices, l7 four Intra-frame and four inter-frame, were developed which employed these four components. A set of procedures, not explained, was used for manipulating frames in the two programs selected for the study. These program frames were analyzed according to a complicated normative system based upon the above Indices. Since it was not fully explained, it was impossible to fit his method of information introduction Into the overall attack as developed by Krumboltz. The three step-size versions of both programs had over 700 frames, with I92 subjects participating in the study set up by Klaus. The subjects were divided into three levels by their scores on an intelligence test. Each student was assigned at random to one of the three step-size versions, which resulted in a balanced 3 x 3 (intelligence x step-size) factorial experiment. "No version produced an error rate of over 20}, even among low ability students" (p. 3). Step-size had, however, a significant effect upon the error rate recorded for both programs. Ability level was a significant variable with respect to error rate on one program and not on the other. Klaus gave no indication as to why this difference between programs might have been expected. Both ability level and step-size were significant factors in varying the time to complete the programs. Klaus failed to find any step-si2e effects with respect to both proficiency and transfer tests. Ability level did, as one might expect, exert a significant effect upon both criteria. There was, however, no interaction between the two main factors. Step-size variation-did not have a differential influence over l8 ability levels. "In summary, the principle findings from the study are that step-size affects error rate, but does not affect achievement and that, when ability level is controlled, error rate and achievement are not significantly related" (p. 5). It was disappointing that Klaus did not Specify his methods of increasing step-size so that one could compare his results within the framework developed by Krumboltz. Furthermore, the differences between error rates for the three versions, although significantly different, did not vary greatly. For example, the mean error rate for the largest step-size version for one of the programs was only 9.4, differing from 7.6 to l2.5 among ability levels. Also, the subjects were considerably above average, with a mean lO of ll7. Greater heterogeneity among the abilities of the learners would undoubtedly have increased error rates, thus making differential main effects and interactions possible. The present study developed a greater step-size variation over a different age level and more heterogeneous class of learners, seventh graders. Also, the effect of feedback variation was added. Intelligencg and Prigr Aghlgvemeni In the early literature in linear programmed instruction, Porter (I959), Ferster and Sapon (I958), and Shay (l96l) found little relationship between aptitude or intelligence and program criterion achievement. Further studies (Reed and Hayman, l962; Lambert, Miller, and Wiley, I962) using longer and more difficult material produced contradictory results in which ability was the major determining factor. l9 Carroll (I964) found aptitude to be highly related to criterion performance (r f .75), in a study where time to complete the program was held constant. Klaus (i964), as discussed earlier, reported consistent significant differences in criterion performance resulting from initial ability levels. Elgen and Feidhusen (l963) made the most comprehensive study, by assessing the zero and first order partials for lQ, pre-test achievement, reading skills, criterion achievement and transfer. Their general finding was that "reading ability and IQ, while initially correlated with learning, were found to be less essential in accounting for post- test variance than general achievement level prior to work on the programs" (p. 385). Gotkin and his workers (I964), in a study using seventh grade students quite similar to the present study, maintained that Vindividual differences in pre-test knowledge and cognitive capacity, dictate entirely different programs for different students in the case of most subject matter. . . Even with extensive revisions there is no guarantee that the slow students could attain terminal behavior, and the resulting program would no longer be the one that appealed to bright students" (p. 4). Gotkin found the culturally disadvantaged students unable to make use of the syntactical prompts, feedback, and redundant frames. He finally had to conclude, "For most of t.e children, our findings indicate that for seventh graders three years retarded in reading, the ability to learn abstract concepts from printed materials is limited" (p. 4). The solution, he felt, was to develop materials based upon Piaget's concept of developmental stages. 20 Feedba k Numerous methods of providing information after the learner's response have been labeled feedback. The term was borrowed from electrical engineers who use it to describe any electrical impulse that serve to regulate the parent system. This lack of a sharply - defined meaning for the term feedback has inevitably led to equivocal findings. Such diverse tasks as target shooting, concept formation, attitude change, and programmed instruction. fall within the domain of feedback studies, and therefore, share related results. Since this study Is concerned with programed instruction, an example of highly meaningful verbal learning, the literature review will only touch upon those related studies needed to provide a rationale for our analysis. A distinction first has to be made between the separate, but theoretically dependent, affect- lve and cognitive components of providing feedback. Thorndike's (I9l3) "law of effect" and Skinner's "reinforcement" (I954) both seemed to center on the reward or affective component of feedback. The effectiveness of a large number of studies, ranging from pigeons playing ping-pong to the learning of paired-associate lists, has been attributed in part to the giving of immediate reinforcement. It was also recognized that providing knowledge of results was cognitive in nature. The word "knowledge" obviously referred to a capacity to clarify, correct, and confirm one's choice. The cognitive component made its way into the literature through the training methods research dealing with perceptual motor skills. 2| Ammons (I956), summarizing this literature, found few exceptions to the generally accepted rule that knowledge of results facilitated performance. The tasks, however, were such that the ; learner's response was primarily a provisional try; a situation where learning the degree;and direction of his missing the target was obviously invaluable; The congruity between these tasks and meaningful verbal learning has not been clarified. Each of these areas, with the main impetus from Skinner, led to the generalization of the feedback principle to programed in- struction, even though there was awareness of the task dissimilarity. Smith and Moore (l96l) , as well as Hough and Revsin (l963), and Feldhusen and Birt (I962) found no significant differences between programs which did and did not provide feedback. Furthermore,‘ Krumboltz and Heisman (l962) and Lambert (I962) found no differ- ences when they varied schedules of reinforcement or feedback throughout the program. Hough and Revsin (l963) offered the ex- planation that ”when students iknow' that their reSponse ls right and thus assumedly reinforce themselves, it would seem reasonable that further confirmation in the form of a reinforcement frame would be redundant" (p.290). Angeli and Lumsdaine (I959) had earlier demonstrated an inter- action between the number of prompting trials and of feedback or confirmation/correction panel, as they called it, In a paired asso- ciate task. They varied the number of prompting trials, I, 2, or 3 (which is analogous to manipulating step size), and the method in which feedback was given. The feedback variations were right- wrong, giving a correct and incorrect response, and providing only the correct response. They found a highly significant effect due 22 to prompting trials, while the significant effect due to feedback variations occurred only in the case where no prompting trials were given. Feedback, In any form, became inoperative after two or more prompting trials. In other words, increasing the number of times the subjects experienced the pairs decreased the need for feedback. The extension of these results to redundant cueing with- In the context of a small step-size program and its corresponding relationship to feedback, provides one of the major generalizations of the study. Smith and Moore (l962) and Goldbeck and Campbell (I962) also conjectured that there may be a relationship between step size and feedback. Goldbeck, 91, 91,, in reviewing their studies In Coulson (I962) maintained that: "There was too little challenge in the easy items for the overt responding and the formal feedback to become effective. When moderately difficult items were en- countered, however, there may have been an increase in motivation and implicit activity associated with the ef- fort for response, with a concomitant Increase in the value of the feedback. The increase in the value of the feedback for moderately difficult items appeared to occur whether the overt response was correct or incorrect during the learning trial" (p. 88). The possibility that there might be an optimal point at which the step size and the accompanying feedback might function for the most effective learning in school subject matter is the basic question of the present study. The increased motivation resulting as a concomitant "side effect" is discussed in the section on affec- tive outcomes. Further support for the theoretical position was given by Ausubel (l963) who stated: ". . . feedback is not generally indispensable for learning, 23 but, on both motivation-reinforcement and cognitive grounds, should facilitate the learning process, more so in the case of rote than of meaningful learning. However the research evidence tends to be equivocal, particularly in relation to programed Instruction, be- cause of the failure to control other relevant variables. Further compounding the difficulty of interpreting the effect of feedback on meaningful programed learning, is the fact that both low error rate and the possibility of Implicit feedback reduce the facilitating potential of explicitly provided feedback" (p. 208). Finally, Wittrock and Twelker hypothesized and supported the principle that feedback "enhances learning, retention, and transfer, when the information it contains is not greatly redundant," (I964, p. l0) In a problem-solving task involving deciphering ten trans- positional cryptograms. They also found an interaction between the amount of direction given within the problem and the amount of feedback. They stated that: "It appears that non-redundant information in the form of knowledge of correct response added to a minimally directed situation enhanced learning while redundant knowledge of correct response added to an already prompted situation, did little to learning, retention, and transfer" (p. l7). This study was reported after the initial proposal and repre- sents an intermediate link from Angeli and Lumdaine's paired- assoclate task finding to the proposition that this interaction is also apparent within the learning of school subject matter. her n t v Ou omes Lgng Igcm Retention A retention test of some form has been part of many studies In programed instruction. Strong (l963) summarized these by stating, "Most of the research suggests that programed In- struction maintains its superiority over time but it is not superior in terms of percentage retention" (I963, p. 226). Krumboltz and Heisman (l963) demonstrated some contrary evidence as they found 24 greater retention occuring when students were forced to give overt responses over a two-week interval. App|iggiign-Iran§fer Gagne and Dick (I962) included a transfer test in a study attempting to "measure and define the nature of 'what was learned in a teaching-machine program on solving simple algebraic equations of the first order'" (I962, p. IO). They were attempting to demonstrate Gagne's contention that the success at a higher task level was primarily dependent upon the acquisition of specific and relevant lower order tasks. The students performed very poorly on the transfer test, scoring a mean of 2.07 out of a possible total of 50. The restricted amount of variance about the mean of 2.07 made the reported correlations between transfer scores and the other variables questionable. Wendt and Rust (l963) demon- strated the use of pictorial frames as beneficial for transfer to the real life situation. Klaus (l964), studying the effects of varying step size over ability levels, reported no significant differences between transfer scores due to step-size. His stepésize variations were not very large, however, with only one version having an overall mean error rate over l0 per cent. He did find significant differences re- sulting from the Initial ability placement. Unfortunately, cor- relations between ability, post-test, and transfer scores were not reported. Eigen and Feldhusen, in a series of studies reported In DeCecco (I964), found evidence for Gagne's hypothesis that "achievement at one stage is a principal determiner of success in learning at a next higher stage of learning new material" (p. 384). In studying the interrelationships among reading ability, IQ, post-test 25 O acquisition, and transfer, they stated that "It is inferred from the consistently high correlations of acquisition and transfer scores that successful 'performance' in a new class of tasks for which the learner must make adaptations is primarily dependent upon mastery or achievement of a subordinate set of learning tasks as measured by the acquisition tests" (p. 384). Affe v Ou comes "As i proceeded, turning pages and writing down answers, the novelty quickly wore off, and i found myself growing increasingly bored. The steps In the program were so minute that it became less and less necessary to think; the correct answers came with what amounted to dulling certainty" (Lipson, I962, p. 2). Affective reactions to programed instruction have been reported which vary from the ridiculous to the sublime (Naumann, l962, Roth, i963, Eigen, I963). Among the former have been the recurring statements which refer to a form of boredom or monotony which is said to result from highly redundant information rate. This dele- terious side effect has been given the unenviable label of the "pail" effect. Goldbeck suggested In an earlier quote that "when moderately difficult items were encountered, however, there would be an in- crease in motivation." McClelland, Atkinson, 21, 21, (I953), in their classic study of achievement motivation, posited this theo- retical explanation for the occurrence of positive and negative affect which reads as thodgh It were an attempt to define the situ- ation stated by Goldbeck. "If the expectation is of low probability [large step-size], then confirmation should produce negative effect as in 'fear of the strange.‘ If they are of moderate probability [moderate step-size] precise confirmation should produce 26 pleasure as In reading a detective story or playing solatalre. if the expectations are of high proba- bility [small step-size], precise confirmation produces boredom or indifference" (p. 87). This explanation provides the theoretical rationale for the hypothesis of reduced boredom or 'pall' effect from the moderate step-size versions of the programs for the above average and average groups. What will transpire with the below average group, which had little confirmation throughout their schooling, remains a question. In a related study, not using programed materials, Chansky (l964) found that "attitudes students express about learning seem to be related to the acquisition phase of learning but not to the retention phase" (p. 99). Students, given four methods of instruc- tion for learning a given task, rated those giving continuous feedback as most interesting and intermittent grading more worry- provoking. The less provoking methods did not, however, result in better retention. Intermittent grading was most effective and efficient in terms of retention. The comparison to step-size and feedback variation as producing affective change and its effect upon cognitive outcomes, hopefully, is obvious. Eigen and Feldhusen, as reported in DeCecco (i964), found "that students' attitudes toward the program are not generally cor- related with their succeSs in learning or transferring from the program" (p. 385). ‘No interpretations were given as to why the correlations were not discovered. 27 Summary The definition and measurement of step-size has been a constant source of trouble for those advocating linear programed instruction. The assessment of either frame or program error rate offered some information as to the relative difficulty, the most common meaning, but little understanding of the complex interaction between student and frame cues which produced the response. Lumsdaine, as early as I959, promoted a concentrated attack upon the cue manipulation intra- and lnter- frames to offer some evi- dence as to the best methods for creating small step size. Jacobs (l963) called for a better measurement, in the sense of giving more information than the dichotomous right-wrong. He suggested the use of response latency, difficulty ratings, and a confidence estimate. Suppes (I964), and Brooks (I964) experimented with re- 5ponse latency and found that It complemented the dichotomous error rate. The present study investigated the use of a confi- dence estimate. Contradictory results were found regarding those studies dealing with step-size effects. This is not surprising when one considers the many ways in which step-size can be defined and varied. Krumboltz (l963) considered the error rate meaning of step size to be a dependent variable and conducted a number of experi- ments carefully manipulating frame characteristics. In most cases, he added different forms of information to the already low error rate program. This additional information generally influenced error rate but not criterion performance. In contrast the present study both added and removed intra- and inter-frame information. I 1 28 Klaus (i964) talked of a complex operation to manipulate step size without reference to the learner. Unfortunately he did not specify exactly how the information was removed. He found, however, that his manipulations resulted in only small error rate differences. These differences were significant but had no effect upon criterion performance. Early studies, dealing with a program having a small number of frames, revealed no interaction between program variables and student ability. Later, Carroll (l964) controlled for time to complete the program and found a large criterion-ability correlation. Eigen and Feldhusen (l964), however, continued the attack and found both IQ and reading ability washing out when prior achieve- ment level was correlated with criterion performance. A study by Gotkin (l964), which discussed the unique problems of programing for culturally deprived seventh grade students, was discussed because of its similarity to the present study sample. His difficulties and conclusions mirrored those of the present study. The concept of feedback has experienced much the same semantic confusion as has step size, mainly due to its dual meaning. Feed- back inherentiy has both a cognitive, informative value and affec- tive reward value. Many studies, purporting general statements concerning feedback, however, have not recognized the duality. Studies emphasizing the cognitive aspect have centered upon difficult tasks, difficult enough that the student's first responses were termed "provisional tries." This situation is quite unlike the relative certainty one has in responding to small step program ’frames. Researchers have used this explanation when they began to 29 find that withholding the correct response had no adverse effect upon criterion performance. Angeli and Lumsdaine (I959) found a prompting trial confirma- tion Interaction in an experiment using a paired-associate task. The question as to whether a concomitant effect could be demonstra- ted withln complex meaningful verbal learning was dropped by Goldbeck (I962). Programed instruction's effectiveness in bringing about change In long-term as well as immediate retention of knowledge has yet to be firmly demonstrated. Only Krumboltz and Heisman (l963) have found some indication of greater retention with overt respond- ing, within a two-week interval. The present study investigates the amount of retention over the typical summer vacation. The ways to facilitate transfer or applicationct concepts and principles to new situations are also little understood. Eigen and Feldhusen (I964) did find evidence for Gagne's contention that prior achievement andprogram acquisition are correlated with transfer performance. A boredom called the "pall" effect, occurring from the small- step monotony, has received much attention as an outcome of small step-size programs. Chansky (l964), however, found no deleterious criterion performance effects due to worry-provoking methods. Eigen and Feldhusen (l964) found similar results from a long-term study in programed Instruction, that is, no correlation between attitude and performance. Goldbeck (I962) suggested that more difficult items might have a positive influence upon motivation. His hunch is backed up by a number of psychological studies, most Vividly by McClelland and Atkinson (I953). CHAPTER III PROCEDURES Ove v w . The chapter begins with a description of the sample and the research setting, followed by an elaboration of the role of the four science teachers. Next, the selection of program content and ' step-size estimation using the confidence ratings are discussed. The results and conclusions surrounding the use of the confidence ratings are then presented, which lead to the final development of the step-size variations. Criterion Instrument development techniques and reliability estimates are given. Finally, the design and statistical analysis are explained. ma ' Permission was granted and assistance given by the Lansing, Michigan school officials tocarry out the study at West Junior High School of that city. Four seventh grade science teachers, in particular, expressed interest in the use of programed materials. They were especially concerned about meeting the needs demanded by the large number of low-achieving students. A total of l6 classes, all but one belonging to these four teachers, were used in study, bringing the total to over 500 students. Although they constituted neither the total seventh grade population of 570 nor a bona fide "probability" sample, it 30 '- a“: 3| was the teachers' judgment that those students taking part In the study were representative of the entire seventh grade. West is the most central of the junior high schools of Lansing. It draws Its students from all cultural, racial, and economic groups. A large segment of the children comes from a residential, upper-middle, and upper-class district populated mostly by professional people employed in managerial positions at Oldsmobile and White Motor Car Corporations. A similar segment of the children resides in low- and lower-middle class district. If anything might be considered abnormal about this particular school population, it Is that it is lacking the common majority in the middle representative of the middle peak on the normal curve. 8y dividing the overall group in thirds according to prior science achievement, it was possible to study each segment independently as well as the total group. A reasonable estimate of the variability of these students on certain skills, such as reading, would be from second to ele- ’ venth grade levels. This wide variability was both a help and hindrance. It was a help in the sense that it provided information of the program effects over all levels, and a hindrance in that developing Instructional materials that were adequate for all students was nearly impossible. n R f h Four ea hers The four Interested teachers were significantly involved in every major decision, except those concerning basic design and analysis._ They received an honorarium for this work. As a group 32 they made the major curricular decision, that of what the program content should be. Three were given portions of the basic program to revise while a fourth developed criterion test Items. Their combined judgment concerning the basic wording of the textual material, the student readiness for certain concepts, and whether or not the students had acquired prerequisite and program concepts was accepted. Most important were their assessments and aid in developing the directions for the program and criterion instruments. Each set of directions was tested before program administration to assure its comprehensability, but it was the teachers' pre-judging that greatly facilitated the task. W What was thought to be a simple task, that of choosing the program content, was, in fact, a most difficult and important decision. Although every teacher was using the same textbook, each of their classes was at a different point in the curricular sequence. The final selection, a unit about static electricity and voltalc cells, was normally considered a topic to be covered in the following year. Fortunately, the prerequisite concepts, those under the heading of magnetism, were a part of the sixth and seventh grade content. in addition, static electricity was not covered by the seventh grade text which eliminated much of the problem surrounding individual home study. Yet there remained the problem of readiness, that is, were these students conceptually ready for the abstract nature of the chosen content. 33 Only one nationally published program could be located which covered static electricity and voltalc cells. Two hundred and twelve frames within angggl figigngg, Sound, SightI Electriglix, gng ngmunicatigng, Vol. ll, by Schaefer, Jeffries, Phillips, Harakas, and Glaser, distributed by Teaching Machines Corporation, dealt with these topics. The number offrames chosen was considered large enough to: I) eliminate the short program complaints and 2) make it possible for the pail effect to occur. The immediate reaction of the teachers was that the program wording was too difficult for the low achiever. That decision prompted a major revision centering on bolstering weak portions and removing difficult examples. Two complete revisions, each using representative students as a source of validation, were carried out to reach a satisfactory version. The first revision was tested upon a heterogeneous class of thirty students while the second was validated upon eight low achieving students. These students were excluded from the group who participated in the experimental phase of the project. a - A s n The adequacy of the confidence estimate as a more informative assessment of step-size was tested during the revision phase. The first task was to develop a set of directions and a rating system which could be easily understood and performed even by the poorest reader. After a number of trials, mainly with low achieving students, the following graphic and verbal directions were settled upon. 34 After writing your answer to each question on the separate sheet of paper, write down the number which describes how confident you are of your answer. 0 l 2 3 4 5 6 7 8 9 ID no idea wild guess good guess pretty sure certain The first revision group, the heterogenous class of thirty, was asked to work through the program and to respond, both to the frame, and with their confidence in that response. It was hoped that certain frames would receive clusters of confidence estimates around the middle or five rating. These frames would then be analyzed as to their unique cueing qualities and remain in the program. The cueing properties of the remaining frames were to be adjusted in line with these unique cueing methods. Frames receiving. primarily high ratings would be made more difficult while frames receiving low ratings would be made easier. A student by frame confidence rating matrix was generated to easily assess overall ratings. (See Appendix ii) The students were ranked as to their science achievement by their instructor, while the frames were in the program order. Each rating was recorded and circled if the response was incorrect. Re u s of onf den n Asses n Although it is common In research studies to record results in a separate chapter, decisions concerning-these ratings influenced the development of the step-size 'variations. To explain the methods finally used in manipulating step-size it Is necessary to discuss this phase's results here. r .II. IIIII'III. T), f /7' .n ,. 3 35 None of the expected clusters of ratings between three and seven appeared among any of the frames. The ratings tended to be high or low. The students were either confident they knew the correct answer, or were confident they did not know the answer. Less than ten per cent of the ratings fell into the "good guess" range. Upon further inquiry, it was learned that one bright boy rated a frame at five because he felt it was "ambiguous". Low achieving students tended to mark eight's and nines in contrast to the better students, who rated mostly ten's. it seemed that low students were not able to admit to "certain" even on many of the easiest items. This might have been expected, as theirs is a history of failure on tests which the programs greatly resemble. anciusign This type of rating offered only limited Information beyond the accumulation of errors and corresponding error rate. Therefore, the ratings proved of little use in determining the frame factors which might account for moderate difficulty levels. Nevertheless, the part confidence or certainty plays in the meaningful learning is still an interesting problem that should be explored in further research. While observing individual students, it was not uncommon to find one pondering some time on a frame, suddenly write down the answer, and rate his confidence in that answer as ten, "certain". This activity suggested that some method for determining response latency, as advocated by Brooks, Suppes, and Jacobs, seemed to be appropriate as a more differentiating measurement for step size. 36 Since the school year was drawing to an end, it was agreed that to develop the latency method and run more trials would take the study into the summer. A more arbitrary attack on step-size had to be taken. W An arbitrary but systematic way of manipulating step size was needed. First, the basic program revision was continued until the requirement of small step size, below ten per cent error rate, was met with the low achievers. More frames were added until a total of 278 with 330 separate responses were written. This final version was administered to eight low achievers and their combined error rate was twelve per cent. Twelve per cent was considered close to the ten per cent criteria so one more effort was made. It was imperative that the original or basic program be within the accepted limits. Following Krumboltz's lead, It was decided to add the first one or two letters of the correct reSponse within the frame blank. Different sections of frames were tested in short trial runs and found to be satisfactory for reducing errors. Two more difficult versions were needed to sample the step size continuum. More than two would have taxed the design by increasing the number of cells and thereby reducing the number of students within each cell. It was imperative, however, to retain the basic information so that the programs were not teaching more or less facts, concepts, and principles. Krumboltz's attack was largely one of increasing the basic program by adding both irrele- vant and redundant information. Therefore the opposite strategy, 37 removal of words, was accepted. Any deleted words and frames had to be considered redundant orrepetltlve. What might be redundant and repetitive for one learner, however, might not be for another learner. Any removal of the words and frames considered redundant had to be somewhat arbitrary. The decision was made to remove both inter- and Intra-frame redundancy thus insuring that step size would be measurably increased. inter-frame redundancy was reduced by eliminating both redundant frames within a group of frames that Introduce new information and review frames that are far removed from the initial informatiOn introduction. lntra-frame redundancy was removed by eliminating selected words within the frame which serve to cue the response. Both techniques were designed to increase the diffi- culty of generating the correct response. Each frame of the basic program was then classified by the experimenter as to whether it contained new information, was redundant, or was review. The strategy for developing the "mod- erately" difficult program, hypothesized to be both most effective and best able to demonstrate operative feedback, was to have one associated redundant frame, and two related review frames removed, plus one contextually redundant word deleted from each remaining frame. The final, "most difficult" variation had two associated redundant frames and all review frames erased, plus two contex- tually redundant words from each remaining frame. (See Table 3.l) The underlying rationale was to reduce: l) inter-frame redundancy to inorease the pace thus negating the possibility of the pall effect due to repetition, and 2) lntra-frame redundancy 1'0 . I; 1...)... .I: o If 38 TABLE 3.l EXAMPLE FRAMES FROM THE FOUR STEP SIZE VARIATIONS Variation I. With Feedback 20. 2|. Variation 20. 2|. Variation l4. Variation 9. -* denotes The zinc plate reacts with the solution and causes zinc atoms to leave the plate and go into the §_ solution After these atoms in the solution leave electrons on the remaining plate, they have more protons than electrons. The atoms then have a p electrical charge. positive ll. Without Feedback The zinc plate reacts with the solution and causes zinc atoms to leave the plate and go into the . After these atoms in the solution leave electrons on the remaining plate, they have more protons than electrons. The atoms then have a electrical charge. lll. With Feedback The zinc plate reacts with the and causes zinc atoms to leave the plate and go into the *. solution After these atoms in the solution leave electrons on the remaining plate, they have more protons than . The atoms then have a * electrical charge. positive lV. Without Feedback The zinc plate reacts with the and causes atoms to leave the plate and go into the * After these atoms in the solution leave electrons on the remaining plate, they have more than . The atoms then have a * electrical charge. the desired response 39 to increase the individual task difficulty, thus adding to the pos- sible tension arousal and subsequent relief. The moderate level was to maximize both difficulty and pace so as to best suit the majority of learners. The final variation was expected to be beyond the difficulty level and pace of most of the students, thus causing a frustration from little or no tension relief as well as cognitive constion. The number of ways feedback could be varied was also seri- ously limited by the number of cells In the factorial experiment. To insure adequate within-cell estimation, only two variations would be allowed. it was natural to choose the two extremes, with and without knowledge of correct response. Therefore each of the four step size variations was mimeographed twice, once with, and one without, the correct response. These combinations brought the the total number of different versions to eight. Prigg Aghigvemgni Plaggmgnt Exam. I The theoretical hypotheses were generated in the expectation that students judged to be achieving in the upper, middle, and lower third of their class would be differentially effected by the varying step size and feedback program combinations. it was therefore necessary to administer a published or teacher-constructed test to determine In which third to place each student. The evidence of Eigen and Feldhusen (l964), who found prior achievement to be correlated with criterion performance even when l0 and reading ability were partlailed out, supported the choice of prior achieve- nmnt rather than other ability measures. l..b|‘1,. ,.. eti/ 40 There still remained the choice of whether to use a pro-test consisting of items testing the program content or one which assessed the overall progress to that point. The inherent diffi- culty arising from the use of a pre-test, of differential prer test x treatment Interaction over levels, determined the choice of the overall progress test. A search through the published science achievement tests did not produce a test which corresponded with the teacher's curricular approach. There was concern over the lower achievers being able to answer some of the items. It was de- cided to sample equally from the best Items of previous exams con- structed by the four teachers plus extracting other items from the teacher's manuals of relevant texts. Forty-one items were chosen by the teachers which would not favor any class or level of achiever. The test was administered approximately two weeks prior to the program administration. As expected, the test scores developed a wide variation, forming a piatykurtic distribution which denoted the lack of the usually prominent middle group. It was therefore easy to rank the scores and divide the students into upper, middle, and lower thirds representing the desired levels of prior achieve- ment. Prggram Adminisiggiign The programs and corresponding criterion instruments were administered by the classroom teachers during the daily, 50- minute periods, during the last two weeks in May, l964. Time to complete the instruments ranged from two to eight class Periods before the slow students were finished. Eight students 4i did not finish because of absence and inability to read. Students finishing early were given independent remedial or enriching study. The teachers were given a standard set of directions to give to each class. These instructions Included separate statements for those receiving each feedback form and special statements for those receiving the two more difficult forms. The teachers remarked that each class needed extra emphasis and repetition, especially the "modified" classes consisting of low achievers. The fact that everyone had different booklets was only distracting at first. The teachers also gave individual assistance when needed. The teachers told their students that they were experimenting with a new kind of textbook to eliminate questions dealing with the nature of the task. It was not uncommon for these teachers to conduct experiments of this.kind In their science classes. They also Instructed the students to be prepared to answer test questions that would not, however, be graded, but would be part of the material included in the final exam. It was hoped that these conditions would produce maximum motivation without inducing outside study and other detrimental Influences. There were no signs of student discontent with the task which led two teachers, independently, to remark that "it was the quietest the room had been all year." The teachers also kept records of the amount of time needed by each student to work through the program. These time estimates became an integral part of the data analysis. 42 As soon as the student finished the program, he rated himself on the continua representing the extremes of the selected adjective pairs. Upon completing the questionnaire, he was given either or both of the criterion tests depending upon the time remaining. In October of the next school year, the school officials allowed one period, fifty minutes, for retesting the students, now eighth graders. Both criterion tests were administered but many of the students failed to complete the application portion. Only the knowledge scores were used in the data analysis. n n a n r The cognitive criterion instruments were patterned after the knowledge, comprehension, and application categories as set forth in Handbook l of the Iaxgngmy gt Eduggiigngi Obiggiivgs. Thirty different facts and concepts were culled from the program. Each member of the research team developed a set of representative items calling for either retrieval of the fact or comprehension of the concept. items, judged to be most discriminating by the consensus of the teachers, were each given a weight of one to form a thirty item completion exam. (See Appendix lll) An estimate of the reliability of the test, given by the Kuder Richardson Formula 2i, was .87. The application items (See Appendix ill) were developed from experiments and practical problem situations. whose solutions required a knowledge of the conceptual meaning and skill in _applying,the principles within the program. The student was asked to name the principle involved and the correct application of that 43 principle. Eighteen items were constructed, each worth two points, one for correct elaboration of the principle and one point for the correct application. It was felt that the large achievement levels effect found in the analysis of the application score variance provided good evidence of the test's internal consistency. Aff v u n re The affective questionnaire consisted of six adjective pairs similar to those in the Semantic Differential (see Appendix III). The numbers, one to nine, were added because the low achievers had difficulty in comprehending the directions. Equal interval assumptions were made to simplify the computations. The adjective pairs were chosen in an attempt to cover the I possible relevant facets of the affective response. The investigator searched through the literature on attitudinal response to programs and through Rggex's Ihgsgurgs before selecting the following pairs: difficult-easy, alert-careless, rewarded-punished, progress-no progress, successful-frustrated, and interested-bored. The final pair, interested-bored, was used to test the hypotheses concerning the reduction of pall effect. Since the design called for each student to study from only one of the eight forms, he obviously was not able to make any comparative judgments among programs. The affective data was single stimulus data and therefore much weaker than If the students had been able to contrast each of the forms. Although the specific research hypotheses dealt with individual criterion measures independently, It was possible because of the multiple scores on one student to consider the use of multivariate analysis of variance. The multivariate technique enables one to make decisions regarding a combination of dependent measures as if they were a single measure. The completion of these tests a offered an overall view of the variance dispersion. Since it was only feasible to compute the multivariate ratios for a problem. having two dependent measures on a desk calculator, only the prime variables, knowledge and application, were analyzed In this fashion. MultiANOVA considers the dependency of the criterion test' scores in making an overall test of the differences obtained from the main effects and Interactions. One Is not allowed, for example, to make independent probability judgments upon the main effects of feedback on application scores or knowledge scores without regard for their possible covariations. The assumptions of this omnibus test are multivariate normality and homogeneity of variances and covariances. Oddly enough both of the tests for the assumptions are more complex than the between means test. Fertunately wide deviations within the data which would cause rejection of the assumptions, were easy to spot when calculating the variances and covariances. 45 The resultant variance-covariance ratios, W's, are dis- tributed as Chi-square for a certain number of degrees of freedom. One obtains eight W's corresponding to the three main effects, three two-way interactions, one three-way lnter- action, and the overall between means effects. Hypothesis Ii, stated In the negative, needed a reversal, that is, without feedback versions being more effective, to be logically substantiated. A priori hypotheses #2, l5, and ti were to be tested by specific series of t-tests within levels plus an over levels test. The research questions called for an after-the-fact analysis of the variance components of the dependent measure associated with each question. The major technique was a three-way factorial breakdown with step size, feedback, and achievement level as the main effects along with their inter- actions. Some affective-cognitive correlations were also under study to guage the possible relationships. Wis. The scores on the placement test were ranked and divided into three groups to represent the three prior achievement levels. Starting with the top eight scores, each student was randomly assigned through use of the random numbers table to one of the eight program variations. Each succeeding group of eight ranked papers was similarly assigned until all the students were placed. The result was a two-way (4 x 2) factorial experiment randomized within blocks design, having sixty blocks 46 each containing eight students. The two factors represented the four step size variations and two types of feedback while the blocks were derived from the prior achievement scores. After the samples were selected it was decided to collapse the "blocks" into three levels for the statistical analysis in order to study the overall levels effect and the various main effect by level Interactions. The optimal statistical treatment would have been to retain the sixty blocks which would remove the maximum amount of "true" variance. The alternative procedure, however, was thought to be defensible on the basis that it would result in a more conservative estimate of error variance than would have been produced from a sampling plan based upon a randomization within levels design. Assuring that each of the eight program combinations be representated in each block,within the three levels,would have the effect of Increasing the within group or error variance over that which would have resulted from the simpler random- ization within levels. 52mm The manner in which the student sample and c00perating school officials were chosen was discussed initially. Next the role and delegated responsibilities of the four teachers in selecting the content, advising on wording, etc. was reviewed. Procedures for assessing and analyzing the subjective confidence ratings followed, concluding with the statement that the ratings offered little information beyond the dichotomous error rate. 47 Most of the ratings were either high or low while desired or middle ratings were said to be associated with ambiguous items. More work needs to be done to explain the function of certainty and confidence in problem solving. Different kinds of information pointed to the response latency measurecas the most promising attack upon step size definition. The resulting strategies for information reduction to' increase step size were elaborated upon and examples presented. Criterion instrument development and program administration details were explained. Lastly the rationale for the design and hypothesis testing was offered. CHAPTER lV RESULTS Mills! The efficiency of the factorial experiment with multiple out- comes is clearly demonstrated by the length and complexity of the_ following chapter and corresponding appendix. The major form of statistical analysis is the three-way factorial source of variation breakdown of the three main factors and their interactions. The significant effects are starred in the tables, mentioned in the text, and summarized In a final table at the end of the chapter. The overall effects might be best understood by glancing at the summary table first. The effects of the major variables upon the program measures are presented first to maintain some continuity with the actual collection and analysis. The multivariate and acceptable univariate analyses of the cognitive variables, as a group, follow. The theoretical hypotheses and questions are taken up in their original order. The variable effects upon the affective ratings, as a group, are printed next, with the appropriate hypotheses and questions being answered. An exploratory look at the more significant relationships between and among the cognitive and effective data precedes the summary of the findings. 48 49 E££2£_BII!. The error rates have to be interpreted In light of the failure of some students to heed the directions which called for making a provisional try before looking at the correct answer. The major effect of this response peeking was to make the lower third estimates too low. (See Table 4.3). ‘The overall effect upon the source of variance breakdown as shown in Table 4.l, was difficult to discern. The overall strength of the effect of the main variables in influencing error rate showed that every main effect and inter- action produced a significant variation. interaction plots are found in Appendix I, Graph #5. If the pregram frames were to be considered the first "learning trial”, then each main variable, entry achieve- ment level, step size, and feedback, and all of their interactions have a pronounced effect on the outcome. MW!!! Every main effect and interaction, except for the three-way, had a significant effect upon the time to complete the program (Table 4.2). interaction.plots are found in Appendix I, Graph II. The program manipulations all resulted in varying the students' pace through the program information. Q1D2E_EEQQElm_th£§£IQELEIL91 Other program characteristics such as density, mean seconds per response, mean total minutes to complete the program, as well as error rate, are depicted in Table 4.3. Program density is the ratio formed by the number of different responses over the total number of re- sponses. Density measures give a quick assessment of the amount of redundancy, and, therefore, difficulty, of the program. They are unique 50 in that they are independent of the performance of the students. Variations I and il remained at the same density ratio, .2l, because no frames were removed. Reducing Inter-frame redundancy on Variations Ill and iV decreased the density to .33 and .43, respectively. A total of 2l7 (330-ll3) frames were removed, 2i (70-49) of them having different responses than the original group. These 2i items were judged to contain the same information as previous frames but called for a secondary response, usually an adjective. I A gross measure of response latency was also taken, that of the mean number of seconds taken per response. This estimate was calculated from the mean minutes to complete the program, also shown. The mean seconds per response estimates demonstrate an interesting phenomena. 'The with-feedback versions of Variations Ill and IV were almost equal, 39 to 40, while the without feedback versions deffered greatly, 43 to 59. The presence of feedback considerably decreased the number of seconds before responding on Variation IV. It is conjectured that the presence of feedback made the students "give up" faster as the anxiety of wondering about the r'°$l>onse increased. The similarity between Variation ill and IV, 39 and 40 mean seconds, seems to point to a kind of tolerance limit. The Without feedback group of Variation lV took, on the average, almOSt'ZO seconds more per response. ‘They were willing to work longer *9 fled the solution. This extra time téken could have been the "WJOP factor In the increase of application scores to be exhibited I“ The next section. TABLE 4.l THREE-WAY FACTORIAL SOURCE OF VARIATION TABLE FOR THE ERROR RATE DATA (H.336) - . , ’ . 223.31. d.f. Mean Square F Scource of Variation Achievement Levels l6,l[5 2 8.257 l20.l9**' Step Size : 55,475 3 l8,49l 269.l7"* Feedback 46,000 I 46,008 569.69!" Feedback x Step Size 20,l94 3 6,73l 97.98*** Achievement Levels x Step Size l,7lO 6 2,6l8 38.ll'** Achievement Levels x Feedback l5,289 2 7,644 lll.27"* Achievement Levels x Feedback x , Step Size 3,857 6 642 9.35‘* Within Cells 2l,504 3i3 68 an P<.00l "" P<.0I TABLI.4.2. THREE-WAY FACTORIAL SOURCE OF VARIATION TABLE FOR THE TIME TOICOMPLETION DATA (N'336) Source of Variation 5"” of d.f. Mean Square F Squares Achievement Levels 3l,576 2 l5,788' l7.2*** Step Size 96,509 3 32,l69 35.l*** Feedback 42,525 i 42,525 46.5*** Feedback x Step Size l3,979 3 4,494 4.9** Achievement Levels x Step Size 20,632. 6 3,455 3.7** Achievement Levels x Feedback 7,380 2 3,690 4.0* Achievement Levels x Feedback x Step Size 5,378 6 896 Within Cells 285,342 3l2 9l4 W p (.OOI f"*p(.0i *P (.05 mechQmem Lo consaz _m+0h\memc0amem «cucoe+~o +0:Lensaz _m+0k* 52 \xoolnwmmc‘ otfidmae‘ufi 95.: +:Oz+_x n o\: nL_;P o_oc_z u.h: \mucoz >02 03% .n xuenueee ;+_: u : ucwzh Lena: u h: easing ">8? 3:3. :< .N em o\: No mN #4 moancu on mm P: +cencauom 03% ._ . Mph 3 : oe «NS . $28 3 u 9. n: .1 mflmuuii \cLOI >0! eco .n meencm xo_>em 03% .N me o\: Nb 0. ha dance 3 t E 28:35 95 ._ . wnl. . on 3 N... N. S - 95.8 mm H mm a: .2. mN o\: «o n H4 nN __ p: . own 2 z 3 o .5 858m .53»: _N u oh onn .: mm o\3 mm m H; N. e h: _N. i own _N 3 m N k: __< +aeoLm 1 Oh onn ._ Ox: : me+mm LoLLu me_:m mechQmem mco_+e_ce> amcoamomxmuem :00: can: mc_EocmoLm s>+_mcoo Lo Lonesz e~_m ao+m 22mm? 550E mzh «on. mmzommwfimozoomm 25: oz< .mmEE «came 25.: .342 955.8% .CBzuo .mumzodmwm do mmmzaz n.¢ 0_n0._v 53 There was a marked similarity in the between-variation increments of density, response latency, and error rate measures. The inability to reflect lntra-frame redundancy makes the density ratio a more gross measure, but the fact that it can be calculated before the program is administered recommends it highly. The expected response-peeking done by the with-feedback groups seriously hinders interpretation of the error rate data. The without feedback versions error rates, where there was no chance of peeking, are a more reliable measure but do not reflect the "normal situation," that is, normal programs contain feedback. The gross response latency measure probably most clearly represents the student reaction to the difficulty of the programs. This conclusion agrees with the earlier work with the confidence rating. n ta W With dependent measures, the obvious statistical test to run would have been one grand multivariate test covering the overall variable effects on the multiple criterion measures: error rate, completion time, the seven affective ratings, the knowledge and comprehension scores, the application scores, and the retention loss scores. ‘This omnibus test takes account of the natural covariance between criterion measures since they are measures on the same subjects. It was impossible, at the time, to carry out the complete, overall test because of the lack of a suitable computer program. However, the two major criterion measures, the knowledge and application scores, were put to the multivariate test. Kendall (l96l) provided the model for the two-measure case which could be i4 ‘. It! .r, ’ 1r. 54 done in a reasonable amount of time using an electric calculator. The decisions reached within the overall multivariate tests are similar to the more common univariate except that they consider all the criterion measures as one combination. For example, we wish to make major programing decisions on the basis of the overall effects upon both objectives, knowledge and application, as though it were a single score. in the present study the overall test was used to demonstrate its relation to the individual tests which represented the decisions needed to answer the research hypotheses. The individual hypothesis tests could have been made independently of the overall tests because of their a priori status. The general logic from the multivariate to these individual tests happened to be consistent and was reported in this manner. Of interest in multivariate tests is the relationship between dependent variables among the treatment groups. This analeis follows. Kno d e-A i ion S are Cor e ation r hs and Uni uene s The overall Pearson product moment correlation coefficient between knowledge and application scores was .76. The coefficient remained quite stable over program variations. (See Table 4.4) TABLE 4.4 RELATION or KNOWLEDGE AND AEELICATION Iggy; scones Step Size Variations Feedback Versions w w/o w w/o w w/o w w/o . Correlations .76 .73 .65 .70 .66 .74 .8! .80 n's 42 40 45 39 44 g 38 46 44 55 The resulting scattergrams were unique in that their points concentrated in the lower right diagonal half. (See Appendix I Graph l2) Although the phenomena may have been a function of the methods with which the tests were scaled, the unique nature of the knowledge-application relationship, that of the knowledge retrieval being a prerequisite subsklll for the more complex application items, is a possible explanation. A student would logically not be able to apply any more knowledge than he could recall, thus the application scores would necessarily be relatively lower than the knowledge scores. This artifact does not affect the interpretation of the product moment correlations but the lack of heteroscedasticity might influence some covariance measures discussed later. flglilggglg12_13§1_ As an exploratory device, the multivariate test is analogous to using the overall F-test to decide whether it' is appropriate to make individual comparisons. Therefore, a significant multivariate H allows the experimenter to continue his analysis by making univariate F-ratio comparisons. A significant F-ratio then enables one to calculate multiple and individual comparisons. Both multivariate normality and homogeneity of variances and covariances were assumed. 'The tests for both are quite complex. Observation of the score distribution, and the sums of squares and cross products demonstrated no irregularities. (See Tables 4.5 and 4.6) Table 4.7 shows that the prior division into achievement levels accounted for overwhelming differences. Oi? = 323.75 for 4 df, p (.000 TABLE 4. 5 MEANS AND STANDARD DEVlATIONS FOR THE KNOIILEDGE AND APPLICATION SCORES """EVEfiERf‘ CEUEL KNOWLEDggANDARD _ APPL'CAglgzoARD ‘ n-l6 ‘ mm "5"" DEVlATl 0N "EA" DEVIATI 0N Upper Third A : - 1 23.06 4.29 19.68 3.89 2 22.68 4.09 l6.37 5.45 3 22.18 4.94 18.75 5.56 4 23.43 5.09 19.81 6.35 5 22.25 3.38 i7.l8 5.84 6 18.68 4.24 i6.50 8.03 7 20.93 5.27 l8.75 7.12 8 21.43 4.05 19.18 6.33 Middle Third 11 —1 i8.25 3 61 11.31 6.79 2 19.81 3 53 14.87 5.30 3 18.00 3 35 12.68 6.05 4 18.37 3 61 11.62 5.61 5 l6.37 5 Si 9.75 5.53 6 14.06 5 64 11.43 5.88 7 12.81 5 54 8.00 5.30 8 14.25 5 07 13.56 5.38 -"Lewer~:~Third 111 - 1 10.43 4.50 7.62 5.15 2 12.00 4.52 5.56 4.40 3 10.81 3.76 5.68 4.27 4 12.00 4.52 7.06 3.89 5 10.00 5.15 7.06 6.16 6 8.87 4.18 4.06 3.86 7 9.00 3.41 3.87 4.24 8 7.88 4.73 5.37 4.48 —‘ l '3 Variation l with feedback 2 " Variation l without feedback 3 " Variation II with feedback ‘ ' Variation ll without feedback Veriation ill with feedback Veriation iii without feedback Veriation IV with feedback Variation iV without feedback GNOU 57 TABLE 4.6 $045, SIMS N SQUARES AID CROSS PRQLCTS N KNWLEDGE (X) MD APPLICAle (Y) 500825 GROUP n=16 EX 5x2 EY EYZ EXY “UpperTThiFd' 1-1 8,805 315 6,429 7,449 2 363 8,487 262 4,736 6, 1 13 3 355 8,243 300 6,088 6,910 4 375 9,177 317 6,885 7,611 5 356 8,092 275 5,239 6,274 6 299 5,857 264 4,968 5,044 7 335 7,431 300 6,386 6,692 8 343 7,599 307 6,491 6,899 2,795 63,691 2,340 47,222 52,994 Middle Third 11-1 . 281 -5,131 l8l 2,739 3,402 2 317 6,467 238 3,962 4,808 3 288 5,352 203 3,125 3,720 4 294 5, 602 186 2, 634 3, 506 . 5 262 4,746 156 1,980 2,787 6 225 3,641 183 2,6ii 2,929 7 205 3,087 128 1,446 1,880 8 228 3,634 217 3,377 3,417 2,1 37,560 1,492 21,874 26,449 Lower Third 111-1 167 2,067 122 1,328 1,407 2 192 2,610 89 785 1,306 3 173 2,083 91 791 1,061 '4 192 2,610 113 1,025 1,573 5 160 1,998 113 1,367 1,408 6 142 1,522 65 487 780 7 144 1,470 62 510 678 8 126 1,328 86 764 841 1,296 15,688 741 7,057 9,054 Grand 76+61s 6,191 117,039 4,573 76,153 88,497 __ Veriatlen l with feedback 5 Variation l witheut feedback 6 Variation ll with feedback 7 Veriation ll witheut feedback 8 Variation ill with feedback Variation ill without feedback Veriation iv with feedback Variation lV witheut feedback buN— one: .‘e-..- THREE-HAY FACTORIAL SOURCE OF VARIATION TABLE FOR THE MULTIVARIATE ANALYSIS OF THE COMBINED KNOWLEDGE AND 58 TABLE 4.7 APPLICATION SCORES (N a 384) lw‘z .o' , plk-i) Source of Variation w df x2* P Between Means .36 2(24-l) 378.03 .OOI 46 Achievement Levels .423 2(3-l) 323.75 .OOI 4 Step Size .895 2(4-l) 45.i4 .00l 6 Feedback .99 2(2-l ) . 57 .05 . 2 Feedback x Step Size .952 2(4-i)(2-l) l9.6l .0l 6 Achievement Level x .974 2(3-l)(4-l) 9.62 .05 Step Size l2 ' Achievement Level x .979 2(3-lll2-l) 7.77 .05 Feedback ‘4 Feedback x Step Size .99 2(4-l)(2-l)(3-l) .57 .05 x Achievement Level l2 *-n log w as x2 is only an approximation to W distribution 59 The combination of entry repertoire, intelligence, and reading ability that makes up science achievement in this case, accounted for the major portion, (approximately 90 per cent) of the between-h means variation. The feedback x step size interaction also produced significant differences ('13 . I9.6i for 6 d.f., p<.0l), thus allow- ing us to accept the statement that: providing knowledge of correct response has differential value for programs of varying step size when considering both knowledge and application test scores, in‘ combination, as a criterion. The main effects, step size and feedback, with their interaction significant, cannot Justifiably be considered as independent sources of variation. It is interesting to contrast, however, the overall significance of the step size variable and the negligible overall amount of dispersion attributable to feedback. Not obtaining even chance difference due to feedback would in some instances be a sign of mismanagement of variable control. Having arbitrarily taken the extremes, with and without, as examples of the possible feedback dimensions, dispels any suspicion regarding experimental variable manipulation. WW As was mentioned earlier, the finding of significant multi- variate H's due to achievement level and the feedback x step size lnteraction.allows the experimenter to continue with their correspond- lng univariate F-tests. The complete three-way factorial breakdown of the sums of square 5 for both knowledge and application scores, available in Tables 4.8 and 4.9, offers more clues to the distribution Ii. TABLE 4.8 THREE-HAY FACTORIAL SOURCE OF VARIATION TABLE FOR THE KNONLEDOE SCORES (N'3B4) Scource of Variation Sum °f d.f. Mean Square F Squares Achievement Levels 8,792.82 2 4,396.4i 2I8.4O iii Step Size 783.6I 3 26I.60 I2.99 *9 Feedback 0.00 l 0.00 Feedback x Step Size I85.03 3 6l.67 3.06! Achievement Levels x Step Size 129.47 6 2I.57 I.07 Achievement Levels x Feedback I6.l9 2 8.09 .40 Achievement Levels x Feedback x Step Size 67.94 6 ll.32 .56 Within Calls 7,250.i9 360 20.i3 "fig-WW TABLE 4.9 THREE-NAV FACTDRIAL SOURCE OF VARIATION TABLE FOR THE APPLICATION SOORES (N'384) Scource of Variation 3"" °’ d.f. Mean Square 1= Squares Achievement Levels 9,999.75 2 4,999.87 i67.49"" STOP 5‘29 'l87.52 3 62.50 2.09 Feedback 9.69 l 9.69 .29 Feedback x Step Size l64.78 3 54,92 1.83 Achievement Levels x Step Size '3g.'9 6 23,19 .77 Achievement Levels x Feedback 202.5. 2 [01,25 3,3q' Achievement Levels x Feedback x ' Step Size 244.l8 6 40.69 l.36 Within Cells l0,746.l9 350 29,85 *" P<~°°'4.1Ld.mf_msn_m_nmu_m1 6| TABLE 4.IO MAIN EFFECT AND INTERACTION MEANS FOR KNOWLEDGE AND APPLICATION TESTS ACHIEVEMENT LEVELS STEP SIZE FEEDBACK FEEDBACK x STEP SIZE STEP SIZE x ACHIEVEMENT LEVELS FEEDBACK x ACHIEVEMENT LEVELS UT MT LT w/o 11 111 IV UT MT LT 21.83 16.40 10.12 17.59 17.46 15.04 14.38 16.11 16.12 17.00 17.00 16.18 14.23 UT 23.86 23.00‘ 20.63 21.18 22.07 16.16 10.05 (UPPER THIRD) UT 1111st THIRD) MT (LowER THIRD) LT (PRGFI'ED vmsmuw (BASIC VERSION) 11 (MODERATE " ) 1" IDIFFIOJLT " ) 1V (WITH) W (WITHOUT) “/0 w/o 18.14 111 17.91 13.85 1v 14.52 MT LT 18.84 11.31 1 18.33 11.50 11 15.34 9.51 111 13.64 8.50 1v w/o 21.56 117 16.60 MT 10.17 -LT 18.28 11.65 5.78 12.57 12.60 11.00 11.45 11.75 12.06 12.85 12.37 11.31 10.19 UT 18.00 19.25 16.82 18.96 w/o 12.25 12.81 10.65 12.70 MT 13.07 12.15 10.58 10.76 w/o 17.96 12.85 5.51 #U‘IO‘O $88115“ 62 of variance, although only the single main effect and single interaction can be rightfully considered. Table 4.10 exhibits the main effect and interaction means. Graphs of the three two-way interactions are found in Appendix 1, fl and #2. Entry achievement level had almost equally strong an effect on the variation of both types of criterion behaviors. (Knowledge F = .A.‘ 218.4 at 2 and 360 d.f., p .001 and Application F I 167.49 at 2 and 360 d.f., p .001). Even when the variance common to both knowledge and application scores is covarled out, a significant amount of variance still remains. (F I 12.09 at 2 and 359 d.f., p .001) (See Table 4.11). . TABLE 4.11 ANALYSIS OF THE APPLICATION SCORE VARIANCE DUE TO THE ACIIIEVEIENT LEVEL MAIN EFFECT THAT REMAINS IIHEN 1o oL.ch Loco. 6L_ae 6.6a_z oc.zk Lena: co.mLe> xuonoeem co.+o.co> e~.m qo+m 40>0—(n QJEO)C+- AJJwO mum o.lzv macaw hzw2Pw4 h2w1w>w_10< 440 10 xucooeeu co.+m.cm> m~_m ae+m _.mLm>0 .JQ)>~JEOC+ <‘U.c Aggwo mwm 0_uzv 0..v mAQm4 k2m2w>w.xo< 440 Io_ Z. Z _ mco.+m.._m> aim no.5 348 am... In! 8.3) 44¢ C23 mi 43.. §>w=6< 9:5. 53.. !!:!.h§§I§_ ghosflsg ON; ugh e mm.n 3A.. Rx” 36.. nemece>< 26.35 3 fl Him w n .N .v . v.25. 0.3:! and and 3.». e8.n 3...... Leon: "n _ 96.. Ease: _ 20¢ soonveeu\x C xUOAtoen.\3 Z. xuonvoen on; ae+m 3.50 mun. In! 8.3» Sun“. 2....) 9:. 1.3 mg .55.. §>m=c< 922—. 38.! g EEémfi!.h§B§Igg—z_ 2.332558 2.? wiflr 76 .h. on. me. an. o\z x >. o\3 3 o\3 3 o\3 co_+ou._qa< ;+_: on. +coeo>o.cu< >L+cm omum_30cx 2+.) om. +cmeo>o.cu< >L+cw x co.mLo> xuonuomm co.+o.Lo> o~.m Q0+m mmaomo zm.Io< >mhzm 2mmzhmm mzo.h<4wmmoo .N.¢ w400) >71 >‘nm -+ (D (D-t- (DO n-I- 00 06-i- (D (I) (DO) <27 36‘ JC 3cm '0 0. GD (3... ----'0 --C1. —-Q'O CT 0’ -(D (D (DU (00' m Oi (DU) < (m (C; 04 +ceeo>o_cu< 91.— 92 ED mqm>mq azmzm>mHmo¢ K . MUngmE m. xooooeom ’ ; ....» i 1., ox) mW /. w. « mmoom nonunsozx was mom moneocmmazH w<3ioza mma mo meoqm on_mne+m >m HHH HH H xoH HHH HH H :::mqm>ma ezmzm>mHmo< _\ d. x «ago-.— eee mNHm “mam .9223 .71, \1/ not: ¢;uo:. .4- Pzz+ Locum/K o.oH O.HH O.NH o.ma o.vH. o.mH .o.oH 0.5H o.ma o.mH 0.0N O.HN o.mm mem O.VN $38005 39031IONN |O8 mzo.. thus?! zu< m~ _ 33m e~_mae+m k4 k! PD >_ __. __ _ >_ __. __. _ F P P F p p h P h p r q a a q A q a W A 4 q .:: 1' UL. _ Kb. \ Inc‘s-e. L030; .:..... :5. .1 11 .I But: hzmzw>w§o< x983“. 329 5263.12 1 I X X X 0, xom_.6< m \. x. 1 I on gumu \ .... ‘ . vhf \u l T mm Lean: ... T m . a e l I. ... .\ omd .. u..._£\it\\ .. I. v- mmm .. .:::: \ ... m. .3388 .. \ .... 1 o} x238: Q .... 1 ... ovt ... ki.‘ ... °\I/D ... UL_£+1ee‘s l I mV Leno; ..... l r om .... s... T Q 1 mm 0. 0.... QT 8 l- m0 iiO mzmHmU¢ .a mugammh magma BZMSMSMHEUAN 1 I om X X X modmammm mNHm mmam mNHm mmew 11mm Luom a lem . I. J 0. 0 . 1 u 23% / x338... ... , r ) o I . . 1 mOHI /./\ . a N . 1 1 03M Ci.“ , l .. r 1 r m H Hm ... xoonuoo... S I z \ cm H11 ... o/ />.s . .0 . \ mm a x \ / I \ P. _ f m m H . . o _ on _ 2 .. .6388 0 0 0 \q / cm H .. o\3 . P. _ f. r - . .533 ... . x .. .... v x... m m a xuenooovq \‘ .. x o: mva mZmq MEADUHhmHQ NEE mom mZOHBUea +ce§e>e_gu< BA 82 .HD mqmczaq psmmrabvmuufiqt . X _xuzaxuflnh ma gauge e~_nae+m e~_nee+m >H HHH HH H . >H HHH HH H unucadmuym mamcnuq usaainbnnafiu¢. . x a? 1\ so mam ...—Nun has»... N 9.2+ IIII' oc_;+ Long: .33 3:. v 0... .. x\ . .\ ..... .4/ u goeavoou m o\x .m ug_g+ m Lgsoa O as 13A31 AlTflOiiJlO |l2 mz¢m2 m.o>o4 +co£e>o_co< #4 k! P: m4w>m4 hzw2m>w.xo< x x0_ __. __ _ m4m>w4 hzwxm>m.zu< x wN.m mWhm ‘2 N oaxsaaaxui In N o.n n.n o.¢ n.v o.n n.m 0.0 03808 ii} mz¢m2 mmmqmm40lfimmfld WEB mom mZOHBU04 +cete>e_cu< #4 k! h: m4w>m4 hzwlm>u.zo< x xutmowmm ./ \' e/ e\ x2328“. .. e~_nae+m >_ ... .. . XQCDOuw x . u~.m Lmhm ...... e .323: .:../vana‘.’ .:..-... \ I. my. .623... u e~_eae+m >_ __. __ . m4w>w4 hzwxm>m.zu< x wN_m mwbm ug_g+ Loan: E _ 5 \Il ..un_x 11‘ Away. uu_g+ Logos 1831V o.N nLN o.m m.n o.¢ U\ U\ $5313UV3 O \O li4 mz04 +coeo>o_zo< #4 h! h: m4m>m4 kzwzw>m.zo< x ¥U<3IOSE WEB m0 mBOAm me gauge o~_mao+m o~_mao+m >_ __. __ _ >_ ___ __ _ xotmomwu m4m>m4 bzwzw>m.zo< X x mN.m mmbm wN_m awkm .\ . no.3. Lena: xueoooom 3 xx ei. W \ ".”oe‘ooeeoe‘ Av. ...... A-w.. .Av \ xUOOUQOm O 3 el“ "' \ e. ..e .::: BEEN. o.N m.N o.n m.n o.¢ m.v o.m OBGHVMBH 03H?» lNiid ii5 mZ<3i039 WEB m0 mBqu m.o>e4 +ceeo>o_zu<_ o~_mao+m h; h: H: >_ ... _. _ m4m>m4 bzwxm>w_zo< xu. __. ._ m4w>m4 hzwzm>m_10( x mN_m awhm 0. 1n?ssaoons in N O n n.n m.v o.m ifi In 0. Galvufsnad in \O mZ¢fiQL m_o>m4 +ceeo>o_co< 4 b1 m4w>m4 kzwzw>w_zo< x xo_ ... xo_ __. __ _ m4m>w4 hzwzw>w_xo< x wN_m mmhm oL_g+ L201— eeee \ AV \\ 4r], \\ oL_z+ .I‘ e.oo_x N d S38908c ON SSBUOOHd In “é ii7 «_o>04 +coee>o_go< BA 82 ED mAMEfiflq BZmEm>mHmU< X x0¢m0mmm .v \ xuonuoou x ii\\J\ A”. .mu.v . an» . x4 xoonuoou .m o\: ,./v u mZ¢mZ 80mmm¢ m0¢ mmfi mom mZOHBUddeZH NQBIOSB mmB ho mBOQm o~_mao+m >H HHH HH H UXU¢m2Hmmmq x . m NH m m we m. . \ .\\ .‘x :...- .3 . e‘.... .>. \\\xvm. xoenoeom : cu AYI.1I.-' xuenuoom o\3 . //i.. .u __n gauge e~.noe+m _>H HHH Hun H mflflHPMA BZHZW>MHEU< X MNHm mmam (aAs+zSOd) (I) \0 d' N O m \0 ‘e v' oz_m3 mFZmODPm mmOIP «Om mmKOOm zo_hm_10< w02m_om mDO_>wma zo om¥zm thODPm wIk mo m4¢2