THE POTENTIAL OF THE FREE RECALL TASK ASA MEANS OF ASSESSING STUDENT ACQUISITION OF COURSE RELATED ORGANIZATION mesh; for the Degree of Ph. D. MICHIGAN STATE UNIVERSITY WALTER SHAWVER BROWN 1972 h ”-1.41.” -." ....-rah—Da—-O:‘ “.3 . . f \ ' L1 "in 2; NTiChx ISVI‘X‘ Univvrsity {‘3 _._ L A "3 '— ‘ :5'15 This is to certify that the thesis entitled THE POTENTIAL OF THE FREE RECALL TASK.AS A MEANS OF ASSESSING STUDENT ACQUISITION OF COURSE RELATED ORGANIZATION, presented by WALTER S. BROWN has been accepted towards fulfillment of the requirements for MEASUREMENT, EVALUATION, AND RESEARCH DESIGN PH '1) ' degree in Major professor fiwfi®mk a 0 Date (2/7131) 2" 0-7639 ABSTRACT THE POTENTIAL OF THE FREE RECALL TASK AS A MEANS OF ASSESSING STUDENT ACQUISITION OF COURSE RELATED ORGANIZATION BY Walter Shawver Brown Rationale of the Inquiry This field study builds on Johnson's research (1967), in which a traditional verbal learning technique is used to examine the organization personalized by stu- dents, and this organization's relationship to formal curricular structure. Rather than using free association as Johnson has done, however, this study uses free recall (FR). Primary Objectives 1. To examine the generalizability of laboratory findings vis a vis the free recall paradigm to the less constrained environment of an ongoing instructional pro- gram. 2. To extend the conceptualization of the associative cluster to include not only segmental cluster- ing (taxonomic), but also suprasegmental clustering (subject-imposed coalition of within-unit segmental clusters, as Opposed to between-unit clustering). Walter Shawver Brown 3. To develop the basic data to support the use of free recall as a mode of formative evaluation, both for individual students and for instructional material refine- ment. Materials A free recall task was constructed using 49 key terms drawn from four units of material presented in individualized carrel programs in basic teacher education course. All subjects were tested in groups of approxi- mately fifteen subjects each. The free recall task was presented in booklet form, with four study trials and four recall trials. The students received 50 seconds for each study trial and 150 seconds for each recall trial. All study lists were composed of the 49 key terms in random order. Design and Analysis Different groups of subjects were randomly assigned to one of five testing times. The first test- ing time occurred before any students had studied any of the four units in the carrels. The second testing time occurred at the close of instruction for unit I. Testing time 3 occurred at the close of instruction for unit II, etc. A total of 350 subjects with 70 per test- ing time comprised the final samples. Walter Shawver Brown Four dependent measures were available for each student. From the free recall task, a recall measure, a segmental clustering measure and a suprasegmental cluster- ing measure were extracted for each subject for each of the four trials and four units nested within each of the trials. The fourth measure for each subject was the mastery test score achieved for each of the four units studied during the quarter. Three major analyses were undertaken. The first or theoretical analysis examined the learning curves over trials in dependent measure performance. This analysis consisted of a series of multivariate analyses of variance using orthonormalized transformations and orthogonal polynomials. The second analysis was an instructional analysis examining predicted changes over testing times in within unit dependent measure performance. This analysis con- sisted of a series of multivariate analyses of variance using simple contrasts. The third analysis was also instructional and exa- mined the nature of the relationship between unit dependent measure performance and unit mastery test performance. This analysis used Pearson product moment correlation coefficients. Walter Shawver Brown Hypotheses Theoretical: The free recall dependent measures will increase over trials. Instructional-MAXOVA: Within-unit dependent measures will peak at that testing time in which the unit in question was studied. Instructional-Correlations: Before unit instruc- tion there will be no significant correlations and after unit instruction there will be signifi- cant positive correlations between unit dependent measure performance and unit mastery test per- formance. Results Theoretical The dependent measures do increase over trials, though the suprasegmental clustering measure is redundant to the segmental clustering measure. The nature of the recall and segmental clustering curves is complex with significant constant, linear, quadratic and cubic trends. The recall curves decelerate (negative quadratic components) and the segmental clustering curves accelerate (positive quadratic components) over time. The major difference between the curves at dif— ferent testing times was the constant term or y-intercept, indicating an increase in trial 1 performance. The curves were, otherwise, highly similar within measures. Instructional-MANOVA The hypothesis was only partially supported for two out of the four units. In general the measures did not peak as predicted. Walter Shawver Brown Instructional-Correlations The correlational data evidenced primarily insig- nificant correlations. There was some evidence that at preinstruction, student differences in entry behavior did correlate with later mastery test performances. The post-instructional correlations were generally not significant and positive as predicted. Conclusions On the basis of this field study, it could be concluded that the practical application of the free recall task to the classroom setting is not possible. The loss of control that is necessitated by "in vivo" research destroys the viability of the free recall task. It is encouraging, however, from a theoretical point of view, that the learning curves observed were consistent within-measures and were not unlike those previously encountered in laboratory research. Finally, it would appear that the suprasegmental clustering measure, while redundant to the segmental measure in this study, might be of value in research with materials or designs that make segmental clustering difficult to use. . 1P. E. Johnson, Some psychological aspects of sub- ject matter structure, J. Ed. Psychol., 1967, 58:2, 75-83. THE POTENTIAL OF THE FREE RECALL TASK AS A MEANS OF ASSESSING STUDENT ACQUISITION OF COURSE RELATED ORGANIZATION BY Walter Shawver Brown A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Educational Psychology 1972 DEDICATION My thanks to those who have supported, guided and nurtured my growth during my doctoral studies. Uppermost among these are the following: My wife Janet and my parents Jack and Wanda Dr. Joe Byers Dr. Judy Henderson Dr. Maryellen McSweeney Dr. Gordon Wood Perhaps the best way to repay all of them is to try to teach as I have been taught--with patience, under- standing and caring. ii DEDICATION TABLE OF CONTENTS LIST OF TABLES. . . . . . . LIST OF FIGURES . . . . . . Chapter I. THE PROBLEM. . . . . . . . Introduction. . . . . . . Commentary . . . . . . . Purpose of the Study . . . . Theory. . . . . . . . . Commentary . . . . . . . II. REVIEW OF THE LITERATURE . . . Overview . . . . . . . . Historical Antecedents . . . Commentary . . . . . . . The Multi-Trial Free Recall Task Commentary . . . . . . . Verbal Organization and Academic Performance . . . . . . Commentary . . . . . . . III. DESIGN AND METHODOLOGY . . . . Overview . . . . . Sample. . . . . . Commentary . . . . Materials. . . . . Word List Characteristics Instrument . . . Design. . . . . . Introduction. . . Testing Times . . Units . . . . . Trials. . . . Unit of Analysis . Completion Index . Analyses . . . . . Research Hypotheses . Theoretical Hypothesis Instructional Hypothesis. iii Page ii vii 26 26 26 27 27 27 33 34 34 35 38 38 38 39 40 41 41 41 Chapter Page IV. PRESENTATION OF THE DATA . . . . . . . 43 Introduction. . . . . . . . . . . 43 The Completion Index . . . . . . . . 43 Theoretical Analysis . . . . . . . . 45 Instructional Analysis . . . . . . . 57 Correlations Between Dependent Measures and Mastery Test Performance . . . . . . 70 Overview and Hypotheses . . . . . . 70 Process of Analysis . . . . . . . 7O Correlations Between Dependent Variables Performance and Mastery Test Performance . . . . . . . . . 72 V. DISCUSSION AND IMPLICATIONS . . . . . . 76 Overview . . . . . . . . . . . 76 The Theoretical Analysis: Change Over Trials. . . . . . . . . . . . 76 The Instructional Analyses: Change Within Units Over Time . . . . . . 81 The Instructional Hypotheses: Correlations Between Unit Dependent Measure Perform- ance and Unit Mastery Test Scores. . . 84 General Implications and Recommendations for Future Research . . . . . . . 86 APPEXDICES O O C C C O I C O O O O O O C 89 BIBLIOGRAPIIY O O I C O O O O I O O O O O 104 iv LIST OF TABLES Table Page 1. Unit I segmental clusters. . . . . . . . 19 2. Maximum possible mastery test score by unit. . 24 3. Report of segmental and suprasegmental cluster scores obtained for each of the units by four different raters . . . . . . . . 29 4. Word list for each unit broken into segmental clusters . . . . . . . . . . . . 3l 5. Maximum possible values for the dependent measures (recall, segmental clustering and suprasegmental clustering) for individual units and the total across units . . . . 32 6. Summary of independent and dependent variables. 34 7. A time-line illustration and explanation of the independent variable of testing times (fall quarter) . . . . . . . . . . . . 37 8. "Ideal" and "obtained" completion indices by testing time . . . . . . . . . . . 44 9. Mean proportions of possible recall achieved over four units and standard deviations by trials and testing time . . . . . . . 45 10. Mean proportion of possible segmental cluster- ing achieved over four units and standard deivations by trials and testing times . . 46 11. Mean proportion of possible suprasegmental clustering achieved over four units and standard deviations by trials and testing times I I I I I I I I I I I I I 46 12. Step-down F tests and P values for three ‘ dependent measures and orthogonal polynomials for trials over each testing time. I I I I I I I I I I I I I 49 13. Independent tests for significance of linear, quadratic and cubic trends within the three dependent measures . . . . . . . . . 51 V Table Page 14. Estimated coefficients for the learning curves over trials, based on the recall data. . . . . . . . . . . . . . 53 15. Estimated coefficients for the learning curves over trials, based on the segmental clustering data . . . . . . . . . . 55 16. The mean preportion of possible recall achieved over four trials by testing times and units. 58 17. The mean porportion of possible segmental clustering achieved over four trials by testing times and units . . . . . . . 59 18. The mean proportion of possible suprasegmental clustering achieved over four trials by testing time and units. . . . . . . . 59 19. The step down "F" ratios and "p" values for the pair-wise contrasts within the three depen- dent variables resulting from the inter- action test of the overall MANOVA . . . . 62 20. The step down "F" ratios and "p" values for the dependent measures in units 3 and 4 respec- tively, resulting from two independent significant multivariate tests of equival- ence of mean vectors in a one factor MANOVA. 64 21. Summary of the multivariate tests of equality of mean vectors for the unit 3 ordered contrasts for the dependent measure of recall . . . . . . . . . . . . . 67 22. Summary of the multivariate tests of equality of mean vectors for the unit 4 ordered contrasts for the dependent measure of recall . . . . . . . . . . . . . 67 23. The completion statuses, completion status codes, sample size and critical values for the correlation coefficients at p < .10, to be used in the insturctional correIational analysis . . . . . . . . . . . . 72 24. Matrices of correlations between unit dependent variable performance and unit mastery test performance by completion status . . . . 73 vi LIST OF FIGURES Figure Page 1. Data matrix--overall . . . . . . . . . 36 2. Proportion of possible recall achieved over four units by trials and testing times . . 54 3. Proportion of possible segmental clustering achieved over 4 units by testing times and trials. I I I I I I I I I I I 56 4. Proportion of possible unit recall achieved over four trials by testing times . . . . 68 5. Proportion of possible suprasegmental clustering achiever over four units by trials and testing times . . . . . . . 100 6. Proportion of possible unit segmental clustering achiever over four trials by testing times. . . . . . . . . . . 102 7. Proportion of possible suprasegmental clustering over four trials by testing times I I I I I I I I I I I I I 103 vii CHAPTER I THE PROBLEM Introduction As evidenced by the publication of books such as Neisser's, Cognitive Psychology (1967), Miller, Galanter and Pribram's, Plans and the Structure of Behavior (1960), and Saltz, The Cognitive Bases of Human Learning (1971), psychologists are becoming more interested in human cognitive functioning and processing. The modes of cognitive organization available to man are of particular interest. Bruner (1956, 1960), Gagne (1970), Bloom (1956) and others have described methods of elucidating the structure inherent to the subject matter areas. The find- ings and hypotheses being promulgated by such men as those mentioned are being translated by educators into numerous Specific instructional programs designed to maximize stu- dent learning of verbal materials. Examples of such pro- grams would be the SCIS and the AAAS science programs (Belanger, 1969; Smith, 1969), the SMSG and IPI mathe- matics prOgrams (Romberg, 1969; Heimer, 1969), and the Bereiter, Bngleman beginning reading program (Bereiter and Bngleman, 1966). More generally, models for the nurturance of Optimal learning, such as Bloom's mastery learning conceptualization (Bloom, 1968) demand explicit analysis of the content and objectives, and flexible implementation of the curriculum, to the end that the majority Of the students may be educationally successful. The success or failure Of such attempts to Optimize the students' personalization of structure can be assessed only by examining the final outcomes Of instruction—- student learning and cognitive organization. The organization imposed on verbal input has become a central concern of yet another academic pOpula- tion within the past fifty years, specifically, the verbal learning psychologists. In ever increasing num- bers, they have produced a voluminous literature base in a relatively short period Of time. While this tremendous growth has produced a number Of experimental paradigms and variables of import, the overwhelming majority of the work has been at a very basic level. Such research, while exciting, and most sensible perhaps, given the relative infancy Of the field (Kuhn, 1970), has been challenged as irrelevant to the "real world." Indeed, in a recent review (Tulving and Madigan, 1970), of the state Of the art, only ten percent (10%) of the research reviewed was seen as being truly relevant to the field itself, in terms Of advancing the boundaries of under- standing. It is apparent that if much of the inquiry seems irrelevant to those intimately involved in the field, that the potential value of the research would be particularly difficult for those outside the microcosm of verbal learning to discern. This irrelevance may, however, be more apparent than real or necessary. To clarify, let us take an example from both fields in relation to organization. The educator may directly attack the structure Of knowl- edge and extract or make concrete that structure, through task analysis and behavioral Objective deveIOpment. All of the curricula mentioned earlier rely heavily on such direct procedures. The verbal learning psychologist, however, must examine indirectly the ways that indi- viduals process the verbal information that they collect every day from peers, parents, teachers, tachistOSCOpes, etc. The necessity Of indirectly studying complex behavior has created a number of ingenious tasks that allow one to gain insight into cognitive processing. Free association, paired-associate learning, serial recall and free recall each approach the problem of how man structures his verbal world and each does so in a slightly different way. The free recall task, for instance, presents subjects with a set Of words or terms that are semantically related in some way, e.g., associatively or taxonomically (Cofer, 1965). Most often, several conceptual clusters or groups Of words are identifiable within the total list. When the words are presented in random order, peOple invariably begin to impose an organization on the list. They report words in chunks or clusters that give the researcher informa- tion about the associations and superordinate concepts that the individuals have internalized. This task has repeatedly shown itself valuable in exposing pieces and parts Of the effective organization a person brings to the task and to the terms within it (Adams, 1967). It is a basic assumption Of this inquiry that the dilemma of the curriculum deveIOper/teacher, that of finding new ways of examining student learning and organization, and that of the verbal learning psycholo- gist, that Of proving the relevance Of his research, may be concomitantly served and at least partially solved by the extension Of some of the traditional verbal learning techniques and understandings to the tasks set before the educator. Further, it seems that the free recall task would be an especially fruitful technique with which tO begin this foray, since it does address itself uniquely to the problem of organization Of verbal materials-~a primary concern of the educator. Commentary Obviously, this movement from the laboratory to the classroom will not be done without some costs. A major trade-Off is the manipulability of the research environment. Traditional experimental designs Often demand a degree Of control which is impossible tO Obtain in contexts other than the laboratory. This does not make research in the field any less necessary or attrac- tive, it simply requires develOping and trying new research techniques and models. Such then is the con- text Of this investigation. The next section will dis- cuss in a more detailed manner the purpose Of the study. Purpose of the Study The purpose of this field study is to determine the efficacy Of the free recall task as a means Of analyz- ing the structural outcomes of a course of study. The structure Of particular interest is that imposed before and after instruction, on a list Of key terms drawn from the materials studied in a basic course in teacher edu- cation. The words have been chosen because they form clusters based on the course Objectives and because they occur contiguously within the textual materials (Hender- son et al., 1971, and unpublished carrel materials). An instructor would hope that the organization that he "teaches" would display a high match with that organi- zation that the student would learn, i.e., impose on key terminology upon exiting the course. Of further interest is the free recall tasks's relationship to the more traditional testing modes used in student evaluation. The four unit exams consist of multiple choice questions and short answer essay ques- tions based on the unit Objectives. Since both the test items and the free recall task involve the same key terms, concepts and principles, it is expected that a positive relationship will Obtain between the level Of perform- ance achieved on the two tasks. Should this study indicate a positive relationship between the dependent measures and more traditional test- ing modes, it is possible that the free recall task could provide valuable evaluative input to a developing program. Its utility would be in giving feedback to material producers about the success of their materials in teaching desired organizational structure. Theory Time has traditionally been treated as an inde- pendent variable in our educational system. Also, the normal curve has been used as a model for the expected outcomes of student learning. Bloom, in developing his model for mastery learning (1968), has taken both Of these desiderata to task. First, he insists that time should be seen as a dependent variable, the length Of time allowed on learning a task being decided not by administrative exigencies, but rather by the amount Of time needed by the student to reach mastery. Second, in regard to the "arbitrary" imposition of the normal curve on student outcomes, he Opines that 95% Of the student pOpulation can and should achieve mastery of the material under consideration if given the apprOpriate modes Of instruction and sufficient time. Indeed, it is sugcested that brandishing records Of stu- dent achieverent which approximate the normal curve is more proof Of poor teaching than Of the existence of poor learners (Bloom, 1968). Of particular interest in this investigation is the concern voiced for new tools Of formative evalua- tion. Formative evaluation has two purposes (Bloom, 1968, 1971): first it is designed to check the stu- dent's progress through a unit Of material. If mastery is evident, then another unit may be entered. If, on the other hand, the unit Objectives are not mastered, the evaluation must serve as a diagnostic tool, aiding the instructor in prescribing apprOpriate additional experiences. The second purpose of formative evaluation is only secondarily concerned with an individual stu- dent's progress. It, rather, emphasizes the further refinement of the program of instruction. If, for example, an individual or a group of students does not reach mastery, the evaluation tool should help identify those concepts and principles that were inadequately taught. Assuming that the Objectives are specified, the problem is then how best to change instruction to incul- cate the desired cognitive organization. From a practi- cal point of view, then, this thesis is concerned with testing the efficacy Of the free recall task as an addi- tional formative evaluation tool for the latter purpose—- that Of identifying weak areas in the instructional materials. From a theoretical point Of view, this represents an extension of free recall, a traditional verbal learn- ing paradigm, from the laboratory to the world Of the classroom. While research emphasis has been placed for some time on the effects Of the relative meaningfulness Of verbal inputs, meaningfulness generally remains restricted to a word's socio-cultural linguistic fre- quency or familiarity and its associative relationships to other words, i.e., the "Iinnesota Norms" (Russell and Jenkins, 1954), and Deese's indices Of "Interitem Associative Strength" and "Associative Meaning" (Deese, 1959, 1962). Further, while the psychOlOgical mechanisms involved in prose learning have been examined by many from varying points Of view and with varying techniques (RothkOpf, 1968; RothkOpf and Coke, 1968, 1970; Frase, 1969, 1971; Ausubel, Stager and Gaite, 1968, 1969), each has carefully manipulated the stimulus materials to maximize the likelihood of producing information gain in a laboratory setting. In this study, while the materials have been carefully constructed to maximize the clarity and the learning Of the materials, it has, nonetheless, been done from an instructional point Of view, not for the purposes Of isolating and manipulating learning pro- cesses for experimentation. Thus, if we find that the laboratory results are replicable in the instructional setting and applicable to instructional materials, the techniques and theory involved may be seen tO have great potency. In this investigation, ordinary English words are being used, making them relatively high on a tradi- tional scale Of meaningfulness. However, rather than being concerned solely with the examination Of pre- existing language habits or verbal organization, we are concerned with the changes in language habits that Obtain as a result of instruction. Many Of the terms may be familiar to the students in contexts other than that imposed by the study Of educational psychology. Thus in some cases we may be extinguishing pre-established associations that are extraneous to the conceptual matrix that the course itself develops for the words. Particu- larly germane to this thesis is the work Of Paul Johnson (1964, 1965, 1967) in free association. He used free association (FA) tasks in the context Of regular classes, i.e., physics, and found that the constructs Of the 10 material were reflected in FA performance as the course progressed. Further, performance on the FA task also correlated with degree Of course involvement and general course performance as measured by more traditional means. His work will be discussed in greater detail in the next chapter. Commentary It should now be clear that this thesis has hypotheses and subsequent analyses Of two basic types: instructional and theoretical. The instructional con- cerns are based upon the ability of the free recall task to expose subject organization Of verbal material. The theoretical questions will center on the replicability of laboratory findings in the instructional setting and in the extension and/or refinement Of the present methodologies. CHAPTER II REVIEW OF THE LITERATURE Overview The intent of this chapter is to review the literature considered relevant to this thesis as briefly described in Chapter I. The review is broken into three suthpics, each followed by a commentary which will summarize the major contributions Of the preceding section, and/or discuss its implications and introduce the next sub-section. The three suthpics are, in order Of discussion: historical antecedents, the multi-trial free recall task, and verbal organization and academic performance. Historical Antecedents The free association and the free recall tasks are both primary tools used by verbal learning psycholo- gists to study learning processes. Of these tasks, the first to attract research interest was free association. It relies totally on the previous verbal experience Of the subject, in that he is given a stimulus word (i.e., white) and is asked to write or say the first word or words that come tO mind (i.e., black). It has served 11 12 as the basis for an inquiry into the nature Of the rela- tionships that words have to each other and has spurred a debate still on-going as to whether the association can effectively Operationally define the kernel upon which all higher processes rest. Norms have been established which identify the associates of common or high frequency words in the English language (Russell and Jenkins, 1954; Kent-Rosanoff, 1910; Thorndike and Lorge, 1944), which reflect the reSponses Of native speakers. Other norms have been established for Special populations such as schiZOphrenics (Bleuler, 1951). Beyond the common associations that individuals of a given identifiable pOpulation will produce, of course, the uniqueness Of individual experiences are reflected by largely ideo- syncratic associations that are common, possibly to many individuals, but hardly common in the sense Of the Minnesota norms. While simple pairwise associations are the most Obvious relationships studied with the free association paradigm a great deal more has been done. For example, if one asks for more than the first word that comes to mind, any number Of levels Of associations are possible. Much theorizing about the nature Of such structure Of associative hierarchies (Deese, 1965; COfer, 1965; COfer and Foley, 1942) has been undertaken. Research in concept formation, for example, is basically a search 13 for the mechanisms that allow words or Objects to become related and labeled categorically. Some psychologists have added intervening variables such as the mediating link or "verbal mediation” to explain more complex behavior (OSgood, 1956, etc.). Others have called upon imagery (Paivio, 1969), and still others prefer straight SR generalization or chaining (Skinner, 1957; Staats and Staats, 1963). Regardless of the descriptive preference Of the psychologist involved, the search is the same—- How does one discover or better understand the processes by which human beings form relationships, simple and complex; how do they achieve the ability to conceptualize, impose and use subordinate and superordinate structuring to communicate and solve problems? Given the Obvious presence Of more complex behavior patterns such as associative hierarchies and general concept formation capabilities, the free recall task has become pOpular as an investigative tool in the past twenty years. The task is again relatively simple in nature, yet it taps more complex and SOphisticated processing. A subject is given a list Of words tO study either visually or aurally and is then required to recall as many words as possible in whatever order they occur to him. Much work is done with lists which have experimenter-selected conceptual categories; i.e., cate- gories that would be known to a native speaker (colors, 14 furniture, vehicles, etc.)—-words and word classes that are high frequency in the English language. If the list contains, for example, twenty words; four concepts with five exemplars each, and is presented to the subject with the words in random order, the subjects reliably begin to impose previous language habit structure on the words. In fact, if given sufficient trials most individuals when asked to recall the randomly ordered words will be able tO report the words in clusters which reflect the conceptual constraints that the experimenter has imposed-- i.e., all words from concept A would be recalled together, all words from B together, etc. Such grouping or clustering in accordance with experimenter defined constraints or concepts has been joined by the study of subjective organization wherein the experimenter need not be concerned with imposing organization with the list prior to presentation (Tulving, 1962). Rather, the subject is given several trials to learn the list and the experimenter then examines the resultant protocols to discover any consistent patterns Of recall over trials. Invariably, the 8's do establish groups or clusters of words that seem to facilitate the memory and recall Of the items on the list. It can be seen that this task also elucidates or makes concrete verbal organization patterns than an individual has acquired through experience. 15 Commentary It is hoped that through this rather cursory review, the point is established that the free association and free recall tasks are intimately related, historically and conceptually. Fach can help expose verbal structure or organization and each is amenable tO diverse variable manipulations; a necessity for a fruitful research paradigm. It is not surprising, therefore, that while some task variables and outcomes are unique to each, there are marked similarities between them. The next section of this review will concentrate on the particular task Of primary importance to this research--the multi-trial free recall task. The Multi-trial Free Recall Task Bousfield and Cohen (1953) first reported exami- nation of subject categorical organization or clustering in a multi-trial free recall paradigm. They presented their subjects with a list Of 60 nouns which contained four different categories and 15 exemplars each. The categories and examples were equated in terms Of the Lorge-Thorndike word count (Thorndike and Lorge, 1944), which as mentioned earlier, is an estimate of the fre- quency per million words in ordinary English text. The four categories which they chose were: animals, names, professions and vegetables. The 60 words were presented 16 at a three-second rate to five independent groups Of subjects. Each group received a different number Of trials or reinforcements on the random stimulus list, ranging from one to five presentations each followed by an immediate recall. It was noted that, in terms Of recall only (strictly defined as the total number Of words present in the stimulus material recalled after a trial), the mean total recall increased from 23.9 words for one reinforcement or trial, to 37.9 words after five. This learning over trials (learning here referring to the increase in mean total of words recalled) has been consistently replicated in the literature (Murdock, 1960; Tulving, 1962, 1964). Total recall, then, will be a significant reSponse measure or dependent variable in this thesis. It is expected that recall will increase over trials. As in his work with single trial free recall (1953), Bousfield defined clustering for this experi- ment in terms of repetitions, a repetition being a sequence in recall Of twO or more items from one Of the four categories Of the stimulus list. The number Of repetitions was defined as the number of items from a category recalled together, minus one. Bousfield then calculated the number Of repetitions that could be expected on the basis Of chance alone, and found that his data suggested that a significant tendency exists to 17 reorganize the randomly presented words into their four conceptual categories. In the multi-trial experiment, then, he found that not only did mean recall increase over trials, but that clustering also increased steadily from one to five reinforcements, with five trials having about twice the level of repetitions as one. Generally Speaking, then, this measure can be Obtained in conditions wherein the total set of "X" words or items is constructed by the experimenter such that it consists Of two or more mutually exlcusive sub- sets Of items. Each item within a given subset must be assumed on some logical or empirical grounds to be more "similar" to other items within the subset than it is to any other items in other subsets. Such subsets have been defined in terms Of belongingness Of the words or items to a conceptual category (Bousfield, 1953), associative relations among words (Jenkins and Russell, 1952), parts Of Speech (Cofer and Bruce, 1959), etc. The basic finding that the degree Of clustering increases over trials has been consistently replicated by the latter researchers and numerous others over the past two decades. One major drawback of this conceptualization Of subject organization of verbal material is that it reflects and examines only that organization that the experimenter imposes (Tulving, 1962, 1964, 1968). In 18 other words, the subject may impose organization in terms of trial by trial contiguity of word or item output, yet if it is not one Of the repetitions defined as "correct” by the experimenter, it is completely missed in the analysis of the subject output protocols. A second type Of organization, therefore, has been labeled by Tulving (1962) as "subjective organization" (SO). Since it is defined in terms Of the consistency Of output order over trials for each subject, it does not necessarily require that the experimenter know in advance of the experiment which items are to be grouped together. It can, there- fore, be used for any set of items. Given the existence Of two types Of secondary organization (clustering and subjective organization), this thesis is designed to examine the clustering that occurs in response to a set of 49 words drawn from four units Of material presented in a basic Teacher Education course. Two measures Of organization are proposed, segmental clustering (Cs) and supra—segmental clustering (C55). The segmental clustering is traditional and directly related to research cited under clustering measures. Within each unit of material, a series of two to five word clusters (CS) have been identified. Each word within a segmental cluster is deemed more similar to other words within the cluster than it is to words in other segmental clusters within the same or different units. 19 The supra-segmental cluster, however, is a departure from traditional measures and can be thought of as one piece of that organization heretofore generally referred to as subjective organization. More specifically, the existence Of a supra-segmental repetition or cluster (Css) is defined by the contiguity Of two or more words from within the same unit Of material, but not from the same segmental clusters. Thus, while much subjective organization will still be missed, a more concrete deline- ation Of at least one type of organization previously typed as subjective organization is now possible, trial by trial rather than only over trials. TO further clarify the distinction between Cs and C let us examine some Of the content of the Unit I 55' list (Table 1). TABLE 1.--Unit I segmental clusters. ‘ 3"! 2". - :rrL-fi'S'rE—Z'wamf -_._._.3__._. .__.-__‘__.__...__.4. _.._ __ t:‘_fl‘:—2' rm A B C D teaching design means variables learning instruction givens human ends environmental curricular Within the list Of Unit I words, there are four segmental clusters--A, B, C, D. If a subject in a test trial or output phase lists the words, means, givens, 20 and ends consecutively, two repetitions are evident: means-givens and givens-ends. However, a further organi- zational mode is possible, that of grouping words together within the unit but across segmental clusters. For example, a subject might report teaching, instruction and curricular together. From this investigator's point of view, this grouping would be referred to as a supra- segmental cluster and would also contain two repetitions: teaching-instruction and instruction-curricular. Note, therefore, that we have at our disposal two measures of clustering, each describing very distinct types of organization. Theoretically, then one would expect that both clustering measures will increase over trials. Commentary At this point, three dependent measures have been identified: recall, segmental clustering and supra- segmental clustering. Each of these will be an integral part of the research to be done and each will serve as dependent measures in both the theoretical and instruc- tional analyses. The following section will discuss the work of Paul Johnson and its relationship to the thesis. Verbal Organization and Academic Performance As mentioned earlier, one of the prime goals of an instructional sequence is to change the behavior of 21 the students involved. It is clear that students cannot memorize verbatim the entire content of a course of study in any permanent fashion. What must occur, then, is stor- age of the substantive essence of what is being taught; associations, concepts, principles, etc. To this end, the majority of what is read or heard by the student must, of necessity, be forgotten, because so much of communica- tive verbiage, by the very nature of our language's syntactic system, is redundant--it does not distinguish one thought from another, one concept from another, etc. When Ausubel, et al. (1969) Speak of meaningful verbal learning and subsumptive processing, they are in fact discussing the process of drOpping the redundant features of instructional communication and saving and organizing those features which are distinctive into previously existing hierarchies or into newly acquired hierarchies and relationships. Using such stored distinctive features, one can retain in long-term memory the essence of what has been taught and learned for future reference--i.e., for test taking as well as for problem solving in life situ- ations. If one were to rationally define what kinds of words in instructional materials would become ”distinctive features" in cognitive storage, the result would probably be the key terminology defining the concepts and principles presented in a course of study. Moving to physics as a 22 specific subject matter, Johnson (1964) presented four groups of high school students a one response, free association task consisting of a list of 18 concepts in physics (i.e., volume, density, mass). The stimulus words were presented aurally one at a time. The groups were (a) students presently taking physics; (b) students having taken physics; (c) students planning to take physics; and (d) students not planning to take physics. He hypothesized that degree of present involvement in physics (group "A" being most involved, group "D" being the least) would be positively related to performance on the dependent variables: the frequency with which the physics stimulus words were given as free associates of one another and Deese's (1959) inter-item associative strength measure. He found that the associative measures were, indeed, related to the degree of involvement in physics. The mean number of appropriate responses to the stimulus words ranged from 8.07 for those taking physics to 2.11 for those not planning to take physics. The inter-item associative strength measures ranged from 44.3 for group "A" to 11.3 for group "D." In both cases, then, the ranking in group performance was A>B>C>D. Having found that performance on the free associa- tion task exposes differences in verbal organization which are related to degree of course involvement, Johnson went a step further in his research in 1965. 23 In this study (Johnson, 1965), he examined the possi- bility that success in taking an actual physics problem- solving test should correlate with performance on the free association task. In other words, if a student has a good graSp of the organization or associative relation- ships between the physics words, he should be able to solve physics problems better than the student who has not internalized the apprOpriate relationships. He found, upon analysis of the data, that the performance of sub- jects in the problem-solving test was related to their performance on the free association test; subjects who produced a relatively large number of relevant associa- tions solved more problems than subjects who produced relatively few such associations. While the relationship between performance on a verbal learning task and performance on a traditional testing mode has not been investigated much beyond Johnson's work, the ideas which he investigated are enticing. Given the relationships discussed earlier between free association and free recall, there is no reason to expect that the cognitive organization tapped by free association cannot also be profitably tapped by the free recall task. Further, since the free recall task allows more complex systems of relationships to be eXposed than does the free association task, it seems important to assess the ability of the free recall task to 24 elucidate student-acquired organization and cognitive structure in an instructional setting. Commentary On the iasis of the above, a set of instructional analyses will he done which will allow examination of the growth of cognitive organization over the Span of an entire course of instruction, as well as examination of the relationship letween performance on a free recall task embodying key terminology from the course and performance on traditional tests. To this end, each student's performance on the mastery test has been recorded. The tests contain both multiple choice and short answer questions. Four such scores have been recorded for each student with maximum values as follows: TABLE 2.--Naxim m possible mastery test score by unit. Fnit I II III IV Iaximum score possible 27 33 29 22 The basic research relevant to this thesis has been presented. Specific research hypotheses will not be presented until the end of Chapter III, when the design will be more clearly delineated. The following general hypotheses are apropos at this time, however. 25 The total (R, C5, C55) will increase over trials and testing times. Performance on the dependent measures will correlate positively with performance on traditional tests of knowledge of course materials. CHAPTER III DESIGN AND METHODOLOGY Overview The intent of this chapter is to discuss in greater detail the methodology and design of this thesis. The five sections are as follows: the sample, the materials, the design, the analyses, and the research hypotheses. Before going further it should be stressed that this research is not experimental in nature. Rather it is a develOpmental field study designed to observe free recall performance "in vivo." The ultimate intent, of course, is to identify new arenas for later experimental investigation. Sample Subjects were drawn from the first required course in MSU'S teacher education sequence. It is a SOphomore course and thus contained a majority of sophomores. Both males and females were included, though females predomi- nated. The size of the potential sample was approxi- mately 700, however, since testing was done only once in each small group section over the quarter, anyone who was absent on the day of testing was excluded from the 26 27 study. Given the large number of students involved and the length of time (a full quarter) over which testing occurred, it is doubtful that there have been any syste- matic effects of absenteeism. After all the data were collected and the quarter was completed, a total sample of 382 tested subjects remained who had completed all course requirements by the end of th quarter. From this number, a total of 32 subjects were randomly deleted from various testing times so as to provide 70 subjects per testing time. This ’mposition of equal cell sizes was necessitated by the mode of analysis which will be discussed later. Commentary It is apparent that because of the requirement imposed on subjects that they complete the course in order to be included in the study that the results can generalized most comfortably only to those students who do finish 'on time." Materials Word List Characteristics As has been briefly alluded to in previous chapters, the course in which this research is being undertaken is a basic course in teacher education. The instruction is broken into four major units: intro- ductory, assessment, objectives and strategies. A word 28 list was constructed by drawing key terminology from each of the four units. A total of 49 such words were chosen in the following fashion. The course's behavioral objec- tives were perused and the materials were then studied to identify those words which could be seen as being of primary importance. One further restriction on the choices was that they had to be members of a cluster of at least two words and that the words chosen could not easily be placed in more than one segmental cluster. The words chosen, then, form clusters ranging from two to four items. Admittedly, this process involved some subjective judgments by the author. For instance, some words may appear in more than one unit, with one unit giving greater stress and detail in regard to their meaning and relationships. In such cases, that unit providing the greatest degree of stress and information in regard to the words in question was considered the "home base" for the concepts. It is possible then that different people extracting lists of key terminology could come up with variations in the unit placement of clusters. To check on the degree to which this concern may be considered dangerous to the validity of the findings, three other people intimately involved in material and general course develOpment were asked to break down the random list into unit and within unit cluster groupings. Each of the 29 three have been involved in the writing and teaching of the course materials and in the production of mastery test items. The individuals were independently given the trial one randomized list from the subject testing materials (see Appendix A). They were asked first to break the words into four groups defining the four units and then to cluster the words within each of the units. The output of these people is summarized in the table below (Table 3). Each letter stands for a separate rater, "A" being the author. TABLE 3.--Report of segmental and suprasegmental cluster scores obtained for each of the units by four different raters. —— -——— -____ .— ..__._. -—--——- X ‘ :3 I‘ ~'*m1n' g.'!_- fi-z‘ r-L— _.— o———-—- — ———.- - .'__ _.—--.;__—-_‘_.... _-_. _-.__._ Segmental Clustering Suprasegmental Clustering Rater Unit Unit I II III IV I II III IV A 7 5 l0 9 10 7 16 12 B 7 4 10 8 10 6 12 12 C 7 5 10 9 10 7 12 12 D 7 5 9 9 10 7 12 12 The above data (Table 3) was then analyzed using Kendall's coefficient of concordance (Hays, 1963), or W. The resultant coefficients were W = .9562 for segmental 3O clustering and W = .9062 for suprasegmental clustering. It is clear that with an upper limit for w of 1.0, both of the measures fared relatively well. Looking at the output summary table above (Table 3), the major area of disagreement is in suprasegmental clusters for Unit III. This, on reviewing the organization imposed by the other persons involved, is due to the placement of two two-word clusters in units different than that chosen by rater "A." In retrOSpect, their placement of these two clusters is perhaps the most defensible. (The two clusters in question are asterisked in the word list chart be- low.) The discrepancy will be discussed further in Chapter IV. The following table (Table 4) will clarify which words were chosen from each unit, and what segmental clusters are identifiable within each unit. The three dependent measures, recall (R), seg- rental clustering (Cs) and suprasegmental clustering (CSS), were each defined on the basis of this list. For example, for Unit I there are 11 words that a subject could recall. The maximum recall score is therefore 11. For segmental clustering, the number of words listed consecutively by the subject in recall from any one ex- perimenter-defined cluster is used as a base for computa- tion of a total Cs score. From this total of "n" words, l'is subtracted, producing the clustering score which summarizes the number of distinct pairs found within a 31 TABLE 4.--Word list for each unit broken into segmental clusters. ' ta. D i 83". L _. n. =m~ -' -- .' I L“: —- .I'..._'- ;;_.:..: .’ 23.3. ‘—1 _ 1 ;-‘-—; —_;_:....'_.;;_2 :_:...:.:—Z; :__"_ *- Unit I Unit II Unit III Unit IV L5 L5 C3 C5 {teaching I.Q. 'behavioral* strategies learning achievement { non—behavioral reSpondent {design pretest ' complete operant instruction ' Piaget { incomplete modeling 'means _ preOpera- beyond-school shaping {givens ’ tional { in-class reinforcers ends ‘ formal ' socio-emotional* positive variables ' Kohlberg { intellectual negative human { moral conditions ‘attitude environ- 3 terminal feelings mental criteria classical curricular recognition 'stimulus % recall { response convergent attending § valuing committing given cluster. Looking once again at Unit I then, if a subject wrote learning and teaching consecutively in recall, he would receive a Cs score of l (2 words minus 1); if he wrote human, variables, environ- mental and curricular, he would receive a Cs score for that cluster of 3 (4-1). The total Cs score for Unit I would be 1 for the first cluster, plus 1 for the second cluster, plus 2 for the third cluster, plus 3 for the last cluster——a total possible segmental clustering score of 7 for Unit I. Suprasegmental clustering assumes that words are recalled consecutively from within a unit but that they are 32 not recalled pairwise from within the segmental clusters identified within Table 4. For example, if a subject recalls teaching and learning together, that constitutes a segmental cluster, but if he recalls learning and design together, that forms suprasegmental cluster of two words. ‘I Azain applying the formula (n-l) to this cluster, it has K~4 a score value of l. The maximum CSS score would be achieved if a subject recalled all 11 words from Unit I consecutively, and produced no segmental clusters. The score for this large chunk would then be 10 (ll-l), which is the maximum possible value for CSS in Unit I. The table below summarizes the possible values, given the above word list, of recall (R), segmental clustering (CS) and suprasegmental clustering (Css) for each of the four units and total. TABLE 5.--Maximum possible values for the dependent meas- ures (recall, segmental clustering and suprasegmental clustering) for individual units and the total across units. w .— -.—_— I II III IV Total R 11 8 l7 13 49 CS 7 5 10 9 31 C 10 7 16 12 45 33 For presentation to the subjects the word list was randomized once for each of the four study trials involved in the study, producing random lists A, B, C, and D. See Appendix A. Instrument The course has large numbers of students, so to facilitate data collection, test booklets were used and the students were tested in groups. Four free recall trials were deemed sufficient to guarantee exposure of the existence of course—related organization of the key terms and yet avoid problems of ceiling effects on any of the measures. The booklet (see sample Appendix A) has a cover sheet requesting the student's name, small group section number, section leader, date, and information regarding which units had been finished or started at the time of testing. This sheet is followed by four randomized word lists for study trials, each followed by a blank sheet on which the students record the words they remember during the test trials. The students are allowed one second per word for study trials (total of 49 seconds) and three seconds per word for recall or test trials (a total of 147 seconds). All students receive the same booklet, regardless of testing times. As mentioned in the previous section, 34 each study trial is composed of a different randomly ordered word list. The following chart summarizes the student's study trial stimulus materials. Trial 1 2 3 4 Random Order A B C D Design Irtre' atirn This study includes the following independent and dependent variables: '2 ' "' - - 3 :- C —. 1' '.. :3: I 2' : —"3 '. 3 '__': "_ ___ ':== _" .‘E- .2 1 ___"— ; ;_'.' '; ' ..' .' . '. ..'__" 2 2 ; _—': -——".__._‘.."' :— .—— - .— Variables Levels Independent Testing times 5 {nits 4 Trials 4 Subjects 350 Conpletion index 3 Dependent Recall (R) Segmental clustering (Cs) Suprasegmental clustering (C35) Mastery test scores (MS) The dependent measures (R, CS, and C55) were ihtroduced in Chapter II and are described more fully on Fuage 27. The dependent variable of mastery test scores 35 was described on p-ge 24. The independent or design variables will be discussed in this section. The conplete design is shown in Figure 1. Using this design as a base, two separate analyses will be undertaken. The first analysis is theoretical and will involve collapsing across units leaving only the dependent measures (R, CS, and CSS) nested within trials. The second analysis is instructional and will involve collaps- ing across the variable of trials leaving the dependent measures nested only within units. Both then will be 5 x 4 designs with three dependent variables. Trials and units respectively will be considered as repeated measures. In succeeding subsections, the independent variables of interest will be discussed in more detail. Testing Times In order to study the growth of organization over the entire course sequence, the testing instrument was given at five times during the quarter. The probes or testing times were Spaced over the quarter such that as a unit reached completion in the carrels, a testing time was planned. While officially the emphasis in the carrels may have shifted from Unit II to Unit III, some students may still be working on Unit I or Unit II. This possi- bility, while perhaps not ideal from a strict .Hamum>o|uxfluume mucouu.~ ousmam 36 mcwuoumSHo m o anamoe mm veauoumsau Haucvfimvm I: U madam 1: v u H muuomnsm nu QmMm lam u wmau am I: U Afinpvx u: m erfips I- .pH n .H moEwH mcfiumOH II > I H uxox # ~ rium > Jr... 5:; i I M t. Sal % 17.. ‘ _ “a _ _ h w t t - :4 11 4. - ,1 _ w H . “a; _ _ w. _ _ _ _ II W > tr > MN _ . a _ _ _ a i _ _ _ _ _ . h ".7. W w w 4 o _ w FL . wm \_ -o I: 31.11-; .Y pi. i. t. a A. p a a . v m a A q M N s _ w M a H A q _ m i N A III) i..-.!_||--( -n- til -. a, c. _ .>~ .Huu .- .u 1) -- i : iii.) a 37 experimental point of view, is necessitated because the course Operates on a mastery model which allows students to move through the carrel prOgrams at their own rate. Thus at any testing tine, students will be Spread in terms ofwhichtndts are untouched, started or finished. The following table describes concretely the independent variables of testing times. TABLE 7.--A time-line illustration and explanation of the independent variable of testing times (fall quarter). Testing Times _. -— .—.____-— _._.._._ .- .— .— —-—_ *iH .—.—.... ._—_.‘ :—u- x...__..__._.._.. 9/29- 9/30 I. 10/11-10/14 II. 10/18-10/21 III. 10/25-10/27 IV. 11/22-11/23 V. Pre-instruction Unit I nearing completion (Introductory) Unit II nearing completion (Assessment) Unit III nearing completion (Objectives) Unit IV nearing completion (Strategies) At each testing time, exactly the same instrument was administered to 9 small group sections. A total of 45 small groups were randomly assigned to five testing times, thus allowing 9 independent groups per testing time. Each small group contained approximately 15 stu- dents, though absences on the testing day resulted in loss of subjects. 38 Units As described earlier in Materials section (page 27), there are four units of subject matter. Performance data on the dependent measures of recall, segmental clustering and suprasegmental clustering for each unit will be gathered at each testing time, thus allowing examination of changes in within-unit performance over the quarter. As the research is to be done in an ongoing, instructional prOgram, counterbalancing of the units is not possible and effects of order will not be testable. Trials As described earlier (page 33), four recall trials are given to each subject. Unit of Analysis Subjects are considered the unit of analysis in this thesis. While subjects are tested in preestablished Small groups in Education 200, no threats to independence are in evidence. The small groups referred to are established for the purposes of personal growth and as such do not deal directly with the subject matter of intimate concern in this study. The substantive material on which this research relies is studied by all subjects in Education 200 in individualized study carrels via tapes, slides, film clips and programmed work books. There is no reason, therefore, to expect systematic 39 influences on carrel performance to accrue as a result of involvement in any particular small group. Further, the assignment of the 45 small groups to a particular testing time was done through the use of a table of random numbers. Completion Inlex Each subject at testing must indicate for each of the four units whether they are untouched (O), started (I), or completed (2). Thus, a sample student may have corpleted Unit I (2), started Unit II (1) and not yet begun Units III and IV (0,0). Since each of the four units contains a different number of words to be "learned," each unit contributes a different weight to the total task. Thus, to further increase our precision, each of the values just discussed will be multiplied by the constant weight on percentage of total words con- tained in the unit in question. The weights for the units are as follows: Unit I (ll/49 = .2245), Unit II (8/49 = .1633), Unit IV (13/49 = .2653), Unit III (17/49 = .3469). To exemplify how these weights will be used, let us assume that a student reports his comple- ‘tion status as 2,1,0,0; he has completed Unit I, started Limit II and has not yet started Units III and IV. If kw: sum the weighted unit completion values, in this case, 2 (.2245) + 1 (.1633) + o (.2653) + o (.3469), we 4O produce .6123, an index of completion which describes in a relatively direct sense how far along in the substantive materials the subjects have progressed. These indices can range from 0 to 2. If this index is determined for all subjects within a testing time and the mean comple- tion index for the group produced, we have an overall indication of how far the group, as a whole, has progressed toward course completion at each testing time. Though no attempt to align or transform the data on the basis of the completion index will be made, they will be used to provide a more realistic graphical portrayal of the variable of testing time. For example, if one were to graph the simple variable of testing times, the first reaction would be to space them eguidistantly over the abscissa. It could be, however, that the rate of progress toward completion varies over the quarter such that the distance between testing time I and II should be much larger than the distance between time III, IV and V. If such absence of eguidistance is the case, a more accurate graphical representation of the data will obtain through the use of the completion index. Analyses Since units have varying numbers of words and clusters, the data must be transformed such that the dependent measures (R, C5, C55) are on an equivalent 41 scale--they have been transformed, therefore, into per- centages by dividing the obtained score for the dependent measure by the maximum possible score for the unit and dependent measure in question. The two basic analyses (instructional and theoret- ical) will be repeated measures, multivariate analyses of variance. The instructional analysis will be followed by post hoc trend analyses and by computations of corre- lation coefficients between performance on the dependent measures and mastery test scores. Research Hypotheses Theoretical Hypothesis 1. Recall, segmental clustering and suprasegmental clustering will increase over trials. Instructional Hypotheses 2. a. Within unit, dependent measure totals will increase over testing times. b. (1) Unit 1 dependent measures will be signifi- cantly higher at testing time II than at testing times I, III, IV, and V. (2) Unit 2 dependent measures will be signifi- cantly higher at testing time III than at testing times I, II, IV and V. (3) Unit 3 dependent measures will be signifi- cantly higher at testing time IV than at testing times I, II, III and V. (4) Unit 4 dependent measures will be signifi- cantly higher at testing time V than at testing times I, II, III and IV. 42 At testing time I, or preinstruction, there will be no correlation between performance on the dependent measures and the mastery test scores. If a unit has been completed at testing times II, III, IV or V, there will be a positive correlation between performance on the dependent measures and the mastery test scores. CHAPTER IV PRESLETATION OF THE DATA Introduction This chapter will summarize the outcomes of the research outlined in the previous chapters. This presen- tation will attend closely to describing these outcomes, not discussing their implications, as that is reserved for Chapter V. Four major presentations will ensue: The completion index, the multivariate analyses of the theoretical hypothesis, the multivariate analyses of the instructional hypotheses and the instructional correla- tional analysis. The Completion Index As described on page 39, a weighted completion index was conputed for each subject. The mean comple- tion indices for each testing time have been computed and are reported below in Table 8. Note that the "ideal" values are those weights for each testing time that would have been obtained if students had moved in a totally rigid fashion through the course material: the "obtained" values reflect the population's freedom to move through the materials at their own pace. 43 44 TABLE 8.--”Idea1" and "obtained" completion indices by testing time. ' _ _ -._._- - _____— .... .. ._ .__ - .4. - -. _ -_. __.._.._ _._-;_...._._.._.. _.....__ ; . 1 A I _ 3' .' .. L‘ 1 L I II III IV V Ideal Values 0 .4490 .7756 1.4694 2.0000 Obtained Values .1032 .6198 .9239 1.1207 1.7630 Both ideal and obtained completion indices were plotted on the same graph. The curves were little dif- ferent from each other. Further, the only reason for computing the completion indices was to more accurately describe the variable of testing times. It was felt that when the dependent measures were graphed as a function of time, the curves resulting from the use of testing times on the abscissa might look quite different than those obtained using the weighted completion indices. Such was not the case. Several sets of data were graphed using both time variables, and the appearance of the curves was virtually identical. Therefore, since no advantage accrued through the use of the more complex, Weighted completion indices, the more parsimonious variable of equidistantly placed testing times will be used in graphing most of the data. 45 Theoretical Hypothesis The hypothesis of interest in this analysis is: Recall, segmental clustering and suprasegmental cluster- ing will increase over trials. Before detailing the series of analyses that were undertaken to test the above hypothesis, it would be useful to peruse the means and standard deviations that were obtained from the samples. The means and standard deviations are presented in Tables 9 (recall), 10 (segmental clustering) and 11 (supra- segmental Clustering). TABLE 9.--Mean proportions of possible recall achieved over four units and standard deviations by trials and testing time. Testing Times Trials Overall I II III IV V 1. Mean .226 .246 .266 .269 .275 .256 Sb* .063 .081 .072 .077 .079 .075 2. Mean .290 .307 .313 .331 .327 .314 SD .080 .084 .082 .086 .088 .084 3. Mean .345 .354 .365 .400 .397 .372 SD .100 .094 .092 .106 .099 .098 4. Mean .364 .390 .399 .401 .419 .395 SD .107 .113 .096 .111 .117 .109 *SD represents standard deviation. 46 TABLE 10.--Mean prOportion of possible segmental cluster- ing achieVed over four units and standard deviations by trials and testing times. 3; Testing Times Trials Overall I II III IV V 1. Mean 042 .050 .057 .059 .051 .052 SD* 038 .045 .044 .050 .043 .044 2. Mean .050 .055 .051 .068 .060 .057 CD .044 .055 .046 .050 .045 .048 3. Mean .063 .074 .071 .096 .086 .080 SD .057 .055 .053 .076 .066 .062 4. Mean .068 .096 .093 .108 .116 .100 SD .054 .060 .058 .070 .069 .063 *SD represents standard deviation. TABLE 11.--Mean proportion of possible suprasegmental clustering achieved over four units and standard deviations by trials and testing times. Testing Times Trials Overall I II III IV V 1. Mean .040 .044 .057 .056 .058 .051 SD* .034 .039 .041 .042 .043 .040 2. Mean .069 .064 .061 .076 .086 .071 SD .045 .049 .038 .047 .056 .047 3. Mean .091 .101 .088 .110 .116 .101 SD .065 .061 .044 .055 .057 .057 4. Mean .089 .102 .099 .116 .104 .102 .066 .070 .064 .068 .067 .067 *SD represents standard deviation. 47 Without concern at this time for significance, it would appear that the means generally increase over trials both within and across testing times for all dependent measures. Also in most cases, the individual trial means tend to increase over testing times. If the above observations were shown to be signif- icant, they would suggest that not only did the dependent measures increase significantly over trials (the hypothesis of interest), but also that the dependent measures increased significantly within trials over testing times; i.e., that trial 1 dependent measure performance at test- ing time V was significantly greater than trial 1 perform- ance at testing time I. The analysis of these data consisted of a series of multivariate analyses of variance (MANOVAS), using orthonormalized transformations and orthogonal polynomial contrasts to examine the nature of the learning curves over the four trials. The polynomial contrasts provide tests for four components: the constant term (the "y" intercept), the linear term (the slope of each curve), the quadratic term (first degree curvature or x2") and the cubic term (second degree curvature or "x3"). The first analysis of this series was an overall test of equality of mean vectors with 70 subjects per cell. It was a two factor MAXOVA; factor 1, testing times, had five levels and factor 2, trials, had four levels treated 48 as repeated measures. The three dependent measures (recall, segmental clustering and suprasegmental cluster- ing), were nested within trials. The overall multivariate F for equality of mean vectors was 634.199 for 12 and 344 degrees of freedom with p < .0000. The interpretation of the main effects of trials was obscured, however, because of a signifi- cant groups (testing times) by repeated measures (trials) interaction. The F ratio for this multivariate test of equality of mean vectors was 1.803 for 48 and 1288.6 degrees of freedom with p < .0008. This interaction suggests that change over the trials is differentially dependent upon the testing time in questions. To attempt further explication of the changes over trials, a series of one factor MANOVAS was under- taken. Each analysis examined the trends over the factor of trials (four levels) within a particular testing time. The data from these analyses are summarized in Tables 12 and 13. Tatle 12 presents the first segment of the rele- vant results of these analyses. This table summarizes the overall tests within each testing time allowing exam- ination of significance of the contributions of each of the dependent measures considered in the following order: recall, segmental clustering and suprasegmental clustering. Since there were 12 variables for each testing time, an 49 0000.0 mm.nma 0000.0 mm.0HH 0000.0 V0.Hma 0000.0 Hm.mma 0000.0 v0.ama mm .ma u up .muouom> cum: wo >uflamzqm u0u ummuum mumflum>fluasz m0m~.0 av.m m0mm.0 H¢.H m000.0 No.0 m0nm.0 VN.H 000H.0 0h.a UHQSU 0Hmm.0 00.0 mmv0.0 Hm.0 mmme.0 ma.0 mhm0.0 00.m 0hH0.0 v0.m oflumupmso mnmm.0 H0.0 0mvm.0 Hm.0 H00~.0 mm.a >0N0.0 vm.0 awhm.0 00.0 unecflq 00mH.0 m0.m NOH0.0 m0.0 Homv.0 00.0 0m00.0 hm.0 mHvH.o NN.N ucmumcou mcflumumSHU Hmucmeowmumumsm mvm0.0 m0.v 0mm0.0 m0.m ~H00.0 0m.HH 5000.0 00.ma mmmv.0 m0.0 oflasu 00m0.0 0H.m m000.0 H0.0 0v0m.0 vb. 0~m0.0 0m.m 0m00.0 mv.0a oaumupmso Hmmm.0 mm.0 bmmm.0 mm.0 005m.0 ma.a 000m.0 vv.0 0m0m.0 cm.0 numbed 00H0.0 m0.0 mmmm.0 00.0 mmoa.0 00.H mOHm.0 vv.0 v0mm.0 v0.0 ucmumcou mcfiumumSHU Hmucmfimmm H000.0 00.vm H000.0 ha.mv H000.0 mm.mm H000.0 hn.mq H000.0 0n.mm UHJSU H000.0 No.0m H000.0 m0.0m H000.0 m®.mm H000.0 No.00 H000.0 H0.vv ofiumuomso 0000.0 an.00m 0000.0 v0.mmm 0000.0 Hm.00v 0000.0 0v.nhm 0000.0 mm.00m ummcflq m000.0 0m.m~ m000.0 0m.ma 0H00.0 00.0H m0vm.0 0v.~ mmv0.0 mm.v ucmumcou Hamowm cmnu u send u coma m cmcu m coca m mmaa cxcc mmzfi c300 mmcfi c300 mmoa c300 mmma c300 nu Ambufi Q alums L Amtum Mm nmeum \w Ouium UCTCOQEOU UCUNE new > a; HHH munmnez ucecceaea ease mcfluncr wear waaurzb push ocfiummb mash mcflumoe UEHH bfiumyk MOM WHMIHLCCWMOAW Hozgjoflwuo ”Jo MLLNQWMVUE uzlmwflfivxmvmu QUH.D MOM ”Munfldxr mm 05¢ mumvu hm C.€.OUI..m...U0IIoNH ”.mtema... .usau mcaumou Lune uv>o mflafiuu 1.»; 50 alpha level of 0.005 was used to make decisions regarding significance of the variables' contributions. This would result in an "analysis-wise" error rate of approximately 0.06. Using the established alpha (0.005) and scanning the p-values associated with each F test from the bottom of Table 12 upwards, the conclusion was reached that at none of the testing times was the suprasegmental clustering measure c ntributing any unique information to the analy- sis of the data. In other words, once the variance exylained by the recall and segmental clustering data had teen taken into account, suprasegmental clustering added no further explanation of variance. Therefore, the suprasegmental clustering measure was dropped in all subsequent steps in the theoretical analysis. Table 13 presents the second relevant segment of data obtained through the one factor MANOVAS. This table summarizes the within-measure tests of significance for the linear, quadratic and cubic trend components for recall and sewmental clustering. The constant term was eliminated in these analyses as we were concerned here only with the shape of the learning curves, not with the '0 N y interceut of the various curves. Scanning Talle 13 led to the conclusion that for both dependent measures there was a reliable cubic, quadratic and linear trend component at each testing time. 00 u come uOu uOuum ~00 EOpmmum mo mmmumea H u mHmecuom>£ wuzmmwe cfinqu comm qu Eoceeum mo memumeo 51 H000. Hmmw.mm H000. 000H.0H H000. voHv.0m 0000. 500m.NH H000. mvom.mm UHDSU H000. hVMv.mv H000. mmMH.hm H000. 00H0.0m H000. mmhm.Hm H000. v000.mm UHumuUMSO H000. HmHv.m0 H000. N000.00 H000. 0005.H0 H000. vaH.00 H000. 0mmv.0m NewcHd 0cHuwumsH0 Hmucesvmm H000. 0m0H.00 H000. Hmv0.mm H000. mnmm.mm H000. 0000.0v H000. mmm0.mm UHDSU H000. omnH.0v H000. 0005.vm H000. mva.mm H000. vmm0.00 H000. mmmm.mm UHumupmso 0000. 0050.00m 0000. mNHO.mmv 0000. mmmm.00v 0000. Hmvw.00m 0000. 0000.0Hm ummch HHmoex cmnu m cozy m cmnu m cmcu m can» b mmeH c300 mmeH c300 mmeH c300 mmeH c300 mmeH £300 a acum Q scum & mvum a meum Q moum ucocomeou vceub new > 5H HHH HH H eusmmwz uceocwmva meme madame? eefib dehumvk efiah ocfiumve weak mcfiumvh weak meaumeh .meujewae ucepceacp ounce ecu :cha3 mrcouu ufinne he efiumurmsv .umccHH mo eucmoauficoflm pew mumnu ucvpcommmcHuu.MH mamas 52 This outcome suggests that the learning curves over trials for all testing times have multiple components that each significantly increase the accuracy of the prediction of the line of best fit for the data. In other words, looking at the means over trials, within testing times, in Tables 9, 10, and 11 we can now safety state that the means do tend to increase significantly over trials as was originally hypothesized, but that this increase is not strictly linear since the significant quadratic and cubic components suggest a degree of curvature identifiable within each dependent measure over trials by testing times. In order to further clarify the differences in the lepes of the learning curves within dependent measures and testing times, the estimated coefficients for each component were computed. Tables 14 and 15 present esti- mates of the constant, linear, quadratic and cubic com- ponents of the curves; for recall and CS respectively. Each table of estimated coefficients is followed by the graph of the means of the data for that dependent measure; recall, Figure 2, segmental clustering, Figure 3. For CSS table and Figure, see Appendix C. The striking feature of the tables of estimated <:oefficients is the general similarity within each depend- <3nt measure. Although the various components are statis- tlically significant, the differences are relatively minor. Tdae major differences seem to occur in the estimated CWDefficients for the constant component, which represents 53 TAPLK l4.--Fstirated coefficients for the learning curves over trials, based on the recall data.* . ”2-: =_-=’== '1 L._—. ’~ "for"; - s: " '.' _‘ ‘ ‘7’. '11-:2' 2'2: T‘TZZ'.‘ " Testing Tires Constant Linear Quadratic Cubic I .306 .105 -.022 -.006 II .324 .107 -.013 .001 III .336 .101 -.006 -.005 IV .350 .104 -.030 -.017 V .353 .109 -.016 -.008 *Suprasemmental coefficients and graph are in Appendix C, pages 99 and 100. a .40 5 1; ea .9. '11 H 9- H .35 "3 'J !I) if. C) r—4 .5} «1 m . 30 U) n U ‘04 O C O r ...4 . 2 3 u L4 "\ I). 0 L4 .20 Trials 54 q Figure 2.--Proportion of possible recall achieved over four units by trials and testing times. 55 TABLE 15.--Kstimated coefficients for the learning curves over trials, Lased on the segmental clustering data. 1' '—=: x ‘ z‘ .‘ '2' fig "1": 2:1“..— 5: Testing Times Constant Linear Quadratic Cubic I .055 -_-”.i020 -.002 -.003 II .069 .035 .008 -.002 III .068 .029 .014 -.006 IV .083 .039 .001 -.008 V .078 .047 .011 -.001 56 Key: Testing Times .1201 I II ....... 111 _ _ v m _-_ / .1134 V a 0 IV Ach ievwl S I’Hfif; 11110 C i’rogx)rtizn1 of Trials H< d N4 U4 .5 Figure 3.--Pr0portion of possible segmental clustering achieved over 4 units by testing times and trials. 57 N N the y axis intercept. In effect, it appears that the major difference between the curves obtained at the various testing times within measures is representative of a con- sistent increase in performance on the dependent measures at trial 1 over testing times. The minor differences in the linear, quadratic and cubic components within measures might suggest thta the curves themselves are highly similar regardless of testing time, once the absolute differences at trial 1 have been equalized. Also of note is the dif- ference between the recall and segmental clustering measures in the nature of their non-linear components. All of the quadratic coefficients for the recall data are negative, suggesting decelerating curves or curves that flatten out over trials. All of the quadratic components for the segmental clustering data, except for testing time I, are positive, suggesting accelerating curves. Glancing at the graphs of the recall and segmental cluster- ing data with the coefficients (Figures 2 and 3 respec- tively), the tendencies descrihel are noticeable. Instru'tgnnalgwnalysis The hypotheses of interest in this analysis are: a. Within unit, dependent measure totals will increase over testing times. b. (1) Unit 1 dependent measures will be signifi- cantly higher at testing time II than at testing times I, III, IV, and V. 58 (2) Fnit 2 dependent measures will be signifi- cantly higher at testing time III than at testing times I, II, IV, and V. (3) Unit 3 dependent measures will be signifi- cantly higher at testing time IV than at testing times I, II, III, and V. (4) Unit 4 dependent measures will be signifi- cantly higher at testing time V than at testing times I, II, III, and IV. Lefore detailing the series of analyses that were under- taken to test the above hypotheses, it would be useful to peruse the within-unit and overall means that were obtained from the samples. The means are presented in Tables 16 (recall), 17 (segmental clustering), and 18 (suprasegmental clustering). TABLE l6.--The mean proportion of possible recall achieved over four trials by testing times and units. ’--'-:‘- Units Testing Times 1 2 3 4 I .348 .403 .238 .300 II .395* .443 .250 .290 III .375 .470* .275 .295 IV .388 .465 .303* .310 V .350 .445 .308 .363* *The "b" hypotheses suggest that these means should be significantly higher than any other within-unit means. 59 TABLE 17.--The mean proportion of possible segmental clus- tering achieved over four trials by testing times and units. "_‘l=—3 Units Testing Times 1 2 3 4 I .090 .017 .048 .059 II .126* .028 .060 .058 III .121 .029* .062 .057 IV .145 .029 .083* .066 V .094 .024 .083 .092* *The "b" hypotheses suggest that these means should be significantly higher than any other within-unit means. TABLE 18.--The mean proportion of possible suprasegmental clustering achieved over four trials by testing time and units. Units Testing Times 1 2 3 4 I .075 .099 .057 .077 II .101* .115 .062 .064 III .079 .132* .071 .052 IV .094 .137 .080* .074 V .075 .135 .085 .089* *The "b" hypotheses suggest that these means should be significantly higher than any other within-unit means. 60 In regards to hypothesis a above, the means, if ranked by testing times should appear as follows: I / O / l 5 .400. 2 I > <1) -«-4 '9 .3734 m .350. 3 l l 3 m 3254 /// O C / 3 _2 .300. 4 ~\\‘ , r—————--—”‘ ~\./-/ / g .275, ////’///’ L4 :1. .250‘ ’//”“//// 3/ .225 . r Tefitlnq I II III IV V limes Figure 4.--Pr0portion of possible unit recall achieved over four trials by testing times. *Graphs for segmental and suprasegmental clustering data are in appendix D. 69 is at this point that the ordering becomes useful, because we know that the absolute difference between the recall means at testing times II and V is greater than the abso- lute difference between the means at testing times IV and V. Therefore it may be concluded that for at least the recall data in unit 4, the performance peaked at testing time V and was significantly higher than that at any prev- ious testing time, as was hypothesized. The data and hypothesis tests relevant to the instructional multivariate hypotheses have been summarized. It can now Le stated that the recall measure was the only one that included significant differences, and further, that only the recall data for units 3 and 4 evidence significant differences. For unit 3 and unit 4 recall data then, there is a significant increase over testing times, and the unit 4 recall data does peak as expected at testing time V when unit 4 has just been studied. From a general point of view, while some parts of the data did perform as expected, it could only be said that the multivariate instructional hypotheses were only weakly supported. Further discussion of this set of analyses will be presented in Chapter V. 7O Correlations Between Dependent Variables and Mastery Test Performance Overview and Hypotheses The following section first outlines the process of analysis used to examine the hypothesized relationships between unit dependent measure performance (Recall - R° I Segmental clustering - CS); Suprasegmental clustering - C55) and the unit mastery test performance. Once the process of analysis is clarified, the data itself will be presented and described. 3. a. At testing time I, or preinstruction, there will be no correlations between performance on the dependent measures and the mastery test scores. b. If a unit has been compelted at testing times II, III, IV or V, there will be a positive correlation between performance on the dependent measures and the mastery test scores. Process of Analysis A series of Pearson product moment correlation matrices were first computed using all 350 subjects divided by testing time. Therefore each matrix was based on a pOpulation of 70 subjects. The resultant correlations were low, with only a small percentage evidencing significance. It was then decided that the above mentioned out- come might be an artifact of the wide range of completion lstatuses possible at any given testing time in a course wherein students may advance at their own rate (see page 35). 71 To reduce the possibility of this artifact's effect the five pure completion statuses (0000, 2000, 2200, 2220, 2222) were used to extract five new populations across testing times. Recall that 0 represents a unit not yet started and 2 signifies a unit that has been finished. In effect then, all people with one status code 2200 (units 1 and 2 finished, units 3 and 4 untouched) were grouped together, regardless of the testing time. These five new samples were then used as a basis for a second analysis of the relationship between dependent measures and the unit mastery test scores. This second analysis produced more differentiated correlation coefficients. That is, a greater number of significant correlations were produced, and the range of the coefficients increased. The data to be presented and discussed then will be that data produced by the second analysis. The following table will summarize the five completion statuses used in the analysis, the completion status code that will be used for identification purposes in the matrices to be presented, the number of subjects (h) in each of the five samples and the critical values for the correlation coefficients at p i .10 for each of the five samples. Had reliability information been available, the coefficients reported would have been corrected for attenuation. Assuming a relatively low 72 TABLE 23.--The completion statuses, completion status codes, sample size and critical values for the correlation coeffi- cients at p < .10, to be used in the instructional correla- — tional analysis. Ins—“3‘3? -t==~v=;_-31 -—'—— - togpiiiéon Compleéégg Status N p i .10 0000 O 57 .21 2000 2 20 .36 2200 4 20 '36 2220 6 19 .37 2222 8 22 .36 reliability for the instruments, such correction would have increased the size of the correlations reported. Since reliability information was not available, the less con- servative alpha level of p i .10 has been used. This less conservative alpha is used with the awareness that the usual interpretation of 'significance" of the correla- tions may be spurious. Therefore, the interpretation of the significance of the relationships will be purely descriptive in nature. Correlations Between Dependent Variable Performance and Mastery Test Performance Two general comments are applicable to the data fin Table 24. First note that out of 60 correlations, only 15 are significant at p i .10. This would suggest 73 TABLE 24.--Matrices of correlations between unit dependent variable performance and unit mastery test performance by completion status. 781:.3' ....... Status Completion Code 0 2 4 6 8 R(u3a] l (R) 91 x “r81 .20 .11 .19 -.16 .14 R2 \ T82 .35* .22 .16 -.28 .56* R3 \ HTS3 .27* .34 .32 -.23 .49* R4 X HTS4 -.01 .36* .22 .26 .14 Segmental Clustering (Cs) C X MTS .22* .05 .35 -.07 .06 51 1 C X MTS .03 .21 -.42* -.O9 .00 52 2 C X HTS .04 .12 .12 -.04 .17 s3 3 C X MTS -.01 .39* .16 .15 .12 s4 4 Suprasegmental Clustering (CSS) C . MTS .19 .35 -.02 -.1O -.10 $51 1 C HTS .22* .48* -.05 -.23 .37* 582 2 C MTS .36* .40* .32 -.38* .48* 553 3 C . HTS .05 .33 .30 .04 .18 554 4 *Indicates significance at p 1 Key: MTS - Mastery Test Score Subscripts represent the unit in question .10 74 that the relationship between dependent variable perform- ance and mastery test performance, in an overall sense, is a tenuous one. Second, units 2 and 3 in the recall and suprasegmental clustering matrices contain eleven out of the fifteen significant correlations. This suggests that unit 2 and 3 recall and suprasegmental Clustering performances do, from among all units and measures, have the greatest tendency to be related to the appropriate mastery test performances. For this reason, only the unit 2 and unit 3 correlations for recall and supra- segmental clustering will be discussed further. In regard to the first hypotheses (no correlation at status 0), it is apparent for both units (2 and 3) that preinstructional recall and suprasegmental clustering correlate significantly and positively with later mastery test performance. In effect, there are individual differ- ences among the students before instruction in units 2 and 3 that are not entirely destroyed by instruction, i.e., those students who perform well on the unit 2 and 3 dependent variables of interest tend to do well on later master tests on those units, and vice versa. In regard to the second hypothesis, after instruc- tion, again for units 2 and 3 in the recall and supra- segmental clustering data, the only consistent relation- ship is that at status code 8. In these cases there is a positive and significant correlation. In general, however, 75 the hypothesized post-instructional correlations are not in evidence. It would appear then that those individual differences which are exposed by free recall performance are not in any strong, systematic way related to individual differences in mastery test performance as was hypothesized. In summary, neither of the hypotheses for the correlational data are clearly supported. Because of the small sample sizes used and other design weaknesses to be discussed in Chapter V, no definitive conclusions should be drawn without replication. CHAPTER V DISCUSSION AND IMPLICATIONS Overview This chapter will discuss and consider the implica- tions of the data presented in Chapter IV. There are four major divisions within the chapter: the theoretical analysis, the instructional hypothesis; multivariate analyses, the instructional hypotheses; correlational analysis; and the general implications and recommendations for future research. The Theoretical Analysis: Change over Trials The hypothesis that the dependent measures would increase over trials was supported by the data. It does appear that the students' recall and clustering perform- ance increases over trials within each testing time. The nature of the curves is complex, each reflecting signifi- cant cubic, quadratic, linear and constant components. In general, the large number of observations may have provided us with a test of great strength that found significance in some of the components that appear little different, both numerically and graphically. This is particularly true within the linear and cubic components. 76 77 The major differences in the curves appeared to have been within the constant and quadratic components. The significant constant component is of particular interest because it would tend to support the conclusion that trial one dependent measure performance increases over testing times in a significant fashion. From an instructional point of view, this is heartening, as it implies that the students at each succeeding testing time have become more familiar with the terminology. Remember, however, that these findings relate primarily to the recall measure, secondarily to segmental clustering and little, if at all, to suprasegmental clustering. The quadratic component differences between recall and segmental clustering are also of interest. The nega- tive quadratic coefficients for recall suggest a decelerat- ing curve. This may imply that the recall would, after another few trials, flatten out and never rise again. There is a second possibility that will be discussed after the succeeding statements about segmental clustering. The segmental clustering, quadratic coefficients are primarily positive (excluding that for testing time I which was negative), which implies an accelerating curve, one that does not appear to be peaking or flattening out. This is important because it suggests that the reporting of the clustering organization was accelerating over trials. Perhaps, if more trials had been given, the clustering 78 would have increased in an absolute sense, to a sufficient degree to change the nature of the recall curves. For instance, it is possible that if eight or ten trials were given, the recall would, instead of flattening out from trial three or four on, merely reach a plateau for a few trials before resuming a second increase in performance or period of acceleration. One obvious suggestion then would be the possibility of studying performance over a longer series of trials. A second feature of interest is the redundancy of the suprasegmental clustering measure. Once the effects of the recall and segmental clustering measures have been examined, the suprasegmental clustering measure adds no new information in regard to change over trials. Though the suprasegmental clustering measure is redundant and unnecessary when the segmental clustering measure is obtainable for a set of words, or data, it is possible that by nature of the difference in definition of the clustering measures that there may be experimental questions based on materials that are not amenable to the use of the segmental clustering measures. These materials may be more amenable to the use of the suprasegmental clustering measure and analysis. This question in regard to the relative viability of the two clustering measures is one that appears worth further examination. 79 A final interesting feature is the low level of performance evidenced throughout the data. For example, looking at the overall means for the four trials in the recall measure (Table 9), the recall increases from 26% to 40% over four trials. This would indicate an approxi- mate increase from 12.7 (26%) words on trial 1 to 19.6 (40%) words on trial 4. Bousfield and Cohen (1953) found E an increase over five trials with 60 words from a mean of 23.9 (40%) at trial 1 to a mean of 37.9 (63%) at trial 5. It was expected that since the words used in this research had been so recently studied in relative depth, that the performance would at least equal and perhaps be higher, that the performance on words which had not necessarily been under recent examination as those in the research of Bousfield and others. These findings suggest for simple recall purposes, instructional terminology does not become equivalent to "high frequency" words such as colors, animals, professors, etc., with such short exposure. Bousfield's clustering (1953) measure about doubled over the five trials, as did those in this study. Again it is not the percentage of increase that is dis- appointing, but rather the absolute level of performance observed. In general, it appears that the words and their relationships studied in the materials for this research .did not behave as high frequency words might have been expected to. This would suggest that long term exposure 80 to the terminolOgy, jargon, etc., inherent to various subject matter areas, would be necessary for the words to become "high frequency" for the individual. In retro- Spect, this makes a good deal of common sense. It would be interesting to experiment with doctoral students or scholars who are concentrating on a particular field of study over several years to see what effects could be noted in key terminology drawn from their fields, or in other words,tx)examine how long it takes people to acquire the often talked about "structure of knowledge" for a given subject area. Of course, alternative hypotheses to that above are possible for explaining the generally low level of performance. Perhaps the students were poorly motivated. This is possible, but most verbal learning research is done with similar kinds of students, so it does not appear to be a strong possibility. Also, it may be related to the strangeness of the task. This possibility could be examined by the manipulation of the number of trials, as mentioned earlier. The analysis of the hypothe- sized change in dependent measure performance over trials has been presented. The ensuing section will discuss the hypotheses of within unit increase in dependent measure performance over testing times. 81 The Instructional Analyses: Change within Units Over Time The means themselves, in the absence of statistical analysis, seemed to support the hypotheses of interest. True, the interactions were complex, with different units behaving differently at different testing times on dif~ ferent dependent measures, but the trends of means them- selves were seemingly supportive. The statistical analysis, however, suggested that neither clustering measure evi- denced anything of particular significance and that the recall measures did so only in units 3 and 4. The significance in units 3 and 4 makes one wonder why the lack of significance for the recall in units 1 and 2. It is possible that the words chosen from 1 and 2 were not as "crucial" in absolute terms as were those in units 3 and 4. On the other hand, if all could be defined as relatively equivalent in importance, perhaps the instruc- tion for units 3 and 4 was more "successful" than that for the first two units. Either hypothesis, or both, could be true, and both would be difficult to "test." I might add, however, that from the judgment of some of the material develOpers, the last two units are more differentiated and well developed than the first two, and this feeling is supported to some extent by the fact that the first two units are presently undergoing major redevelopment by those involved in the course. 82 The lack of any evidence that the clustering measures were non-redundant with the recall measure was also unexpected. In effect, the clustering measures are of no significance in showing change within units over time. This suggests that the students did not pick up any significant degree of organization over the quarter that could be exposed by the free recall task. Certainly these findings would negate the possibility of using the free recall task to measure acquisition of organization that was treated in an incidental fashion by the course mate- rials, as was the case here. To clarify, the students were not asked to memorize the key terminology, nor were they asked to form particular associations. The words were related contextually throughout the materials, but the way that the student stored the information he studied was not directly manipulated. If the structure of interest in this study was thought to be of crucial importance by the material developers, then the inclusion of Specific objectives and experiences designed to teach that struc- ture might make a difference. The results do imply, how— ever, that students on the average do finish the course (recall that all subjects that did not finish the course were deleted from the samples), and appear not to have concerned themselves, on their own, with the relationships between the key terminology identified by the author. Also, since all students who finish the course must pass 83 a series of mastery tests, the organization tapped by the free recall task does not seem to be a direct outcome of successfully studying the course materials to pass the tests given in the course. This relationship between the dependent measure performance and mastery test performance is examined further in the next section of Chapter V, when the correlational analysis is discussed. The possibility of the need for more direct tuition of the desired relationships is partially supported by re-examination of Johnson's work in free association (1967, 1965, 1964). He did his research in the context of physics. The words he chose were related to each other in the form of equations; which also incidentally must be known to solve physics problems which would comprise the evaluation instruments. The relationships among the various words chosen then were much more directly taught and mastery of them was demanded of the students to successfully complete the course. The discussion of the analysis of the hypotheses of change in within-unit dependent measure performance has been presented. The next section discusses the out- comes of the analysis of the hypothesized correlations between dependent measure performance and mastery test performance. 84 The Instructional Hypotheses: Correla- tions Between Unit Dependent Measure Performance and tnit Mastery Test Scores First, each of the hypotheses was discussed only on the basis of the significant correlations to be found in the data. In general, units 1 and 4 supported the first hypothesis (no relationship between entry dependent ' variable performance and later mastery test performance) and did not support the second hypothesis (positive cor- relations after unit completion). Units 2 and 3, on the other hand, did not support the first, and did support the second, particularly within the recall and suprasegmental clustering measures. In effect then, unit recall and suprasegmental clustering which evidence significant positive correlations before instruction also tend to show significant positive correlations after instruction and vice versa. The explication of this phenomenon is difficult. Perhaps some students had previous experience with the terminology, etc., in units 2 and 3, thus start- ing the quarter with an advantage over others who had no such previous experience. If this was the case, the instruction tended not to mitigate against or wash out these early advantages, as those who started out ahead, stayed ahead. It is possible that the post-instructional significant positives describe the same sort of individual differences in retrOSpect. This explanation would leave 85 two possibilities for the explanation of units 1 and 4. The first possibility is that the students had uniformly little advantage at the beginning and the instruction did not help or hinder students in any systematic way. The second possibility is thxt the students entered the units with the same variance of previous experience with the terminolocy, etc., but that the instructional materials helped the students without prior experience to overcome their initial disadvantage. One other possibility that has not yet been discussed is that for units 2 and 3, the mastery tests more directly test the organization exposed by the free recall task while unit 1 and 4 tests do not. Which of the above explanations is the most accurate, if any of them are, is impossible to say. Further research testing the effects of entry behavior, the effects of differently composed mastery examinations and the effects of differently organized textual materials would perhaps help unravel the relatively obscure data outcomes. The use of larger sample sizes would also lend credibility to the findings, as the need to regroup sub- jects. The correlational analysis may have produced spurious results. The general lack of significance and clear trends may be suggesting that the organization or learning needed to perform well on the free recall task is not closely related to or reflected by that necessary to do well on the mastery tests. Also, it is possible that the low 86 correlations are an artifact of limited differences in performance on the mastery tests. This is possible, because mastery tests by definition and design are not usually of high difficulty and discrimination in classical measurement terms. General Implications and Recommendations for Further Research The free recall task, at this time, would not appear to he of any practical use in an instructional setting. It seems to be subject to confounding influences and complex interaction patterns which provide anyone needing relatively clean information with results of little practical significance. The conceptualization for this research appeared lOgical enough, but it seems that, at the very least, we tried to look at too much too soon. Laboratory research is done under closely controlled conditions and it is perhaps naive to expect that so many controls could be successfully drOpped at once. It would be wise in the future for researchers attempting to apply such laboratory techniques to an instructional setting, to limit the scope of the research much more severely at first. In this study, for instance, group testing sup- planted the more frequently used individual testing; multiple units of material, testing times and dependent variables were used. Perhaps one or two of these 87 divergencies at a time is more than any paradigm could withstand. Add to this, the fact that students move at their own pace through the course and the nature of the study materials and mastery tests is out of the control of the experimenter, and it is not at all surprising, in retrospect, that the outcomes were somewhat disappoint- ing and unclear. The new dependent variable of suprasegmental clustering still seems like it may be worthy of further investigation, if not in the instructional setting, then back in the laboratory. Certainly in the correlational analysis, suprasegmental clustering produced far more significant correlations than did segmental clustering. This suggests the possibility that examination of student subjective organization, a piece of which suprasegmental clustering represents as defined in this study (page 19), might account for more variance than did the segmental clustering and suprasegmental clustering together. Also, suprasegmental clustering may be able to supplant segmental clustering in materials where that type of clustering is inappropriate. While the nature of classroom learning was not greatly illuminated in this study, it did make patently clear the complexity of the instructional setting and the probable need to develOp new means of assessing and studying this type of learning. More such attempts at 88 transfer from laboratory to classroom may be necessary before we know for sure whether laboratory techniques such as the free recall task have any practical significance outside of the laboratory setting, but this research would suggest that they do not. APPENDICES 89 APPENDIX A TESTING BOOKLET 9O TESTING BOOKLET Section I Section Leader Date Name Student # Place an "X” in the box beside those units you have completely finished. I have completed the following units: E Introductory E: Assessment [:::] Objectives S Strategies I completed the last of the above units on the following date: l have started but not completed the following units: Introductory Assessment Objectives [J [l U [1 Strategies 91 92 Random Order A Kohlberg teaching negative stimulus behavioral design terminal conditions I.Q. recognition curricular learning non-behavioral modeling environmental ends moral valuing beyond-school means pre-test positive strategies reSpondent human criteria Operant intellectual socio-emotional instruction variables F‘ Piaget givens convergent response ' complete recall incomplete shaping in-class attitude formal feelings attending committing preOperational reinforcers classical achievement 93 Randan Order 8 respondent attending beyond-school recognition positive design Piaget criteria intellectual ends convergent Kohlberg operant shaping non-behavioral formal givens terminal human response recall committing valuing variables attitude stimulus incomplete teaching classical preOperational moral ) environmental reinforcers learning v— __._-_r_-_'. feelings complete I.Q. conditions in-class socio-emotional means pre-test behavioral strategies achievement modeling instruction negative curricular 94 Random Order C learning means I.Q. Operant Piaget terminal criteria reinforcers instruction recall intellectual valuing moral behavioral variables shaping Kohlberg design classical feelings strategies convergent socio-emotional non-behavioral ends pre-test negative response formal curricular environmental attitude . preOperational I stimulus ‘7‘— committing achievement recognition givens attending complete in-class positive modeling beyond-school conditions teaching respondent human incomplete Random Order D means classical pre-test response givens incomplete valuing committing non-behavioral attitude socio-emotional I.Q. modeling beyond-school reinforcers intellectual teaching Operant recall curricular reapondent instruction preOperational moral recognition environmental formal negative ends positive human attending stimulus feelings conditions learning achievement in-class terminal shaping strategies behavioral complete variables convergent Piaget criteria Kohlberg design APPENDIX B INSTRUCTIONS TO THE SUBJECTS 96 INSTRUCTIONS TO THE SUBJECTS The following instructions were given verbatim to all subjects at all testing times. I am involved in research with Education 200. The task I am going to ask you to perform is designed to give us feedback to help improve the course. Your performance will in no way affect your grade or your progress in the course. Would you please fill out the cover sheet Of the booklet I am passing out? Please do not Open the booklet until I ask you to. --Pause while the students fill out the cover sheet . The task may seem strange to you but I hope you will find it enjoyable. Please don't Open the booklet until I finish giving the instruc- tions. When you do Open the booklet you will note that you have four pages that look like this (I hold up a sample word list). Each of these pages will have two columns of words. After each page like this, you will find a blank page. Okay. Now the task will go as follows. When I tell you to begin, you will have about one minute to study all of the words on the first page. When I tell you to stOp, you should immediately turn to the following blank page and write down all Of the words you remember from the study trial. You should write them down in whatever order they come to you. You will have about three minutes for each of these test trials. Finally, we will do four such trials, one right after the other. Now are there any questions? --Pause for questions. 97 APPENDIX C CSS COEFFICIENTS AND GRAPH FOR THE THEORETICAL ANALYSIS 98 APPENDIX C.--£stimated coefficients for the learning curves ver trials, based on the suprasegmental clustering data. Te:1ing Times Constant Linear Quadratic Cubic:= I .072 .039 -.016 -.004 II .078 .048 -.009 -.012 III .076 .034 .004 -.009 IV .090 .048 -.007 -.101 V .091 .036 -.022 —.011 99 ‘r. 9. July-- .120. .110 .090i Achieved O (D 0 SS .070‘ .0604 Proportion of Possible C .050 Llaallllal .040 ‘ 100 Trials Figure S.--Proportion of possible suprasegmental clustering achieved over four units by trials and testing times. APPENDIX D GRAPHS OF CS AND CSS DATA FOR THE INSTRUCTIONAL ANALYSIS 101 .123. a .1004 , E e ’ z 1 i 11 1 /’4 U / ‘12 /'——/—‘3 m U 075. / o / E H - h .,.4 / / g 4.x .—-—-"""" I O 74:‘J m .0504 “6 3/ c O H sJ L4 C) _______ -O ——————— <\‘ 6‘ 0023.1 ’l” ‘§~‘~2 L4 I c. 2,,” Key Units 1 ------- 2 ———--——-3 .000. - 4 Testing I II III IV 9 Times Figure 6.--PrOportion of possible unit segmental clustering achieved over four trials by testing times. 102 lO3 .lSO _--—".--"“‘~-‘2 ,0" II .123 ’1’ I” [I {'1 o a} 4 H .8 11::3 “.0, .07; i \ /‘7 l (m \\ . J C) \ / / m g .050 \- m Key Units “4 O l S 2 ——————— '3 02' 3—_ u . 3 S3 4—- 64 Li (L .oog Testing I 171 IfI iv v Times Figure 7.--PrOportion of possible suprasegmental clus- tering over four trials by testing times. ‘ u. __.v :1.- - . BIBLIOGRAPHY 104 BIBLIOGRAPHY Adams, J. A. Human memory. New York: McGraw-Hill, 1967. Ausubel, D. P. A subsumption theory of meaningful verbal _ learning and retention. J. Gen. Psychol., 1962, F 9g, 213-214. Ausubel, D. P., Stager, M., and Gaite, A. J. H. Proactive effects in meaningful verbal learning. J. Fd. Pslshol., 1969, 69, 59-64. _- Ausubel, D. P., Stager, M., and Gaite, A. J. H. Retro- ‘ active facilitation in meaningful verbal learning. — J. F3. Psychol., 1968, 29, 250-255. Belanger, M. Learning studied in science education. Rev. Ed. Research, 1969, 32:4, 377-395. Bereiter, C., and anleman, 8. Teaching disadvantaged children in theypreschool. Englewood Cliffs, N. .: Prentice-Hall, 1966. Bleuler, E. Organization and pathology of thought. (D. Rappaport, edT). New York: Columbia Uni- versity Press, 1951. Bloom, B. S. Mastery learning. Evaluation comment I, no. 2. Los Angeles: Center for the Study of Instructional Programs, University of California, 1968. Bloom, B. 8. (ed.). Taxonomy of educational Objectives: The classification of educatiOnal goals. Hand- book 1. Cognitive domain. New York: McKay, 1956. Bloom, B. S., Hasting, J. T., and Madaus, G. F. Hand- book on formative and summative evaluation of student learning. New York: McGraw-Hill Book Company, 1971. 105 106 Bousfield, W. A. The occurrence of clustering in the recall of randomly arranged associates. J. Gen. Psychol., 1953, 49, 229—240. Bousfield, w. A., and Cohen, B. B. The effects of rein- Briggs, L flillf‘f , Pruner , COfer, COfer, Deese , Dcwase, Frase, Frase, forcement on the occurrence of clustering in the recall of randomly arranged associates. J. Psychol., 1953, 32, 67-81. G. E. Retroactive inhibition as a function of the degree Of original and interpolated learning. J. Fxp. Psychol., 1957, 53, 60-67. J. S., Goodnow, J. J., and Austin, G. A. A study of thinking. New York: John Wiley & Sons, Inc., 1956. Jerome S. Thegprocess of education. Cambridge, Mass.: Barvard thiversity Press, 1960. C. N. On some factors in the organizational character- ‘ istics in free recall. Am. Psychologist, 1965, 20, 261-272. C. 3., and Bruce, D. R. Form-class as the basis for clustering in the recall Of non-associated words. J. Ver. Learning and Ver. Behavior, 1965, 4, 386-389. C. N., and Foley, J. P., Jr. Mediated generaliza- tion and the interpretation of verbal behavior: I. Prolegomena. Psychol. Rev., 1942, 19:6, 513- 540. J. Influence of inter-item associative strength upon immediate free recall. Psychol. Reports, 1959, 5, 305-312. J. On the structure of associative meaning. Psychol. Rev., 1962, fig, 161-175. L. T. Cybernetic control of memory while reading connected discourse. J. Ed. Psychol., 1969, 69, 49-55. L. T. Effect of incentive variables and type of adjunct question upon text learning. J. Ed. Psychol., 1971, 62, 371-375. 107 Gagne, R. M. The conditions Of learning. (2nd ed.) New York: Holt, Rinehart and Winston, Inc., 1970. Hays, w. L. Statistics. New York: Holt, Rinehart and Winston, Inc., 1968. Beimer, R. T. Conditions of learning in mathematics: Sequence theory develOpment. Rev. Ed. Research, 1969, 63:4, 493-508. henderson, J. B., Willard, S. M., Barnes, H. L., and r Prawat, R. S. The individual and the school. East Lansing, Mich.: Michigan State University Printing, 1971. Jenkins, J. J., and Russell, W. A. Associative clustering during recall. J. Abnorm. and Soc. Psychol., 1952, 91, 818-821. 5 Johnson, P. E. Associative meaning of concepts in physics. J. Ed. Psychol., 1964, 66:2, 84-88. Johnson, P. E. Word relatedness and problem solving in high school physics. J. Ed. Psyghol., 1965, 66:4, 217-224. Johnson, P. E. Some psychological aspects of subject matter structure. J. Ed. Psychol., 1967, 66:2, 75-83. Kent, G. B., and Rosanoff, A. J. A study of association in insanity. Am. J. Insanity, 1910, 67, 37-96. Kuhn, T. S. The structure Of scientific revolution. Chicago: University of Chicago Press, 1970, Miller, G. A., Galanter, E., and Pribram, K. H. Plans and the structure of behavior. New York: Holt, Rinehart andiWinston, Inc., 1960. Murdock, B. B., Jr. The immediate retention of unrelated words. J. Exp. Psychol., 1960, 66, 222—234. Neisser, U. Cognitive_psychology. New York: Appleton- Century-Crofts, 1967. Osgood, C. E. Behavior theory and the social sciences. Behav. Sci., 1956, 1:3, 167-185. 108 Paivio, A. Mental imagery in associative learning and memory. Psychol. Rev., 1969, 16, 241-263. Romberg, T. A. Current research in mathematics education. Rev. Ed. Research, 1969, 26:4, 473-491. RothkOpf, E. Z. Textual constraints as a function of repeated inspection. J. Ed. Psychol., 1968, 62, 20-25. RothkOpf, E. Z., and Coke, E. U. The concept Of mathe- magenic activities. Rev. Ed. Research, 1970. 4__g, 325-336. E” RothkOpf, E. 2., and Coke, E. U. Learning about added sentence fragments following repeated inspection of written discourse. J. Exp. Psychol., 1968, 16, 191-199. Russell, W. A., and Jenkins, J. J. The complete Minnesota i norms for responses to 100 words from the Kent- Rosanoff word association test. Technical Report NO. 11, University of Minnesota, Contrast NBOnr- 66216, Office of Naval Research, 1954. Saltz, E. The cognitive bases of human memory. Homewood, Ill.: Dorsey, 1971. Skinner, B. F. Verbal behavior. New York: Appleton Century Crofts, Inc., 1957. Smith, H. A. Curriculum develOpment and instructional materials. Rev. Ed. Research, 1969, 66;4, 397- 413. Staats, A. W., and Staats, C. K. Complex human behavior. New York: Holt, Rinehart and Winston, Inc., 1963. Thorndike, E. L., and Lorge, I. The teacher's word book of 30,000 words. New York: Cambridge University Press, 1944. Tulving, E. Intratrial and intertrial retention: Notes toward a theory of free recall verbal learning. Psychol. Rev., 1964, 11, 219-237. Tulving, E. Subjective organization in free recall of "unrelated words." Psychol. Rev., 1962, 66, 344-354. 109 Tulving, E. Theoretical issues in free recall. In T. R. Dixon and D. L. Horton (eds.), Verbal behavior and general behavior theopy. New YOik: Prentice Hall, Inc., 1968. Tulving, E., and Madigan, S. Memory and verbal learning. Annual Rev. Psychol., 1971, 21-84. y F li11;d.al ii! in .II ' MIC” lI”ll/£1111fllfllllllllm“