HYI’OTHESIS SAMPLING AND IIIFDRMATION PROCESSING IN CONCEPT IDENTIFICATION Thesis for the Degree of M. A. MICHIGAN STATE UIIIVERSITY DAVIE} IOHH DePALMA 1971 IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII ' 3 1293 10 .- - u. .u- u .m-mrm’; 1.- ): 3;). 11.3; 1:: Y Eh I i i‘v‘IuiIngafl Stats ‘11 University .- 3~_'.u‘/ 3“ - r ABSTRACT HYPOTHESIS SAMPLING AND INFORMATION PROCESSING IN CONCEPT IDENTIFICATION BY David John DePalma Many researchers have replicated the "outcome effect" in experiments in concept identification (Richter, 1965; Levine, 1966; Kornreich, 1968; DePalma, 1969; and others). Some of the methodological problems, however, have received little attention. DePalma tried to avoid some of these problems by using a modified version of Richter's design. The approach proved successful, but it too had some deficiencies. The present study extends this earlier investigation, and examines the effects of memory aid and frequency of experimental question on subjects' performance. The relationship between problem type, sequence of outcomes and performance is also investigated. Subjects were asked the question, "How did you make that choice?" either after each trial or only once per four- trial problem. Half of each of these groups were allowed to use a "memory" aid, paper and pencil, while half were not. All subjects were given sixteen four-trial, four- dimensional problems, with each dimension (color, letter, David John DePalma size or position) correct (relevant) an equal number of times. The variables of question frequency and memory aid were controls in this study, since it was expected that neither one would have differential effects on performance. It was predicted that: 1) Problem type and sequence of outcomes would be influential factors on per- formance; 2) Changes of hypothesis would occur after rights as well as after wrongs; 3) A new hypothesis would not always be consistent with the information the subject received on an error trial; 4) The subject would_consider as hypotheses stimuli which had failed to pass a consistency check; 5) Hypothesis-sampling would occur with replacement; 6) The outcome effect would be replicated; 7) No one parti- cular strategy (Win-Stay, Lose-Shift) would be used more frequently than any other, a) subjects would respond on the basis of one hypothesis while processing one or more hypo- theses, b) a more important factor (than stratng) in processing would be how subjects used the information they received, especially "wrong" information. Ninety-six undergraduates were individually tested with the experimenter giving outcomes, asking the experi- mental question, and recording the subjects' responses. Analysis of question frequency, memory aid and problem type showed a significant effect of problem type. No other effects were significant. Size problems resulted in the lowest level of performance. This result replicates David John DePalma DePalma's earlier work, but requires further research for an explanation. In another analysis of variance, the main effects of sequence of outcome and problem type were signif- icant. Of these, sequence proved to be more influential. The other predictions were confirmed. From these data, it was observed that some subjects processed and recorded "wrong" information with "right" information. Other subjects only utilized a portion of the available input. The latter method of processing caused many difficulties, and usually did not lead to solution- attainment by these subjects. It was concluded that subjects who can use correct ("right") and incorrect ("wrong") information effectively, will solve the problems despite the experimental conditions employed. However, subjects who have trouble processing information (espe- cially "wrong") may be affected positively or negatively by the same methodology. The hypothesis-sampling and processing of such subjects should be more closely investi- APPROVED‘ W DATE: 3 Want /?7/ gated in future studies. HYPOTHESIS SAMPLING AND INFORMATION 'PROCESSING IN CONCEPT IDENTIFICATION BY David John DePalma A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of MASTER OF ARTS Department of Psychology 1971 To my parents, who have made all this possible. ii ACKNOWLEDGMENTS I would like to express my gratitude to the members of my committee: Dr. Ellen Strommen, Dr. William Stellwagen, Dr. Gordon Wood and Dr. Donald Johnson for all their assist- ance in planning this research. Dr. Strommen particularly deserves my appreciation for the time, energy and encourage- ment she offered me during the entire project. I am especially indebted to Dr. Martin Richter, Lehigh University, who introduced me to this research area, and without whose generous assistance this project would not have been realized. The students who served as subjects should be thanked for their time and efforts. I would also like to thank Charlotte Wright for her help with the data, and her advice and enthusiasm during the study. iii LIST OF TABLES . . . INTRODUCTION . . . . METHOD . . . . . . . RESULTS . . . . . . DISCUSSION . . . . . REFERENCES . . . . . APPENDIX A EXPERIMENTAL DE B QUESTION MATRIX TABLE SIGN OF CONTENTS iv Page 12 20 31 37 40 41 Table 1. LIST OF TABLES Page MEAN NUMBER OF CORRECT PROBLEMS AND VARIANCE FOR EACH PROBLEM TYPE BY GROUP CELL . . . . . . 20 SUMMARY OF ANALYSIS OF VARIANCE OF QUESTION X MEMORY AID X PROBLEM TYPE . . . . . . 21 PERCENTAGES OF CORRECT-CHOICE RESPONSE ON TRIAL FOUR FOR PROBLEM TYPE AND SEQUENCE OF OUTCOME . 23 SUMMARY OF ANALYSIS OF VARIANCE OF PROBLEM TYPE X SEQUENCE OF OUTCOME . . . . . . . 25 PERCENTAGES OF CORRECT-CHOICE RESPONSE ON TRIAL FOUR FOR QUESTION AND MEMORY GROUPS BY SEQUENCE OF OUTCOME . . . . . . . . . . . . . 26 FREQUENCY OF SUBJECTS BY NUMBER OF CORRECT PROBLEMS O O O O C O O O O O O O O O O O 2 7 NUMBER OF SUBJECTS GIVING CORRECT-CHOICE RESPONSE ON TRIAL FOUR FOR EACH PROBLEM AND PROBLEM TYPE . . . . . . . . . . . . . . . . 29 PERCENTAGES OF CORRECTNESS AND USE OF HYPOTHESIS-SAMPLING STRATEGIES FOR PROBLEM TYPES O O O O C O O O O O O O O O O O O O O I O 30 INTRODUCTION Many theories of hypothesis sampling and informa- tion processing in concept identification have been proposed (Levine, 1966; Bower and Trabasso, 1964; Rogers and Haygood, 1968; and others). However, many of the assumptions of these theories appear questionable in light of recent research. One of the problems with these theories is that they usually describe only good solvers, and the situation where subjects use information correctly. The shortcomings of these descriptions are twofold. First, few (if any) generalizations can be made in the application of these assumptions to the processing of poor solvers. And secondly, DePalma (1969) has shown that even the assump- tions regarding good solvers are not completely accurate. Some of the other problems with these theories involve the methodologies used to support them. These procedures have in some way been inappropriate, too highly structured or unsatisfactory to examine the broad range of questions involving the theories. The present study was designed to test the assump- tions concerning hypothesis sampling, avoid some of the methodological problems of past research, and investigate factors involved in subjects' processing of the informa- tion they received from the outcomes. The conceptualization in human discrimination learning of subjects as information processors and analyzers is relatively recent in psychology. Researchers in concept identification characteristically utilize computer termi- nology for such description. Levine (1966) proposed a theory (similar to Bruner, Goodnow and Austin's (1956) "focusing" strategy) in which he assumed that subjects remember (encode) all the logi- cally correct cues after an outcome ("right" or "wrong"), store these "hypotheses," and then test these hypotheses on subsequent trials. This allows the subject to eliminate hypotheses until the one correct solution remains. To test this theory, Levine proposed a method whereby the set of possible hypotheses was determined by the experimental situation and the one hypothesis which the subject was "holding" after each trial could be inferred. The outcomes, "right" or "wrong," were controlled so that the effects of outcome on retention or rejection of a hypothesis held could be analyzed. To obtain the necessary information, Levine devised the "blank" trial method, in which four blank (no outcome) trials are presented between the outcome trials. If the subject responded on the basis of a single hypothesis, that hypothesis manifested itself in a distinguishable sequence over the four blank trials. From his studies, Levine found that subjects respond on the basis of a hypothesis until a "wrong" outcome is received, at which time they shift to another hypothesis. These data yielded evidence that the subjects hold several hypotheses at one time and eliminate several simultaneously. This led Levine to the formulation of his "focusing" strat- egy, and in a recent study (1969) he proposes: a) the subject samples a subset of hypotheses, b) then the subject takes one, a working hypothesis, from this subset as the basis for his response, c) the subject uses the outcome "right" or "wrong," to evaluate those hypotheses in the subset. The emphasis here is on the subject's monitoring of more than one hypothesis at a time even though he uses only one hypothesis as the basis for his response. Another hypothesis-testing model of concept identi- fication is the Bower and Trabasso (1964) theory. The basic assumptions of this theory postulate: 1) a change of hypothesis occurs only after a "wrong" outcome, which infirms the hypothesis on which that response was based. 2) a new hypothesis is always consistent with infor- mation (about stimulus and response assignment) given on an error trial. 3) stimulus dimensions failing to pass consistency check are not considered possible hypotheses during the selection process. Although these two theories have been the most noteworthy in the literature, a recent probabilistic model by Rogers and Haygood (1968) attempts to explain hypothesis- testing in concept identification as a process in which: 1) the subject discovers a working hypothesis by changing his hypothesis frequently until he is "right" more than 50% of the time. 2) after discovering a hypothesis which works better than chance, the subject adds amendments, until the solution hypothesis (minus irrelevant hypotheses) has been obtained. 3) the subject no longer changes his hypothesis, and continues to respond on the basis of this hypothesis (see also Falmagne, 1970). There have been many studies criticizing the assump- tions of these theories. Bower and Trabasso's contention that after an error trial, the subject resamples with. replacement from the "hypothesis-pool," has been questioned. Restle (1962) provided the original proposition for such sampling, then Bower and Trabasso (1964, 1966) presented supporting evidence. Most recently, Merryman, Kaufmann, Brown and Dames (1968) concluded that the sampling-with- replacement theory could not be rejected. However, not all of the available evidence supports such a conclusion. Levine (1966), Erickson (1968) and Nahinsky and Slaymaker (1969) have obtained strong support for the contrary proposal that after an error, sampling cannot occur with replacement, but instead occurs without replacement. One aim of the present study was to obtain further evidence relevant to this-issue. Another area of some dispute has been the effects of "right" and/or "wrong" on the information-processing of the subject. Rogers and Haygood (1968) found that for a block of errorless trials, the subject is just as likely to change his hypothesis as he is to keep it; and with at least one error, the subject is as likely to keep his hypothesis as he is to change it. The authors point out that the subjects could have changed hypotheses because of impli- cation; that is, because of the experimental procedure. Unfortunately, Rogers and Haygood seem to dismiss this possibility as easily as they proposed it. They also found that subjects who take longer to respond make more errors, and change hypotheses more often than low-latency subjects. Merryman, Kaufmann, Brown and Dames (1968) found that after six non-contingent trials of either "right" or "wrong," the "wrong"s had no effects on performance, while the "right"s produced a retarding effect on subsequent learning. From their data, they also decided to reject the idea that the subject keeps his hypothesis after a correct trial. Similarly, Nahinsky and Slaymaker (1969) and Dodd and Bourne (1969) found evidence that subjects change hypotheses not only after an error trial, but also after a correct trial. However, not all the evidence substantiates these conclusions. Bourne, Dodd, Guy and Justesen (1968) observed earlier that although learning occurs on all trials, changes occur only after an error. Levine (1966) and Bower and Trabasso (1966) concur on this point, as mentioned earlier. And more recently, Trabasso and Staudenmayer (1968) have obtained data which indicate that random reinforcement effects, that is, non-contingent feed- back, are problem or dimension specific, especially if the subject is familiar with (knows) the stimulus dimensions. In the present study these random reinforcement effects were avoided by using contingent outcomes. It was hoped that this procedure would provide more relevant (and more accurate) information regarding subjects' processing than non-contingent feedback. Some other problems which have received relatively little attention are concerned with the methodology of the experiments themselves. In Levine's "blank" trial method, only one hypothesis is tapped on a given trial, although it has been shown that subjects hold several hypotheses simul- taneously. Levine himself realized this, but he continues to use this method. This conflict between experimental procedure and the "observed" mode of information-processing is an important shortcoming of the methodology, not only with regard to the sampling of hypotheses, but also with respect to the effects of the outcomes. Kornreich (1968) used two procedures to circumvent these trouble areas. In the first, he used a modified "blank" trial procedure and preprogrammed the outcomes. In the second phase of the study, the outcomes were depend- ent on the subject's responses. During the experiment, the subjects were faced with eight buttons on which all the possible hypotheses were written. Subjects were asked to indicate which hypotheses still could be correct after each of the outcomes. This procedure supposedly taps all the hypotheses held by the subject. Another group of sub- jects was run under Levine's "blank" trial method. No significant differences were found between procedures in effect on correct processing or selecting. However, there may not have been any differences because of the highly structured cue aid (the eight buttons). The present study examined this problem more closely by providing some sub- jects with a completely unstructured cue aid, paper and pencil. These subjects were able to use the information they received on each trial in a manner more consistent with their own mode of processing (without the influence of the eight buttons). Another methodological problem occurs in experiments such as that of Merryman, Kaufmann, Brown and Dames (1968) in which a group of non-contingent "right"s or "wrong"s is presented to the subject. It does not seem reasonable that the same mode of processing operates in this setting and in the situation where the subject receives contingent "right"s or "wrong"s (or a mixture of contingent outcomes). The effort to make the outcomes non-contingent fails in the former procedure, because the subject begins to use more information than he would under a contingent paradigm. That is, the subject notices that no matter what he says the outcome is the same. So he experiments with many possibilities, changing hypotheses frequently--probably more frequently than he would under the contingent situa- tion (at least for "right"s). Thus, the changing of hypotheses and the retarding effects of the group of errorless trials are artifacts of the procedure, not true indications of "what is happening." Certainly, such random reinforcement results cannot provide prototypes for suc- cessful hypothesis-sampling theories in the mixed outcome condition. In an effort to avoid these problems, and investi- gate some other aspects of the concept identification task, DePalma (1969) used the four-trial, four-dimensional dis- crimination problems of Richter (1965) with one important modification. A question designed to tap the hypotheses held by the subject (but without the cue aids of Kornreich) was asked by the experimenter once during each problem. The question was purposely very vague--"How did you make that choice?"--and was asked only once per problem to keep interference with the subject's processing at a minimum. Otherwise, the problems might have become question-answering tasks. The question was asked after the outcome, because prior questioning might have shaped or interfered with the subject's processing. It was hoped that asking the ques- tion once per problem would not have a detrimental effect on the subject's processing ability. The data indicated that the experimental question interfered no more than Kornreich's procedure had, if it interfered at all. The effect of wrongs on performance as observed by Richter (1965), Levine (1966) and Kornreich (1968) was replicated. That is, the probability of correct- choice response on trial four decreased as the number of errors on the first three trials (from 0 to 3) increased. This has been labelled the "outcome effect." But further analysis of the data suggested that the effect was much more complex. The sequence of rights and wrongs and the problem type (color, letter, size and position) appeared to be related somehow to the probability of correct-choice response on trial four. This differed from the outcome effect because sequences which had the same number of wrongs on the first three trials had different probabili- ties of problem solution. For two of the problem types, o+o (wrong, right, wrong) was more detrimental than 000 (wrong, wrong, wrong)! These results indicate the importance of sequential effects of the outcomes. The traditional outcome effect 10 explanation fails to account for this. What seems so simple at first glance appears so only because of averaging of results--when each sequence is studied separately, the complexity is revealed. The responses given by the subjects contingent upon the outcome on the previous trial were also analyzed. It is interesting to note that being incorrect on the previous trial resulted in a fairly constant level (proba- bility) of being correct on trial four of approximately 67%. Being correct, however, on the previous trial led to much higher percentages, averaging around 84%. It seems from these data that one of the effects of error on the previous trial is to make some subjects use more elements (hypotheses) at a time when they should be narrowing down the choices, not expanding them. Of course, being correct on the first three trials increased the probability of the subject's being correct on trial four. The subjects who responded correctly on a given trial outperformed subjects who responded incorrectly on corresponding trials with regard to final solution attainment for every trial. Thus, this experiment supported Levine's contention that "wrong"s affect problem-solving differently. However, the subjects did not code (attempt to remember) the stimulus cues as Levine says, before the outcome, but after it. Kornreich stated that after the outcome the subject simply encodes the correct stimulus ll cues. This corresponds to the Bower and Trabasso theory mentioned earlier. Logically, subjects should encode only the correct cue information, but DePalma observed that sub- jects will, in fact, encode "wrong" cue information instead of the correct stimulus cues. This ultimately led to incorrect response choice on trial four. The subject "knew" how to solve the problem, since he had solved some correctly, but sometimes he used the "wrong" information. The interference was not in the "focusing" strategy em- ployed, but in the coding--either incorrect information, or the non-utilization of all the available information. It also seems that subjects encode all the hypotheses or stimulus cues, but decide on the basis of only one. This result has been confirmed recently by Levine (1969). Contrary to Levine's theory (and others), it was observed that sampling occurred with replacement, since hypotheses were frequently repeated during a problem. Thus, the problem type and sequence of outcomes play an important part in the subject's processing and performance. However, it is possible that the subject's performance might be affected by the availability of "memory" aid, or by the frequency of the probes (experi- mental questions). The present study extends DePalma's (1969) study by examining the effects of memory aid and frequency of probes on performance. The relationship be- tween problem type, sequence of outcomes and performance is investigated more closely. METHOD Studies such as Kornreich's (with their cue aids) were criticized for structuring the subject's responses so that the observed data are not true "tapping"s of the sub- ject's hypotheses. In the present study we hope to remedy this by using a less structured methodology, allowing sub- jects to use paper and pencil while working the problems. This unstructured cue aid will permit the subject to use all or a portion of the information he receives from the outcomes according to his own method of processing. If memory for previous responses is an important component in problem solution, subjects using pencil and paper should perform differently from subjects who have no cue aid. One group of subjects will be allowed to use paper and pencil to help them, while the other group will not. The main purpose of the present study was to examine problem type and sequence of outcome in greater detail. From DePalma's (1969) data and the considerations reviewed above, we predict: Hypothesis one: Problem type and sequence of out- come will be important factors for correct-choice response on trial four. 12 13 Hypothesis two: Changes of hypothesis will occur after "right"s as well as after "wrong"s. Hypothesis three: A new hypothesis will not always be consistent with the information the subject receives on an error trial. Hypothesis four: Stimuli which fail to pass con- sistency check (as determined by the experimenter) will be considered possible hypotheses during the selection process. This hypothesis and the previous one could be combined by stating that subjects will not always use hypotheses which are consistent with (logically follow) information they receive. Hypotheses two through four are in disagreement with the Bower and Trabasso theory mentioned earlier. We will agree, however, that: Hypothesis five: Hypothesis-sampling will occur with replacement. Hypothesis six: As the number of "wrong"s on the first three trials increases from 0 to 3, the probability of correct-choice response on trial four will decrease. This will not be a simple relationship, however, if DePalma's sequential effects are replicated. Hypothesis seven: No one particular strategy of processing (Win-Stay, Lose-Shift, for example) will be used more frequently than any other, a) subjects will respond on the basis of one hypothesis while processing one or more, 14 b) a much more important concern will be how subjects use the information they receive, especially "wrong" informa- tion. In DePalma's study it was observed that asking the experimental question once per problem was not detrimental to the subject's performance. And, it was assumed that asking the question after each trial would change the nature of the task. However, this assumption was not tested. In the present study both conditions are used, to see whether frequency of question influences performance on the task. Subjects Ninety-six undergraduates (76 females, 20 males) enrolled in an introductory psychology course at Michigan State University served as subjects. Stimulus Cards and Problems The discrimination problems consisted of sets of cards on which were drawn two stimuli about 1-1/2 inches apart. The stimuli varied on four dimensions--color, letter, size and position. The colors and letters differed for each problem. Large letters were 1-1/2 inches, small letters one inch in height. A problem was composed of four such cards, and the outcome, "right" or "wrong," was given after the subject's 15 response to the card. The four cards formed a set with several properties. Each value of each dimension was com- bined exactly twice with the values of all the other dimensions. The set provided that, after the first out- come, four of the eight cues remained as logically possible solutions; after the second outcome, two remained; and after the third outcome, the solution was logically deter- mined. This was true whether the outcomes were "right" or "wrong." Also, the subject had a 50% chance of choosing the correct stimulus on the first three cards. Using the three cards (trials) #2, 3, and 4, it was possible to construct (for each problem) three combina- tions of the cards so that each card type was present on each trial over the three combinations. That is, these combinations were possible: 1,2,3,4; l,3,4,2; and 1,4,2,3. This balanced for sequential effects across subjects and enabled the experimenter to make inferences that were not problem specific. The three combinations were labelled A, B, C. Design and Procedure The design was a simple 2x2x4 question x memory aid x problem type factorial design with repeated measures on the last factor. The two question conditions were: 1) question after each trial, and 2) question once per problem. This condition existed to test the effects 16 (facilitative or detrimental) of the experimental question. The two memory aid conditions were: 1) the paper and pencil group, and 2) no paper and pencil. These groups tested the effects of an unstructured cue aid on the subject's pro- cessing. Subjects in each of these groups were given color, letter, size and position problems (see Appendix A). These conditions provided that there be two analyses of variance for the data. The first analyzed the effects of question, memory aid and problem type on performance (hypothesis given on trial four). And the second analysis examined the effects of sequence of outcome (on the first three trials) and problem type on performance. The question--"How did you make that choice?"-- was asked in the question-once-per-problem condition according to a schedule determined by a 16x16 matrix of trials vs problem type (see Appendix B). This matrix provided for the experimental question to be asked after each trial across all subjects (N.B. not for each subject). Of course, no matrix was needed in the question-after-each- trial condition. The deck types A, B, or C were assigned to the subject as he entered the experimental situation in the order--(A,B,C,A,B,C...). Thus, deck types and question trial were controlled across subjects. Each subject was instructed and tested individually. The instructions were nearly identical to those used by Richter (1965). The difference was that the subject was 17 given four practice problems--one of each problem type--to the criterion of correct solution. After practice, all subjects were given one more problem (which did not count in the experiment) and then the sixteen test problems. It should be noted here that no two problem types followed one another more than twice over the sixteen problems. The subjects turned the cards over at their own speed. The subjects' responses to the experimental question were recorded, as were the outcomes given by the experimenter. Instructions The instructions as given to the subject were as follows: "This is an experiment in problem-solving. We want to see how quickly you can solve some very simple problems." "I will show you a card like this (show cX). Each card will have two different letters side by side, each of a different color and different size. Each problem will consist of a series of cards with different combinations of the two letters, two colors, two sizes and two positions, (left and right)--like this...(show first three or four cards of cX)." "For each card I want you to point with this pen to the one you think is correct, either this one or that one (demonstrate). Hold the pen in that position until I tell you whether you are right or wrong. Then you may turn to 18 the next card and again point to the one you think is cor- rect. After you have turned over a card you may not turn back to it. The idea, of course, is for you to try to get as many right as possible. (The paper-and-pencil subjects were instructed that they could use paper and pencil pro- vided by the experimenter in the task)." "In all these problems the solution is of the simplest kind; either the same letter, the same color, the same size or the same position will be correct throughout a single problem. In order to give you a better idea of the procedure and the kind of problems you will have, let's begin with the first practice problem. Are there any ques- tions before we begin?" After practice: "(As you can see yourself, you were getting them all right toward the end).* On all the cards of this problem, one element (color, letter, size or position) was always correct. All the other problems you will have will also have solutions as simple as these. For some problems the large letter (or the small) will always be correct; for some the one on the right (or left) will always be *The bracketed material was used only if the sub- ject had actually gotten at least the last three trials correct. For those who did not, only the second half of the sentence was read, followed by a demonstration by the experimenter of the correct responses on the last four trials of the problem. 19 correct. Sometimes it will be one of the colors, and some- times it will be one of the letters itself that will always be correct." "However, the problems will be much shorter than the practice problems; there will only be four cards in each problem, while there were twelve in the practice problems. Thus, although the solutions are simple, you must solve the problems very fast in order to get as many right as possible. Also remember that once you have turned over a card you may not refer back to it again. Are there any questions?" RESULTS Since each subject was given four of each problem type, he could have had from 0 to 4 correct-choice responses on trial four for each type. The mean number of correct problems and the variance for each problem type and condi- tion appear in Table 1. TABLE I MEAN NUMBER OF CORRECT PROBLEMS AND VARIANCE FOR EACH PROBLEM TYPE BY GROUP CELL Problem Type Color Letter Size Position NP Mean = 3.17 3.21 2.79 3.08 Variance = .58 .17 .65 .69 Q PP Mean = 3.17 2.96 2.88 3.04 Variance = 1.36 .99 .90 .74 NP Mean = 3.21 3.04 2.75 3.29 Variance = 1.04 .56 84 .56 QAE Mean = 3.42 3.17 2.92 3.29 PP Variance = .43 1.28 1.12 .39 Q = question once per problem QAE = question after each trial NP = no paper and pencil PP = paper and pencil 20 21 The solution data were analyzed by a three-way question x memory aid x problem type analysis of variance with repeated measures over problem type (see Table 2). This analysis showed a significant main effect of problem TABLE 2 SUMMARY OF ANALYSIS OF VARIANCE OF QUESTION X MEMORY AID X PROBLEM TYPE Source of Sums of Variance Squares DF F P Between A Question no. 0.94 l .86 NS B Memory aid 0.31 l .28 NS AB 0.75 l .69 NS S within groups 100.56 92 (error) Within C Problem type 11.34 3 5.73 <.01 AC 0.84 3 .42 NS BC 1.09 3 .55 NS ABC 0.41 3 .21 NS C x‘gs within 182.57 276 (error) Total 298.81 383 22 type (F = 5.73, df = 3/276, p < .01). None of the other Inain effects or interaction was significant. The question :frequency and presence of a "memory" aid had no effect on asubjects' performance. Evidently asking the question after (each trial does not affect performance on the task, and factors other than memory are critical for correct solution. However, the paper and pencil group provided valuable infor- Ination regarding subjects' processing. This will be discussed later. Individual comparisons (Winer, pp. 65-69) showed that size problems differed significantly (F = 6.64, df = 1/276, p < .01--for the smallest difference) from the other problem types. That is, there were significantly fewer correct-choice responses on trial four (solutions) for the size problems than for any other type. This partially confirmed the hypothesis that problem type would be an influential factor of performance, and replicated DePalma's earlier results. Size problems again led to the lowest level of performance, while letter, position and color problems (in order of increasing performance) resulted in significantly better scores (see Table 7). Although this result was expected, it is unexplainable at this time. In order to evaluate the effects of sequence of outcome on solution attainment, and test hypotheses one through seven, the data were reorganized. 23 The percentage of solution attainment on trial four as a function of problem type and sequence of outcome are shown in Table 3. These percentages represent the propor- tion of the time a particular sequence of outcome and problem type resulted in correct-choice response on trial four. TABLE 3 PERCENTAGES OF CORRECT-CHOICE RESPONSE ON TRIAL FOUR FOR PROBLEM TYPE AND SEQUENCE OF OUTCOME Problem Type Sequence Color Letter Size Position 000 61 64 53 71 +00 71 70 53 77 o+o 69 58 61 63 00+ 77 85 80 70 ++o 86 89 84 90 +o+ 88 78 78 84 o++ 84 86 67 74 +++ 100 98 100 100 Since the outcomes in this study were not prepro- grammed but were contingent upon the subjects' choices on each trial, the number of subjects in each sequence of outcome by problem type cell (see Table 3) could not be, 24 controlled. Therefore, there is unequal subject representa- tion in the data. Despite the unequal subject representation in each cell (and sometimes repeated representation), the sequence of outcome data were analyzed by a two-way problem type x sequence of outcome analysis of variance (see Table 4). A harmonic mean of 46.51 was used according to Winer (pp. 242-243). This analysis yielded significant main effects of problem type (F = 2.94, df = 3/1504, p < .05) and of sequence of outcome (F = 19.13, df = 7/1504, p < .01). The interaction was not significant. Using F-tests (Winer, p. 244), it was found that the variation of the simple effects of sequence of outcome was non-zero (significant at p < .01) at all four levels of problem type. However, the variance of the effects of problem type was non-zero (significant at p < .05) only at sequence +oo. Problem type was not significant at the other seven sequences. (Note: the numerical value of the significant- levels F for problem type was equal to the F from the analysis of variance). Although this analysis of variance was not the "usual" type because of subject representation, individual comparisons from the data on Table 3 were nearly identical to similar tests (F-ratios) using the data from Table 4. Table 5 resummarizes the data of Table 3 to show the percentages of correct-choice response following each sequence of outcome by question and memory groups. Totals 25 TABLE 4 SUMMARY OF ANALYSIS OF VARIANCE OF PROBLEM TYPE X SEQUENCE OF OUTCOME Source of Sums of Variance Squares DF F P A Problem type 1.40 3 2.94 <.05 B Sequence 21.39 7 19.13 <.01 AB 3.26 21 1.00 NS within cell 245.31 1504 (error) Total 271.36 1535 for the question-once-per-problem and the question-after- each-trial groups, and a grand total for all conditions by sequence are included. These data show the traditional outcome effect; that is, as the number of rights increases from 0 correct (sequence 000) to 3 correct (sequence +++), there is an increase in the probability of being correct on trial four. These data are consistent with Kornreich's (1968) and DePalma's (1969) studies. If the grand totals for these percentages across the question and memory groups are used in making individual comparisons between sequences, some significant differences are obtained. Sequence oo+ was significantly (p < .01) greater than +00 and o+o. Sequence ++o differed significantly (p < .05) from o++. And +++ was significantly different (p < .01) from all the 26 other sequences. Subjects who were correct on trial one performed better (p < .01) on problems than incorrect subjects. TABLE 5 PERCENTAGES OF CORRECT-CHOICE RESPONSE ON TRIAL FOUR FOR QUESTION AND MEMORY GROUPS BY SEQUENCE OF OUTCOME Groups Totals QAE Q QAE Q Grand Sequence PP NP PP NP 000 67 62 57 61 65 59 62 +00 66 70 70 62 68 67 67 o+o 66 55 67 63 60 65 63 00+ 82 82 72 77 82 74 78 ++o 94 91 92 74 92 82 88 +o+ 90 76 76 86 83 81 82 o++ 81 76 74 82 79 79 79 +++ 100 98 100 100 99 100 99 QAE = question after each trial Q = question once per problem PP = paper and pencil NP = no paper and pencil In the experimental task 16 problems were given, so it was possible for each subject to get from 0 to 16 problems correct. Table 6 shows the frequency of subjects for each number of correct problems. The lowest number of 27 TABLE 6 FREQUENCY OF SUBJECTS BY NUMBER OF CORRECT PROBLEMS Number correct QAE Q 6 - 1 '7 ._ _. 8 - l 9 5 3 10 4 4 11 7 10 12 9 7 l3 4 ll 14 8 4 15 10 5 16 l 2 Proportion correct ggg = 78% 3%; = 76% 1181 _ Total N -- 77% Means 12.5 12.1 Grand mean 12.3 correct problems in the experiment was 6. There were 48 subjects in each question condition, so a total of 768 (48x16) problems were given to the subjects in each group. The proportions and percentages of correct, and the means are also given. These values correspond closely to those of earlier studies and indicate that there was no more interference in processing for subjects in these conditions than for those in Kornreich's (1968) or DePalma's (1969) studies. Whatever the effects of the experimental question were, they were not distinguishable from the procedural 28 effects of these other studies. However, during the experi- ment some subjects (question-after-each-trial) said they felt they were aided by responding aloud after each trial. Other subjects felt they had a "hard time doing the task," because of the questioning; or that they weren't always able to give a "reason" for their choice. As we shall see, such statements are more indicative of the subject's manner of processing than they are of the effects of the experi- mental question. Table 7 represents the number of subjects giving correct-choice response on trial four as a function of problem number and problem type. This table shows that the best performance was on problem number 6 (color), while the worst occurred on problem 3 (size). As expected, there was no improvement over problems. (Note: if the answer the subject gave is compared with the stimulus to which he pointed, an interesting result is obtained. The subject does not always give the correct reason for his choice, even though he may point to the correct stimulus. If we examine the frequencies of such occurrences we find: Color Letter Size Position QAE (768 problems) 7 22 11 12 Q (192 problems) 8 5 5 11 Total 15 27 16 23 or 81 problems which were pointed to correctly, but given the incorrect reason for their selection! This number 29 would probably have been larger, but we did not always receive verbal responses on trial four in the question- once-per-problem condition, so there was no way of obtaining these numbers). TABLE 7 NUMBER OF SUBJECTS GIVING CORRECT-CHOICE RESPONSE ON TRIAL FOUR FOR EACH PROBLEM AND PROBLEM TYPE Problem # Type Number of subjects 1 L 77 2 L 78 3 S 56 4 P 79 5 C 70 6 C 85 7 L 70 8 C 74 9 S 63 10 S 80 11 L 72 12 P 69 13 P 76 14 S 69 15 C 82 16 P 81 Problem types = Color, Letter, Size and Position The sequence of outcomes and the subjects' verbal responses to the experimental question (in the question- after-each-trial group) were also utilized to investigate subjects' hypothesis-sampling strategies. Table 8 shows the percentages of correct response on trial four and usage of hypothesis-sampling strategies according to 3O problem types. These data indicate that Win-Stay, Lose-I Shift and Other were most frequently used, and most often correct, with Other being the "best strategy" overall. TABLE 8 PERCENTAGES OF CORRECTNESS AND USE OF HYPOTHESIS-SAMPLING STRATEGIES FOR PROBLEM TYPES Strategy WSLF WFLS WFLF WSLS OTHER Problem types %c %u %c %u %c %u %c %u %c %u Color 88 40 75 2 72 15 73 6 82 47 Letter 82 35 46 6 73 21 100 5 78 42 Size 75 35 50 l 64 23 81 6 67 45 Position 80 39 75 2 79 20 100 5 85 42 Totals 82 37 57 3 72 20 88 5 78 44 k_, 1, L} V 72 19 W win (right) lose (wrong) stay (keep hypothesis) shift (change hypothesis) percent correct percent of time used dpw'uuab 5(1 DISCUSSION The solution analysis shows that of question, memory aid and problem type only the latter had a signifi- cant effect on subjects' performance. There were no differences between the performances of the question-after- each-trial and the question-once-per-problem subjects, or between the no-paper and the paper-and-pencil groups. Size problems were shown to lead to much lower performance than any of the other problem types. This supports the hypothesis that problem type is an influen- tial factor on performance, but it remains unexplainable at this time. From the responses the subjects gave to the experi- mental question, it was observed that (see Table 8): 1) Subjects do change hypotheses after "right"s as well as after "wrong"s. 2) Subjects don't always pick hypotheses consist- ent with information received from an error trial. 3) Subjects may think they are being consistent with prior information when, in fact, they aren't. Thus, subjects do consider hypotheses which fail to pass a consistency check. 31 32 4) Subjects give the same hypothesis frequently during a problem--even after having been "wrong" with it on a previous trial. Therefore, the subjects sample with replacement from the hypothesis-pool during the selection process. These observations support predictions two through five. In the sequence of outcome analysis of variance, both sequence of outcome and problem type were shown to have significant effects on performance. Since the inter- action was zero, these main effects are presumably additive. From the individual F-tests (Winer, p. 244), it was found that the variation of the simple effects of B (sequence) was not zero at all levels of A (problem type). This was expected, because as the number of "right"s on the first three trials increases, the probability of correct-choice response on trial four increases. Thus, the variation among the effects of the sequence should be quite high. The variance of the effects of problem type, however, was non-zero only at level B2 (sequence +oo). This can be explained by the extremely low level (relatively) of performance for the size problems at this sequence. There was a difference of 24 percentage points between the value for the size problem (at +00) and the highest value at +00. All of the variance related to factor A can be accounted for by size problems (see pp. 23-25). Thus, the relation- ship of sequence of outcome to performance seems to be quite complex. 33 The traditional outcome effect can be observed in Table 5. As the number of rights increases from 0 (in 000) to 3 (in +++) there is a corresponding increase in the probability of correct-choice response on trial four. However, the individual comparisons among sequences reveal a complex relationship between sequence of outcomes, num- ber of rights, and performance. It seems that for two or more rights on the first three trials, being right on trial one is more important (with respect to performance) than being correct on trials three or two, respectively. For one right, being correct on trial three leads to better performance than being correct on trials one or two. 80 for two or more rights, primacy is more influen- tial than recency; and for one correct recency is more important than primacy. It is also possible that the outcome effect is actually the result of the number of transformations the subject must make during the problem. For every "wrong" outcome, the subject must "transform" this information in terms of "right" information, that is, he must determine what the "wrong" information means in terms of "right" information. If we arrange the sequences in order of increasing number of transformations we would have: 000, o+o, +oo, oo+, +o+, o++, o++, ++o, and +++. It is also assumed here that sequences 0+0 and +o+ will be more dif- ficult than the other sequences with the same numbers of 34 rights. In both sequences, there is an interruption in the consistency of information from one trial to the next. In the other six sequences at least two similar outcomes (similar information) follow one another. The data from this study do not quite satisfy this prediction. Instead we find, in order of increasing probability of solution: 000, o+o, +oo, oo+, git, IEII ++o and +++. So sequences o++ and +o+ have exchanged positions. Perhaps, the facil- itative effect of primacy was stronger than the detrimental effect of the transformation-interruption. Although we have discussed the effect of sequence on performance, we have not included the relationship of sequence of outcome to the subject's processing. By examining the outcomes the responses given by the subjects in the question-after-each-trial group, we were able to observe different percentages of correctness (how often the response on trial four was correct) and use (propor- tion of the time a particular strategy was employed on the problems for all subjects) (see Table 8). From this table it is clear that the best single strategy is WSLF (Win- Stay, Lose-Shift). This strategy was used on 37% of the problems, and when used resulted in correct trial-four response 82% of the time. The remaining strategies com- bined were used by subjects on only 19% of the problems, and were correct 72% of the time. The remaining workable strategy was actually not one of these (Win-X, Lose-X) 35 types, but a "combination." That is, the subjects from the Other strategy did not use a specifiable strategy on those problems. This "strategy" was 78% correct and utilized 44% of the time. So overall, it was the most effective "strategy." These data indicate that it doesn't matter partic- ularly :tf the subject has a consistent strategy or not, but rather than he uses the information he receives cor- rectly. This statement was verified by some observations of the paper—and-pencil subjects. After a wrong outcome, some of these subjects actually wrote down the incorrect (pertaining to "wrong" outcome) information. The inci- dence of this phenomenon varied from subject to subject (and sometimes from problem to problem in a single sub- ject!). Such subjects found it very difficult to solve the problems, especially if they wrote down the incorrect information and on the following trials treated this information as correct information. Of course, such processing cannot lead to correct solution of the problem. Other subjects only recorded some of the information they received from the outcome. These subjects would then need more than four trials to solve a problem, so they were unsuccessful in attaining problem solution. Another interesting observation was that subjects wrote down (processed) more than one hypothesis at a time, yet they gave only one hypothesis as the reason for their choice 36 (in response to the question). This confirmed Levine's earlier work. It is assumed that the data obtained from this paper-and-pencil group are representative of subjects' mental processing. It seems reasonable to conclude that subjects who can use correct ("right") and incorrect ("wrong") information effectively will solve the problems despite the experimental conditions (question and memory). Subjects who have trouble processing information (espe- cially "wrong") for one of the reasons described (or any other) may be affected positively or negatively by the same methodology. The processing and hypothesis-sampling of such subjects merit further study. REFERENCES REFERENCES Andrews, 0., Levinthal, C., and Fishbein, H. The organi- zation of hypothesis-testing behavior in concept- identification tasks. American Journal of Psychology, 1969, 82, 523-530. Bourne, L. and Guy, D. Learning conceptual rules II: The role of right and wrong instances. Journal of Experimental Psychology, 1968, 11, 488-494. Bourne, L., Dodd, D., Guy, D., and Justesen, D. Response contingent ITS in concept identification. Journal of Experimental Psychology, 1968, 16, 601-608. Bower, G. and Trabasso, T. Concept identification. In R. C. Atkinson (ed.) Studies in Mathematical Psychology. Stanford: Stanford University Press, DePalma, D. unpublished senior thesis, Lehigh University, 1969. Dodd, D. and Bourne, L. Test of some assumptions of a hypothesis-testing model of concept identification. Journal of Experimental Psychology, 1969, 80, 69-72. Dominowski, R. Role of memory in concept learning. Psychological Bulletin, 1965, 63, 271-280. Erickson, J. Hypothesis sampling in concept identifica- tion. Journal of Experimental Psychology, 1968, 16, 12-18. Falmagne, R. Construction of a hypothesis model for con- cept identification. Journal of Mathematical Psychology, 1970, 1(1), 60-96. Falmagne, R. A direct investigation of hypothesis-making behavior in concept identification. Psychonomic Science, 1968, 12f6), 335-336. Kenoyer, C. and Phillips, J. Some direct tests of concept identification models. Psychonomic Science, 1968, 13(4), 237-238. 37 38 Kornreich, L. B. Strategy selection and information proces- sing in human discrimination learning. Journal of Educational P§ychology, 1968, 52, 438-448. Levine, M. Hypothesis behavior by humans during discrimi- nation learning. Journal of Experimental Psychology, 1966, 11, 331-338. Levine, M. Latency-choice discrepancy in concept learning. Journal of Experimental ngchology, 1969, 82, 1-3. Merryman, C., Kaufmann, B., Brown, E. and Dames, J. Effects of "rights" and "wrongs" on concept identi- fication. Journal of Experimental Psychology, 1968, 16, 116-119. Nahinsky, J. and Slaymaker, F. Sampling without replace- ment and information processing following a correct response in concept identification. Journal of Experimental Psychology, 1969, 80, 475-482. Nunnally, J. ngchometric Theory. New York: McGraw-Hill Inc., I967. Restle, F. The selection of strategies in cue learning. Psychological Review, 1962, 69, 329-343. Richter, M. Memory, choice and stimulus sequence in human discrimination learning. Unpublished doctoral dissertation, Indiana University, 1965. Rogers, S. and Haygood, R. Hypothesis behavior in a concept-learning task with probabilistic feedback. Journal of Experimental Psychology, 1968, 16, Rourke, D. and Trabasso, T. Hypothesis sampling and prior experience. Proceedings: 76th Annual Convention APA, 1968, 47-48. Trabasso, T. and Staudenmayer, H. Random reinforcement in concept identification. Journal of Experimental Psychology, 1968, 11, 447-452. Trabasso, T. and Bower, G. Presolution dimensional shifts in concept identification: A test of the sampling with replacement axiom in all-or-none models. 'Journal of Mathematical Psychology, 1966, 3, 163-173. 39 Trabasso, T. and Bower, G. Memory in concept identifica- tion. Psychonomic Science, 1964, 1, 133-134. Winer, B. J. Statistical Principles in Experimental Design. New York: McGraw-Hill Book Co., 1962. APPENDIX A EXPERIMENTAL DESIGN APPENDIX A EXPERIMENTAL DESIGN Problem Type Question Memory Number of frequency aid subjects Color. Letter Size» Position NP 24 Q 1 PP 24 NP 24 QAE . PP 24 40 4r?” APPENDIX B QUESTION MATRIX APPENDIX B QUESTION MATRIX Problem Type Questions asked after trial S P C L S P C L S P C L S P C L 1 i: * * * 2 * ~k * i: 3 1|: * * i: 4 * * * * 1 * ~k * * 2 '1: * i: i: 3 * * i: * 4 * *- * * 1 -k k * i: 2 * i: * * 3 * * * * 4 * i: i: 9: 1 * * ~k * 2 -k * * * 3 'k i: * * 4 * * * * NOTE: This matrix represents problem types (size, position, color and letter) vs the trial after which the ques- tion is asked. 41 MAY 5 1871 "IIIIII‘IIIIIIIIIIIIIIS