PRODUCTION AND JUDGMENT PROCESSES EN A WORD PROBLEM Thesis fortheDegrée of—iPh.}D_;»_- J _ . V “MICHIGAN. STATEIUNIVERSITY 1_' . :1 _ BRADLEYIA'QBREMER, ~ ; This is to certifg that the thesis entitled Production and Judgment Processes in a Word Problem presented by Bradley A. Bremer has been accepted towards fulfillment of the requirements for JILL degree in £22102?! Y .15", ‘ Still! firmnmm' ABSTRACT PRODUCTION AND JUDGMENT PROCESSES IN A WORD PROBLEM by Bradley A. Bremer Problem solving is frequently analyzed into pro— duction and judgment processes. In word problems, solution word frequency strongly influences the production process. Few such word problems involve the production of a number of potential solutions. The present study involves a problem which does and was designed to determine whether 85 would alter their production when instructed to emit high or low frequency solutionsa The relative influence of production and judgment process on the final solution was also investigatedo Each S solved 16 problems consisting of two con— sonants. Ss were instructed to think of solutions consist— ing of four—letter words which began with the first letter and ended with the second. Production data were collected after the solution was given by means of a recognition test consisting of all potential solutions to that problem found in the Thorndike—Lorge (T—L) tables. 85 were instructed to check their earlier productionsu Five Groups, of 20 $5 each, Bradley A. Bremer were used. Group F was instructed to elicit‘ the most common solution possible, Group I the most uncommon. Group KR received uncommon instructions and received knowledge of results after every problem. Group C—l received uncommon instructions, but no recognition test. Finally C—2 did not receive any frequency instructions. T-L frequencies were determined for each production. Frequencies above 50 were designated high frequency so- lutions, those of 5 or less as low frequency instructions. Median frequencies of final solutions were significantly higher for Group F than any of the remaining groups. Group F also elicited significantly more high frequency and sig— nificantly fewer low frequency solutions. The total number of productions, the number of low frequency productions and the number of high frequency productions did not vary among groups. The total number of productions did increase over blocks of trials and this could largely be accounted for by an increase in low frequency productions. Ss successfully evaluated their own productions and chose a "good" solution from those produced. Median frequencies of final solutions were much higher than median frequencies of all other pro— ductions in Group F and much lower in Groups I and KR. It was concluded that Ss did not alter their pro— duction in a facilitory fashion. However, they produced Bradley A. Bremer enough potential solutions of varying frequencies to allow for problem solution under either set of instructions. The judgment phase was critical to solution. These results can be accounted for within the framework of the spew hypothesis, but alternative explanations are possible. A more direct test would require data on production order. PRODUCTION AND JUDGMENT PROCESSES IN A WORD PROBLEM BY .u"" Bradley AffBremer A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Psychology 1968 PRODUCTION AND JUDGMENT PROCESSES IN A WORD PROBLEM BY J“ j Bradley A?}Bremer Submitted to Michigan State University in partial fulfillment of the requirements I A THESIS l for the degree of I DOCTOR OF PHILOSOPHY Department of Psychology 1968 2.1.1,"... .'"- . f. ii ACKNOWLEDGMENTS The author wishes to express his gratitude to his major professor, Dr. Donald M. Johnson, for his support and guidance in the planning and execution of the research and the development of this manuscript. The author also extends thanks to the members of his guidance committee: Dr. Paul Bakan, Dr. James Phillips, and Dr. Gordon Wood for their advice and criticism. iii TABLE OF CONTENTS Page DEDICATION . . . . . . . . . . . . . . . . . . . . . . ii ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . iii LIST OF TABLES . . . . . . . . . . . . . . . . . . . . vi LIST OF APPENDICES . . . . . . . . . . . . . . . . . . vii Chapter I. INTRODUCTION 1 Basic Analysis 1 Production 1 Judgment 7 Purpose 8 II. METHODS . . . . . . . . . . . . . . . . . . . 12 Subjects . . . . . . . . . . . . . . . . . 12 Problems . . . . . . . . . . . . . . . . . 12 Recognition Test . . . . . . . . . . . . . 13 Post—experimental Questionnaire . . . . . 14 Conditions and Instructions . . . . . . . 14 Procedure . . . . . . . . . . . . . . . . 16 III. RESULTS . . . . . . . . . . . . . . . . . . . 20 Treatment of the Data . . . . . . . . . . 20 Analysis of Final Solutions . . . . . . . 21 Analysis of Total Production . . . . . . . 27 Within Groups Comparisons of Final Solutions and Intervening Productions . 33 Post—experimental Questionnaire . . . . . 35 IV° DISCUSSION . . . . . . . . . . . . . . . . . . 37 iv Chapter Page v I SWY I I I I I I I I I I I I I I I I I I I 45 BIBLIWRAPHY I I I I I I I I I I I I I I I I I I I I I 47 APPENDICES I I I O I I I I I I I I I I l I I I I I I I 50 LIST OF TABLES Table Page 1. Summary of groups and procedures . . . . . . . l6 2. Means and standard deviations of the number of high and low frequency final solutions in four groups . . . . . . . . . . . . . . . 23 3. Summary of analysis of variance of number of high frequency final solutions . . . . . 24 4. Summary of analysis of variance of number of low frequency final solutions . . . . . . 26 5. Means and standard deviations of the number of productions and the number of high and low frequency productions in four groups . . 29 6. Summary of analysis of variance of number of high frequency responses . . . . . . . . 3O 7. Summary of analysis of variance of number of low frequency responses . . . . . . . . . 31 8. Summary of analysis of variance of total number of responses . . . . . . . . . . . . 32 9. Medians of final solutions and intervening productions . . . . . . . . . . . . . . . . 34 vi LIST OF APPENDICES Appe nd ix A. INSTRUCTIONS . . . . . . . . . . B. DISTRIBUTION OF RESPONSES TO QUESTIONS #1 AND #3 ON THE POST-EXPERIMENTAL QUESTIONNAIRE . . . . . . . . . . . . . . . 54 vii I NTRODUCT I ON Basic Analysis In many problem solving tasks it is possible to differentiate between production and judgment processes. Such formulations assume that the subject produces a series of possible solutions and then judges the quality of each in relationship to the problem requirements. Davis (1966) also differentiates between problems in which S must make an overt response in order to test the production and those in which the judgment is made covertly. The latter catagory includes a number of verbal problems characterized by one- word solutions. In solving anagrams, for example, the S presumably searches for, or produces, potential solutions until one is judged acceptable. In such problems, factors which influence production and judgment processes can be in- vestigated independently or it is also possible to evaluate the relative importance of each phase in arriving at a solution. Production In the production phase of verbal problem solving, frequency of words in the language has proven to be a power- ful variable. Ss are more likely to think of words with which they have had more experience. Underwood and Schultz (1960) have labeled this tendency the ”spew hypothesis." Specifically, this hypothesis states that high frequency words, i.e. those which S has encountered frequently, have a higher probability of being evoked and also tend to be evoked earlier. The spew hypothesis can be considered a specific case of response hierarchy theory (Duncan, 1966b; Maltzman, 1955). This more general formulation states that responses vary in habit strength and thus form a divergent hierarchy. In any situation, responses are elicited in order of their position on the hierarchy. Hierarchies may be inferred from association norms, bigram frequencies, etc. The spew hypothesis is a restricted case referring only to hierarchies based on word frequency norms. Anagram studies provide the most extensive source of support for the spew hypothesis. Mayzner and Tresselt (1958) were the first to report an inverse relationship between so— lution word frequency and solving time. They explained that high frequency words ”will probably be high on the 83 re— sponse repertoire and therefore possess a greater potential for evocation as an early implicit response . . . " The sta— bility of the relationship has been demonstrated by several subsequent studies (Johnson, 1966). Additional support for the spew hypothesis is de— rived from a study of ambiguous anagrams by Johnson and Van Monfrans (1965). When Ss were asked to give more than one solution to an anagram they found that the solution with the highest T—L frequency tended to be evoked first. Duncan (1966a) devised a test of the spew hypothesis which utilized a task designed to "minimize the effects of letter order and to maximize the frequency of the word as a whole.'l Each problem consisted of two consonants. In the first experiment subjects were instructed to think of a five— letter word which began and ended with the letters specified. The frequency of all potential solutions was determined by the T—L general count (Thorndike and Lorge, 1944). As hy— pothesized, significantly more high frequency than low fre— quency words were given. In a second experiment letter combinations were used which had only two possible solutions. Duncan correctly predicted that the higher frequency member of the pair would be given more often. A further re— striction was employed in a third experiment. Only the first letter of the solution was given, but the category of the solution was specified, e.g. trees° With two potential solutions, the higher frequency word was given more often. When the subjects were given the higher frequency word, as an example, the number of low frequency responses did in— crease, but the number of high frequency responses did not increase when the low frequency word was used as an example. The data cited above provide evidence that solution word frequency is an important variable in solving word problems and supports the spew hypothesis. When Ss are searching for solutions they are more likely to produce high than low frequency responses. However, this general tenden- cy may not be appropriate for many problems. Many problems are problems because their solutions are uncommon. There— fore, it would often be to the Ss advantage to reverse this general tendency and emit more uncommon responses and to do so earlier in the production process. Most experimental attempts to increase the production of uncommon responses have gone under the heading of ”origi- nality training." These investigations employ various tech— niques to evoke a greater number of original responses from 85. While the originality training studies have not dealt directly with word frequency, they have used closely related measures. One aspect of originality, uniqueness, is oper— ationalized in terms of statistical frequency. Uniqueness is usually determined by the number of times the response occurred within the experimental group itself or has oc— curred in prior groups. This type of criterion is used in Guilfords Unusual Uses Test and Quick Response Test (Wilson, et al., 1953). ”Brainstorming" is one example of a method designed to increase the production of uncommon responses (Parnes and Meadows, 1960). The essential feature of this method is a relaxation of critical evaluation during problem solving. Another method used to facilitate unique responses is Maltzman”s originality training (Maltzman, et al., 1958; 1960). The main feature is a training session consisting of repeated free associations to the same stimulus words. Some success in increasing the number of infrequent productions has been achieved through both of these methods. The elicitation of low frequency responses has also been facilitated by the use of instructions to be original. Maltzman, et al. (1958) increased the number of uncommon as- sociates to a stimulus word by presenting instructions for originality just prior to the test session. The effect also transferred to the Unusual Uses Test. Also working within the framework of Maltzman's technique, Rosenbaum, et al. (1964) presented the originality instructions before the training task. The number of uncommon associates on the subsequent test task was increased, but the effect did not transfer to the Unusual Uses Test. Gerlach, et al. (1964) hypothesized that creativity would be improved when the instructions in— cluded cues ”concerning criteria for scoring the responses.” Therefore, they devised ”criteria—cued directions" which contained such cues. The effect of these instructions was compared to 5 other kinds of instructions, including brain— storming. The criteria—cued directions produced more re- sponses rated unique than any other method. Apparently in— structions to be original are effective, especially when they specify the characteristics of the desired response. It seems appropriate to generalize from the unique— ness studies and ask whether Ss can place restrictions on the ”spew" when solving word problems. Under certain con- ditions 88 may be able to limit their production of implicit responses to low frequency words or at least increase the relative proportion of such words produced. The literature appears devoid of studies relating directly to this question. However, Underwood (1966, p. 590) has suggested that such a restriction of l‘spew" might occur in anagram solving if a series of anagrams with low frequency solutions were pre— sented to the same 85. In this situation " . . . it seems reasonable to expect that spew would become more and more restricted as each anagram is solved. . . . ” Underwood's hypothesis is based on a study of bigram frequency by Dominowski and Duncan (1964). They found an interaction be- tween anagram bigram frequency and solution word bigram frequency. Performance was best when both types of fre— quency were high or both were low. In the case of low fre- quency bigrams Underwood suggested that the low frequency anagram may restrict S'sproduction to words composed of low frequency bigrams. Underwood generalized from bigram fre- quency to word frequency and predicted that Ss may be able to restrict the "spew” until low frequency anagrams were solved as quickly as high frequency anagrams. This reason— ing may be expanded to include other word problems. The central question is: Can Ss restrict the "spew” and limit production to low frequency responses or is the spew re— sistant to such alterations? Judgment The role of the judgment process in most word problems is to determine whether or not the solution which the S has produced is correct. In anagram studies for ex— ample, the S has to determine whether his production is a word. The same is true for Duncan's (1966a) task. While the judgmental task is probably relatively simple in such problems, Johnson, Lynch and Ramsey (1967) have shown that word frequency influences the judgment process as well as the production process. In one experiment Ss were able to determine whether a letter arrangement following a word was the same as the word or different more easily when the words were common than when they were uncommon. In a second ex- periment, Ss were able to evaluate the correctness of three letter starts to five letter anagrams better when the so— lutions were common words. In the uniqueness studies judgment plays a somewhat different role. There is no response which is obviously best or correct. The task is more “open ended.” The S may be able to produce a variety of potential solutions. The quality of his final solution depends heavily on his ability to choose the best one. This type of task has not been used in studies where word frequency is a variable. It could be introduced in a problem in which there were a series of possible one-word solutions that vary in frequency and the S was induced to evoke a low frequency solution. In such a task, the judgment phase is essential to achieving a good solution. There is evidence that Ss are capable of making judgments of frequency. Atteneave (1953) demonstrated that Ss can judge the frequency of use of individual letters. Underwood and Schultz (1966) have shown that SS can also estimate the frequency of letter bigrams. Both of these findings were supported by Mayzner and Tresselt (1962)- Howes (1954) asked Harvard students to rank words "in order of their frequency among Harvard undergraduates." Rank order correlations with the Lorge Magazine Count were .71 for 25 relatively rare words and .87 for 15 words covering a wider range of frequencies. Purpose The current study was designed to investigate the process by which Ss arrive at a low frequency response to a word problem. The task was chosen with three considerations in mind. 1) The problem had to have several potential so- lutions of varying frequency. 2) No single solution was ob- viously correct or best. 3) The 8 had to attempt to produce the most uncommon solution he could. A modified version of Duncan's (1966a) task was chosen. Ss were instructed to pro— duce the most uncommon four—letter word which began and ended with specified consonants. The first major purpose of the study was to de- termine whether Ss could voluntarily restrict the range of their production on the frequency dimension, i.e. whether they could increase the rate of production of high or low frequency words at will. When the S has been instructed to think of a very uncommon solution, several production models may be considered. First, the S may begin by producing common responses and then continue to go down through the frequency hierarchy at a normal rate until he produces a re- sponse which he judges to be sufficiently rare. In this case the S has virtually no control of the production pro— cess. He proceeds mechanically through his repertoire until he gets to uncommon responses. A strict response hierarchy explanation seems to coincide with this model. A second possibility is that the S begins at the top of the frequency hierarchy, by producing common responses, and then accelerates the rate at which he moves through the hierarchy to the rarer responses. This alternative can be considered as a case of partial control over the production phase. Finally, it may be possible for the S to ”jump into" the response hierarchy at any point, including the low fre- quency end. If so, he can eliminate production of high fre— quency responses and restrict his production to uncommon responses when desired. In this alternative, the S has rather complete control of the production phase. 10 It is rather difficult to differentiate between the latter models. Both predict a reduction of high frequency responses. It should be possible to distinguish between them and the first model however, since it predicts no pro— duction change dependent on low frequency instructions. The present study is designed to determine whether the S can alter his production when instructed to do so. The second major purpose of the study was to evalu- ate the relative importance of production and judgment pro- cess in this type of problem. As indicated above, a suc— cessful or adaptive production change implies that the S can restrict his production to low frequency responses, or at least increase the proportion of such responses evoked. The judgment phase is considered successful if the S can choose a ”good” response from those produced. The 5 who can choose successfully from a number of productions could give a low frequency solution without altering his production. Of course, it is possible for both processes to be operating in the same problem. The S could restrict his production and then choose the best response from this limited group. The experiment was designed to determine whether one or both of these processes influence the final solution to the problem. In addition to the two major purposes stated above, there were two secondary purposes. First, practice effects were investigated. Several investigators have looked for practice effects within a series of anagrams. In a recent pawsasglts was provided to one experimental group to de- termine whether this powerful variable would have an effect on a.series of problems. METHODS Subjects The subjects were 100 introductory psychology stu— dents who participated as part of a class requirement. Sub— jects were assigned to one of five groups in the order of their appearance with the restriction that the number of males and females be equal across groups. Each of the re- sulting groups was composed of seven males and 13 females. Problems Each of the 16 problems used in the experiment con- sisted of two consonants. These consonant combinations were the initial and final letters of a subset of four—letter words from the T—L (Thorndike and Lorge, 1944) general word count. The 16 letter combinations were: B—D (for which the relevant words were bead, bend, bard, etc.), B-K, B—T, C—P, C-T, D-T, L-D, L—K, M—T, P-K, R—T, S—D, S-N, S—P and S—R. The subsets of T—L words which began and ended with these letter pairs ranged in number from 10 to 16 words. In each subset, between 2 and 6 words had a T-L frequency of 50 or higher and an equal number had a T—L frequency of 5 or less. There were a total of 207 words for all 16 problems and a mean of 12.94 words per problem. The median T—L frequencies 12 13 of the subsets ranged from 3 to over 50. The median of the medians was 15.5 and the median of all 207 words was 15. A separate 3 by 5 stimulus card was prepared for each combination. The two relevant consonants were printed on the card with a black felt pen and separated by two blank spaces (e.g. D—-K). All 16 problems were presented to a pilot group in order to determine approximately how many solutions would be produced for each problem. Ss were run in a group and the problems were presented verbally. They were instructed to give as many solutions as they could which began with the first letter and ended with the second. With a 90 second solving period, the mean number of solutions per problem ranged from 3.42 to 5.96 (N=24). The level of difficulty of the problems appeared suitable for the purpose of the investigation. Recognition Test An individual recognition test was devised for each problem for the purpose of determining which potential so— lutions Ss had produced during the solving period. It con— sisted of a small mimeographed form listing all words from the T—L tables which were potential solutions for that problem. The words were listed in a single column in alpha- betical order. Each word was followed by a blank space. The 14 tests were administered after each problem to determine which words S had thought of during the solving period. Post—experimental Questionnaire A typewritten questionnaire consisting of four questions was prepared and administered to all Ss. The questions were designed to provide llself report” information about the S's approach to the problem and SEsconfidence in the technique employed to collect data on implicit behavior. The questions follow: 1) After each problem you were asked to check the words you had thought of earlier. How confident are you that you were able to make an accurate identification of those words you thought of while you were solving the problem? very confident;_, fairly confident__, (?)__, fairly unconfident__, very unconfident__. 2) How much emphasis did you place on the check list while you were solving the problem? 3) Did you use any particular method or technique to solve the problems? 4) Do you have any other comments about the task? Conditions and Instructions Five groups, three experimental and two control, were used in the experiment. The same problems were given to Ss in all groups, but instructions concerning the type of so- lution varied. A summary of the groups is found below. Com— plete instructions for each group can be found in Appendix A. 15 The three experimental groups were instructed to think of four-letter words which began and ended with the consonants composing the problem. Foreign words and proper nouns were disallowed. Further instructions specified the frequency in the language of the solution word to be pro- duced. Examples of high and low frequency solutions to the T—K problem were given. In Group F (frequent solution in— structions) the Ss were instructed to think of the most fre— quently used word they could which met the other require— ments. In Group I (infrequent solution instructions) opposite instructions were given. Ss were told to think of the most infrequently used word they could within the limi- tations of the task. Ss in Group KR (knowledge of results) received the same instructions as those in Group I. In ad— dition, after each solution had been emitted, the S was told how common or uncommon his response was according to the T-L tables. In all three experimental groups the appropriate recognition test was administered subsequent to each problem. Group C—l (control one) received infrequent solution instructions, but was not administered the recognition tests. Its purpose was to determine whether the recognition tests had any influence on solution frequency of subsequent problems. Finally Group C—2 (control two) was simply told to think of as many solutions as possible to the problems. At the end of the solving time, the subjects were instructed to check the items on the recognition test that they had 16 thought of while solving. The main purpose of this group was to provide a baseline for the three experimental groups on the recognition task. A summary of the groups and pro— cedures is presented in Table 1. Table 1. Summary of groups and procedures. Type of Final Recognition Knowledge Groups frequency solution test of results instructions elicited administered obtained frequent I infrequent no KR infrequent yes C—l infrequent no no ne Procedure The problems were administered to each subject indi- vidually with S facing E across an office desk. A plywood shield on the desk prevented S from seeing E's materials, but did not obstruct the S's view of E or vice versa. After each S had received instructions appropriate to his group and had indicated that he understood the in— structions, the first problem was presented. Both verbal and visual presentations were used. The E held up the stimulus card for approximately five seconds. During this 17 time E also read the problem letters aloud, (e.g. ”the problem is D—K"). The S was then allowed 90 seconds to solve the problem. Any S who attempted to give his solution before the 90 seconds were up was instructed to take the full amount of time for solving. At the end of that period the S was asked to give his solution verbally. This re— sponse was then recorded as the S's final solution (ES) to that problem. Immediately after the FS had been elicited, E handed S the recognition test appropriate to that problem. S was instructed to check only those items which he had thought of while working on the problem. If S indicated that he was uncertain of his ability to recognize his earlier pro- ductions, he was instructed to "do the best you can." If the question arose, S was told to ignore the words that he had thought of which did not appear on the list. The inci— dence of the latter two occurances was rather low. The items checked on the recognition test included the FS plus other words S had implicitly produced during the solving period. The latter were designated as intervening productions (IPs). Total production (TP) for the problem then consisted of the FS plus the IPs. For Groups F, I and C—2 the problem was completed when S returned the recognition test to E. In Group KR one step was added. Ss in this group were informed of the fre- quency of their responses. This information was determined 18 by the T-L tables. Since no recognition test was used in Group C—l, the problem was considered complete when the S verbalized his solution. In all groups, the next problem was presented without delay upon the completion of the former. A total of 18 problems were presented to each S. The first two problems were practice trials and were not in— cluded in the analysis. The same two problems, D—K and S-M, were used for all 85. The Ss were not aware that these were practice problems, since the 16 experimental problems followed immediately and the sequence was uninterrupted. The S was not told how many problems were to be solved. Up— on completion of the final problem the post—experimental questionnaire was presented to each S. The time required for the entire session was approximately 35 to 50 minutes. Twenty different problem orders were used in the ex— periment. Ss in each group were numbered in the order of their appearance. One problem sequence was used for the first S in all groups, another for the second S in all groups, etc. The sequences were determined in the following manner. The 16 problems were randomly divided into two equal groups. One group consisting of problems B—D, B—T, C—T, L—D, P—K, R—T, S—N and S—R was presented first to odd— numbered 85 and last to even—numbered Ss. The other group consisting of problems B—K, D—T, C—P, L-K, L—T, M—T, 8—D and S—P was presented first to even—numbered Ss and last to 1 position. Within_a;ri a. 2% prfihlems-infleach.ha1f were randOmly ordered. Thus the problem orders were balanced to allow for first and second half comparisons. RESULTS Treatment of the Data The T—L frequency was determined for every response used in the following analysis. Frequencies were easily as- signed to all IPs because Ss were limited to the words on the check list. However, some Ss elicited FSs which were not included in the T-L Tables. These exceptions fell into three catagories. l) Unlisted Words. These FSs were words according to Websters New Collegiate Dictionary, but were not found in the T—L Tables. It was assumed that these words were too uncommon to appear in the tables. Therefore, they were attributed a frequency of less than one per million and were included in the analysis. 2) Non—words. These FSs were neither T—L nor dictionary entries. They were con— sidered invalid responses and were not included in the analy— sis. 3) Blanks. In a few cases, Ss were unable to produce any solution to the problem. The number of blanks and non—words was relatively small and fairly evenly divided among groups. The 17 non- words and 6 blanks were distributed as follows: 5 in Group F, 8 in Group I, 6 in Group KR, and 4 in Group C—l. Since each of the 20 88 within each group attempted to solve 16 problems, there were 320 potential FSs per group. That 20 21 meant the number of FSs for which a frequency could be de— termined was 315, 312, 314, and 316 respectively in Groups F, I, KR, and C-1. For purposes of analysis, all usable responses were divided into three categories on the basis of their T-L fre- quencies. The first category, high frequency productions, included all responses with a T-L rating of A or AA (50 or higher). Responses with a T-L rating of five or less were designated as low frequency productions. The third category consisted of the remaining productions in the middle fre- quency range. The 16 experimental problems were divided into two equal parts. The data from the first eight problems ad- ministered to a S were combined and are designated as Block I. The corresponding data from the final eight problems are referred to as Block II. Problem orders had been counter— balanced to allow for comparisons between the two blocks. Analysis of Final Solutions Ss in 4 groups, F, I, KR and C-1, elicited FSs to the problems. The T-L frequency of the FSs reflected the Ss ability to solve the problems. This ability was an as— sumption essential to the remaining analyses. The ability would be reflected by a tendency to produce high frequency FSs in Group F and an opposite tendency to produce low fre— quency solutions in Group I, KR, and C-1. 22 Since the T—L tables do not specify frequencies greater than 50, and many FSs fell into this category, the mean frequencies of the solutions could not be calculated. It was possible, however, to determine the median T-L fre— quency of each S's FSs. Medians of these medians indicate a sizable difference between Group F and the other 3 groups, all of which received low frequency instructions. The median of the medians exceeded 100 in Group F, while in Groups I, KR, and C—1 the median medians were 8, 4.5 and 10.25 respectively. There was no overlap between the groups. The lowest median in Group F was 43.5, while the highest in the other three groups was 29.5. All comparisons between groups were evaluated by means of the Composite—Rank Test (Guilford, 1965). For each difference, the smaller sum of ranks was compared to the significant R value at the .05 level on the table devised by Wilcoxon for samples of equal size. The analysis showed that Group F differed significant- ly from all of the other groups, but none of the differences among these groups was significant. The number of high frequency FSs was analyzed to further evaluate the Ss ability to solve the problem. Means and standard deviations of the number of high frequency re— sponses are presented in Section A of Table 2. Separate means are given for Block I, Block II and the entire session. All standard deviations are shown in parentheses. Hartley's F-max Test of Heterogeneity of Variance (Walker and Lev, 23 Table 2. Means and standard deviations of the number of high and low frequency final solutions in four groups. A. Number of high frequency final solutions. Block I . (1.08) Block II 1.45 (1.25) Total B. Number of low 24 1953) was performed on the data and the assumption of homo— geneity appears tenable (Fmax=l.51, l9 df). A 2X4 factorial analysis of variance was performed on the data to evaluate the effects of conditions, practice and their interaction. The analysis was ”mixed" since the conditions effect involved an inter-subject comparison and the comparison between blocks of trials was intra—subject. The analysis corresponds to Lindquist's Type I design (Lindquist, 1953; p. 267). The results of this analysis are summarized in Table 3. The F ratio for conditions was sig- nificant at the .01 level. Both practice and interaction F ratios failed to reach statistical significance. Table 3. Summary of analysis of variance of number of high frequency final solutions. Source df Mean F P Square Between Subjects Conditions 3 116.26 .01 Error between Ss 76 Within Subjects Blocks 1 .08 n.s. B X C 3 1.42 n.s. Error within 85 76 Total 159 25 The significant F ratio for conditions indicates that there were significant differences in the number of high frequency FSs produced by the different groups. Duncans range test was used to determine which of the specific differ- ences were significant. The analysis indicated that Group F produced a significantly larger number of high frequency so— lutions than any of the other three groups. All differences among the remaining three groups were not significant. In— spection of Table 2 indicates that the magnitude of the differences is rather large. The mean number of high fre— quency FSs is more than 4 times greater in Group F than in the next group. The number of low frequency FSs was analyzed in the same manner as the number of high frequency FSs. The rele— vant means and standard deviations are found in Table 2 Section B. The Fm Test indicated that the variances were ax heterogeneous (Fmax = 9.68, df 19). In spite of the fact that the assumption of homogeneity was not satisfied, an analysis of variance was performed to assess the effects of conditions and practice. Edwards, (1960) p. 132 states, " . . . since the F test is very unsensative to nonnormality and since with equal n's it is also insensative to variance inequalities, it would be best to accept the fact that it can be used safely under most conditions." The same 2X4 factorial analysis of variance was re— peated on the low frequency data. Table 4 summarizes this 26 analysis. Again, the practice and interaction effects were insignificant while the F ratio for conditions was signifi— cant at the .01 level. Duncans Range test was used to de— termine which of the separate group means differed signifi— cantly. The results were the opposite of those found for high frequency F85. The analysis indicated significantly fewer low frequency FS had been evoked in Group F than in any of the other groups. Only one other difference was sig- nificant. Group KR produced more low frequency FSs than Group C-l. Table 4. Summary of analysis of variance of number of low frequency final solutions. Between Subjects Conditions .01 Error between Ss Within Subjects Blocks n.s B x C n.s Error within Ss Total 27 The difference between the 3 groups which received low frequency instructions and Group F is rather large. The smallest of the means for these 3 groups is over 6 times as great as the mean for Group F. This difference, together with the differences in means on the high frequency F58 and the differences in median frequencies, provides substantial evidence that the Ss were able to solve the basic problem and emit the kind of solution requested by the instructions. Analysis of Total Production Data on the 85 TP during the problem solving period was obtained from Groups F, I, KR and C—2 by means of the recognition test. Differences in the T-L frequencies in the TP were taken as evidence of an alteration in the Ss pro— duction process. Group C-2 provided a production baseline since its production was not influenced by frequency in— structions. Expected production changes in Group F would take the form of relatively high frequency responses while Ss in Groups I and KR would produce relatively fewer frequent responses and more uncommon responses. The median T—L frequency was derived for every S's TP. Then, for each group, a median of these medians was de- termined. These medians of medians follow: Group F, 31; Group I, 24; Group KR, 24.75; and Group C-2, 32. The Kruskal—Wallis H-Test was applied to the data (H = 5.83, 3 df). The null hypothesis of differences among groups 28 could not be rejected. Thus these data do not provide any evidence of higher or lower frequency production in any of the 4 groups. To further evaluate the extent of the production changes, the mean number of high frequency productions and the mean number of low frequency productions was calculated for each group. These means and corresponding standard deviations are presented in Sections B and C of Table 5. The data on the number of high frequency productions was subjected to the FmaX Test to evaluate the hypothesis of heterogeneity of variance. The resulting ratio indicates that homogeneity may be assumed (FmaX = 1.87, 19 df). The same 2X4 ”mixed" factorial analysis of variance used on the FSs was applied to the TP data. A summary of this analysis for high frequency TP is presented in Table 6. None of the F ratios in the analysis were significant. There was no evi— dence that any group produced any more high frequency re— sponses than any other group. Also, there was no increase or reduction in the number of high frequency productions from Block I to Block II. The same analysis was applied to the number of low frequency responses. Again, the Fm ratio indicated that ax the assumption of homogeneity was not violated (FmaX = 2.44, 19 df). The analysis of variance is summarized in Table 7. The conditions effect was not significant indicating there 29 Table 5. Means and standard deviations of the number of productions and the number of high and low frequency productions in four groups. A. Total number of productions. 40 39.45 18) (11.90) (12.87) Block II .65 40.90 39.85 .65) (11.90) (14.75) Total 73.05 80.35 77.25 (21.33) (26 41) B. Number of high frequency productions. C. 30 Table 6. Summary of analysis of variance of number of high frequency responses. Source Between Subjects Conditions n.s Error between 85 Within Subjects Blocks n.s B X C n.s Error within Ss Total were no differences in the mean number of low frequency pro— ductions among groups. The practice effect was significant, however. The number of low frequency productions increased from Block I to Block II. The interaction effect was not significant. The analysis of high and low frequency productions supports the findings based on medians. The high and low frequency instructions and knowledge of results did not sig— nificantly affect the frequency of the responses produced in the problem situation. 31 Table 7. Summary of analysis of variance of number of low frequency responses. Between Subjects Conditions n.s Error between 85 Within Subjects Blocks .01 B X C n.s Error within Ss Total Means and standard deviations of the total number of productions (disregarding frequencies) can be found in Table 5, Section A. The Fmax Test was applied to this data and the resulting ratio indicated that the hypothesis of hetero— geneity of variance is unsupported (Fmax = 1.61, 19 df). The analysis of variance used in the prior sections was re— peated on this data. Table 8 summarizes the analysis. The conditions effect was not significant indicating that the TP did not vary across groups. The Conditions X Blocks inter— action was also non-significant. The F ratio for the practice effect indicates that significantly more productions 32 were elicited in Block II than Block I. This difference corresponds to the increase in low frequency productions in Block II- Table 8. Summary of analysis of variance of total number of responses. Between Subjects Conditions n.s Error between 85 Within Subjects Blocks .01 B X C n.s Error within Ss Total Product—moment correlations were obtained between the number of "good“ productions and the total number of other productions. The latter statistic was used instead of TP to eliminate the spuriousness which would have resulted from common occurance of high or low frequency productions in both of the correlated measures. Groups I and KR were combined and yielded a coefficient of .65 between the number 33 of low frequency productions (”good” solutions by definition) and productions with a T—L frequency of 6 or higher. For Group F the correlation coefficient between number of high frequency productions and productions with T—L frequencies below 50 was .76. These coefficients are within the range found by Parnes and Meadows (1959) and Gerlach et al. (1964) in uniqueness studies. The coefficient of correlation between the number of low frequency FSs and remaining productions was .34 in Groups I and KR. The related coefficient between the number of high frequency responses and other responses for Group F was .68. Rank order correlation coefficients were obtained between the total number of productions and the median fre— quency of TP. The resulting coefficients were —.12 in Groups I and KR and —.25 in Group F. Within Groups Comparisons of Final Solutions and Inter— vening Productions Ss in only 3 of the 5 groups (the 3 experimental groups) both emitted a FS and responded to the recognition test. Within each of these groups a comparison was made be- tween the T—L frequencies of the FSs and the IPs. Neither individual nor group means can be reported because many of the responses have a T—L frequency above 50 and interval data on these frequencies were not available. Therefore, the median FS and median IP was determined for 34 each of the 60 85. Group medians (actually medians of the individual S's medians) were then determined for each of the 3 groups. These group medians, for F85 and IPs, can be found in Table 9. Table 9. Medians of final solutions and intervening productions. Median FS Median TP In Group F, the median T—L frequency of the FSs is considerably higher than the median frequency of the IPs. In fact, when each S was considered individually, the median frequency of the FSs was greater than the median fre— quency of the IPs for every one of the 20 Ss. A sign test was performed on the data and the difference is highly significant. In Groups I and KR the results were in the opposite direction. In both groups, the group medians are dis— tinctively higher for IPs than FSs. Again an inspection of individual S medians revealed that there were no reversals. Every Ss median IP is greater than his median F8. The sign 35 test indicates the difference is highly significant in both cases . Post—experimental Questionnaire The post—experimental questionnaire was given to all 85. However, Questions 1 and 2 were omitted in Group C—l since those questions pertained to the recognition test which was not administered to those 85. Similarly Question’ 2 was omitted for Group C—2. Questions 2 and 4 did not yield any data worth reporting. Responses were too widely varied for Question 4 and Ss apparently did not understand Question 2. On Question 1, 36 Ss replied that they were very confident and 39 considered themselves fairly confident that they could identify the words on the checklist that they had thought of while working on the problem. Only 3 85 re- sponded by checking the (?) alternative and the fairly un— confident and very unconfident responses were each checked once. In all then, 75 out of 80 85 responded that they felt some degree of confidence. Analysis of the responses ac— cording to groups is presented in Appendix B. On Question 3, only 11 out of 100 85 reported that they did not use any special method or technique to solve the problems. Out of the 89 who answered yes, 65 reported using some type of letter substitution system, i.e. going DISCUSSION The major purpose of the study was to determine the importance of the production and judgment processes in reach— ing a “good" solution to a relatively simple word problem. In order to study these intervening processes, it was es— sential that a good solution be reached. A good solution in this study consisted of a high or low frequency word, de— pending on the experimental instructions. It would have been impossible to evaluate the extent to which production and judgment processes affected the solution if there had been no differences between the frequencies of the FSs elicited. The results leave little doubt that the instructions were effective and the basic problems were solved. There were substantial differences in the FS frequencies of Group F and Groups I, KR,and C—l, all of which received low fre— quency instructions. The separation of median T-L frequen— cies for FSs illustrates the extend of the differences. The lowest median T—L frequency in Group F was considerably higher than the highest median T—L frequency in the other groups. The data on the number of high and low frequency FSs substantiate the conclusion that the 85 were able to 37 38 reach good solutions. With this fundamental assumption satisfied, the major issue can be approached. If changes in the production phase instrumental to solving took place, there should have been differences in the T—L frequencies of the potential solutions. However, there were no significant differences in the median T-L fre— quency of the T-P, the mean number of high frequency pro— ductions, or the mean number of low frequency productions. Thus the data indicates that the Ss produced potential so- lutions of comparable frequencies whether they were in— structed to reach high or low frequency solutions. There is no way to determine whether or not they were capable of making facilitory production changes, but they obviously did not do so in this situation. Three models of solution production were described in the introductory section. The results are more con— sistent with the first model than with the latter two. Ac— cording to the former model, Ss have little control over the production process and proceed mechanically through the fre— quency hierarchy. The latter models are inappropriate since both predict a production change. The practice effects in the production phase are al— so congruent with the spew hypothesis. T—P increased sig— nificantly from Block I to Block II. The increase in pro— duction, moreover, occurred at one end of the frequency continuum. The increase in T—P apparently resulted from an 39 increase of low frequency responses, since a significant practice effect was found for low frequency productions but not for high frequency productions. This latter finding can be explained in terms of the spew hypothesis. In the second block of problems Ss presumably went through the upper part of the hierarchy in the same manner as they did in Block I, but were able to proceed further down the hierarchy. The additional productions, therefore, came from the lower end of the frequency hierarchy i.e. were uncommon words. It is sig- nificant that the increase in low frequency productions oc— curred in all four groups, including Group F. This indicates that spew, or some similar mechanism, operated in the face of instructions to produce high frequency words. While the production data can be accounted for by the spew hypothesis, the evidence is necessarily indirect. The spew hypothesis makes specific predictions about the se- quence of production, but the method used did not reveal the order in which the potential solutions were produced. There— fore, it is not possible to determine exactly what happened during the production phase. If a major production change had occurred, i.e. if Group F had produced a preponderance of high frequency words, while Groups I, KR and C-2 had pro— duced a preponderance of low frequency responses it would have been possible to conclude that spew had been restricted in the manner suggested by Underwood (1966). It can only be concluded that there is no evidence that that happened. The 40 distribution of frequencies was approximately the same for all groups. This distribution might have occurred in a number of ways. It is possible that spew was operating in all groups in the manner described above. It is also possi— ble that the order of production was haphazard. Or, it is even possible that Ss actually produced the relevant re— sponses first and then went on to produce the other so- lutions. The S's responses to Question 3 on the post— experimental questionnaire does provide some suggestions about the production process. Most 88 indicated that they used some strategy to generate solution words. Many Ss mentioned some type of letter substitution technique, e.g. placing vowels in the second letter position and then filling in position three with a series of consonants. This type of method would not necessarily lead to a production order congruent with spew. While the spew hypothesis does account for the data other explanations are equally tenable. Future research may determine which alternative is most reasonable. In the present investigation the important fact remains that the 85 were unable to alter or restrict their production in a manner that would facilitate problem solving. The failure to obtain production differences based on instructions contradicts the findings from the uniqueness studies, particularly Gerlach et al. (1964). Their criteria— cued directions increased the number of unique responses on wece — ' ... :«m atdrsaoa e'r :I =_ H ' nun”, _.. .‘ - r‘ .. .._ ' - --.l - I. | I ,‘J- i’f_.-.- — .l. J .. 4.: J . _. . . . .-.‘- at I-- 41 the Unusual Uses Test. In the present study, the criterion was clearly specified. 83 were instructed to give "infre— quent" and ”uncommon" solutions and examples were given. More stringent restrictions are placed on production in the present study than in the Unusual Uses Test. Perhaps criteria—oriented instructions are more effective when the potential responses are less limited than in the present study. The data indicate that the judgment process was critical to solving. Results of the comparison between median T—L frequency of FSs and IPs are clear. In Group F, the median FS frequency is higher than the median IP fre— quency for every S. In the three groups given low frequency instructions, the differences are equally obvious, but in the opposite direction. Every ss median FS frequency was lower than his median IP frequency. An inspection of the individual data revealed that the differences were sub— stantial in almost every case. Ss did choose good solutions, high or low frequency as the instructions required, from among those they produced. Some general conclusions about the problem solving process, based on the results discussed, can be made at this point. 85 had little trouble generating a number of word solutions. They did not succeed in restricting their pro- ductions to solutions of relevant frequency, for the distri— bution of potential solution frequencies was similar for all groups. However, the array of words generated apparently 42 included a sufficient number of both high and low frequency words to allow for problem solution under either set of frequency instructions. The key to the solution was 8'8 judgment ability. Ss successfully chose a "good" solution from the number produced. The probability of a good so— lution being available and subsequently being chosen in- creased if many potential solutions were produced. This is reflected in the correlations between T-P and the number of “good” productions and number of "good” FSs. The current investigation utilized a unique method for collecting production data. ”Thinking out loud” tech— niques have been used for similar purposes in anagram studies (Mayzner, Tresselt and Helbrook, 1964). In the present study, however, the data was collected after the solution had been elicited, to avoid interference with the solving process. The nature of the problem itself afforded an opportunity to employ a recognition test for this purpose, since the major portion of potential productions could be specified in advance. There were no indications that the administration of the recognition test had any substantial effects on the remaining problems in the series. The FSs of Group C-l, which did not receive the recognition test, did not differ significantly from Group I on any of the dependent variables. The data from the post—experimental questionnaire indicated that a large majority of the 85 felt at least fairly confident that they were able to recognize their 43 earlier productions. The recognition test is apparently a workable method and suitable for further research. The recognition test has at least one important limitation. It does not yield information about the order of production. Order information could be obtained by ask- ing Ss to verbalize productions as they occur. With the present problem, Ss may be able to specify order on a post- test if the number of productions were reduced. This could be done by using five-letter words, which Duncan's (1966a) 85 found quite difficult, or by using problems with fewer potential solutions. A revision of the current design in which solving time is varied should provide a more direct test of the spew hypothesis. If spew is operating additional solving time should facilitate problem solving under conditions of low frequency instructions, but not high frequency instructions. If, as spew predicts, high frequency solutions are produced first, all or at least most of the relevant productions for high frequency instructions should occur during the early portion of the solving period and later productions should be relatively useless. With low frequency instructions, how— ever, the spew hypothesis predicts that the longer the S has to solve the further he will proceed down the hierarcy, the more uncommon his responses will be and therefore the more relevant. If spew is operating there should be a significant SUMMARY Problem solving is frequently analyzed into pro— duction and judgment processes. In word problems, solution word frequency strongly influences the production process. Few such word problems involve the production of a number of potential solutions. The present study involves a problem which does and was designed to determine whether 85 would alter their production when instructed to emit high or low frequency solutions. The relative influence of production and judgment process on the final solution was also investigated. Each S solved 16 problems consisting of two con— sonants. Ss were instructed to think of solutions consisting of four-letter words which began with the first letter and ended with the second. Production data were collected after the solution was given by means of a recognition test con— sisting of all potential solutions to that problem found in the Thorndike—Lorge (T—L) tables. 85 were instructed to check their earlier productions. Five Groups, of 20 Ss each, were used. Group F was instructed to elicit the most common solution possible, Group I the most uncommon. Group KR received uncommon instructions and received knowledge of results after every problem. Group C-l received uncommon 45 46 instructions but no recognition test. Finally C-2 did not receive any frequency instructions. T—L frequencies were determined for each production. Frequencies above 50 were designated high frequency so— lutions, those of 5 or less as low frequency instructions. Median frequencies of final solutions were significantly higher for Group F than any of the remaining groups. Group F also elicited significantly more high frequency and sig— nificantly fewer low frequency solutions. The total number of productions, the number of low frequency productions and the number of high frequency productions did not vary among groups. The total number of productions did increase over blocks of trials and this could largely be accounted for by an increase in low frequency productions. Ss successfully evaluated their own productions and chose a "good” solution from those produced. Median frequencies of final solutions were much higher than median frequencies of all other pro- ductions in Group F and much lower in Groups I and KR- It was concluded that Ss did not alter their pro- duction in a facilitory fashion. However, they produced enough potential solutions of varying frequencies to allow for problem solution under either set of instructions. The judgment phase was critical to solution. These results can be accounted for within the framework of the spew hypothesis, but alternative explanations are possible. A more direct test would require data on production order. BIBLIOGRAPHY Attneave, F. Psychological probability as a function of eX— perienced frequency. Journal of Experimental Psy— chology, 1953, 46, 81—86. Davis, G-A. Current status of research and theory in human problem solving. Psychological Bulletin, 1966, 66, 35—54. Duncan, C.P. Effect of word frequency on thinking of a word. Journal of Verbal Learning and Verbal Be— havior, 1966, 5, 434-440. (a) Duncan, C.P. Response hierarchies in problem solving. Presidential Address to Midwestern Psychological Association, Chicago, 1966. (Reprinted in, Duncan, C.P. Thinking: Current experimental studies. New York: J.B. Lippencott Company, 1967) (b). Dominowski, R.L. and Duncan, C.P. Anagram solving as a function of bigram frequency. Journal of Verbal Learning and Verbal Behavior, 1964, 3, 321—325. Edwards, A.L. Experimental design in psychological research. New York: Holt, Rinehart & Winston, 1960. Gerlach, V., Schultz, R., Baker, R., and Mazer, G. Effects of variations in test instruction on originality test response. Journal of Educational Psychology, 1964, 55, 79—83. Guilford, J.P. Fundamental statistics in psychology and edu— cation. New York: McGraw—Hill Book Company, 1965. Howes, D. On the interpretation of word frequency as a variable affecting speed of recognition. Journal of Experimental Psychology, 1954, 48, 106—112. Johnson, D-M. Solution of anagrams. Psychological Bulletin, 1966, 66, 371—384. Johnson, D.M., Lynch, D.O. and Ramsey, J.G. Word frequency and verbal comparisons. Journal of Verbal Learning and Verbal Behavior, 1967, 6, 403-407. 47 48 Johnson, T.J. and Van Mondfrans, A.P. Order of solutions in ambiguous anagrams as a function of word frequency of the isolated words. Psychonomic Science, 1965, 3, 565—566. Lindquist, E.F. Design and analysis of experiments in p y- chology and education. Boston: Houghton Mifflin Company, 1953. Maltzman, I. Thinking: from a behavioristic point of view. Psychological Review, 1955, 62, 275—286. Maltzman, 1., Brooks, L.O., Bogartz, W. and Summers, S. The facilitation of problem solving by prior exposure to uncommon responses. Journal of Experimental Psy— chology, 1958, 56, 399—406. Maltzman, 1., Simon, S., Raskin, D., and Licht, L. Experi— mental studies in the training of originality. Psychological Monographs, 1960, 74, 493. Mayzner, M.S. and Tresselt, M.E. Anagram solutions times: a function of letter order and word frequency. Journal of Experimental Psychology, 1958, 56, 376- 379. Mayzner, M.S. and Tresselt, M.E. The ranking of letter pairs and single letters to match digram and single—letter frequency counts. Journal of Verbal Learning and Verbal Behavior, 1962, 1, 203—207. Mayzner, M.S., Tresselt, M.E. and Helbock, H. An exploratory study of mediational responses in anagram problem solving. Journal of Psychology, 1964, 57, 263—274. Parnes, S.J. and Meadow, A. Evaluation of persistence of ef— fects produced by a creative problem solving course. Psychological Reports, 1960, 7, 357—361. Rosenbaum, M.E., Arensen, S.J. and Panman, R-A. Training and instructions in the facilitating of originality. Journal of Verbal Learning and Verbal Behavior, 1964, 3, 50-56. Thorndike, E.L. and Lorge, I. The teacher's word book of 30,000 words. New York: Columbia University Press, 1944. Underwood, B.J. Experimental psychology. New York: Appleton—Century-Crofts, 1966. APPENDIX A INSTRUCTIONS INSTRUCTIONS Complete instructions for Group I are presented be— low. Instructions for the remaining groups involve various modifications which are described subsequently. Group I: (Paragraph 1) "I have a series of short problems I want you to solve. Each problem consists of two letters. You are to produce a four letter word that begins with the first letter and ends with the second. You may not use foreign language words or proper nouns. For example, if I give you the letters T—K words like TASK, TOOK or TUSK meet all the requirements." (Paragraph 2) "Words vary in the extent to which they are used in the language. Some words like FROM or DOOR are very common, while others like FURL or GARB are much less common. I want you to think of the most infrequently used word you can which meets the other requirements. As an example, in the T—K problem TUSK is a better solution than TOOK because it is much less common.“ (Paragraph 3) ”You will have 90 sec. to work on each problem. Do not give me your answer until I tell you the time is up. I will check to see what other words you have produced after 51 52 the problem is solved, but don't be too concerned with this while you are solving the problem.” (Paragraph 4) I'Here again is a list of the solution re— quirements. (1) It must be a word. (2) It must begin and end with the letters specified. (3) It must have exactly four letters. (4) It can be neither a proper noun or foreign word. (5) Finally, it should be the most uncommon word you can think of, within the other limitations." Group F: The word "frequently” was substituted for ”infre— quently” in the second sentence of paragraph 2 and the words ”TOOK'I and "TUSK“ were interchanged in the third sentence. The word "common'I was substituted for "uncommon" in para- graph 4, item 5. Group KR: The following sentence was added to paragraph 2. ”After each problem, I will tell you how infrequent your answer is so you know how well you are doing." Group C—l: The second sentence was deleted from paragraph 3. Group C—2: Paragraph 2 was deleted. The second sentence of paragraph 3 was changed to read, “I will check to see what words you thought of after the time is up." Finally, item 5 in paragraph 4 was deleted. 53 Instructions for the Recognition Test: (Groups I,F,KR and C—1) "Here is a list of words related to the problem you have just completed. Check those words and only those words which you thought of while you were working on the problem. Some of the words you produced may not appear on the list, but don't be concerned about that. APPENDIX B DISTRIBUTION OF RESPONSES TO QUESTION #1 & #3 ON THE POST- EXPERIMENTAL QUESTIONNAIRE Frequency Distribution of Responses to Question #1.-—“After each problem you were asked to check the words you had thought of earlier. How confident are you that you were able to make an accurate identification of those words you thought of while you were solving the problem." Response very confident fairly confident (?) fairly unconfident very unconfident Frequency Distribution of Responses to Question #3.——”Did you use any particular method or technique to solve the problems?” Response 55 \Hlllllllllll‘ll‘HI\WllmlllllllWNW\llllllfllllll 312931044846