AN ANALYSIS or mmcumEs m ABSTRACT SYLLOGESTEC gamma Eiissertafioa éor Em fiegree 6? Pk. B ‘ MECRIGAN SMTE WVERSETY _ FRED HELSABECK, JR. W73 . LIBRARYL) ' Michigan State 5 in? University I] my...” k This is to certify that the thesis entitled AN ANALYSIS OF DIFFICULTIES IN ABSTRACT SYLLOGISTIC REASONING presented by Fred Helsabeck, Jr. has been accepted towards fulfillment of the requirements for Ph . D . degree in Psychology Major pro sor Date W 0-7 639 gr.» II n 1? HUM; ! SUN? 80W. BINUERY lNC. u ’RARV amocns ‘ "'RI. 15552933.! ‘5' ABSTRACT AN ANALYSIS OF DIFFICULTIES IN ABSTRACT SYLLOGISTIC REASONING BY Fred Helsabeck, Jr. This study analyzes student performance in solving abstract syllogisms from the perspective of instructional psychology, that is, an attempt is made to determine the initial cognitive state of the typical college student without formal logical training, and then to determine what transforming actions would be effective in causing a change to a desired final state. An "algebraic substitution" model is proposed as being most typical of the reasoning processes of untrained college students. A given subject is hypothesized to make substitutions of letters in the statements as though they were equations, according to a "substitution hierarchy" consisting of one or more of the four standard sentence forms traditionally used in syllogisms. This model fits the pattern of response frequencies somewhat better than two other models based on Woodworth and Sell's atmosphere effect or Chapman and Chapman's invalid conversion with probabilistic inference. Fred Helsabeck, Jr. To get further insight into the subject's initial state as well as what transforming actions might be effec- tive, a series of experiments were performed, testing the effect of various modifications of the original syllogism task. The first experiment tested the effect of changing the word "some" to "at least one" to remove this ambiguity in the English language. Scores of individuals were very similar on the two forms of wording for the same individual and were highly correlated (r==.90). The small difference between groups was interpreted as sampling error. A number of subject protocols showed confirming evidence for the algebraic substitution model. The second experiment tested the effect of using spatial wording in place of the traditional wording. This alteration produced mixed results. Some subjects made much higher scores on the spatial task while others simply made different types of errors. The deciding factor in whether or not a subject would profit from the use of drawings seemed to be whether or not all of the possibilities in- herent in the statements were considered, but it was not clear whether this was due to lack of imagination or simply failing to look for these possibilities. The third experiment forced consideration of appropriate possibilities by testing the production of counter-examples to invalid syllogisms. The relative Fred Helsabeck, Jr. effectiveness of three methods of producing counter-examples was tested, one using set diagrams (Spatial), one using concrete objects (verbal-specific), and one using general verbal descriptions (verbal-general). Scores for the spatial and verbal-specific methods were significantly better than the verbal-general method, but the scores for all three methods were relatively low even for this reduced task, owing in large measure to the difficulty subjects had in negating statements, especially statements involving "Some __ are not __." A fourth experiment removed the difficulty of form- ing negations of statements from the production of counter— examples by modifying the task to one in which the step of negation was already taken, leaving only the difficulty of example production. Near perfect scores resulted for both of the two methods of example generation used, but the spatial method took less time than the verbal-concrete method, even when a correction was made for writing time. Several conclusions were drawn from these experi- ments that relate to the process of instruction: (1) Sub- jects are capable of considering various alternatives when reasoning with syllogisms but must be induced to do so. (2) Subjects have difficulty forming negations of statements and must be instructed in this. (3) Set diagrams are helpful in generating examples. A simple instructional procedure that was consistent with the above conclusions was proposed. AN ANALYSIS OF DIFFICULTIES IN ABSTRACT SYLLOGISTIC REASONING BY Fred Helsabeck, Jr. A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Psychology 1973 é‘ ACKNOWLEDGMENTS (31 I would like to thank Dr. Donald Johnson for spending many hours with me from the early exploratory work to the final draft, supplying advice and encouragement. I would also like to thank the other members of my disserta- tion committee, Dr. David Wessel, Dr. Richard Hart, and especially Dr. John Hunter who has had a particularly strong influence in clarifying my thinking. ****** ii Chapter I. II. III. IV. TABLE OF CONTENTS THE INITIAL COGNITIVE STATE . . . . . . . . Some Theoretical Hypotheses Regarding the Initial State . . . . . . . . . . . Empirical Evaluation of Models . . . . . Summary . . . . . . . . . . . . . . . . . EXPERIMENTS TESTING THE EFFECTS OF CHANGES OF WORDING I O O O O C Q C C O O O O I O 0 Experiment I . . . . . . . . Method . . . . . . . . Results and Discussion Experiment II . . . . . . . Method . . . . . . . . Results and Discussion Summary . . . . . . . . . . O O O O O O O O O O O O O I O O O O O O O O ,. O O O O O O O O O O O O O O O O O O O EXPERIMENTS TESTING ABILITY TO GENERATE COUNTERPEXAMPLES TO INVALID SYLLOGISMS . . Experiment III . Method . . . Results . . . Discussion . Experiment IV . . Method . . . Results . . Discussion IMPLICATIONS FOR INSTRUCTIONAL PROCESS . . REFEMNCES O O O O O O O O O O O O O O O O O O O 0 Appendix A. LOGICAL REASONING TEST . . . . . . . . . . B. SPATIAL REASONING TEST . . . . . . . . . . C. D. INVALID SYLLOGISMS . . . . . . . . . . . . ILLUSTMTIONS O O O O O O O O O O O O O O 0 iii Page 10 18 20 20 21 23 27 28 29 32 33 33 37 39 43 45 45 48 49 53 55 61 63 68 Table l. 10. LIST OF TABLES Comparisons of Predictions from Three Models with Data from Chapman and Chapman Study . . Comparison of Observed Responses with Algebraic Model Under Assumption of Equal Distribution of Three Hierarchies Among 80% of Subjects . . . . . . . . . . . . . . Comparison of Observed Responses with Algebraic Model Under Assumption of Equal Distribution of Four Hierarchies Among 80% of Subjects . . . . . . . . . . . . . . Effects of Wording and Order on Number Correct . . . . . . . . . . . . . . . . . . Percent of Response of Each Kind to Each Premise Combination . . . . . . . . . . . . Comparison of Spatial and Verbal-Specific Methods of Counter-Example Production . . . . Comparison of Spatial and Verbal-General Methods of Counter-Example Production . . . Comparison of Verbal-Specific and Verbal- General Methods of Counter-Example PrOduCtj-On O O O O O O O Q 0 O O O O O O O 0 Effect of Removing Conclusion-Negation Step from Spatial and Verbal-Specific Methods of Counter-Example Production . . . . . . . . . Analysis of Variance for Solution Times for Spatial and Verbal-Specific Methods . . . . iv Page ll 14 15 23 25 40 41 42 47 48 CHAPTER I THE INITIAL COGNITIVE STATE Anyone who has taught undergraduate mathematics or logic should be well aware that the instructional process requires much more than the instructor's mastery of the subject. He must also take into consideration the psy- chological processes involved as the student's thinking and behavior undergoes modification to the desired state. Even though he might not think of himself in this way, he functions in a manner not too different from that of a behavior therapist or athletic coach in that he must not only know the desired thinking and behavior but also what is required to facilitate desired change in the thinking or behavior of the student, client, or athlete. Glaser and Resnick (1972) point out the increasing interest of experimental psychologists in the instructional process and of the need for analyzing complex tasks. They contrast descriptive theories of learning with prescriptive theories of instruction and go on to describe some charac- teristics of the latter: Regardless of the kind of descriptive theory with which one works, certain characteristics of prescriptive theory for the optimization of learning seem reasonable to consider. They are: (a) a description of the state of knowledge to be achieved; (b) description of the initial state with which one begins; (c) actions which can be taken, or conditions that can be implemented to transform the initial state; (d) assessment of the transformation of the state that results from each action; and (e) evaluation of the attainment of the terminal state desired (p. 208). The above will serve as a useful framework to which to relate results of this study, an analysis of the task of solving abstract syllogisms. Before proceeding with this analysis, it is desirable to define a few terms used to describe syllogisms. A categorical proposition is one of the following sentence forms: Name Expression Symbol Universal Affirmative All __ are __. A Universal Negative No __ are __. E Particular Affirmative Some __ are __. I Particular Negative Some __ are not __. 0 Into the blanks can be substituted names of categories. These categories, called terms, can be either abstract, such as letters of the alphabet, or concrete as specific nouns or phrases. A syllogism consists of three such propositions, the first two called premises and the third called the conclusion. For example, the following is an abstract syllogism: All A are 3: Some B are C. Therefore, Some A are C. A syllogism is said to be valid if any consistent replace- ment of its terms by specific categories that results in both premises being true will also result in the conclusion being true. For example, the above abstract syllogism is not valid because it is possible to find at least one replacement of terms that fails to meet the above require- ment, namely the replacement A= cats, Ba: animals, and C==dogs, resulting in the concrete syllogism, All cats are animals. (true) Some animals are dogs. (true) Therefore, Some cats are dogs. (false) The desired final state is the ability to reliably judge any syllogism to be valid or invalid. The description of final state was fairly easy to describe, at least in behavioral terms. The description of the initial state is more difficult and will be discussed next. Some Theoretical Hypotheses Regarding the Initial State For the purposes of this study, abstract syllogisms, that is, those with letters rather than nouns as terms will be used to rule out the more complex effects produced by the subject's attitude toward the content of the material. Research into these effects have been done by Wilkins (1928), Janis and Frick (1964), Luchins and Luchins (1965), Frase (l966a,b), Dillehay, Insko, and Smith (1966), Parrott (1967, 1969), and Stratton (1970). These studies are discussed by Johnson (1972). Even using abstract syllogisms with the suggestive effect of content neutralized, untrained subjects are ref markably prone to error. In fact, Wilkins (1928) in com? paring various types of material, found symbolic material to be one of the more difficult types. Woodworth and Sells (1935) and Sells (1936) used the concept "atmosphere" in their description of the initial state. This could be thought of as a kind of verbal set or tendency to make conclusions that are similar in Verbal form to the premises. For example, the following invalid syl- logism could be explained by the atmosphere effect: Some A are B; . Some B are C. Therefore, Some A are C. In this case, since both premises are "I" statements, there would be a tendency to conclude an "I" statement. That the above syllogism is invalid can be seen by making the replacement, A = cats , B = black things, and C = dogs , resulting in the syllogism: Some cats are black; (true) Some black things are dogs. (true) Therefore, Some cats are dogs. (false) Woodworth and Sells supplement their atmosphere hypothesis with two others, an interpretation of the word "some" to mean "some but not all" instead of the logically conven— tional interpretation "some or all," and a principle of caution which says that subjects tend to favor particular or negative conclusions over universal or positive conclu- sions. Their three principles can be used to account for the data quite well, but because it is not clear just how far to carry each of these principles, it is not possible to make a priori predictions independent of the data. Begg and Denny (1969) interpret the effects of atmosphere and caution in a more explicit way as follows: The first principle, referring to quality, states that whenever the quality of at least one premise is negative, the quality of the most frequently accepted conclusion will be negative; when neither premise is negative, the conclusion will be affirmative. The second prin- ciple, referring to quantity, states that when- ever the quantity of at least one premise is particular, the quantity of the most frequently accepted conclusion will be particular; when neither premise is particular, the conclusion will be universal. Stated another way, subjects would tend to make conclusions having the same quality and quantity as the premises, with mixed types being decided in favor of negatives or partic- ulars. To see how a conclusion would be drawn from a mixed pair of premises, consider the premise pair, AE (all-no). Both premises are universal so the conclusion would also be universal. The A statement is positive but the E statement is negative, so the conclusion would be negative. Therefore the conclusion would be the universal negative pr0position, E. Applying the model to each of the ten possible premise pairs, the following predictions are made: An A conclusion would be predicted from an AA premise pair, an E from.AE and EE, I from AI and II, and an 0 from A0, E1, E0, IO, and 00. An empirical evaluation of these predictions as well as those from other hypotheses will follow after some alter- native hypotheses have been discussed. Chapman and Chapman (1959) proposed that one source of error was an invalid acceptance of the converse of an A or 0 statement. For example, the 0 statement, "Some A are not B" would be converted into "Some B are not A." For those pairs not containing A or O statements, they hypoth- esize a process of "probabilistic inference." They describe it this way: By one kind of probabilistic inference, S reasons that things that have common qualities or effects are likely to be the same kinds of things, but things that lack common qualities or effects are not likely to be the same. In the syllogism, the available common character- istic is the middle term. This idea certainly deserves consideration, but the theory seems to have two drawbacks. (l) The authors in their discussion did not try to apply the probabilistic infer- ence uniformly to all of the premise pairs but only to those remaining that were not successfully covered by the conversion hypothesis. It does not seem likely that sub- jects would switch methods from item to item in this manner. (2) While their description of how the inference is made in specific cases sounds reasonable, it is not clear exactly how the inference would be made in other cases. In other words, it seems to lack the properties of a model capable of generating a priori predictions. At the risk of misinterpreting the authors' intent, it seems that the essential element of their explanations is that of shared properties. The statement "All A are B" could be phrased "All A's have the property B" or for the purpose of making a priori predictions, "A is linked to B." The order, A, I, O, E would be represented by links of decreasing strength until the E proposition which would be represented by "A has no link with B." When two state— ments are combined, the resulting link would be as strong as the weaker of the two original links. For example, the premise pair IO would yield the conclusion 0, it being rep- resented by the weaker link. Applying this process to each of the ten premise pairs, the following predictions are made: A from AA; I from AI and II; 0 from A0, IO, and 00; and E from AE, IE, OE, and EE. Since this model disregards order of premises or terms, it is not necessary to use the hypothesis of invalid conversion, this reinterpretation of probabilistic inference being extendable to all premise pairs. The above hypotheses are by no means exhaustive. The following alternative hypothesisis proposed, based on some informal observations of students in their attempts to work syllogisms. Most undergraduates, even though quite unsophis- ticated with formal logic, have had some eXperience with ordinary number algebra. The fact that abstract syllogisms use letters instead of nouns would tend to favor an "alge- braic set" in untrained subjects. With this set, the sub- ject would see the relations between terms as "degrees of equality" and perform operations apprOpriate to equations, such as the substitution of a quantity for its equal. For example, consider the following pair of premises: All A are B; Some B are C. With an algebraic set, the subject would interpret the statement "All A are B" as "A4=B" and substitute A for B in the second premise, resulting in the conclusion, "Some A are C." Another example: Some A are not B; No B are C. In this case, neither statement is strongly suggestive of equality, but this would not necessarily keep subjects from trying to treat the statements in this way. The question is which terms will be substituted for which? Will A be substituted for B in the second statement, or will C be substituted for B in the first statement? One way to resolve this ambiguity is to postulate the existence of an "equality hierarchy." This would be a given subject's perception of how similar a given statement is to an equa- tion. It seems reasonable that all subjects would rate the A statement as being most likefequality, followed by the I statement. Beyond this subjects would vary. One type of subject would see the 0 statement as being just another, more indirect way of saying the I statement. This would most likely result from the interpretation of "some" as meaning "some but not all" which would imply "some-not." This type of subject would have the equality hierarchy, A—I-O-E. A second type of subject could be thought of as having two hierarchies, one for degree of equality and one for inequality. Many students mistakenly treat inequalities as having the same algebraic properties as equalities, so it would not be very surprising if these subjects formed two sub-orders placing the stronger statement first, producing 10 the sub-orders A21 and E-O, which could be combined as A—I-E-O. A third type of subject would use only the abbreviated hierarchy AeI. To see how this model would be applied to the above premise pair, the A-I-O-E subject would consider the first statement (0) to be higher on the equality hierarchy and substitute A for B in the second statement. The A-I-E-O subject would consider the second statement (E) to be higher and substitute C for B in the first statement. The ArI subject would respond "none of the above," as neither premise is an A or I statement. Predicted con- clusions for each of the ten premise pairs for each form of the above model are given in Table 1. Empirical Evaluation of Models For the purpose of this evaluation, the data gathered by Chapman and Chapman seem to be the most suit- able. The multiple-choice format produces more clear-cut patterns of response than the true-false or extended true- false scale used by earlier researchers. Also the fact that four possible conclusions are given in a single test item allows for complete coverage of different syllogisms with fewer items. Another factor arguing for the superi- ority of their data is the unusually large number of subjects used (222) with the resulting reliability of ll percentages in their tabulations. A summary of predictions based on each of the three models discussed above, together with the experimental results of Chapman and Chapman are given in Table 1. TABLE I COMPARISONS OF PREDICTIONS FROM THREE MODELS WITH DATA FROM CHAPMAN AND CHAPMAN STUDY Probablistic T a f - Premises Atmosphere Inference (l) (2) (3) A E I O N AA A A A A A 80 5 1 AE E E E E E 2 83 AI I I I I I 5 5 76 6 A0 0 O O O O 2 ll 70 12 BE E E N E E 2 54 4 5 35 E1 0 E E E E 2 55 6 18 19 E0 0 E N E O 3 31 8 24 34 II I I I I I 2 65 8 20 IO 0 O O O O l 6 12 55 26 00 O O N O O 0 11 52 30 :::§::::: 58% 62% 74% *(1)--Equality hierarchy-rA-I (2)--Equality hierarchy--AeI-O—E (3)--Equality hierarchy--A-I—E-O N--None of these. 12 It should be pointed out that each of the models discussed generates predictions from premise pairs without regard to the order of terms within the statements. These variations of order, traditionally called figures, are not necessarily equivalent logically. For example, both of the test items given below have AA premise pairs, but differ in figure, the second premise of each item being the converse of the other. The correct response to the second item is "None of the above," but all of the models discussed would predict "All A are C" as responses to both items. All A are B; All A are B; All B are C. All C are B. Therefore, Therefore, (1) All A are C. (1) All A are C.- (2) No A are C. (2) No A are C. (3) Some A are C. (3) Some A are C. (4) Some A are not C. (4) Some A are not C (5) None of the above. (5) None of the above. There is considerable evidence that most subjects in fact ignore differences of figure (Chapman and Chapman, 1959; Johnson, 1973; Experiment 1 of this study). This finding permits a convenient simplification in the presentation of the data of Chapman and Chapman. The percentages for dif- ferent figures of the same premise pair were averaged and listed in a single row of Table 1. Each of the three models fits the data reasonably well. For the premise pair, E1, the atmosphere model accounts for a considerably lower percent of the responses. 13 The probabilistic model predicts the same responses as the algebraic model with equality hierarchy, ArI-O-E. The algebraic model predicts the highest percent of the responses, but it must be kept in mind that because the model predicts more than one response for some of the items, it would necessarily have a mathematical advantage over the other models. To make the comparison fully leg- itimate, it would be necessary to have access to individual protocols and to use some a priori method of assigning equality hierarchies to subjects.' In the absence of these protocols, an assumption could be made about the percentage of subjects using various substitution hierarchies. For example, note in Table 1 that 80% of the responses to the premise pair AA are the predicted A and that the predicted responses for the next three rows occur at approximately the same percentage. Note also that in some of the rows where different hierarchies predict different responses, the totals of the corresponding percentages are not far from 80%. Suppose the assumption is made that 80% of the subjects conformed to the algebraic model and that there were about an equal number of these subjects using the three equality hierarchies. This would result in the pattern of responses given in Table 2. Another possibility would be to assume the existence of another class of sub- jects possessing the hierarchy consisting only of the A 14 statement and further assume that each of the resulting four hierarchies is used by 20% of the subjects. The predictions following this assumption are given in Table 3. Of course these assumptions are rather speculative and only included to illustrate possibilities. Further evaluation of these models can come from other approaches such as a systematic examination of individual protocols or the use of experimental manipulations. TABLE 2 COMPARISON OF OBSERVED RESPONSES WITH ALGEBRAIC MODEL UNDER ASSUMPTION OF EQUAL DISTRIBUTION OF THREE HIERARCHIES AMONG 80% OF SUBJECTS W Percent Predicted Percent Observed Deviationsa Premises A E I O N A E I O N from Predictions AA 80 .. .. .. .. 80 .. .. .. .. 0 AE .. 80 .. .. .. .. 83 .. .. .. 3 AI .. .. 80 .. .. .. .. 76 .. .. 4 A0 .. .. .. 80 .. .. .. .. 7O .. 10 EE .. 54 .. .. 27 .. 54 .. .. 35 8 E1 .. 80 .. .. .. .. 55 .. .. .. 25 E0 .. 27 .. 27 27 .. 31 .. 24 34 14 II .. .. 80 .. .. .. .. 65 .. .. 15 IO .. .. .. 80 .. .. .. .. 55 .. 25 OO .. .. .. 54 27 .. .. .. 52 3O 5 aAverage deviation==10.9. 15 TABLE 3 COMPARISON OF OBSERVED RESPONSES WITH ALGEBRAIC MODEL UNDER ASSUMPTION OF EQUAL DISTRIBUTION OF FOUR HIERARCHIES AMONG 80% OF SUBJECTS J m jL" I v__V, Percent Predicted Percent Observed a Deviations Premises A E I O N A E I O N from Predictions AA 80 .. .. .. .. 80 .. .. .. .. 0 AE .. 8O .. .. .. .. 83 .. .. .. AI .. .. 80 .. .. .. .. 76 .. .. 4 A0 .. .. .. 8O .. .. .. .. 70 .. 10 EE .. 4O .. .. 4O .. 54 .. .. 35 19 E1 .. 6O .. .. 20 .. 55 .. .. l9 6 EO .. 20 .. 20 4O .. 31 .. 24 34 21 II .. .. 60 .. 20 .. .. 65 .. 20 5 IO .. .. .. 60 20 .. .. .. 55 26 11 OO .. .. .. 40 40 .. .. .. 52 30 22 aAverage deviation==10.l. The latter approach was used by Simpson and Johnson (1966). They constructed a test containing two scales, an atmosphere scale consisting of syllogisms thought to be subject to atmosphere errors but not to conversion errors, and a conversion scale thought to be subject to conversion errors but not to atmosphere errors. The atmosphere scale consisted of syllogisms, EE-E, 11-1, 00-0, 10-1 or O, and EO-E or 0. They interpreted the atmosphere effect a little differently than the model presented earlier in this paper, 16 as can be seen from the fact that they considered either an I or 0 response to the pair, IO to be the result of an atmosphere effect, similarly for the pair, EO. For the conversion scale, they used only different figures of AO-O syllogisms. Other syllogisms were recognized as possibly resulting from invalid conversion, but they were also capable of resulting from the atmosphere effect, so they were not used. For each of these scales, the correct response was "none of the above." Items of this kind are called indeterminant items. Several determinant items were added to the test to prevent a response set. The scales were fairly reliable and uncorrelated with each other, indicating the existence of two separate kinds of error. They used differential training designed specifically as anti-atmosphere and anti-conversion training. The anti- atmosphere training was quite effective in reducing errors on the atmosphere scale, but anti-conversion training was less effective in reducing errors on the conversion scale. Johnson (1972) modified the above experiment using a reinterpretation of the atmosphere effect. He still used the same five premise combinations as before, but used the responses predicted by the Begg and Denny interpretation of the atmosphere effect given earlier in Table 1. From these syllogisms, he formed two scales, a primary atmosphere scale, consisting of the syllogisms EE-E, 11-1, and 00-0, 17 and a supplementary atmosphere scale consisting of the syllogisms, 10-0 and EO-O. His conversion scale was formed as before only from AO-O syllogisms. He also used a larger number of determinant items, so that the effect of the training on these items could be better assessed. The primary atmosphere scale was uncorrelated with either the supplementary atmosphere scale, the conversion scale, or the determinant scale, the latter three scales having moderate positive correlations with each other. The anti- atmosphere training clearly reduced the number of errors on the atmosphere, supplementary atmosphere, and conversion scales; but at the same time also reduced the number of correct responses on the determinant scale. These results could be interpreted as the production of a "skeptical set" or a tendency to respond "none of the above" if in doubt. The anti-conversion training produced a similar but weaker effect. How do these results affect the acceptability of the algebraic model discussed earlier? The only result that really causes trouble is the near zero correlation between the primary atmosphere scale and the other scales. It seems that the best way to deal with this result, which was in evidence in three experiments, is to assume that in addition to the algebraic set, there is a tendency in some subjects to reason in an even more superficial manner than 18 that of the algebraic substitutions. These subjects look at the items and simply make responses that are verbally similar to the premises, without regard for the structure of the sentences. As the responses predicted for this process would not differ much from those predicted by the algebraic model, probably the best approach to gathering further evidence for a verbal similarity set would be the use of individual protocols. Indirect evidence could also be gotten from such factors as an absence of scratch work and a relatively short time spent taking the test. Summary The process of instruction was discussed as a process by which an initial state of knowledge is trans- formed into a desired state by certain transforming actions. The desired state for the task of judging the validity of syllogisms was defined as a behavioral criterion. Alter- native descriptions of possible initial states were made, with two of them, one termed an algebraic set and the other a verbal similarity set, being singled out, at least tenta- tively, as best typifying untrained subjects. The effect of two transforming actions, anti—atmosphere and anti- conversion training, were discussed; and it was concluded that the major effect was to transform whatever was the initial state to an equally erroneous state termed a l9 skeptical set. Additional attempts to find more effective actions for inducing the desired behavior will be discussed in Chapter II. CHAPTER II EXPERIMENTS TESTING THE EFFECTS OF CHANGES OF WORDING As an alternative to training or instructions, a very convenient and effective way to induce a change in a subject's reasoning process is to simply change the task itself so that the subject will reason as though he has been given appropriate training. Following are two exper- iments which attempt to do this by changes of wording into logically equivalent, but psychologically more informative forms. Experiment 1 It is generally recognized that the word "some" can be ambiguous to subjects untrained in formal logic, but the effect of clearing up this ambiguity has not been tested specifically. The purpose of this experiment is to see to what extent this ambiguity contributes to errors in judging the validity of syllogisms. 20 21 Method Subjects.--The subjects were sixteen introductory psychology students who stated that they had not had any training with syllogistic reasoning. Materials.--A 14-item multiple-choice syllogism test was constructed by random selection of items, subject to the following constraints: 1. The word "some" must occur in at least one of the two premises. (At least one premise must be I or o.) 2. Each of the remaining seven combinations of premise pairs must be represented once in each half-test. 3. With the mixed pairs, each order is used. For example, if the pair AI occurs in the first seven items, then the pair IA must occur in the second seven items. 4. One of the four figures (patterns of assignment of the three terms to positions within the state- ments) is randomly chosen for each of the items, subject to the conditions that figures for the same premise combination are not repeated, and that if a determinant figure for a given premise combination exists, both a determinant and an indeterminant figure must be used for that premise combination. 22 5. The three determinant items are spaced through the fourteen item test by moving, if necessary, the easy AI or IA determinant item to position #1 on the test and the other two to positions #5 and #10. 6. Letters of the alphabet are put into random order and assigned in this order, three at a time, as terms for each of the items. After the items were selected, two forms of the test were constructed, one with the usual wording and the other with modified wording as shown below: Usual Wording Modified Wording All A are B. Every A is a B. Some A are B. At least one A is a B. Some A are not B. At least one A is not a B. No A are B. No A is a B. Each form of the test contained the following instructions: Read each syllogism carefully, then select any conclusion which must follow logically from the premises and circle its number. Feel free to write in the empty space if you find it helpful. The inclusion of empty space was to encourage subjects to show their work, in case some interesting protocols resulted. The departure from the usual forced-choice instruction was made so that any ambivalence in making responses could be observed. The two forms of the test are given in Appendix A. 23 Procedure.--Subjects were run in groups of from 6 to 10. However each subject was randomly assigned to one of two orders on an individual basis, resulting in two treatment groups of 8 subjects each. The first treatment group received the usual form followed by the modified, and the second group, the two forms in opposite order. For each subject, both tests were scored for number correct. There was no limit on the time allowed for completion of the tests. Results and Discussion The means and standard deviations for each of the two test forms with each of the two orders are given in Tatfihe 4. TABLE 4 EFFECTS OF WORDING AND ORDER ON NUMBER CORRECT Means Original Wording Modified Wording Original first 1.9 2.1 Modified first 5.4 4.9 Standard Deviations Original first 1.4 1.5 Modified first 4.3 4.1 24 If the scores of the two groups are combined, counter- balancing any effects of order, the means of the original and modified wording are 3.6 and 3.5, respectively, with correlated t==.325, df==l4, and r==.90. If only the first test score for each subject is considered, then we have an independent groups comparison of the two forms of wording. The mean of the original wording is 1.9 and that of the modified wording is 4.9 (Table 4). A one-tailed t-test is statistically significant at the .05 level (t==1.96, df= 14). Two interpretations of the above results seem possible. The first interpretation is that the modification in wording made little difference in subject performance, and that the rather large difference in group means was just a matter of sampling error. That is, it just happened that the three most logically sophisticated subjects were assigned to the same group. The secohd interpretation is that subjects tended to form a response set after their first test which caused them to continue with the same thought process on their second test. The modified wording was easier to interpret correctly and when it was given first, subjects were more likely to use valid reasoning processes, not only on the first test but also on the second. The reverse was the case with the opposite order. 25 Examination of individual test papers seemed to favor the first interpretation, that is that the modified wording was not significantly better than the original wording. For example, the subject making the highest score and contributing the most to the higher mean and variance of his group used set diagrams and it seems unlikely that the use of this aid would be influenced by the modification. Examination of individual protocols reveals that not only are the totals similar on both forms of the test, but also that the response patterns to individual items are similar. Thus, the results on the two forms could be com- bined, and response frequencies to each premise combination are given in Table 5. TABLE 5 PERCENT OF RESPONSE OF EACH KIND TO EACH PREMISE COMBINATION w L; Premises A E I O N I & O AI 9 3 58 6 8 16 A0 3 5 25 42 14 9 El 0 45 5 25 16 E0 0 34 20 22 22 0 II 1 l 59 9 l8 9 IO 0 3 25 28 28 13 OO 2 6 22 36 23 ll 26 There are a few things that could be observed from this tabulation and the original test papers: 1. Even with the modified test format allowing the selection of more than one response to a given set of prem- ises, the pattern of errors was very similar to that of earlier studies using the usual forced-choice format. 2. All of the simultaneous I and 0 responses shown in Table 2 were made by three of the sixteen subjects. These subjects also showed a considerable number of items in which either the I or 0 response had been erased in favor of the other response, indicating a degree of ambivalence in response between these two sentence forms. Other subjects also showed some inconsistency in making an 0 response where an I response had been made earlier in the test to the same premise combination and vice versa. All of the above sug- gests the possibility that some of the subjects were inter— preting the I and O statements as meaning the same. The similarity of responses on the two forms of the test by a given subject indicates that a change from "some" to "at least one" is ineffective in dispelling this inter- pretation. 3. The format of the test allowed for an examina- tion of scratch work done by the subjects. Six of them left their spaces blank, but ten showed considerable use of symbolic work, mostly in the form of algebraic equations 27 and inequalities, supporting the algebraic model discussed in Chapter I. It is interesting that some of the subjects would write very infrequently and very small, suggesting a desire to try to "do it in their heads" even when supplied with plenty of writing space. Thirteen of the sixteen subjects ranged in score from 1 to 8 out of a possible 28 (l4-tl4), two subjects had intermediate scores of 14 and 17, and one subject had an almost perfect score of 26. The latter subject was the only one to use set diagrams. He claimed not to have had previous training with syllogistic reasoning but said that he had had some exposure to work with sets in mathematics courses. Experiment II Set diagrams are often taught as an aid to reasoning in mathematics and logic courses. Frandsen (1969) showed training with diagrams to be especially useful in overcoming difficulties of students with low spatial aptitude. Whimbey and Ryan (1969) found ability to solve syllogisms to be related to short-term memory as measured by a modified digit test, and proposed that the use of diagrams functioned to reduce dependence on short-term memory. Johnson (1972) used diagrams to aid in his anti-atmosphere and anti-conversion training. Each of these researchers used diagrams as a part of a more complete training procedure, but the question 28 remains, what would be the effect of simply inducing subjects to use diagrams, without additional training? At the very least, the use of diagrams would be expected to break up either of the two cognitive sets proposed in Chapter I, allowing the possibility of a change to a correct process. Whether or not this happens for a particular subject, additional information would be gained about the initial cognitive state of the untrained subject. To test this effect of inducing subjects to use diagrams, the method of modified wording as used in Experiment I was again used, this time to appropriate spatial wording. Method Subjects.--This experiment was run as an adjunct to that of Johnson (1973). The subjects were 20 introductory psychology students from his control group. Materials.--An ll-item subtest of that used by Johnson was constructed by randomly selecting one of each of the 11 combinations of premises appearing on the original test. Five items were determinant and six were indetermi- nant. These syllogisms were reworded in logically equiva- lent Spatial terms as follows: Original . Reworded All A are E. A is inside B. No A are E. A and B are separate. Some A are B. A and B overlap. Some A are not B. Some of A lies outside B. 29 The spatial test had the following instructions: The following are statements about geographical regions. For each pair of statements circle the number of any (one or more) of the conclu- sions listed that follow logically. You may use the margins if you find it useful to make notes or drawings of the situations described by the statements. A copy of the spatial form of the test is given in Appen- dix B. Procedure.--Each subject was given the original syllogism test and then given the spatial test. Counter- balancing the order of the two tests, while desirable, was not possible without compromising the Johnson experiment. Results and Discussion The mean number correct out of eleven was 5.3 and 5.8 for the original and spatial tests, but the difference between these means was not significant. The relationship between the performances on the two tests was interesting and is given in the scatterplot shown in Figure 1. There seems to be three types of subject. One type scored low on the original test and as low or lower on the spatial test but with a different pattern of errors. A second type scored low on the original test, but scored considerably higher on the spatial form. A third type scored high on both tests. There were no subjects getting more than 6 out of 11 of the items correct on the original test who got 6 or less of the items on the spatial test. Spatial 30 ll- 10% O --—_—o————————————--————- Verbal Figure 1. Number correct (out of 11) on each form of test. 31 If it can be assumed that subjects would have no difficulty in translating statements with traditional wording into spatial terms, then it would be expected that those subjects represented in the upper left portion of the scatterplot would have profited from the simple suggestion to make this translation into spatial terms. As for those scoring low on both tests, more would be required. Examination of individual papers suggests that an important variable in determining whether or not a given subject will profit from a change from verbal to spatial representation is whether or not the subject uses drawings that indicate the consideration of more than one case. A count of the number of items for which a given subject drew multiple-case drawings was made and if non-zero, was indi- cated on the subject's point in Figure l. A high number was not necessary but seems to have been sufficient for a high score. It seems reasonable that some of the subjects would be able to visualize the cases without actually making the drawings. Assuming that lack of consideration of all the cases implied by a set of premises is an important reason for invalid conclusions, it is not clear whether subjects have difficulty generating different special cases or whether they just fail to look for these cases. This question will be considered in Chapter III. 32 Summary Two attempts were made to modify the initial cognitive state as described in Chapter I. The first, a removal of the ambiguity in the word "some," did not yield conclusive results. Subjects showed little change in response with modification of wording, and the marginal difference between groups was interpreted as the result of sampling error. Both the tabulation of responses and informal examination of individual papers gave additional evidence for the verbal and algebraic models described in Chapter I, as well as the tendency for subjects to inter- pret I and O statements as the same even with modified wording designed to remove the ambiguity. The second attempt at modification, the use of spatial wording, was differentially effective. Some sub- jects could function effectively with either kind of problem. Some subjects scored considerably higher on the spatial task, suggesting that if they were taught to represent statements spatially, they would show dramatic improvement on the orig- inal task. Other subjects misused the diagrams, often draw— ing conclusions on the basis of a single special case. Thus these subjects simply made different kinds of errors on the spatial form and were equally ineffective with both tasks. In the experiment reported in the next chapter, subjects were forcedflto consider alternative possibilities. CHAPTER III EXPERIMENTS TESTING ABILITY TO GENERATE COUNTER-EXAMPLES TO INVALID SYLLOGISMS Experiment III In the process of solving syllogisms by diagrams, successful subjects seem to draw or visualize different positions of the sets making up the diagram that are consistent with the premises, in an effort to eliminate potential conclusions, until it is clear whether or not a conclusion must follow. For example, consider the following item: All A are B; Some B are C Therefore, 1. All A are C. . No A are C. . Some A are C. . Some A are not C. . None of the above. U'l-bLAJN In this problem, it seems likely that the subject would first draw the first premise, which would be represented by two regions. A typical example is shown by the solid lines in the diagram at right. Then he could represent the second premise by drawing or imagining 33 34 alternative positions for the third region, such as are shown by the dotted regions, until it is clear that each of the four alternatives is possible, so that none of the alternatives is necessarily true. Therefore, the correct response would be "None of the above." Actually, it is possible to be more systematic in the above process. To be certain that none of the alter- natives can be deduced, it is enough to find a single example that will refute "Some A are C" and a single example to refute "Some A are not B." Refuting the I statement will automatically refute the stronger A statement, and refuting the 0 statement will automatically refute the stronger E statement. To test the ability of subjects to generate counter-examples, the subject's time can be better used if instead of the multiple-choice format, test items consisting of single invalid syllogisms are used. If the subject can find counter-examples for a randomly selected list of invalid syllogisms, it is assumed that he could find counter-examples for each of the invalid conclusions given in a list of multiple-choice items. While set diagrams are in wide use, other methods are possible. Two of these are the use of specific nouns and the use of general verbal descriptions. Each of these methods is illustrated below: All A are B; Some B are C. Therefore, Some A are C. 35 Solutions: Spatial Verbal Specific Verbal-General1 All dogs are animals; All A are B, but not all Some animals are cats. B have to be A. The C's Therefore, could be just those B's Some dogs are cats. that are not A's, refuting "Some A are C." It would seem reasonable to predict that the spatial approach would give the best results with most subjects. The verbal-specific method has the advantage of concreteness and direct relation to the sentence forms but is not condu- cive to consideration of general structures. This method would then demand a considerable amount of divergent pro- duction, requiring more trial-and-error than would be expected with the other two methods. The verbal-general method has the advantage of inducing consideration of general structures but lacks the concreteness of the other two methods. It also seems likely that it would place a greater demand on short-term memory. Moreover, this method might be more easily misunderstood with limited instruction. The spatial method has both generality and concreteness but requires a translation into spatial terms. However, in Experiment II there was little evidence of difficulty in correctly making the translation from spatial wording to diagrams; and since there is little reason to believe that 1This method was derived from the protocols of a mathematical psychologist, using his customary method. 36 translation from the original wording to spatial wording would be difficult, it is predicted that this method of generating examples would be the easiest for untrained subjects to learn and apply. First a pilot study was done to determine the number of items that subjects could handle in the allotted span of time. The number of items was relatively small. In fact, the number of counter-examples that the subjects could generate was much smaller than the number of syllogisms that they could answer in the same length of time. This was true despite the fact that a single counter-example is only part of what is required for the correct solution of the usual multiple-choice item, that is the rejection of one partic- ular incorrect answer. Actually this is not so surprising when one considers that in this task it is impossible for subjects to use the fast but superficial algebraic substi- tution or verbal similarity processes. They are effectively forced to come to grips with specific cases. Since this is what the logician must do, performance on this task should reflect more clearly the subject's potential to profit from instructions to consider special cases and to eliminate possibilities. 37 Method Subjects.--The subjects were 42 introductory psychology students. Materials.--Ten invalid syllogisms were randomly selected, subject to the following criteria: 1. Each of the ten kinds of premise pairs that can be formed from A, E, I, or O statements were ordered randomly. 2. For each premise pair, the conclusion was chosen to be either an I or 0 statement. Five I and five 0 conclusions were put in random order and assigned to the ten premise pairs. 3. The resulting triples of statement types were randomly assigned one of the four figures. If the resulting syllogism was valid, then the conclusion was strengthened from I to A or O to E, whichever applied. If the syllogism was §£i11_valid, then another figure was selected and the process repeated. (As it turned out, it was not necessary to apply this rule.) For most items, the premises could be represented by a number of different drawings, requiring some searching by the subject. However for one item there was only one drawing possible, and a different figure was selected. 4. The alphabet was randomly ordered and assigned to terms of the syllogisms. 38 The ten items were divided into three parts. The first two items were sample problems, the next three were to be scored but not timed, and the last five were to be scored and timed. Some reordering was done to produce a "better" test. That is, the first item seemed to be par- ticularly difficult and was shifted to the end of the list. The second and third items seemed to make particularly good illustrative problems, so they were moved to the top of the list. The item in the ninth position seemed particularly easy, so it was moved to the third position, just after the sample problems. Three forms of the test were constructed, one for each of the three methods described earlier. The forms were identical except for the explanations of the sample solu- tions. (The three methods of counter-example production were judged to be sufficiently different to prevent undue interference from one task to the other. The pilot study showed the instructions to be effective in causing the required shift in the subject's method.) The timed portions were separate from the untimed portions. These tests are given in Appendix C. Procedure.--Subjects were tested in groups of from 6 to 15. However each subject was randomly assigned to one of the six treatment conditions on an individual basis. After a brief discussion of the general purpose and design 39 of the experiment at the beginning of the session, the untimed portion of the test was distributed to the subjects. When a subject finished with this portion, he was given the second portion with the time written on the sheet. When the subject returned, a second time was written on the sheet and then the process repeated with his second method. Each of the eight test items was scored as correct or incorrect and the number correct was recorded. Results For each pair of methods, there were two relevant groups. One of the groups got the two methods in the order AB, the other BA. The results for each such pair of groups were analyzed separately. Table 6 gives the results of comparing subjects' scores for spatial counter- examples with those for verbal-specific counter-examples. Table 7 gives the results of comparing subjects' scores for spatial counter-examples with those for verbal- general counter-examples. Table 8 gives the results of comparing subjects' scores for verbal-specific counter-examples with those of verbal-general counter-examples. 40 TABLE 6 COMPARISON OF SPATIAL AND VERBAL-SPECIFIC METHODS OF COUNTERFEXAMPLE PRODUCTION Mean Correct (out of four) I Conclusions Groups Spatial Verbal-Specific Spatial first 3.3 2.9 3.1 Verbal-specific first 2.0 2.5 2.3 2.6 2.7 2.7 0 Conclusions Spatial first 1.7 0.7 1.2 Verbal-specific first 0.4 1.0 0.7 1.1 0.8 1.0 Analysis of Variance Source SS df MS F p eta Groups (G) 5.785 1 5.8 1.1 .. ... Ss/G 62.429 12 5.2 . . ... Methods (M) 0.071 1 0.07 ... ... ... G X M 5.787 1 5.8 4.3 ... ... M X Ss/G 16.142 12 1.3 ... ... ... Conclusions (C) 41.143 1 41.1 39.3 .001 .26 G X C 0.286 1 0.3 ... ... ... C X Ss/G 12.571 12 1.0 ... ... ... M X C 0.286 1 0.3 G X M X C 0.285 1 0.3 ... . . ... M X C X Ss/G 15.429 12 1.3 ... . . ... Total 160.214 55 .. ... ... .26 41 TABLE 7 COMPARISON OF SPATIAL AND VERBAL-GENERAL METHODS OF COUNTERrEXAMPLE PRODUCTION Mean Correct (out of four) I Conclusions Groups Spatial Verbal-General Spatial first 2.4 . 1.6 Verbal-general first 2.6 . 1.9 2.5 1.0 1.8 0 Conclusions Spatial first 0.9 1.3 1.1 Verbal—general first . 0.7 . 0.8 1.0 0.9 Analysis of Variance Source SS df MS F p eta Groups (G) 0.055 1 0.05 ... ... ... Ss/G 44.389 12 3.70 ... ... ... Methods (M) 4.500 1 4.5 2.1 ... ... G X M 0.056 1 0.06 ... ... ... M X Ss/G 25.944 12 2.2 ... ... ... Conclusions (C) 8.000 1 8.0 5.3 .05 .06 G X C 0.889 1 0.9 ... ... ... C X Ss/G 17.611 12 1.5 ... ... ... M X C 7.889‘ 1 7.9 5.3 .05 .06 G X C X Ss/G 18.278 12 1.5 ... ... ... Total 128.033 55 ... ... ... .12 42 TABLE 8 COMPARISON OF VERBAL-SPECIFIC AND VERBAL-GENERAL METHODS OF COUNTER-EXAMPLE PRODUCTION Mean Correct I Conclusions Groups Verbal-Specific Verbal-General Verbal-specific first 1.7 0.1 0.9 Verbal-general first 2.3 0.4 1.4 2.0 0.3 1.1 0 Conclusions Verbal-specific first 1.0 0.3 0.6 Verbal-general first 0.3 0.3 0.3 0.6 0.3 0.5 Analysis of Variance Source SS df MS F p eta Groups (G) 0.018 1 0.02 ... ... ... Ss/G 12.571 12 1.0 ... ... .. Methods (M) 15.018 1 15.0 12.4 .005 .21 G X M 0.160 1 0.16 ... . . . . M X Ss/G 14.572 12 1.214 . . ... ... Conclusions (C) 6.446 1 6.4 12.6 .005 .09 G X C 2.161 2.2 4.2 ... ... C X Ss/G 6.143 12 0.51 ... ... ... M X C 6.268 6.3 11.7 .01 .09 G X M X C 0.876 1 0.88 ... ... ... M X C X Ss/G 6.428 12 0.54 ... ... ... Total 70.661 55 ... ... ... .39 43 To summarize the statistically significant results, scores for the I conclusions were significantly higher than those for the O conclusions. This was true for each of the~ three analyses. Since this variable and its interactions with method accounted for so much of the significant vari- ance, comparisons between pairs of methods were carried out for I and O conclusions separately. None of the group-by- method interactions (order effects) were significant, so the two orders were combined for each of the three pairs of methods. For the I conclusions, both the spatial and verbal-specific methods had significantly higher means than the verbal-general method (t==2.94 and t==4.17, df==13) but did not differ significantly from each other. Means for the O conclusions were all low and did not differ significantly. This "floor" effect seems to have been enough to mask the effect of the method variable and produce the significant method-by-conclusion interactions. Discussion Subjects evidently found the Verbal-general method considerably more difficult than either of the other two methods. This does not mean that this method could not be potentially useful, only that the small amount of training used in this experiment was not sufficient to help the type of student used. Perhaps the results would have been different, had either more logically sophisticated subjects 44 been used or more extensive training used. However, the emphasis in this study is on the effects of minimal alter- ations in the cognitive state of untrained subjects, so further investigation will use only the spatial and verbal- specific methods. It is also evident that subjects had considerably more difficulty with O conclusions than with I conclusions. In negating the O statements, many subjects used I or E statements instead of the A statement, which is the correct negation of the 0 statement. The pervasiveness of errors traceable to incorrect negation of statement suggests an experimental manipulation that should produce significant reduction of errors and in addition, allow a more sensitive comparison of the remaining two methods of generating examr ples. The generation of counter-examples can be considered to be a two-step process, the first to negate the conclusion, and the second to generate an appropriate example which is consistent with both premises and the negation of the con- clusion. The effect of eliminating this first step was tested in the next experiment. 45 Experiment IV This experiment has two purposes. One is to demonstrate a significant increase in correct solutions when subjects are given what are essentially the same problems as in Experiment III, but with the step of negating the conclusions removed. This would give additional verifi- cation of the hypothesis that this step is a major source of difficulty. The second purpose is to demonstrate what was hypothesized but not verified in Experiment III, that is that the spatial method of generating examples is superior to the verbal-specific method. Method Subjects.--Twenty introductory psychology students were assigned randomly, ten to each of two groups. Materials.--Each of the ten syllogisms used in Experiment III was modified by negating the conclusion. The instructions were also modified in keeping with the change in items. For example, instead of asking subjects to give a counter-example for the invalid syllogism, All A are B; Some B are C. Therefore, Some A are C they are asked to give an example which satisfies each of the three statements, 46 All A are B; Some B are C; No A are C. Note that the modified item is different only in that the third statement has been replaced by its negation. The tests are given in Appendix D. Procedure.--This was the same as in Experiment III except that only two experimental groups were used, one for each order of the two methods to be compared. Results To test the effect of removing the conclusion- negation component from the task, the initial test scores of each of the twenty subjects in this experiment were compared with twenty corresponding scores selected randomly from Experiment III. This permitted a two-by-two indepen- dent groups comparison, and the results are given in Table 9. Scores on the reduced task were significantly higher than those on the original task. To compare the two methods for the reduced task, solution times for the final five problems were used. In an effort to separate the thinking and writing components of the solution process, writing times were compared for three subjects by having them copy randomly selected answer sheets. The mean difference in writing time was 2:19 min. with 7 sec. standard error of the mean. A time, 2:40, which is three standard errors above the mean, was 47 TABLE 9 EFFECT OF REMOVING CONCLUSION-NEGATION STEP FROM SPATIAL AND VERBAL-SPECIFIC METHODS OF COUNTERPEXAMPLE PRODUCTION Mean Correct (out of eight) Task Spatial Verbal-Specific Negation required 4.9 3.3 Negation not required 7.1 6.5 Analysis of Variance Source SS df MS F p eta Tasks (T) 72.9 1 72.9 19.5 .001 .33 Methods (M) 12.1 1 12.1 3.24 ... ... T X M 2.5 1 2.5 0.67 ... ... Error 134.4 36 3.73 ... ... ... Total 221.9 39 ... ... ... ... subtracted from the times for the verbal-specific method. This would seem to be a liberal correction, especially considering that it would be reasonable to assume that some thinking could take place during writing. With this adjust— ment, the mean time for the spatial method was 5.1 min. and the corrected mean time for the verbal-specific method was 10.0 min. The analysis of variance for this comparison is given in Table 10. 48 TABLE 10 ANALYSIS OF VARIANCE FOR SOLUTION TIMES FOR SPATIAL AND VERBAL-SPECIFIC METHODS w 1 Source SS df MS F p eta2 Groups (G) 10.1 1 10.1 ... ... ... Ss/G 261.1 18 14.5 ... ... ... Methods (M) 245.8 1 245.8 19.1 .001 .32 G XM 27.5 1 27.5 201 0.. ... M X Ss/G 232.75 18 12.9 ... ... ... Total 777.3 39 ... ... .. ... Discussion The results of this experiment, together with those of Experiment III, indicate that for the task of generating counter-examples to invalid syllogisms, a major source of difficulty is the first step of forming the negations of the conclusions; For the second step of generating the examples, the spatial method takes less time than the verbal-specific method. Both are more accurate than the verbal-general method, at least for untrained subjects. Implications for instruction in producing counter-examples are that students must be given some training in forming statement negations, and that the conventional tendency to use set diagrams in the instruction of syllogisms has experimental backing. CHAPTER IV IMPLICATIONS FOR INSTRUCTIONAL PROCESS In Chapter I, descriptions of the initial cognitive state were proposed that were relevant to subject behavior when misinterpreting traditional syllogism items, but these processes are easily changed and seem to have little to do with what happens when attempts are made to induce subjects to use an effective process. As a result of the experiments of this study, the following elements of the initial cogni- tive state are proposed as more relevant to the instructional process: 1. Ability to translate statements into diagrams and to work with these diagrams, when shown by a couple of examples. Tendency to overlook alternatives. Ability to generate specific examples when tendency to overlook alternatives is corrected. Diagrams are not necessary, but facilitate in this process. Tendency to negate propositions incorrectly, espe- cially the 0 statement. Using the above as descriptive of untrained subjects, it is now possible to propose a simple training procedure 49 50 that could be tested experimentally. The subject could be shown a listing of the diagrams representing the five possible relations between two categories and then the A, E, I, and O statements could be defined as collections of these relations by simply indicating them as in Figure 2. All A are B Some A are not B Some A are B Figure 2. Set of diagrams for use in instruction. This procedure should induce consideration of alternatives by the fact that they are explicitly listed. The negation of a statement could be explained as simply all other cases. For example, it is Clear from the drawings the complementary nature of the A and O statements and the E and I statements. An experiment testing the effect of showing subjects the information given in Figure 2 would complement a study done by Ceraso and Provitera (1971). They used syllogisms 51 constructed from simpler statements corresponding to a single drawing of Figure 2. For example, they used the statement, "Whenever I have a yellow block it is striped, but there are some striped blocks which are not yellow." They presented the premises to their subjects by holding up objects and orally pointing out the information given in the premises. This task was significantly easier for subjects than the corresponding task using traditional statements. Unfortunately, it is not possible to tell from these results to what extent the better performance on the modified syllogisms resulted from the fact that the meaning of the statements was made more explicit or from the fact that they were logically simpler. If the example mentioned above is put into abstract form, the statement "All A are B, but some B SEE not A" was compared with the traditional statement "All A are B," but the latter statement is logically equivalent to the statement "All A are B," but. some B mighp not be A," a logically more complex statement. A replication of the Ceraso and Provitera study using the latter forms should produce an interesting supplement to the study. This replication in verbal form could then be compared with the proposed study using diagrams. It would be predicted that the diagram method would be superior in that not only would the alternatives be made explicit, but in handling these alternatives it would not place as much burden on the short-term memory as the verbal method. 52 The elements of the cognitive state listed above were arrived at by a sequence of modifications of the original task of judging the validity of a syllogism. At first modifications of wording were used. Then it was necessary to look at the subtask of counter-example production and finally at an even smaller subtask of this subtask. From this analysis, inferences were drawn about what should be effective instruction for the original task, but the process is not complete until some training such as that mentioned earlier in this chapter has actually been tried. REFERENCES REFERENCES Begg, I., and Denny, J. P. Empirical reconciliation of atmosphere and conversion interpretations of syllogistic reasoning errors. Journal of Experimental Psychology, 1969, 81, 351-354. Ceraso, J., and Provitera, A. Sources of error in syllo- gistic reasoning. Cognitive Psychology, 2, 1971, 400-410. Chapman, L. J., and Chapman, J. P. Atmosphere effect reexamined. Journal of Experimental Psychology, 1959, 58, 220-226. Frandsen, A. N., and Holder, J. R. Spatial visualization in solving complex verbal problems. Journal of P§ychology, 1969, 13, 229-233. Dillehay, R. C., Insko, C. A., Smith, M. B. Logical consistency and attitude change. Journal of Personality and Social Psychology, 1966, 3, 646-654. Frase, L. T; 'Validity judgment of syllogisms in relation to two sets of terms. Journal of Educational Psychology, Frase, L. T. Belief, incongruity, and syllogistic reasoning. Psychological Reports, 1966, 18, 982. (b) Glaser, R., and Resnick, L. B. Instructional psychology. Annual Review of Psychology, 1972, 23, 207-276. Janis, I. L., and Frick, F. The relationship between attitudes toward conclusions and errors in judging logical validity of syllogisms. Journal of Experimental Psychology, 1943, 33, 73-77. Johnson, D. M. Systematic introduction to thgypsychology of thinking. New York: Harper and Row, 1972, 236-240. Johnson, D. M. 1973 (unpublished to date). 53 54 and Luchins, E. H. Reactions to Phenominal versus logical con- Luchins, A. S., 1965, inconsistencies: tradictions. Journal of General Psychology, 12' 47-650 The effects of instructions, transfer, and Unpublished Ph.D. thesis, 1969. Parrott, G. L. content on reasoning time. Michigan State University, Parrott, G. L. The effects of premise content on accuracy and solution time in syllogistic reasoning. Unpublished Master's thesis, Michigan State University, 1967. The atmosphere effect: An experimental study Sells, S. B. Archives of Psychology, 1936, No. 200. of reasoning. Atmosphere and Simpson, M. E., and Johnson, D. M. conversion errors in syllogistic reasoning. Journal of Experimental Psychology, 1966, 1;, 197-200. Stratton, R. P. Atmosphere and conversion errors in syllogistic reasoning with contextual material and the effect of differential training. Unpublished Master's thesis, Michigan State University, 1970. Whimbey, A. E., and Ryan, S. F. Role of short term memory and training in solving reasoning problems mentally. Journal of Educational Psychology, 1969, fig, 361-364. Wilkins, M. C. The effect of changed material on ability to do formal syllogistic reasoning. Archives of Psychology, 1928, No. 102. Woodworth, R. S., and Sells, S. B. An atmosphere effect in formal syllogistic reasoning. Journal of Experi- mental Psychology, 1935, 18, 451-460. APPENDICES APPENDIX A LOGICAL REASONING TEST (Test 1) Name Logical Reasoning Test (Test 1) Read each syllogism carefully, then select any conclusion which must follow logically from the premises and circle its number. Feel free to write in the empty space if you find it helpful. 1. All V are X; Some V are Z. Therefore, 1. All Z are X. . No Z are X. . Some Z are X. Some Z are not X. None of the above. Ultbww 2. Some K are not E; All S are E. Therefore, . All S are K. No S are K. Some S are K. . Some S are not K. . None of the above. U‘hbbJNH 3. No I are P; Some U are not I. Therefore, 1. All U are P. . No U are P. . Some U are P. . Some U are not P. . None of the above. UlobWN 4. Some B are T; Some M are B. Therefore, 1. All M are T. 2. No M are T. 3. Some M are T. 4. Some M are not T. 5. None of the above. 5. No A are G; Some G are W. Therefore, . All W are A. . No W are A. . Some W are A. . Some W are not A. . None of the above. UlobDJNH 55 10. ll. 56 Some N are Q; Some N are J. Therefore, U'l-bLONl-J . 0 All J are Q. No J are Q. Some J are Q. Some J are not Q. None of the above. Some H are not D; Some D are not 0; Therefore, 1. All H are 0. 2. No H are 0. 3. Some H are 0. 4. Some H are not 0. 5. None of the above. Some R are F; No C are F. Therefore, 1. 2. 3. 4. 5. All C are R. No C are R. Some C are R. Some C are not R. None of the above. Some L are Y; All V are Y. Therefore, 1. U'l-bUJN o o All V are L. No V are L. Some V are L. Some V are not L. None of the above. All X are Z; Some K are not Z. Therefore, 1. U1-bLoJN All K are X. No K are X. Some K are X. Some K are not X. None of the above. Some E are not S; Some E are I. Therefore, 1. £11th 0 o All I are S. No I are S. Some I are S. Some I are not S. None of the above. 12. 13. 14. 57 Some P are U; Some U are B. Therefore, 1. U'lubUJN All B are P. No B are P. Some B are P. Some B are not P. None of the above. Some T are not M; Some A are not T. Therefore, 1. UlobUJN All A are M. No A are M. Some A are M. Some A are not M. None of the above. Some G are not W; No N are W. Therefore, 1. 2. 3. 4. 5. All N are G. No N are G. Some N are G. Some N are not G. None of the above. Name Logical Reasoning Test (Test 2) Read each syllogism carefully, then select any conclusion which must follow logically from the premises and circle its number. Feel free to write in the empty space if you find it helpful. 1. Every V is an X; At least one V is a Z. Therefore, . Every Z is an X. . No Z is an X. . At least one Z is an X. . At least one Z is not an X. . None of the above. mwaI-J 2. At least one K is not an E; Every S is an E. Therefore, 1. Every S is a K. . No S is a K. At least one S is a K. At least one 8 is not a K. . None of the above. U1th 3. No I is a P; At least one U is not an I. Therefore, . Every U is a P. . No U is a P. . At least one U is a P. . At least one U is not a P. . None of the above. UTJBDJNH 4. At leaSt one B is a T; At least one M is a B. Therefore, 1. Every M is a T. No M is a T. At least one M is a T. . At least one M is not a T. . None of the above. U'l-hbJN 5. No A is a G. At least one G is a W. Therefore, 1. Every W is an A. . No W is an A. . At least one W is an A. . At least one W is not an A. . None of the above. U'lnwa 58 10. ll. 59 At least one N is a Q. At least one N is a J. Therefore, 1. Every J is a Q. 2. No J is a Q. 3. At least one J is a Q. 4. At least one J is not a Q. 5. None of the above. At least one H is not a D; At least one D is not an 0. Therefore, 1. 2 3 4 5 Every H is an O. No H is an 0. At least one H is an 0. At least one H is not an None of the above. At least one R is an F; No C is an F. Therefore, 1. Every C is an R. 2. No C is an R. 3. At least one C is an R. 4. At least one C is not an 5. None of the above. At least one L is a Y. Every V is a Y. Therefore, 1. £11:wa Every V is an L. No V is an L. At least one V is an L. At least one V is not an None of the above. Every X is a Z; At least one K is not a Z. Therefore, 1. Every K is an X. 2. No K is an X. 3. At least one K is an X. 4. At least one K is not an 5. None of the above. At least one E is not an S; At least one E is an I. Therefore, 1. Every I is an S. 2. No I is an S. 3. At least one I is an S. 4. At least one I is not an 5. None of the above. 12. 13. 14. 60 At least one P is a U; At least one U is a B. Therefore, U'IprONH At At Every B is a P. No B is a P. At least one B is a P. At least one B is a B. None of the above. least one T is not an M; least one A is not a T. Therefore, mubWNH At Every A is an M. No A is an M. At least one A is an M. At least one A is not an M. None of the above. least one G is not a W; No N is a W. Therefore, UlswaI-J Every N is a G. No N is a G. At least one N is a G. At least one N is not a G. None of the above. APPENDIX B SPATIAL REASONING TEST (Brief Form) N ame SPATIAL REASONING TEST (Brief Form) The following are statements about geographical regions. For each pair of statements, circle the number of any (one or more) of the conclusions listed that follow logically. You may use the margins if you find it useful to make notes or drawings of the situations described by the statements. 1. M lies inside N; N lies inside 0. Therefore, 1. M lies inside 0. M and O are separate. M and O overlap. . Some of M lies outside . None of the above. £11.th T and X are separate; X and B are separate. Therefore, . T lies inside B. . T and B are separate. T and B overlap. Some of T lies outside . None of the above. U'IAWNH O. A lies inside B; Some of B lies outside C. Therefore, A lies inside C. A and C are separate. A and C overlap. Some of A lies outside None of the above lies inside Q; and R are separate. herefore, P lies inside R. P and R are separate. P and R overlap. Some of P lies outside None of the above. U1e4snaprac>t UMbUJNF‘ < and W overlap; and V overlap. herefore U lies inside W. U and W are separate. U and W overlap. Some of U lies outside None of the above. u1ncoxn~h3c 61 10. 11. 62 X and Y overlap; Some of Y lies outside Z. Therefore, 1. X lies inside Z. 2. X and Z are separate. 3. X and Z overlap. 4. Some of X lies outside 5. None of the above. V lies inside T; Some of 8 lies outside T. Therefore, 1. S lies inside V. 2. S and V are separate. 3. S and V overlap. 4. Some of S lies outside 5. None of the above. Some of J lies outside K; Some of K lies outside L. Therefore, 1. J lies inside L. 2. J and L are separate. 3. J and L overlap. 4. Some of J lies outside 5. None of the above. K and L are separate; Some of L lies outside M. Therefore, 1. K lies inside M. 2. K and M are separate. 3. K and M overlap. 4. Some of K lies outside 5. None of the above. S and M are separate; M and P overlap. Therefore, 1. S lies inside P. 2. S and P are separate. 3. S and P overlap. 4. Some of S lies outside 5. None of the above. S and T overlap; T lies inside V. Therefore, 1. S lies inside V. . S and V are separate. . S and V overlap. . Some of S lies outside 2 3 4 5 None of the above. APPENDIX C INVALID SYLLOGISMS Name INVALID SYLLOGISMS A syllogism consists of two statements called premises, which are assumed to be true, together with a third statement, called the conclusion, which may or may not logically follow from the premises. For example, consider the following syllogism: 1. No K are V; Some K are R. Therefore, Some R are V. - Statements of this type can be represented by diagrams as shown above. Note that regions K and V are separate, representing "No K are V," and that regions K ane R have points in common, representing "Some K are R," but that "Some R are V" is false as indicated by their separation in the diagram. Therefore, the conclusion does not necessarily follow from the two premises. Let us look at another example: Solution 2. All C are Q; Q Some C are not S. Therefore, (:> (:> Some S are not Q. Note that the word "some" is given the interpretation "some or all" in the second premise. This wider interpretation is customary in formal logic. ‘ As an aid to finding an appropriate counter-example, it is often useful to form the negation of the conclusion ahead of time. The above problem would then reduce to finding an example for which each of the following state- ments is true: All C are Q; Some C are not S; All S are Q (negation of "some S are not Q"). Go to next page. 63 N ame INVALID SYLLOGISMS A syllogism consists of two statements called premises, which are assumed to be true, together with a third statement, called the conclusion, which may or may not logically follow from the premises. For example, consider the following syllogism: . 1. No K are V; Some K are R. Therefore, Some R are V. Now consider the following counter—example: No animals are plants; Some animals are cats. Therefore, Some cats are plants. In this example, the two premises are true, but the conclusion is clearly false, showing that in general the conclusion of the above syllogism does not necessarily follow from the two premises. Let us look at another example: Solution 2. All C are Q; All Mexicans are North Americans Some C are not S. Some Mexicans are not Canadians. Therefore, Therefore, Some S are not Q. Some Canadians are not North Americans. Note that the word "some" is given the interpretation "some or all" in the second premise. This wider interpretation is customary in formal logic. As an aid to finding an appropriate counter-example, it is often useful to form the negation of the conclusion ahead of time. The above problem would then reduce to finding an example for which each of the following state- ments is true: All C are Q; Some C are not S. All S are Q (negation of "some S are not Q). Go to next page. 64 Name INVALID SYLLOGISMS A syllogism consists of two statements called premises, which are assumed to be true, together with a third statement, called the conclusion, which may or may not logically follow from the premises. For example, consider the following syllogisms: 1. No K are V; Some K are R. Therefore, Some R are V. Some K are R, and in particular, K could be the same as R (K==R). Then in this case, "No K are V" would imply "No R are V," refuting the conclusion given. Let us look at another example: 2. All C are Q; Some C are not S. Therefore, Some S are not Q. Solution: All C's are Q's but not all Q's have to be C's. Then the S's could be just those Q's that are not C's. (Q==C-+S). Then the two premises would be true, but the conclusion false. Note that the word "some" is given the interpretation "some or all" in the second premise. This wider inter- pretation is customary in formal logic. As an aid to finding an appropriate counter-example, it is often useful to form the negation of the conclusion ahead of time. The above problem would then reduce to finding an example for which each of the following state- ments is true: All C are Q; Some C are not S; All S are Q (negation of "some S are not Q"). Go to next page. 65 66 Refute the following invalid syllogisms, using the method illustrated in the first page. Feel free to refer back to the first page if you find it helpful. 3. All J are W; No J are D. Therefore, Some D are W. 4. Some P are M; Some M are not S. Therefore, Some S are P. 5. Some L are E Some E are 0 Therefore, Some 0 are not L. 0 ‘0 Return this part to experimenter when finished to get the remainder of the test. 67 Refute the following syllogisms. Length of time of work will be used as a secondary measure of the test diffi- culty, but take time to be careful. Return this sheet to the experimenter when finished. 6. Some Y are not I; No Z are Y. Therefore, Some Z are not I. 7. Some U are not B; Some P are not U. Therefore, Some P are E. 8. All M are X; All G are X. Therefore, Some G are not M. 9. No T are N; No F are T. Therefore, Some F are N. 10. Some F are H; All H are A. Therefore, Some A are not F. APPENDIX D ILLUSTRATIONS Method A Name ILLUSTRATIONS Your task here is to illustrate a number of sets of statements by drawing appropriate diagrams. For example, consider the following set of statements: 1. No K are V. Some K are R. No R are V. I Note that in the above diagram, regions K and V are separate, representing "No K are V"; regions K and R have points in common representing "Some K are R"; and that regions R and V are separate representing "No R are V." Let us look at another example: Illustration 0 2. All C are Q. Some C are not S. All S are Q. Note that the word "some" is given the interpreta- tion "some or all" in the second statement above. This wider interpretation is customary in formal logic. Illustrate the following sets of statements, using the nethod.shown above. 3. All J are W. No J are D. No D are W. 4. Some P are M. Some M are not S. No S are P. 5. Some L are E. Some E are 0. All 0 are L. Return to experimenter and receive continuation of test. 68 Method B Name ILLUSTRATIONS Your task here is to illustrate a number of sets of statements by giving appropriate examples. For example, consider the following set of statements: Illustration 1. No K are V. No animals are plants. Some K are R. Some animals are cats. No R are V. No cats are plants. Note that in the above example words were substituted consistently, that is, each letter stands for one and only one word. In addition, effort was made to form statements that are as clearly true as possible. Let us look at another example: 2. All C are Q. All Mexicans are North Americans Some C are not S. Some Mexicans are not Canadians All S are Q. All Canadians are North Americans. Note that the word "some" is given the interpreta— tion "some or all" in the second statement above. This wider interpretation is customary in formal logic. Illustrate the following sets of statements, using the method shown above. 3. All J are W. No J are D. No D are W. 4. Some P are M. Some M are not S. No S are P. 5. Some L are E. Some E are 0. All 0 are L. Return to experimenter and receive continuation of test. 69 70 Name Continue to illustrate the following by the method shown on the preceding sheet (the one just handed in). 6. Some Y are not I. No Z are Y. All Z are I. 7. Some U are not B. Some P are not U. No P are B. 8. All M are X. All G are X. All G are M. 9. No T are N. No F are T. No F are N. 10. Some F are H. All H are A. All A are F. AN STATE UNIVERSITY LIBRARIES "'II'I'IIIINI III | M um I 1293 03185 1277