.—- v - Vw—_-- -. V .- v
.- 4ca.’ 'v' vw-vw
iii 7 -- -4" Og‘o‘.‘ <
— “'.-.“."-\‘__..'.."-_-.>_ ...
- - 4-. -—“‘-.‘ .
» . . ”I... ‘- -.¢
r A . 9 say.
'o—
‘2
..
O-

EFFECTS 0? TWO TYPES or _
mums AND PROBLEM smus on
’ smomsnc Pmoamucs

 

  

 

   

 

1
I
.
n
0 0
I . . A
.
.
. .
‘ .
. - -.-
- . - . A . ' .. . .- o .
. . . -.- - I -- - ,
\ V , a ..
. . .. - .. _ . -
- V. l - o I n' D , . r .
. , . . . . -... .
‘ ‘ ' -~ 0 - .A.. -o-.- . . . v. - ~ . ,
_ _ , . . . . , ‘. . _
' v ,. . , , - r . ..-- .
- . , _ .- ' . , ..
~ . . . ... . - . . -
. . ’ .. , . .. . . .
’ - .~ ~ - - D . ~ o . --
‘ . , . -- I u' ' ' '
, . . o r4 ‘. — ~ ‘ - ‘ '
. , _ _ .‘ - .
. . v - .
' .1 . o.-.o.... -..,-«. o . .- .‘I' .
_ .. ,.- A _.
u . v - . "" ' o-"'- .-- a, . -
. .. . . . ' '. o n- , . -
. . . ,. . . «c-' . ,- .u'u-p- r-’ - '.
. .- . . _ ....
. . .. .... ‘ .. ...~< , ., , .
A .. . . , . _ . n . . , . l . .V _ , I
- , . . . . , ....p_‘,, . !-.. - ....,.- , ‘ V
.. . ‘ - . . a. . . . . . _ _ I _' _
v . . . . . ,-« .- r .I' -- -' ~- . - . . ‘; r0"- .y, ’Ov
- , .. r a--I p ‘. . .‘ . : 'rO—OI r u - .- . I. ; .—
. . . _ ‘ __ . ., . .. . , .o-... .u ...¢ ... .. o. . .. . 0‘ ., .,
. . ,‘, , . . . n. .. o . - . . . ., _,.
~ — . . .,.- ~ 4 - . _ .--( .o'OI' ‘0 a vr-v-.la--I' a - .- -
‘ .. v . . , . . . . ..y. - '__- , 1-- —‘ ou'a 3 v ,,....
. . . . <0. . . .. , a: —-u- ! a , cud. o ' .2 . .. A.-, 0-..
I ‘. . , ., -. ‘, _ ._ .. . ..‘ .. , ._ ,.._-. (o. I u-n .oc . --. 4' . tr, c’.‘
. . ‘ n . , . _ , ,., .n'- - ...'-...r .p», fo~‘.' —" c-o-OD' " -Io .0- .-:.pr. , -
- . , . , ; v -p. . r t-V- 'I.pvr o"-0’0oo-.4.'o'b—.:. , , rt~ -or.
. . ‘ ‘ , A . . . . . , ‘ . ‘ .. - _ , o ,. .. .go't....'.¢.t'prp- v- .o . . - l—Ol‘
‘ I . . ,. - . 1 . . . 2'. - -‘ -’.l -/-l .o'..vr~o.o- --' luv apa'r- 'r' - -,o u o
‘o o . n ‘ _ . . , ..v . o « y I‘ ’ . a -, v..'o..ul-vn n-I' ." -r--- 0"- vr""l c ‘°I-‘-vo—o‘o’v l -- ' -'0 .0-- ar.." ".0—....r
I . . . . .. . . , .‘4 . .0. ' ..A v a, 00.. o-- -. """" “' o“"' ‘ " -* ". p --O .’-n'l" v’ . u f!
~ . .. . - ,. . .,_ n.o »- o‘-- ...».,. 1-. .~o-..-o- -- -"- v’l" , l ¢rﬁlv v--'-'loo .-t,.r-n.- -. .- v. . . co,.-v-- a.-.
c ,‘. .. .. . - _ . . ,‘. _ .,. ‘ v- -._ ., ... . , . -- r o"0 o- 0 .0'~oo-';O votu"0¢|"'['.."- .ro." voo-—--u.,,
' t ' - 'n' - - v - . o.- 5 . - ---I.. .. 1.. . y...A-- ‘ 1" " ~° '- . --,¢, . c v-vgtr- 7a. . . ..-..-gyp.. ... — -..- -7.-
‘ . - . .- . . ,. .. ,..V. .. .-.. -., r- ' ‘-- ~~|‘. -.--o'0 '-.a ‘;-.--r-- ...-:-V. vr-v -lo n..o,a'~o..-r.o...4.
. I . . . v a ~ .._ _ . . .. -, . ...... ....,. .. ‘ " “¢- A°"' "" ' ' C I. . . 9-011 -.U o ~~ -4....-p-..);'.~rv.” o-
- ' . " . -a. .,. ' ‘ .. - .p., .. npnpg..p-» r“ y-anr. ‘.'~--vr . a -- >v ." -..'5..-or. '10. o .- - .--¢.
u.- - u u “ , ,.._. J. , . .... . , .~ .- ‘ ,...;. '. a .. I ’,. .. .n- a. - v. -u . a —-- ‘,,......_. .y'»o.'. .
a. ‘ , _ _ . . . . ...— a- ..... . . 4.,... u. 09" -b -' . -‘P-o‘/u'- s-r a.-.‘ no—o- I . o. 0.9- -
‘ o -u .- . . ,. o .n .' .-r' 0-. . “to, ... .._ ""['(I]’.""‘1'."‘-““"""‘!."- I 4o.’\--" " '
-.~ .a - . - ‘ ... ._.. .. 4.7‘0‘.“.....‘-n—c g ' o .g.

         

 

 

LIJRARY

Mic! igan State
Univcrsity

 

    
    

   

”a! magnum BY “
T Hons & SﬂNS’
WERNER"?-

ABSTRACT

EFFECTS OF TWO TYPES OF TRAINING AND PROBLEM
STATUS 0N SYLLOGISTIC PERFORMANCE

By
David William Carroll

The major intent of the present study was to produce differ-
ences in syllogistic performance as a function of training techniques
and problem types, and then to assess both the depth and the scope of
these differences.

Forty-two introductory psychology students were exposed to
either a spatial or an algorithmic treatment condition during a l6-
problem training session. They were then tested on 32 problems, dif-
fering in specifiable characteristics. The results indicated that
both experimental groups performed at a higher level than the control
group which received no training, yet the effect was restricted to
those problems that do not have a valid logical conclusion (indeter-
minate problems). Though there was no treatment effect on determinate
problems, post-hoc analysis revealed a significant effect on two types
of indeterminates.

The results were discussed in terms of behavioral rule con-

siderations. An analysis of the results was also performed in terms

David William Carroll

of the underlying skills of verification and falsification of logical
propositions, and these skills were specified as several plausible

rules.

EFFECTS OF TWO TYPES OF TRAINING AND PROBLEM
STATUS 0N SYLLOGISTIC PERFORMANCE

By

David William Carroll

A THESIS

Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

MASTER OF ARTS

Department of Psychology

ACKNOWLEDGMENTS

I am pleased to express my appreciation to Dr. Donald M.
Johnson, chairman of my committee, for his guidance and assistance in
the preparation of this thesis. Gratitude is also extended to both
Dr. Andrew Porter and Dr. Gordon Wood for their contributions to the

design and execution of the work.

ii

TABLE OF CONTENTS

Page
INTRODUCTION ........................... l
A. The Syllogism ....................... 3
B. Historical Introduction .................. 3
METHOD .............................. l0
A. Subjects .......................... 10
B. Procedure ......................... l0
C. Hypotheses ......................... l6
RESULTS ..... I .~.— ..................... l8
A. Training results ..... . ................. 18
B. Major findings ....................... 19
C. Treatment and Treatment-Measures Effects .......... 23
D. Measures Effect ...................... 34
DISCUSSION ............................ 43
A Treatment Effect ......... Tb ............ 43
B. Treatment-Measures Interaction ............... 44
C. Measures Effect ...................... 48
D Conclusions ........................ 51
E Implications for Further Research ............. 52
LIST OF REFERENCES ........................ 53
APPENDICES
A. Training Booklets ..................... 55
B. Training Items ....................... 6l
C. Test Items ......................... 65
D. Rule-Exception Results per Problem ............. 72

iii

Table

A

CDNO‘LH

10.
ll.
12.
l3.
14.
IS.
l6.
I7.
18.

LIST OF TABLES

Scores in percentages on each of l6 training problems . . .

Figure, order and premise combination comparisons

total number correct per group (n=l4) per problem . . . .

Total percentage scores of each group on each problem

Total percentage scores of each group on each problem

dimension .......................
Analysis of variance using original data ........
Analysis of variance using transformed data .......

Analysis of variance using total score data .......

Fourteen two-way ANOVAs and Scheffé contrast confidence

intervals .......................
Analysis of variance using determinate problems .....

Analysis of variance using indeterminate problems . . . .

Total percentage scores of each group on each category

Analysis of variance for "see rule" ...........
Analysis of variance for "correct rule" .........
Analysis of variance for "correct exception" ......
Analysis of variance for “see exception" ........
Between-problem ANOVA ..................
U-M ANOVA for determinate problems ...........
A-M ANOVA for determinate problems ...........

iv

Page
18

20
21

22
24
24
24

25
28
28
32
33
33
33

38

Table Page

19. U—M ANOVA for Group B problems ............... 4O
20. A-M ANOVA for Group 8 problems ......... ' ...... 4O
21. U-M-P ANOVA for Group A problems ..... i ......... 41
22. A-M-N ANOVA for Group A problems .............. 41

LIST OF FIGURES

Figure Page
1. Properties of the syllogism ................ 2
2. Dominant response chart .................. 15
3. Profiles of 14 scores for each group ........... 27
4. Per cent correct response, per group, on three

dimensions of problems ................. 29
5. Rules and exceptions ................... 31
6. Per cent correct response, per group, on three

types of problems .................... 35
7. Per cent correct response, per group, on determinate

rules and group B indeterminates ............ 36
8. Per cent correct response, on indeterminate problems . . . 37

vi

INTRODUCTION

This thesis was an attempt to train subjects to improve on
their syllogistic performance, with the goal in mind of assessing the
depth and scope of any improvement such training might provide. There
are undoubtedly many ways of effecting an improvement in any perfor-
mance, but my aim has been to closely examine but two methods of train-
ing, in terms of the quantitative and qualitative changes they have
produced here, and in terms of both the positive and negative aspects
of these changes. Inasmuch as some details of the syllogism will be
employed in the discussions throughout this thesis, a brief presenta-

tion of the terminology to be used will be of merit.

A. The Syllogism

 

The syllogisms used here consist of two premises, with five
possible conclusions. Each statement of the syllogism is selected
from four possible logical propositions (see Figure l on next page).
When a set of premises is used in a description, the first premise is
mentioned first, such as in AE or "All X are Y, No Y are Z." The set
of two premises may also be described by "averaging" the dimensions of
quantity and quality, thus forming a single description. To take the
same example, the A premise is universal affirmative, and the E premise
is universal negative, so that the AE is universal and mixed. As used

here, the quantity dimension indicates the extent of reference, and has

1

the values of universal, mixed and particular. The quality dimension
indicates the form of reference, and has the values of affirmative,
mixed and negative. It is the latter description of the problem as a
universal mixed, or UM, and not as an AE, that will be employed more
extensively. This is so because, for one, it is the latter variables
that will be manipulated in this experiment and, secondly, there is an
economy of reference when employing this system. For instance, there
are 4 AE problems, but only two corresponding UM problems in all. There
are 64 problems (16 premise combination, each in 4 figures) but only 16

problem types.

 

 

Propositions: Terms: A, B, X, Y, etc.

A = All A are B (universal affirmative)

E = No A are B (universal negative) Premise combinations:

I = Some A are B (particular affirmative) A+E=AE or universal mixed

 

 

 

 

0 Some A are not B (particular negative) I+E=IE or mixed mixed
Figures:
lst--All X are Y 2nd--All A are B
All 2 are X All C are B
All 2 are Y (valid) All C are A (invalid)
3rd--All J are K 4th--All P are Q
All J are L All Q are R
All L are K (invalid) All P are R (valid)
Some L are K (valid) Some R are P (valid)

Problem status:

lst letter--signifies determinate or indeterminate
2nd letter--signifies universal, particular or mixed
3rd letter--signifies affirmative, negative or mixed

 

Thus, the premise combination AA, in the first figure (AA-1), is a
determinate universal affirmative (DUA). AA-2, however, is indeter-
minite (IUA).

Figure 1. Properties of the syllogism.

The terms of the syllogism, as indicated above, are simply the
elements that are being referred to. Their importance in the present
study is negligible, for only letters of the alphabet are used as terms.

The four figures of any given syllogism leave unaltered the
basic information (quantity and quality of reference), but may change
the determinateness/indeterminateness (whether or not a problem has a
solution) of the syllogism. When referring to a particular figure of
a premise combination, the notation 10-3, EA-4 and so on will be used.
Finally, it should be noted that the status of a problem is but an
ordered set of three properties of that problem; the first property,
however, differs from the other two in several ways.

The quantity and quality dimensions are straightforward "aver-
age" of the two premises, but the determinateness of the problem is a
property of the whole syllogism, not a combination of its parts. Fur-
thermore, it is contingent upon many factors--including the other two
dimensions, as well as the figure of the problem, the order of the
premises, and the premise combination. This point will be considered
again later, for though a logical principle (e.g., undistributed middle
term) is available to account for the determinateness of a problem, the
psychological dimensions underlying an individual's recognition of a

determinate item are still in need of clarification.

B. Historical Introduction

Psychological investigation of logical problems and the reason-
ing process began in the 19305. The type of research carried out seems

divisible into two rather large classes. One type of work has been

theory-laden, with the principal intent of the investigation being to
obtain supportive evidence for one or another of the theories of the
reasoning process that have been offerred in the last forty years. The
other, more theoretically innocent line of research has been specific
in nature, attempting to delineate details concerning the difficulty of
certain problems, the tactics experimental subjects resort to in trying
to solve the problems, and the properties a training procedure must
have in order to be successful. I will want to acknowledge the contri—
butions to this study, of both lines of research before outlining the
purpose behind my own work.

Woodworth and Sells (1935) outlined the first psychological
hypothesis of the reasoning process, by noting that their subjects (Ss)
seemed to process the material non-logically. They termed the "atmos-
phere" error the tendency for an §_to choose an A response to AA syllo-
gisms, an E response to EE syllogisms, and so on, without respect to
the logical relations involved.

A different post-hoc analysis of errors, by Chapman and Chapman
(1959), supports the view that a primitive type of reasoning is respon-
sible for these atmosphere errors. Their hypothesis of "probabilistic
inference" is hinged on the finding that §s apparently try to convert
propositions such as "All A are B" into "All B are A" (an invalid con-
version), then deal with the problem appropriately after that point.
Henle (1962) and Ceraso and Provitera (1971) have argued similarly.
Henle (1962) argues that an error on a logical item does not necessarily
implicate a non-reasoning problem-solving process, and that errors such

as slipping in probabilistic yet non-implicated additional premises in

an argument may account for these supposed failures to reason. Ceraso
and Provitera (1971) modified traditional syllogisms in order to test
the hypothesis that premise misunderstanding is the basic component of
a high error rate. Their results suggest that errors previously attri-
buted to the atmosphere effect might well be due to premise misunder-
standing, with the subject reasoning properly from that point on.

To experimentally assess the two hypotheses, Simpson and Johnson
(1966) designed two training conditions, one anti-atmosphere and one
anti-(invalid) conversion. The superficiality of the atmosphere effect
is exposed by the fact that it diminishes greatly even after brief
training. Anti-conversion training was somewhat less successful, but
the two interpretations appear to overlap by accounting for the same
error in two different ways. In a more recent study, Johnson (unpub-
lished) found an interesting effect, that both of these types of train-
ing reduce only the indeterminate errors and that, in fact, the control
group did slightly better than the experimental groups on determinate
syllogisms. Ceraso and Provitera's (1971) training effect likewise
produced larger differences on the indeterminate items.

Helsabeck (1973) evaluated several hypotheses concerning problem
difficulty. In line with premise misinterpretation theorizing, he
altered the wording of syllogisms, once to achieve non-ambiguous wording
(changing "All A are B" to "Every A is B") and once to achieve a spatial
wording of premises (A is inside B, B and C overlap). The former change
had no effect and the latter only a small one. Thus, earlier supposi-
tions that the problem with logical statements was simply a matter of

the meaning of the terms, as used in logic, were not confirmed.

Helsabeck then tried three training procedures--spatia1, verbal
concrete and verbal abstract--in a refutation task in which introductory
students were asked to generate counter-examples to conclusions. The
spatial training, utilizing Euler diagrams, facilitated performance the
most, with the verbal concrete method of substituting common nouns for
alphabetic terms also (though less) facilitative. Verbal abstract
training--basically asking §s to verbally work through the logical re-
lations of a problem--was not successful in increasing individual's
performance level. Helsabeck also found that negative conclusions were
harder to refute than affirmative conclusions, but he only used inde-
terminate syllogisms in the experiment.

Frase (1966) used 1 1/2 hours of programmed instruction, empha-
sizing the distinction between formal validity and material truthhood
with meaningful items. He also noted differences between problems with
either universal or particular conclusions (rather than premises, as
used here). His finding was that the training effect he observed was
restricted to those problems with particular conclusions, which had,
on a pretest, been the more difficult ones.

Much training research has used spatial representations as a
means for improving performance. DeSoto §t_al, (1965) argue from their
data that such a representation acts as a mediator in the solution of
three-term series problems. Similarly, in syllogistic research, the
results of Helsabeck (1973) and Henle and Michael (1956) argue for the
importance of the concept. Schwartz (1971), working with "whb-done-it"
problems, suggests three major heuristic elements of a "mode of repre-

sentation": (1) it clearly defines needed information; (2) it suggests

fruitful orders of operation; and (3) it provides consistency checks.
The representation also ought to be generative or productive by defining
the problem in such a way that further manipulations may be easily car-
ried out. These matters will be brought up again in the discussion
section when considering interpretations for the treatment effect.

Although a systematic treatment of problem difficulty is lack-
ing, the differences between problems has often been noted and attributed
to various factors. Frase (1966, 1968b) obtained evidence that the
quantity dimension is important, for universals are much easier than
particular problems. Lack of clarity regarding the actual problems
used, however, urges the caution that the determinate-indeterminate
dimension may be involved here, for almost all of the universal problems
used in this line of research have been determinate and particulars are,
by necessity, indeterminate.

Frase (1968a) also investigated the importance of problem figure
and found a marginal effect, with figures three and four being more
troublesome. He suggests an associative explanation, with the first
figure being a forward chain, the fourth a backward chain, and the second
and third as stimulus equivalence and response equivalence, respectively.
Again, however, it is not certain that the determinateness factor has
been properly controlled. For the purposes of the present study, this
supposed figure effect may be divided into two possible effects: (1)

a difference in difficulty between two problems with the same problem
status, but differing in figure (e.g., AA-l and AA-3 are both DUA); (2)
a difference in difficulty between two problems identical only in the

quantity and quality dimensions (e.g., AE-l is IUM, AE-2 is DUM).

The pilot work for this experiment used three types of training.
The spatial group practiced on transforming the problems into Euler
diagrams, depicting, for instance, "No X are Y" as completely disjoint
sets in space. The verbal group was instructed on how to transform the
terms of the syllogism into more meaningful sentences, as in "Some ani-
mals are not dogs", with the hope that some of the structure of the
relations would become clearer. The logical group was encouraged to try
the more abstract strategy of thinking of possible cases, in terms of
sets and subsets, and attempting to determine a conclusion without
using spatial representations or verbal substitutions. A control group
was given the same problems to practice on, but with no training instruc-
tions; then all of the groups were given an identical test. The results
lent but little support to the hypothesis that this "strategy training"
would lead to better performance on the test: spatial, logical and
control §§ all performed at or near the 55% level, while the verbal §s
lagged behind at 49%.

A second manipulation in that experiment dealt with the problem
status. The finding, consistent across groups, was that the determinate
problems were easier than the indeterminate ones; the universals were
easier than the particulars, which were, in turn, easier than the mixeds;
and the affirmatives were easier than the negatives, with the mixed
problems again inferior to all. In addition, specific problems could be
noted in terms of these three dimensions and hence, their difficulty
could be marked. The easiest problem was a DUA at 90% and the most dif-

ficult one, the IMM, was correctly responded to at the 20% level.

Apart from the conclusions it suggested concerning problem status,
the pilot work underlined the importance of developing training proce-
dures that will, in principle (i.e., if followed), be effective enough
to produce a solution to a problem. This is necessary in order to dis-
tinguish the problems that §s will have in using a procedure from the
problems (i.e., of consistency) of the procedure itself. For these
reasons, the verbal and logical groups were dropped in the present ex-
periment, and an algorithmic one was used to replace them. The question
of a training procedure's "completeness" will be considered briefly in

the next section.

METHOD

A. Subjects

Subjects (Ss) were 42 (21 male; 21 female) introductory psy-
chology students at Michigan State University who had neither had a for-
mal course in logic or could demonstrate any reasonable acquaintance
with logical methods or principles. §s volunteered for the study and
were given credit for their participation. All S; were run individually

in the experiment, and were assigned randomly to the treatment conditions.

8. Procedure

 

As experimental §s arrived, they were handed a two-page booklet
illustrating the training they were to use (see Appendix A). They were
told to read the booklet and encouraged to ask questions about it. They
were then shown one of 16 trainingproblems typed on 4" by 6" index cards
and told that their task was to find the correct answer from the five
alternatives given, concerning what would follow from the top premises.
S; were instructed to study the problem, then give the strongest answer
possible, and explain their answer. After their answer and their ex—
planation, they were told, experimenter (E) and §_wou1d discuss the
problem. The list of 16 training problems appear in Appendix B. Scratch
paper and a pencil were available, and §s were again asked if they had

any questions before they started.

10

ll

The two experimental groups differed only in the booklet they
received (and were allowed to retain with them during the training ses-
sion) and the type of explanations offerred by both §_and S, The spatial
group, as shown in Appendix A, were given examples of the application of
Euler diagrams to the solution of problems. The first four examples
simply show how to diagram the basic four logical propositions. The
latter four examples deal with conversion, and demonstrate which prob-
lems can or cannot be validly converted. For the algorithmic group, the
first page of their booklet consisted of a description of the structure
of the problems, including how to decide if a conclusion is valid and
how to label each premise. The second page of their booklet shows a tree
diagram, illustrating a straightforward, "semi-algorithmic" method of
solving the problems. The tree diagram was explained to 5s who could
not understand it.

The procedure during the training session, as stated above, was
, to let S; respond and explain their response before intervening. If
either their answer or their explanation was incorrect, it was discussed
and the major topics of discussion were as follows: (1) elimination--
§s were encouraged to eliminate hypotheses and to narrow down their prob-
lem to a couple of alternatives. This heuristic is explicit in the al-
gorithmic condition; in fact, the procedure that was suggested for these
§s was to follow the tree as far as they could and then deal with the
remaining alternatives (if more than one) by going back to the problem
and considering it in terms of sets and subsets (basically, the logical
method of the pilot experiment). For the spatial group, the concept of

elimination was closely associated with the heuristics of counter-example

12

and refutation, to be discussed below. (2) 00unter-example--§s were
encouraged to work negatively, especially if they had narrowed down
their hypotheses to two or three., If,_for instance, they were enterr
taining hypothesis E, S; were asked "Under what conditions would this
hypothesis be false?" and if none, to select it. If all of the hypotheses
could be falsified, the indeterminate response would be appropriate and
§s were told this if uncertain. (3) Case-conclusion distinction--§s were
informed of the distinction between a single case in which the A con-
clusion may be true versus the general conclusion that it mu§t_be true.
This distinction seemed facilitated by the spatial representation of the
separate cases, and indeed, spatial §s were encouraged to number the
cases in terms of compatible conclusions, after sustematically diagram—
ming all possible cases. In this context, their task was to discover
what conclusion, if any, is common to all cases. (4) Refutation--A
special instance of counter-example, based on this fact: that conclu-
sion A entails both not E and not 0 and that conclusion E entails both
not A and not I. It follows from this that any syllogism in which one
finds that 99th_alternatives A and E are possible (i.e., possible cases)
will be a syllogism in which ng_determinate solution is necessary, be-
cause the necessity of each conclusion is refuted by the possibility of
both A and E. S; were given a brief explanation of this argument, with
emphasis put upon the usefulness of the principle: if one can find
cases in which A and E are possible for a given problem, then the prob-
lem is indeterminate. (5) Logical sense of "some"--essentially, S; were
instructed that the logical propositions I and 0 do not necessarily

entail each other.

13

These various heuristics could not be completely standardized
across Es for several reasons, including the fact that some subjects
make fewer errors than others. Yet all were included in basic form for
all of the experimental Es, and certain orders dominated. Refutation
followed case-conclusion, and elimination usually preceded counter-
example, but often the exact timing of the heuristics was heavily de-
pendent upon opportunity and a "sense of the appropriate". Any lack of
experimental control that resulted from this flexibility of timing was
weighed against the desire for a relatively informal, not overly stress-
ful situation for the individual E,

During the training session, E kept notes of §fs progress, and
of the order in which the heuristics were delivered. Following the 16
training problems, Es were asked to read over their booklets to make
sure there wasn't anything they still didn't understand. After they
handed back their booklets, the test session began.

Es were instructed in the test session as follows. A pack of
32 test problems, face down, was presented to the Es and they were told
that this task was the same one as before, to decide which of five prob-
lem answers was correct. This time, however, there would be no feedback
concerning the answers to the problems by E, nor any explanation, DY.§
or E, of the problems. The §_was then told that he would be given 90
seconds to do each problem, but was admonished that this time period
was more than sufficient for most problems and most individuals so that
there was no need to hurry. (The time limit was included in the design
because preliminary work indicated that, beyond two or three minutes,

any extra time given to the subject was of no great benefit. The extra

14

time served only to fatigue the individual and, in fact, most §s worked
best when they were working quite rapidly.) Again, scratch paper and a
pencil were made available, and the Es were given time to ask questions
before beginning. §s simply vocalized their answers to E, who recorded
them, and were permitted to work at their own pace. The test session
usually lasted between twenty and thirty minutes.

The control §s who did not receive training were given the same
instructions as the experimental groups for the test session, excluding,
of course, any mention of the differences between the training and test
sessions. The instructions emphasized that an answer must necessarily
follow from the premises to be the correct answer and that one and only
one of the five answers is the correct one. The 32 test problems are
included in Appencix C.

A final aspect of how the experiment was designed deals with the
selection of problems for use in the experiment. Several criteria deter-
mined the inclusion of both training and test problems, and these will
be enumerated in descending importance: (1) determinate items are rare
in syllogistic logic, so as many as possible were incorporated into the
design. A sufficient number are needed so that Es do not develop an
indeterminate response bias, but caution was exercised so that the
particular determinates employed in the test set did not overlap exceed-
ingly with those of the training set. (2) Each one of the 16 premise
combinations was included in the first figure in the test set. (3) The
remaining 16 problems of the test set were chosen to facilitate figure
(AA-l vs. AA-3), order (EA-2 vs. AE-2), and premise combination (IE-l

vs. OA-l) comparisons, as will be discussed later. (4) Finally,

15

dimensional comparisons of the problem status (DUM to be compared with
IUM, DMM, DUA, etc.) were included in the design, where possible. In
addition care was taken to insure that the letter terms of the problem
could not be used as a cue to problem type. Since the problems were
typed onto index cards, individual randomization of the order in which
the problems were received by the Es was insured by a shuffling of the

deck.

 

 

Status Example (in Figure 1) Dominant response
UA All A are B; A11 B are c All A are ca
UM All A are B; No B are C No A are C
UN No A are 8; No B are C None of the above
MA All A are B; Some B are C Some A are C
MN All A are 8; Some 8 are not C Some A are not C

No A are 8; Some 8 are C Some A are not C

MN No A are 8; Some 8 are not C None of the above
PA Some A are 8; Some 8 are C None of the above
PM Some A are 8; Some 8 are not C None of the above
PN Some A are not 8; Some B are not C None of the above

aThis is empirically determined, as shown in the next section.

Figure 2. Dominant response chart.

A final note on the structure of the problem set involves the
notion of a dominant response (Figure 2), defined as the form in which
‘a response must be, for each problem, if the problem is determinate.
Hence, Figure 2 corresponds to the algorithmic group's tree diagram, in
which each problem falls into one of two categories: (1) it is either
necessarily indeterminate (UN, MN, PA, PM, PN); or (2) it is either

indeterminate, or if determinate, of the form of that problem's dominant

16

response (UM, MA, MM). The lone exception is UA, which takes responses
A and I, depending on the figure. The argument to be presented is that
Es act as if they know the second category of responses as they enter
the laboratory, and include the A response to UA as a non-exceptional

instance of the second category.

C. Hypotheses

 

The hypotheses may be broken down into those concerning treatment
differences, problem differences, sex differences, and possible signifi-
cant interactions. The hypothesis was advanced that both experimental
groups were to be more successful on the test problems than the control
group. This was expected to cover both sexes and all problems, but with
the expectation that the greatest treatment differences, in favor of the
experimental groups, would be found on the indeterminate problems. Dif-
ferences related to sex were not expected, but at least one interaction
was plausible: that females might perform better on the spatial train-
ing, while males would be more successful on the algorithmic training.
No other interactions regarding sex, nor an overall sex effect, were
anticipated.

Regarding the problems or measures effect, the hypotheses were
based primarily on the pilot data that showed determinates easier than
indeterminates; affirmatives easier than negatives, with mixed problems
even more difficult; and universals, particulars and mixed problems in
ascending difficulty. It was anticipated that these dimensions of the
stimuli would produce the differences between problems, and that other

dimensions would be found not to be very pertinent to problem difficulty.

17

Finally, direct comparison of the two experimental groups receives
less emphasis here than the comparison of either to the control group.
Honoring the distinction between a particular treatment manifested in
this experiment and the more general training features of interest, one
must be acutely aware that any difference between the two experimental
groups might plausibly be attributed to a difference in construction;
although there is sufficient reason for thinking that the conditions are
constructed well enough to be comparable to a control group, a higher
level of construction would be necessary to compare the conditions with
each other, in meaningful fashion. Thus, the comparison of the two
experimental conditions with the control group will be emphasized most
in the results and discussion sections; a comparison of the experimental
groups is left to those tests in which the experimental-control differ—

ence does not account for the observed effects.

RESULTS

A. Training Results

 

Table 1 shows the results of both experimental groups, of both

sexes, on the 16 training problems. Since individuals received the

Table 1. Scores in percentages on each of 16 training problems.

 

 

Group-Sex 1. 2. 1 i 2 9. 1 £3.

 

Spatial-Male .571 .714 1.000 (.286 .429 .714 .857 1.000
Spatial-Female .429 .571 .429 .714 .714 .571 .714 1.000
Spatia1-T0tal .470 .642 .714 .500 .571 .642 .785 1.000

Algorithmic-M .286 .714 .286 .714 .714 .714 .714 .857
Algorithmic-F .286 .429 .714 .714 1.000 .857 .714 .857
Algorithmic-T .286 .571 .500 .714 .857 .785 .714 .857

TOTAL .378 .606 .606 .606 .714 .714 .750 .929

Group-Sex _9_ l_q 1_l_ _1_2_ _1_3_ 11 _1_E l_§_ Total
Spatial-M .571 .857 .571 .857 .857 .714 .714 .429 .697
Spatial-F .714 .571 .714 1.000 .571 1.000 .857 .714 .705
Spatial-T .642 .714 .642 .929 .714 .857 .785 .571 .701

Algorithmic-M .714 .714 .857 .857 1.000 .714 .571 .714 .697
Algorithmic-F .857 1.000 .714 .714 .857 .714 .857 .857 .759
A1gorithmic-T .785 .857 .785 .785 .929 .714 .714 .785 .728

TOTAL .785 .785 .785 .857 .821 .785 .750 .678 .714

 

problems in individual random orders, the numbered problems refer not

to a specific problem, but rather to a serial order. The major finding

18

19

was an increase in problem success, over all groups, to the level of 70%
within five problems and a maintenance of that level until a drop in
performance over the last few problems.

The training scores for the two experimental groups were then
analyzed for the correspondence between training and test scores. A
correlation coefficient of .556 (df=26, p=.Ol) was obtained, indicating
a tendency for those who score either high or low on the training set

to do likewise on the test set of problems.

8. Major Findings

 

The test data were analyzed in terms of the three possible rival
hypotheses mentioned earlier, possible figure, order and premise combin-
ation effects. In order to separate these effects from those associated
with problem status, 16 comparisons were used to explore these hypotheses,
but with the stipulation that the problem status was not involved in any
comparisons (capturing, for example, a "pure" figure effect). Only the
first of two possible figure effects, to take that example, that were
distinguished in the introduction is of interest here.

The 16 comparisons are shown in Table 2 on the next page. Sta-
tistical analysis, which in the case here involves wishing to retain the
null hypothesis, was not performed but the lack of variance attributable
to these factors, with one noteworthy exception, seems evident. The
exception is the AA series, in which the first figure is far easier than
the third or fourth. All three may be characterized as DUA problems, but

the latter two will be henceforth referred to as DUA' problems for two

20

Table 2. Figure, order and premise combination comparisons total
number correct per group (n=l4) per problem.

 

 

Figure: 9 comparisons

 

 

]_ AA-l AA-3 AA-4 E_ AE-l AE-3 E_ AI-l AI-3
S 14 5 5 10 10 9 7

A 14 4 5 9 8 ll 11

C 13 l 2 9 9 14 11

T 41 10 12 28 27 34 29

g_ EA-l EA-2 E_ EE-l EE-2 E_ IA-3 IA-4
S 14 12 12 13 8 10

A 13 10 l4 l4 9 13

C 12 14 5 6 l3 9

T 39 36 31 33 30 32

2_ 11-1 11-2 §_ 00-1 00-2 2_ AE-2 AE-4
S 13 12 12 14 ll 13

A 10 l3 14 13 10 10

C 0 3 3 7 13 12

T 23 28 29 34 34 35

Premise combination: 2 comparisons

]_ EI-l A0-2 0A-3 E_ IE-l OA—l AO-l

S 7 4 5 10 11 11

A 10 10 9 12 8 6

C 5 8 10 4 l l

T 22 22 24 26 20 18

Order 5 comparisons

1_ AE-2 EA-2 g_ IA-3 AI-3 E_ OA-l AO-l
S 11 12 8 7 11 11

A 10 10 9 ll 8 6

C l3 14 13 11 l l

T 34 36 30 29 20 18

ﬂ. EO-l OE-l E_ 01-1 10-1

S 8 12 12 14

A ll 12 9 12

C 5 4 2 2

T 24 28 23 28

 

21

 

 

 

Table 3. Total percentage scores of each group on each problem.
Group-Sex DUA DUA' IUA DUM IUM IUN DMA
Spatial-Male 1.000 .357 .857 .929 .714 .857 .535
Spatial-Female 1 000 .357 .571 .857 .714 .929 .579
Spatial-Total 1 000 .357 .714 .893 .714 .893 .608
Algorithmic M 1.000 .286 .571 .821 .714 1 000 .857
Algorithmic F 1 000 .357 .429 .714 .500 1.000 .714
Algorithmic T 1.000 .322 .500 .758 .508 1 000 .786
Control-M .857 .000 .143 .929 .000 .285 .893
Control-F 1.000 .071 .143 .893 .286 .500 .786
Control-T .929 .035 .143 .911 .143 .393 .840

Male Total .952 .214 .524 .903 .475 .714 .752
Female Total 1 000 .252 .381 .821 .500 .810 .725

Grand Total .975 .238 .452 .857 .488 .752 .744
Group-Sex IMA DMM IMM IMN IPA IPM IPN TOTAL
Spatial-M .543 .429 .557 .714 1.000 .929 .857 .723
Spatial-F .857 .333 .857 .714 .785 .929 .000 .745
Spatial-T .750 .381 .752 .714 .893 .929 .929 .735
Algorithmic-M .571 .519 .557 .857 .929 .543 .000 .755
Algorithmic-f .357 .752 .571 .714 .714 .857 .929 .587
Algorithmic-T .454 .590 .519 .786 .821 .750 .955 .721
Control-M .071 .475 .047 .214 .000 .071 .357 .375
Control-F .143 .519 .238 .429 .214 .214 .357 .455
Control-T .107 .548 .143 .286 .107 .131 .357 .420
Male Total .428 .508 .450 .595 .543 .540 .738 .518
Female Total .452 .571 .555 .519 .571 .557 .752 .532
Grand Total .440 .540 .508 .507 .507 .503 .750 .525

 

22

reasons: (1) that the results indicate a unique difference between the
problems on the basis of figure; and (2) that these DUA' problems require
the I response, while AA-l requires the A response. As remarked earlier,
the AA problem is unique in this respect, and the empirical determina-
tion of the A response as the dominant response of this problem rests on
the previously cited fact: a problem with an A response is far easier
than those that require the I response.

The test data matrix for the design is shown in Table 3. An
examination of the data indicates great differences between problems as
well as an apparent superiority of the experimental groups over the con-
trol group. The data are further displayed in Table 4, which presents
the group means on the three dimensions of the problem status.

Table 4. Total percentage scores of each group on each problem dimen-
sion.

 

 

Group-Sex Deter. Indet. Univ. Mix. Part. Eff, U15: Egg, Total

 

 

Spatial-M .633 .794 .786 .582 .929 .667 .735 .809 .723
Spatia1-F .633 .833 .750 .673 .905 .691 .734 .881 .746
Spatial-T .633 .814 .768 .628 .917 .679 .735 .845 .735

Algorithmic-M .724 .777 .738 .724 .857 .714 .704 .952 .755
Algorithmic-F .694 .682 .667 .643 .833 .595 .684 .881 .687
A1gorithmic-T .709 .730 .703 .684 .845 .655 .694 .917 .721

Control-M .684 .121 .441 .408 .135 .393 .384 .286 .375
'Control-F .694 .286 .536 .490 .262 .429 .510 .429 .465
Control-T .689 .204 .489 .449 .199 .411 .447 .358 .420
Tota1-M .680 .564 .655 .571 .640 .591 .608 .682 .618
Total-F .674 .600 .651 .602 .667 .572 .643 .730 .632

Grand Total .677 .583 .653 .587 .659 .582 .626 .706 .626

 

23

Because the instances of problems used in the experiment were
not equally represented (for reasons given in the methods section),
analysis of the data was made by means of the percentage of correct re-
sponse, per problem and per individual. Since this mode equates all
problems in the analysis, an analysis was also performed on the total
scores for each individual (i.e., with the problems differentially
weighted, as they were in the experiment). The former data set will be
referred to as the "original data", and the latter as the "total scores
data".

In addition, to account for the differential veriability of
problems, a "transformed data" set was created by dividing each individ-
ual's score on each problem by the standard deviation of the problem.
Analyses of variance were then run on each of these data sets, using
conservative tests with an alpha level of .05.

Table 5 shows the repeated-measures analysis of variance for the
original data. The sources of variation of significance are the treat-
ment effect, the measures or problems effect, and the treatment-measures
interaction. All three exceeded the .05 level of significance. Table
6 shows the transformed data, indicating the same three significant
results. Table 7, ignoring the effect of problems, confirmed the treat-
ment effect.

C. Treatment and Treatment-measures
Effects

In order to more precisely determine the limits of the differences
between treatments, 14 univariate analyses of variance were performed.
The results of these tests and the Scheffé contrast confidence intervals

(Scheffé, 1959) resulting from them are enumerated in Table 8.

24

 

 

 

 

 

 

 

 

 

 

 

Table 5. Analysis of variance using original data.
Source E 9i: _M_S E
Treatment (T) 18.045 2 9.023150 35.85a
Sex (S) .044 1 .043940 0.18
TS .767 2 .383567 1.57
I:TS 8.816 36 .244893 ----
Measures (M) 20.070 13 1.543822 18.55a
12.857 25 .494898 5.94a
SM .744 13 .057223 0.69
TSM 1.615 26 .062108 0.75
IMzTS 38.995 468 .083322 ----
Table 6. Analysis of variance using transformed data.
Source ES_ d_f E E
Treatment 202.481 2 101.24052 40.15a
Sex .794 1 .79449 0.32
TS 9.115 2 4.55726 1.81
I:TS 90.780 36 2.52166 ----
Measures 1305.520 13 100.50923 113.84:
TM 172.719 26 6.64303 7.52
SM 9.708 13 .74673 0.85
TSM 17.457 26 .67141 0.76
IMzTS 413.202 468 .88291 ----
Table 7. Analysis of variance using total score data.
Source EE g: M_S_ E
Treatment 866,175 2 433,087 21.45a
Sex 1,093 1 1,093 0.05
TS 51,765 2 25,882 1.28
I:TS 726,885 36 20,191 ----

 

ap < .05

25

Table 8. Fourteen two-way ANOVAs and Scheffé contrast confidence

 

 

 

intervals.
Problem Source F value Upper limit Lower limit
DUA Treatment (T) 1.0000 .681 -.397
Sex (S) 1.0000
TS 1.0000
DUA' T 3.3182 1.432 -.218
S 0.1818
TS 0.0455
IUA T 5.4444 1.861 -.005
S 1.0000
TS 0.3333
DUM T 2.6308 .420 -.742
S 1.6615
TS 0.1385
IUM T 8.9178* 1.881 .189
S 0.0411
TS 1.5205
IUN T 26.4643* 1.773 .441
5 1.7143
TS 0.7500
DMA T 2.5120 .448 -1.020
S 0.1627
TS 1.0301
IMA T 10.7647* 1.832 .168
S 0.0441
TS 1.2353
DMM T 3.3735 .744 -.798
S 0.4243
TS 0.6611
IMM T 22.2432* 1.791 .399
S 1.4395
1.4402
IMN T 6.5821* 1.757 .099
S 0.0448

TS 0.8507

26

Table 8. Continued.

 

 

 

Problem Source F value Upper limit Lower limit
IPA T 38.0571* 2.204 .796
S 0.7714
TS 3.0857
IPM T 30.6923 2.141 .693
S 1.9231
TS 0.5385
IPN T 25.5938 1.869 .491
S 0.0938
TS 0.6563

 

*Indicates an F value significant at .05/14=.0036.
All contrasts were run as S + A - 2C. A11 significant contrasts
favor S + A.

The results are relatively clear-cut. Not only do the S and TS
effects not appear in the overall analysis, but they are similarly absent
from each two-way analysis reported in Table 8. The treatment-measures
interaction can be formulated in the following manner: the indetermin-
ates, with the exception of IUA, all show a treatment effect (all favor-
ing the experimental groups), while the determinates show no treatment
effect. It is noteworthy that DUA' of all determinates, comes closest
to exhibiting a treatment effect. Figure 3 shows the results graphi-
cally.

Two two-way analyses of variance were then run, which summarize
these 14 tests. Again, the determinate problems show no training effect
(Table 9), but the indeterminate problems manifest one well beyond the

critical F ratio for significance at .05 (TablelO). Taken together

28

Table 9. Analysis of variance using determinate problems.

 

 

Source df SS MS F

 

Treatment 2 43,456 21,728 .7546
Sex 1 378 378 .0131
TS 2 3,122 1,561 .0541
I:TS 36 1,036,597 28,794 ----

 

Table 10. Analysis of variance using indeterminate problems.

 

 

 

Source df SS MS F

Treatment 2 3,089,150 1,544,580 52.810
Sex 1 22,791 22,791 .925
T5 2 101,515 50,807 2.055
I:TS 35 885,202 24,589 -----

 

with the fourteen earlier analyses, these results implicate that the
control group does as well as the experimental groups on the determinate
problems, but is far inferior on the indeterminates. 0n those problems
that are necessarily indeterminate (negative or particular problems),
the control group inferiority is extended. 0n those problems that are
usually or often determinate (universal, affirmative, or mixed in either
way), the margin of difference is smaller. Figure 4 illustrates these
dimensional comparisons.

The analysis, however, may go a bit further. As mentioned
earlier, some of the indeterminate problems are indeterminate by virtue
of either a negative value on the quality dimension or a particular

value on the quantity dimension. These indeterminates shall be

29

 

 

 

 

 

 

 

 

 

 

1.001-
— SPATIAL
r‘ ”f“ ALGORITHMIC
.50 _
- CONTROL
l I
DETERMINATE INDETERMINATE SPATIAL
- ALGORITHMIC
.50 _
' ‘ ' 1 9.4925544:
_ UNIVERSAL MIXED ALGORITHMIC
_- / SPATIAL
.. 0—7 -
.___
.50 — _.__
’ VT? *—-0CONTROL
0 b 1 I L
AFFIRMATIVE MIXED NEGATIVE

Figure 4. Per cent correct response, per group, on three dimensions of
problems.

30

designated "Group A" (including IUM, IMN, IPA, IPM, and IPN). "Group
B" indeterminates are not so easily characterized, and a hypothetical
psychological correlate of a figure change is responsible for their
recognition as indeterminates. What is clearer is that they are excep-
tions to general rules, those rules specified by the dominant responses
of each problem status.

A brief look at Table 8 or Figure 3 demonstrates that the dif-
ference between experimental and control groups--favoring the experi-
mental groups in all cases--is somewhat larger in the Group A than in
Group B (IMM, IMA, IUM, IUA). This difference may be due to the follow-
ing consideration: for Group A problems, falsification (i.e., selecting
the "none of the above" response) is a one-step operation. For Group 8
problems, one has to (1) refute the dominant response of that problem
(Figure 2) and then (2) falsify the other three determinate solutions.
For the following test, the hypothesis was that control group Es would
do poorer on Group 8 problems than experimental Es because of the first
step, not the second step.

The hypothesis was operationalized in the following manner. Nine
problems were selected, on the basis of their being easily categorized
into "rules" or "exceptions". A rule was defined as a problem that
had a dominant response which was the correct response for that problem.
This is the set of determinate problems, excluding DUA'. An exception
was defined as a problem which had a dominant response, but one in which
that response was not the correct one. This is the set of Group B

indeterminates. DUA' was added to the list of exceptions because of the

31

 

Rules ExcepEions
DUA IUA, DUA'
DUM IUM
DMA IMA
DMM IMM

Figure 5. Rules and exceptions.

manner in which the group was defined. It differs from the others in
that its correct response, though also not the dominant response, is not
the indeterminate response, either, but rather another determinate
response. It is argued that although the problem is structurally a
determinate problem, it functions as an exception.

Four categories of response were discriminated: (1) see the
rule--marking a determinate response to any of the four rules; (2) see
the correct rule--marking the correct response, given that one has marked
a determinate response; (3) see the exception--marking a non-dominant
response to any of the five exceptions; and (4) see the correct excep-
tion--marking the correct response, given that one has marked a non-
dominant response. These four types of responses were recorded for each
individual on all four sets of problems. The summaries, over all prob-
lems, are given in Table 11; the same data, analyzed for each problem,
are in Appendix 0.

Four two-way analyses of variance were performed on the following
data (Tables 12-15). The analyses indicate no treatment differences on
the four analyses. Scheffé tests were used to examine the three remain-
ing significant treatment effects and revealed a superiority of the two

experimental groups over the control group (S+A-20) on "see exception“,

32

Table 11. Total percentage scores of each group on each category.

 

 

 

See Correct See Correct
Group—Sex Rule Rule Exception Exception
Spatial-M .857 .792 .786 .783
Spatial-F .762 .907 .815 .859
Spatial-T .810 .846 .800 .821
Algorithmic-M .798 1.000 .786 .762
Algorithmic-F .786 .955 .586 .807
Algorithmic-T .792 .978 .687 .782
Control-M .833 .957 .186 .308
Control—F .868 .918 .357 .520
Control-T .853 .937 .271 .448
Male Total .830 .915 .587 .723
Female Total .805 .926 .587 .772
Grand Total .817 .920 .587 .748

 

but on "correct exception" the experimental-control difference was
slightly less significant, barely missing the .05 level. Because the
control group performed at a higher level than the spatial group on the
"correct rule", two-way contrasts were employed here. No difference were
found between either the control group and the algorithmic group or the
control group and the spatial group; however, the algorithmic group was
superior to the spatial group on this category.

In terms of the hypothesis stated earlier, the results do indi-
cate that control Es perseverate longer on the dominant response than the
experimental groups, but it is also shown that even when they are able
to see an exception, the proportion of times that the control group are
able to see the correct exception is also inferior to that of the ex-

perimental groups (though it statistically eludes the .05 level). The

Table 12.

Analysis of variance for "see rule".

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Source SS df MS F
Treatment (T) 3.76190 2 1.88095 1.22
Sex (5) .85714 1 85714 0.56
TS 4.42858 2 2.21429 1.43
I:TS 55.42857 36 1.53968 ----
Table 13. Analysis of variance for "correct rule".

Source SS df MS F

T 93,890,439 2 45,945,219 4.09a
5 28,133 1 28,133 0.00
TS 45,746,719 2 22,873,359 1.99
I:TS 412,999,670 36 11,472,213 ----
Table 14. Analysis of variance for "correct exception".
Source SS df MS F

T 11,088,522 2 5,544,311 12.35a
S 435,337 1 435,337 0.97
TS 206,182 2 103,091 0.23
I:TS 16,165,181 36 449,032 ----
Table 15. Analysis of variance for "see exception".
Source SS df MS F

T 215.57143 2 108.28572 25.55a
S 0 1 0 0.00
TS 24.57143 2 12.28572 2.91
I:TS 152.00000 36 4.22222 ----

 

ap < .05

34

lack of a "see rule" effect indicates no treatment differences in the
selection of determinate solutions to determinate problems.

Figure 6 summarizes the differences between experimental and
control groups quite neatly: (l) the control group is inferior on the
Group A indeterminates and (2) the control group is inferior on the
Group B indeterminates, which have been broken down into the constituent
abilities of "see exception" and "correct exception". On both of these
categories, the control group was found to be inferior to the experi-

mental groups.

0. Measures Effect

 

The significant measures effect shown in Tables 5 and 6 was
scrutinized by dimensional comparisons in each of the three types of
problems. Figures 7 and 8, on succeeding pages, illustrate the compari-
sons to be investigated.

The rationale for the tests that were performed deserves some
explicit recognition. The hypotheses entertained at the outset of the
experiment dealt with the three dimensions of the problem status; the
post-hoc analysis of the problem set into Determinate, Group A and Group
8 problems enlarges this conception of the possible Measures effect and,
at the same time, affords a more convincing and meaningful method of
testing the original hypotheses.

The plan is as follows: tests were run on the dimensional com-
parisons shown in Figures 7 and 8, but only ijEjﬁfeach of the three

problem types. This leaves seven ANOVAs--one between the three prob-

1em types and two within each problem type--and they are more stringent

35

.mEmFroa 40 mqup muggy co .azocm emu .mmcoammc pumscoo ucmu com

.o mesmwm

 

A.<=o wzHozsquv A.<=o wzHeamuxmv
m azomw < 830mm me<szmmemo
» w 13 o
somkzou

1 mm.
uczzeHmow5< a om.

3<He<am 5<He<am
1 me.

.w5<
.. oo._

 

j
C

1.00

TOTAL

 

 

 

 

. *0
D\ ‘
SPATIAL [ I ,, ,___ .._, :
.5 4
*0

r

 

r“ I
/
b

 

 

 

 

 

 

 

ALGOR. - '7‘7“‘--..
L.
5_ I./\//'
L
_ *0
1..
- 03—7 -
CONTROL -
.5 —
I. 1?— Jv _ #49
1- f
*2 J 4 .
0 l 1 1 1
UA UM MA MM

Figure 7. Per cent correct response, per group, on determinate rules and
group B indeterminates. (* indicates DUA'.)

 

 

 

 

 

1.00—-
TOTAL '
.50. J / H
r.
W
G/‘\.
SPATIAL - .__/
.50..
ALGOR. * '//////////r
1-
.50..
CONTROL '
.50L.
0- 1, 1 l 1 l 4 t 4. %
UA UM UN MA MM MN PA PM PN
Figure 8. Per cent correct response on indeterminate problems.

38

tests, for any between-problem variance is thus parcelled out in the
six within-problem tests. These six tests, represented in Tables 17
through 22, thus sought to discover if there was a "pure" quantity or
quality effect in the problem data, just as the quest for a "pure" figure
effect was undertaken earlier.

The between-problem test is shown first, in Table 16. The dif-

ference between problems is significant and Scheffé analysis indicated

Table 16. Between-problem ANOVA.

 

 

 

 

Source SS df MS F

Sex 28.57143 . 1 28.57143 0.05

I:S 18,678.63492 -. 40 466.96587 ----

Measures 7,959.11112 2 3,979.55555 19.39a

MS 57.33333 2 28.55557 0.14

IM:S 15,414 22222 80 205.17778 -----
ap < .05.

that both determinate and Group A problems are easier for individuals
than Group 8 problems. There is no difference in difficulty between
determinates and Group A problems. The significant measures effect
offers support for the strategy of testing dimensional comparisons only
within problem types.

Tables 17 and 18 demonstrate significant quantity and quality
effects for determinate problems. As suggested in Figure 7, for these
problems the universal value is easier than the mixed-quantity value;
similarly, on the quality dimension, the affirmative value is less diffi-

cult than the mixed determinate.

39

Table 17. U-M ANOVA for determinate problems.

 

 

 

Source SS df MS F
Sex 10.71429 1 10 71429 0.21
I:S 1996.52381 40 49.91310 ----
Measures 1296.42858 1 1296.42858 20.28a
MS 19.04751 1 19.04751 0.30
IM:S 2555 52381 40 53.91310 .....

 

Table 18. A-M ANOVA for determinate problems.

 

 

 

 

Source SS df MS F

Sex 19.04752 1 19.04752. 0.17

I:S 4554 40475 40 113.86012 ----

Measures 109.71428 1 109.71428 5.52a

MS 10 71429 1 10 71429 0.55

IM:S 781.07143 40 19.52579 ----
ap < .05.

Figure 7 also suggests that for Group B indeterminates, these
effects for the quantity and quality dimensions are not retained.
Tables 19 and 20 confirm this position.

Finally, two dimensional tests were run for the Group A inde-
terminates. In this case, three-way comparisons were utilized, with
the addition of the negative and particular values. The quantity test
reached significance, and Scheffé analysis showed that, as in the deter-
minate sample, universal problems were significantly easier than mixed-

quantity problems. Additionally, the universals were easier than the

40

 

 

 

 

 

 

 

Table 19. U-M ANOVA for Group 8 problems.

Source SS df MS F
Sex .76191 1 .76191 0.01
I:S 2188.38095 40 54.70952 ----
Measures .42857 1 .42857 0.03
MS 9.33333 1 9.33333 0.71
IM:S 521.23810 40 13.03095 --—-
Table 20. A-M ANOVA for Group B problems.

Source SS df MS F
Sex 1.44048 1 1.44048 0.02
I:S 2435.76190 40‘ 60.89405 ----
Measures 6.29762 1 6.29762 1.29
MS 11.44047 1 11.44047 2.34
IM:S 195.76191 40 4.89405 ----

 

particular problems that are specific to Group A indeterminates. There
was no difference between the mixed and particular items, nor was there
an effect found for the quality dimension for the Group A indeterminates
(Table 22).

In total, the measures results appear to be more qualified than
anticipated. The major effect is the functional division of the deter-
minates (75%) and the Group A indeterminates (67%) both being easier
than the Group B indeterminates or "exceptions" (43%). Within-group

comparisons are less strong, though relatively consistent. The quantity

Table 21.

U-M-P ANOVA for Group A problems.

 

 

 

 

 

 

 

 

Source SS df MS F
Sex 2.60854 1 2.60854 0.23
I:S 447.77242 40 11.19431 ----
Measures 19.00000 2 9.50000 4.24a
MS 1.24861 2 .62431 0.28
IM:S 179.08472 80 2.23856 ----
ap < .05.
Table 22. A-M-N ANOVA for Group A problems
Source SS df MS F
Sex .95032 1' .95032 0.05
I:S 600.74603 40 15.01865 ----
Measures 8.82540 2 4.41270 1.78
MS 5.77780 2 2.88889 1.17
IM:S 198.06349 80 2.47579 ----
dimension is the more sensitive one, showing differences--both favoring

universal problems--in both the determinate and Group A samples. .Deter-

minates also show the only quality effect that was observed and this

effect is again consistent with earlier hypothesis:

minates are easier than mixed determinates.

affirmative deter-

Finally, though there is no overall difference between deter-

minate and Group A indeterminates, there has already been demonstrated

a strong interaction present in the data (see Tables 9 and 10). The

conclusions drawn from the measures tests that were run here must then

42

be limited in generality due to the treatment-measures interaction found
earlier. In particular, as Figure 6 has shown, the determinate problems
are significantly easier than Group A problems for the control group,

but slightly more difficult than the Group A problems for the two experi-
mental groups. In this way, then, the lack of a difference in diffi-
culty between the two groups of problems may be attributable to this

interacting result.

DISCUSSION

The discussion here will center on two major issues: (1) the
validity of the effects demonstrated in the last section, particularly
the treatment effect, and (2) the ways in which one might account for
these results, emphasizing states of knowledge inferrable from these

data.

A. Treatment effect

The treatment differences that have been found are strong, yet
not completely clear in their meaning. As stated in the beginning, the
major intent was to examine the changes in performance as a result of
training, rather than to specify the exact conditions under which a
training procedure might produce an improvement in performance. Con-
sistent with that orientation, the elements of a compound treatment
"package"--consisting of a two-page booklet, subjects' practice, experi-
menter's feedback, and experimenter's heuristics--were not isolated in
this experiment, and thus it is not possible to ascertain which of the
above contributed to this treatment effect and which did not. Under
the tutorial setting in which the experiment was realized, it was
simply not possible to isolate all of these possible contributions to the
treatment effect. It must ultimately be accepted that this effect
cannot be attributed to anything less than the composite treatment

"package" to which Es were exposed.

43

44

Logically, however, there are features of the two training pro-
cedures employed that would seem to be beneficial to the Es. In terms
of Schwartz' (1971) three components of a "mode of representation", the
two techniques appear as follows: (1) the algorithmic defines needed
information explicitly in terms of its series of questions; the spatial
does so implicitly by the possibility of more than one diagram; (2) the
algorithmic does more than suggest a fruitful order of operation, it
explicitly designates it; the spatial method again is more implicit, by
suggesting the diagramming of the first premise, then the second, fol-
lowed by a search for possible conclusions; (3) the spatial technique
provided consistency checks inasmuch as different diagrams may illustrate
the same conclusion, or lack thereof; the algorithmic method, however,
has no inherent checking procedure.

Finally, though these considerations might suggest the operation
of the above three factors in the production of the treatment effect,

further research is necessary to confirm the logical speculations.

B. Treatment-measures Interaction

 

In the introduction, it was noted that several training studies
have revealed a larger training effect for indeterminate problems than
for determinate problems, a result that is replicated in this study.

It is reasonable to suggest that the interaction found may be summed
up by saying that experimental Es simply give the indeterminate response
more often than the control Es, and thus received better scores on the

indeterminate items.

45

This part is true. However, one inference from this statement
may be tested. An argument may be made that these diverse training pro-
cedures all produce similar interactions because what Es are really
learning is a general attitude of cautiousness in reaching conclusions
from the premises given. Thus, using the analogy of the statisticians,
the naive subject in the control group has a response-bias toward
choosing determinate answers and hence, is highly prone to the Type I
error of making a determinate answer to indeterminate problems. The
trained E_in the experimental groups, however, have curbed this bias and
are prone, if anything, to the opposite tendency, of the Type II error
of making indeterminate answers to determinate problems.

Certain parts of this picture are true. Naive subjects do use
the indeterminate response much less than the other groups; control Es
used it 63 times for 14%, algorithmic Es used it 213 times for 47%, while
spatial Es used it 254 times for 57% of all of their responses. For
purposes of context, the indeterminate response is the correct response
56% of the time. Yet the evidence does not show an experimental decline
on determinate or rule items, corresponding to an increase on indeter—
minate or exception items. Table 9 clearly shows no treatment effect
at all on the determinate items, and tables 12 and 13 indicate no rele:
vant differences in "see rule" and "correct rule" abilities. The effect
found for "correct rule" (Table 13) was not due to a difference between
control and experimental groups.

Scandura's conception of a rule is helpful here (Scandura, 1970).
As Scandura suggests, the notion of a rule is an appropriate theoretical

entity to invoke whenever behavior manifests all-or-none characteristics,

45

for then the account of the behavior in question is summarized by the
operation, or lack thereof, of certain rules. These are rules that an
experimental E apparently stores during the process of the experiment,
or brings to the experimental situation.

Scandura identifies two processes of rule use. In what he terms
"decoding", stimuli are mapped into certain classes of stimuli. In
concept-attainment terminology, irrelevant dimensions of stimuli are
ignored in the classification, and thus the present stimuli are mapped
into the 14 classes of DUA, DUM, etcetera (or at least the 9 classes of
UA, MM, and so on). The dimensions of problem figure, premise combin-
ation and premise order are hypothesized to be ignored in this functional
classification. The rule-process that is termed "encoding" involves the
selection of I'one of the functionally equivalent overt responses in the
defined class" (Scandura, 1970, p. 522). For our purposes, this process
is essentially that of retrieving a rule corresponding to the classifi-
cation obtained in the earlier stage. Once the rule has been applied,
the problem is trivial, and an answer is generated.

As an example of this procedure, suppose an E were confronted
with a problem such as I'All B are C, All A are 8" and is left to draw an
inference between A and C. The subject "decodes" the problem into UA,
and, in the encoding process, the rule "UA--1", where 1 stands for the
conclusion "All A are C", is derived. This rule is that of the dominant
response for the UA problem. Similar rules would be involved for the
other determinate items, and, apparently, these rules are not learned in

the experimental situation, for even the control Es act as if they knew

47

them at the start of the experiment. It would thus appear that these
rules are part of the tacit knowledge of the individual. Of course,
nothing conscious is necessarily involved here; the E_may not explicitly
formulate the rule but only behave in accordance with it. A rule con-
struct is of value in imposing order on what an individual does, not
what he thinks he does.

The aforementioned rules may be referred'to as verification rules,

 

for they function in verifying one of the determinate solutions in the

problem set. The following falsification rules perform a similar func-

 

tion, that of eliminating all determinate solutions as alternatives to a
given problem (i.e., falsifying the 4 determinate solutions) and leaving
the indeterminate response as the correct response. Falsification rules
act upon the Group A indeterminates quite easily. The N-rule operates

N-rule: If both of the premises contain a negative value on the quality
dimension, the problem is indeterminate.

E:Eﬂl§5 If both of the premises contain a particular value on the quantity
dimension, the problem is indeterminate.

on IUN, IMN, and IPN. The P-rule operates on IPA, IPM and IPN (again).
The two rules are the easiest way to falsify any problem, and may be
seen as a screening ground for all problems, once decoded. A check for
these two values is made (or could be) and if confirmed, all processing
of the problem would terminate. This is at least one way in which Es
can answer, at high accuracy, these Group A problems in less than ten
seconds. And it is quite conceivable that these rules could be learned

rapidly in the laboratory; as in the concept attainment paradigm expo—

sure to exemplars and appropriate feedback would be all that is required.

48

An escapsulation, then, of this interaction would emphasize

the interaction of abilities across groups. The two experimental groups
and the control group are equally adept at correctly verifying determi-
nate problems; the control group gets into difficulty with indeterminate
problems, which require the skills of falsification. It is these skills,
materialized in the form of the N- and P-rules, that the experimental
groups learn during the practice session. Notwithstanding the training—
induced facilitation of falsification for the experimental groups, these

groups also demonstrated verification ability equal to the control group.

C. Measures Effect

 

Several hypotheses relevant to problem difficulty were not con-
firmed. Frase's (1968a) associative hypothesis concerning problem figure
gained no support, though no statistical tests were run on this data.

It was noted in the introduction that two types of figure effects were
conceivable. The I'pure" figure effect, which was not found, would act

in separation of a change in problem status. The "other" figure effect
was quite potent, as noted above, especially for the control group. In
this case, however, the result may be more parsimoniously referred to

as due to a change in status (i.e., determinate-indeterminate). The only
exception in the figure data was the problem DUA', which will be dis-
cussed below. The other two comparisons sought, concerning premise
combination and order effects, similarly failed to exert a systematic

effect on problem difficulty.

49

The author's hypothesis was a "linear" one, in that it predicted
that the dimensions of quantity and quality would arrange themselves, '
from easiest to hardest, from universal/affirmative, thrbugh particular
negative to the mixed problems. This linear tendency was not manifested
very clearly in the present study, however, and six dimensional tests
(Tables 17 to 22)--while revealing three significant comparisons--
nevertheless failed to demonstrate that quantity and quality dimensions
are always important factors determining problem difficulty. In certain
cases (for determinates and, for the quantity dimension, Group A problems)
the predicted effects are shown clearly, while in others (Group B prob-
lems) the effects are non-existent. A clarification of the role of
these two problem dimensions might result from an exceedingly careful
study of individual problems differing only in one value on one of the
mentioned dimensions.

The "other" measures effect, however, is quite clear in its
interpretation. Determinate problems are easier than Group B problems
for all groups and for the control group, they are also easier than
Group A problems. The evidence thus suggests that the functional group-
ing of problems into three types has merit inasmuch as the problems
appear to be treated somewhat differently. As mentioned earlier, two
require the use of logical rules either present at the start of the ex-
periment or easily acquired. It is maintained that the Group B problems
are the most difficult because they are not governed by rules at all, but
are structurally and psychologically exceptions. Subjects are asked to
falsify a conclusion that is very similar to those that have been cor-

rectly verified earlier.

50

A final consideration relative to problem difficulty concerns
DUA'. This problem is the most exceptional in the study because it may
be properly regarded as an exception, along with the Group B indeter-
minates, but, unlike this group, its correct answer is of the form "Some
A are 8". This is pertinent, for the experimental groups have shown
what has been called greater falsificatory skill in falsifying these
problems, and DUA' is the only one in which the act of falsification does
not involve the N (None of the above) response. Instead, only the A
dominant response is falsified, and the weaker form I is retained as the
correct answer. From Table 8, one may note that the treatment effect
for DUA' is not significant, though the experimental groups do consid-
erably better than the control group. Thus, the hypothesis that the ex-
perimental groups' superiority on these problems is restricted to the N
response cannot be rejected.

However, it can be shown that the 1 response necessary for DUA'
success is not outside the grasp of the control group. They use the
response more efficiently than the experimental groups on DMA. In sum,
there is a non-significant trend supporting the statement that experi-
mental Es perform better than control Es on indeterminate items because
they have learned the logical concept of an exception, rather than simply
learning an individual response and repeatedly using it.

One hypothesis, however, regarding the difficulty of the DUA'
problem has no support from the data presented here. Frase (1968a)
would have hypothesized that AA-4 was more difficult than AA-l (See

Appendix C) due to the fact that the former resembles a backward chain

51

and the latter a forward chain. Table 2, however, shows rather clearly
that both AA-3 and AA-4 (DUA' problems) were more difficult than AA-l
(DUA), but no differentiation is tenable between the two DUA' problems.
Thus, it is evident that something more general than a difference in
figure (e.g., a psychological status) is necessary to account for the

difference.

0. Conclusions

 

One way of viewing the task given to the 42 subjects who were
employed in this experiment is to see the problems they face as well-
structured problems, de-emphasizing their logical nature. The struc-
turedness of the problem set was everywhere apparent: each had 2 prem-
ises, each had 5 conclusions, each consisted of a selection of 4 basic
propositions, each problem type had its own dominant response, and so
on. It seems clear that Es took advantage of the structure that was
handed to them, and structured their own responses. One such structure
is a rule, and for the reasons stated above, it is believed that the
evidence for rule-governed behavior is more convincing than the evidence
for logical reasoning.

I have been careful in distinguishing "syllogistic performance"
from that type of performance termed "reasoning". It is patently ob-
vious that a high performance on a reasoning test is not to be equated
with this concept of reasoning. What is not clear is exactly what be-
haviors shall constitute sufficient evidence for this process. Theo-
retical consideration must be given as to whether or not the process

shall be ascribed to the use of a rule such as the N—rule, or perhaps

52

reserved for the more abstract processes underlying Group 8 problem
performance. It is exceedingly unclear whether "reasoning" is a factor
composed of skills that can be acquired (such as verification or falsi-
fication), or whether the rules previously discussed are but a simple
way of avoiding reasoning altogether and doing the problems "mechani-
cally". Ultimately, this would lead to a consideration of the relation

between logical rules and more general behavioral rules.

E. Implications for Further Research

The treatment-measures interaction has a couple of significant
implications: (1) the selection of problems may well be a determinant
of training efficacy. In training studies, if very difficult problems
were used, a treatment effect would not be nearly as convincing as if
rather easy problems were employed. Of course, with easy problems, any
treatment would have to contend with a possible ceiling effect; never-
theless, if such a treatment were successful, it would be highly impres-
sive. It would be possible, in fact, to grade procedures on one of
several levels of assessability (e.g., this procedure was successful for
indeterminates only, a rather low level). (2) The selection of problems
may well be a confounding variable in other logical studies. Many
studies fail to specify what problems were used and why they were chosen.
Recent studies (e.g., Ceraso and Provitera, 1971; Lippmann, 1972) have
been marred with a confounding effect of failing to control the problem
status when dealing with other variables. A lack of representativeness
of the problem set is a related concern for studies not directly aimed
at explicating the factors underlying problem differentiation. The nor-
mative data presented here may well be of assistance in the problem-

selection process.

LIST OF REFERENCES

LIST OF REFERENCES

Ceraso, J. and Provitera, A. Sources of error in syllogistic reasoning.
Eggnitive Psychology, 1971, E, 400-410.

Chapman, L. J. and Chapman, J. P. Atmosphere effect re-examined.
Journal of Experimental Psychology, 1959, EE, 3, 220-226.

DeSoto, C., London, M., and Handel, S. Social reasoning and spatial
paralogic. Journal of Personality and Social Psychology, 1965,
E_513-521.

Frase, L. T. Validity judgments of syllogisms in relation to 2 sets of
terms. Journal of Educational Psychology, 1966, E1, 239—245.

. Associative factors in syllogistic reasoning. Journal of
Experimental Psychology, 1968, 7E__(3, Part 1), 407-412. (a)

 

Effects of semantic incompatibility upon deductive reasoning.
Psychonomic Science, 1968, lg_l, 64. (b)

 

Helsabeck, F. An analysis of difficulties in abstract syllogistic
reasoning. Unpublished Ph.D. dissertation, Michigan State
University, 1973.

Henle, M. 0n the relation between logic and thinking. Psychological
Review, 1962, E2, 4, 366-78.

 

Henle, M. and Michael, M. The influence of attitudes on syllogistic
reasoning. The Journal of Social Psychology, 1956, 45, 115-127.

Lippmann, M. Z. The influence of grammatical transformations in a syl-
logistic reasoning task. Journal of Verbal Learning and Verbal
Behavior, 1972, 11, 424-430.

Scandura, J. M. Role of rules in behavior: Toward an operational
definition of what (rule) is learned. Psychological Review,
1970, 22, 6. 516-533.

 

Scheffé, H. The Analysis of Variance. New York: Wiley, 1969.

Schwartz, S. H. Modes of representation and problem-solving: Well
evolved is half solved. Journal of Experimental Psychology,
1971. 21, 2. 347-350.

53

54

Simpson, M. E. and Johnson, D. M. Atmosphere and conversion errors in
syllogistic reasoning. Journal of Experimental Psychology,
1966, 22, 2, 197-200.

Woodworth, R. S. and Sells, S. B. An atmosphere effect in formal syl—
logistic reasoning. Journal of Experimental Psychology, 1935,
18, 451—460.

 

APPENDICES

APPENDIX A

TRAINING BOOKLETS

FORM S

This is an experiment in logical syllogisms. The statements
that form the syllogism are illustrated below in numbers 1-4.
1. A11 X are Y

In order to better understand the nature of this statement, it

is possible to diagram it as shown below:

 

'Y 0R

41' 11

.<

 

 

 

 

(A) (B)

 

 

 

 

Figure A indicates a set of things called "X" lies inside the "Y” area,
but do not exhaust it; that is, there are some other things in Y besides
X. B Shows another possible diagram: Here X and Y are identical, so
here, too, "All X are Y".

2. No X are Y

This statement has only one diagram. The two classes are inde-

pendent. F” I s (:::::::::::>
3. Some X are Y
(B) (C)

 

 

 

 

 

 

111

This statement has three possible diagrams. Any of these cases may

 

(A)

illustrate the relationship between X and Y.

55

4. Some X are not Y

L (A) (B)

 

 

 

 

 

 

(C)

O
:0

Again, there is more than one figure possible for the single statement.
Since we do not know which is the correct figure, we must consider all
of them. Now look at these examples:

5. All X are Y
Therefore,
All Y are X. Is this a valid conclusion?

To decide this, ou look at (A)
both possibilities (the ones in #1 . The statement 0R
"All Y are X" is true for B, but not for A. $0,
the conclusion is not valid.
(B)
6. No X are Y
Therefore,-

No Y are X. Is this a valid conclusion? Again, as in #2, we have
only one diagram possible. The answer is yes.

.1
LY

 

 

 

 

 

 

.
‘ ’ OR (Em) 0R
. (A) ” (a)

8.

 

57

Some X are Y

Therefore, .

Some Y are X. Is this a.va1id conclusion? Look at the three pos-
sibilities, and notice that the statement is true for all three.

The answer is yes.

 

(C)

Some X are not Y

Therefore,

Some Y are not X. Is this a valid conclusion?* Note the second
possibility here. The answer is no.

0R

 

 

(A)

 

 

58

FORM A

This is an experiment in logical syllogisms. A syllogism con-

tains two premises and one conclusion, of the form below.

 

 

Premise 1 (true) Premise 1 (true)
Premise 2 (true) Premise 2 (true)
Conclusion (true)--Valid conclusion Conclusion (false)--invalid

conclusion

In this experiment, we are assuming the truth of the two premises, and
you are asked to decide if any of a number of conclusions is also true;
if it is, then it is called a valid conclusion; if it is not, it is an
invalid conclusion. "True" means true in gygpy_case; for a conclusion
to be valid, it must be true whenever the two premises are.

The statements that form the premises and conclusions are illus-
trated below in numbers 1-4.
1. A11 X are Y.--Affirmative Universal

This statement is called affirmative because it is in positive
form (it says what all X are, not what they aren't). It is called
universal because it refers to all X5.
2. No X are Y.—-Negative Universal

This statement is called negative because it tells us what the
X5 aren't, not what they are. It is universal because all Xs are in-
cluded: All of them are not Y.
3. Some X are Y.--Affirmative Particular

This is affirmative because it is in positive form, and particu-

lar because it is referring only to some of the X5.

59

4. Some X are not Y.--Negative Particular.
This is negative because it tells us what some of the X5 are

not; it is particular because it refers to only some of the $5.

Affirmative Negative
Universal 1 2

Particular 3 4

60

 

 

_*
:owmapucou

pmmgm>wcz
e>eeaeeeee<

 

 

LO

 

.a>eeaeewee<

m*
:owmzpucou

empzupuceq

 

 

LO

 

.mw
commzpucou

eepa> oz

 

 

 

:opmz—ucou
uppm> oz

 

mm”

gmpzuv

ewe» ea

    

 

m*
:owmapucou

empaowucma
m>_pmscpmm<

 

 

 

 

sf

«cmpzuwucma
Ems» Lo mco ma

non mg¢

/ \

ew>pummoc Ems» we «so mH

oz

 

cowmz

m*

Pucou.

evens
oz

 

 

cowmaﬁucou.
Pemcm>wcz
Na a>eeamez

 

oz

wsnpaowusma

swsu.mo mco mm

V.

wgmpzowugma
swzu we even mc<

\\\\\\

mm>

~m>wpmamc mmmwsmeq span ms<

L .

o

 

 

 

 

 

 

 

 

 

 

 

 

 

 

copmzpucou .1
uwpm> oz commspocou L
mm cmpauwpcmn o cowm:_ucou
¢* m>pummmz vam> ozme
mm>
cowmzpucou
m* twpo> oz
mm>\
cowmapucou
m* vam> oz
mm>

 

APPENDIX B

TRAINING ITEMS

Premises

AE

EA

AE

AE

Figure
4

Status

DUM

DUM

DUM

IUM

IUM

66

Correct

2

Problem

All J are K

No K are L

Therefore,

1. A11 L are J

2. No L are J

3. Some L are J

4. Some L are not J
5. None of the above

No P are 0 '

All R are P
Therefore,

1. All R are 0

2. No R are 0

3. Some R are 0

4. Some R are not Q
5. None of the above

No T are U

All V are U
Therefore,

1. All V are T

2. No V are T

3. Some V are T

4. Some V are not T
5. None of the above

All M are N

No 0 are N

Therefore,

1. All 0 are N

2. No 0 are N

3. Some 0 are N

4. Some 0 are not N
5. None of the above

All S are T

No S are U

Therefore,

1. All U are T

2. No U are T

3. Some U are T

4. Some U are not T
5. None of the above

Premises

EE

E1

E0

IA

IE

Figure
3

Status

IUN

DMM

IMN

IMA

IMM

62

Correct

5

Problem

No D are E

No D are F

Therefore,

1. A11 F are E

2. No F are E

3. Some F are E

4. Some F are not E
5. None of the above

No J are K

Some J are L
Therefore,

1. All L are K

2. No L are K

3. Some L are K

4. Some L are not K
5. None of the above

No T are U

Some V are not U
Therefore,

1. All V are T

2. No V are T

3. Some V are T

4. Some V are not T
5. None of the above

Some A are B

All C are 8
Therefore,

1. All C are A

2. No C are A

3. Some C are A

4. Some C are not A
5. None of the above

Some Y are X

No Z are X

Therefore,

1. All 2 are Y

2. No Z are Y

3. Some 2 are Y

4. Some 2 are not Y
5. None of the above

Premises

EE

EE

AI

AI

IA

Figure
1

Status

IUN

IUN

DMA

DMA

DMA

67

Correct

5

Problem

No D are E

No F are 0

Therefore,

1. All F are E

2. No F are E

3. Some F are E

4. Some F are not E
5. None of the above

No M are N

No 0 are N

Therefore,

1. All 0 are M

2. No 0 are M

3. Some 0 are M

4. Some 0 are not M
5. None of the above

All T are V

Some U are T
Therefore,

1. All U are V

2. No U are V

3. Some U are V

4. Some U are not V
5. None of the above

All 0 are M

Some 0 are N
Therefore,

1. All N are M

2. No N are M

3. Some N are M

4. Some N are not M
5. None of the above

Some A are 8

All A are C
Therefore,

1. All C are 8

2. No C are 8

3. Some C are 8

4. Some C are not B
5. None of the above

Premises

IA

IA

AI

EI

A0

Figure
4

Status

DMA

IMA

IMA

DMM

DMM

68

Correct

3

Problem

Some Y are X

All X are Z
Therefore,

1. All Z are Y

2. No 2 are Y

3. Some Z are Y

4. Some Z are not Y
5. None of the above

Some U are W

All V are U
Therefore,

1. All V are W

2. No V are W

3. Some V are W

4. Some V are not W
5. None of the above

All A are 8

Some C are 8
Therefore,

1. All C are A

2. No C are A

3. Some C are A

4. Some C are not A
5. None of the above

No G are H

Some F are G
Therefore,

1. All F are H

2. No F are H

3. Some F are H

4. Some F are not H
5. None of the above

All H are F

Some G are not F
Therefore,

1. A11 G are H

2. No G are H

3. Some G are H

4. Some G are not H
5. None of the above

APPENDIX C

'TEST ITEMS

Premises

AA

AA

AA

AE

Figure
1

TEST ITEMS

Status

DUA

DUA'

DUA'

IUA

DUM

Correct

65

1

Problem

All A are 8

All C are A
Therefore,

1. All C are B

2. No C are 8

3. Some C are 8

4. Some C are not B
5. None of the above

All P are 0

All P are R
Therefore,

1. All R are 0

2. No R are 0

3. Some R are 0

4. Some R are not Q
5. None of the above

All F are G

All G are H
Therefore,

1. All H are F

2. No H are F

3. Some H are F

4. Some H are not F
5. None of the above

All Y are X

All Z are X
Therefore,

1. All Z are Y

2. No Z are Y

3. Some Z are Y

4. Some Z are not Y
5. None of the above

All E are F

No D are F

Therefore,

1. A11 0 are E

2. No D are E

3. Some 0 are E

4. Some 0 are not E
5. None of the above

Premises

OE

II

II

10

OI

Figure
1

Status

IMN

IPA

IPA

IPM

IPM

70

Correct

5

Problem

Some V are not W

No X are V

Therefore,

1. All X are W

2. No X are W

3. Some X are W

4. Some X are not W
5. None of the above

Some H are I
Some J are H

Therefore,
1. All J are I
2. No J are I

3. Some J are I
4. Some J are not I
5. None of the above

Some J are K
Some L are K

Therefore,
1. All L are J
2. No L are J

3. Some L are J
4. Some L are not J
5. None of the above

Some R are S
Some T are not R

Therefore,
1. All T are S
2. N0 T are S

3. Some T are S
4. Some T are not S
5. None of the above

Some N are not 0
Some P are N

Therefore,

1. All P are 0

2. No P are 0

3. Some P are 0

4. Some P are not 0
5. None of the above

Premises

OO

00

Figure
1

Status

IPN

IPN

71

Correct

5

Problem

Some C are not 0
Some E are not C

Therefore,

1. All E are D

2. No E are D

3. Some E are D

4. Some E are not 0
5. None of the above

Some Y are not W
Some X are not W

Therefore,

1. All X are Y

2. No X are Y

3. Some X are Y

4. Some X are not Y
5. None of the above

APPENDIX D

RULE-EXCEPTION RESULTS PER PROBLEM

Group

Spatial
Algorithmic
Control
Total

Group
Spatial
Algorithmic
Control
Total

Group

Spatial
Algorithmic
Control
Total

Group
Spatial
Algorithmic
Control
Total

RULE-EXCEPTION RESULTS
Problem UA

 

Rule Correct Rule See

dddd
. . C .

Problem UM

 

Rule Correct Rule See

PER PROBLEM

Exception

.714
.594
.190
.500

Exception

Correct Exception

.667
.640
.500
.635

Correct Exception

 

Problem MA

 

Rule Correct Rule See

.786
.714
.143
.548

Exception

.909
.850
1.000
.891

Correct Exception

 

Problem MM

 

Rule Correct Rule See

.857
.643
.143
.583

Exception

.792
.722
.429
.714

Correct Exception

 

.881
.714
.452
.683

.865
.867
.316
.744