thﬁvw v

 

 

A TEST OF THE HYPOTHESIS OF
ADDITIVH'YaQF-sCUES IN A TWQ-CHOICE
DISCRIMINATION LEARNING PROBLEM

Thesis Iov Ibo chrea oI M. A.
MICHIGAN STATE UNIVERSITY

Thomas Robert Trabasso
1959

“£319

 

’L [B R A R Y
Mir‘ﬁgan State
University

 

A TEST OF THE HYPOTHESIS OF ADDITIVITY-OF-CUBS
IN A TWO-CHOICE DISCRIMINATION LEARNING PROBLEM

By
Themes Robert Trebeeeo

A THESIS

Submitted to the College of Arte and Sciences of
Michigan State university of Agriculture and
Applied Science in partial fulfillment of
the requiremente for the degree of

IILSTER OF ARTS

Department of Peychology
1959

Thomas R. Trabasso
ABSTRACT

This thesis is concerned with quantitatively formulating and
testing the concept of additivity-of-cues.

Previous experimentation has shown that learning is more rapid
when relevant stimuli were presented in more than one modality.
Qualitatively, it was known that learning is always faster in the
combined-cue situation but prior to the application of mathematical
models of learning to such data, no quantitative laws regarding the
function of additivity had been formulated and tested.

Using a theory of two-choice discrimination learning by Restle
(1955), two hypotheses of additivity of the proportions of relevant
cues were formulated in set-theoretic mathematics and tested. The
hypotheses were additivity-cf-cuee of two kinds: direct and additivity
by a function.

To test additivity-of-cues, a two-choice discrimination learning
problem.was used. Five groups of 16 human subjects each were tested
on separate problems. Two problems had one cue relevant and one cue
irrelevant. A second two problems had one one relevant, but the
measure of the irrelevant cues was reduced. The remaining problem
had both cues relevant.

The stimuli were patterns of letters which had a fixed alphabet-
ical order. but.varied in form between upper and lower case. The
response was written when the subject saw a stimulus pattern. The

i.

Thomas R. Trabasso
correct answer (an X or O) was predictable from the stimulus pattern
by a consistent principle.

Two hypotheses of additivity were formulated and tested.
Hypothesis 1: direct additivity, where the proportion of relevant
cues in a combined-cue problem was predicted by direct addition of
the proportions of relevant cues in two single-cue problems.
Hypothesis 2: additivity by a derived function. where the proportion
of relevant cues in a combined-cue problem was predicted by a function
of the preportions of relevant cues in two single-cue problems, the
measure of the irrelevant cues being reduced in the latter two problems.

The results may be sunnarised as to four findings:

1. The combined-cue group showed faster learning than the single-cue
groups. indicating some form of additivity. One of the relevant cues
was found to be stronger than the other and the reduction ,of the
measure of irrelevant cues through the fixing of a letter had a small
beneficial effect on learning.

2. The two methods of estimating the learning rate parameter, 9,
yielded discrepant results, indicating that neither the group nor the
individual learning curves were of the shape predicted by the theory.
An analysis of the discrepancies between the two methods of estimation
suggested a bias inherent in the methods.

3. The predictions of the mean error scores and learning rates by
both hypotheses of additivity were found to be accurate in all cases.
Statistical tests of these hypotheses failed to indicate that they
should be rejected.

ii.

Thomas B. Trabasso
4. To account for individual differences in learning rate, an assump-
tion of a high positive correlation as to subject.position in the
groups was made. Predictions of the distributions of rates of learning
in the combined-cue problem.by the application of the hypotheses of
additivity to the matched rank values in the single-cue groups were
made. These predictions were found to be accurate for the cumulative
distribution of learning rates. the mean and the median of the combined-
cue group.

The discrepancies between the two methods of'estimating learning
rates were discussed. Inspection of the discrepancies indicated that
the discrepancies were larger with faster rates of learning. A lento-
Carlo procedure was used to test this difference but did not clearly
indicate the nature of the discrepancies. It is suggested that such
a procedure would be fruitful for investigations of the variance and
distribution of the learning rate parameter. 0.

A second quantitative analysis, based on the number of stimulus
patterns in the problems was discussed and found not to be consistent

with the additivity-of-cues data.

References

Restle. F. A theory of discrimination learning. Psychol. Rev.. 1955,
62, 11-19.

Approved: '3, ICU/kg) Mi] [[1 Date: )NQLLAIQJZI‘IC]

VlaJor Professor (]

iii.

DEDICATION

To IV wife

1Ve

ACKNOWLEDGEMENT

The author wishes to express his gratitude and sincere
appreciation for the guidance and assistance in the planning
and execution of this research, and the development of this
manuscript to Dr. Frank J. Restle, chairman of his ccnsnittee.

In addition. he wishes to convey profound thanks to
Dr. ll. Ray Denny and Dr. Terrence M. Allen for their excel-
lent criticism and advice, during the preparation of this

ﬁleﬂi'e

7e

INTRODUCTION.
METHOD . .
RESULTS . .
DISCUSSION .
SUMMARY . .
APPENDICES .

Appendix

Appendix
REFERENCES .

I--Restle's Theory of Discrﬂminaticn.learning

II--Tables and subject Summary Data

TABLE OF CONTENTS

vi.

PAGE

10
17
32
4O
42
43
53

63

LIST OF TABLES AND FIGURES
TABLE PAGE
I. EXPERIMENTAL GROUPS USED TO TEST ADDITIVITY-OF-CUES . . 15
II. INDIVIDUAL COMPARISONS BETWEEN GROUP MEANS . . . . . . 19

III. 9 ESTIMATES FOR THE FIVE EXPERIMENTAL GROUPS AS OBTAINED
BY THE MEAN TOTAL AND WEIGHTED ERROR SCORES . . . . . . 21

IV. MEAN TOTAL ERRORS IN 128 TRIALS AND PROPORTION OF
REIEVANTCUES(9)FORGROUPA+E............ 24

V. MEAN WEIGHTED ERRORS IN 128 TRIALS AND PROPORTION OF
RELEVANTCUES(9)FORGROUPA+E............ 24

VI. 5 AND a VALUES ESTIMATED BY THE TOTAL ERROR am) WEIGHTED
ERROR METHODS OF ssrmnox FOR 128 rams . . . . . . 53

VII. SUBJECT SUMMARY'DATA, SHOWIIB NUMBER OF ERRORS PER BLOCK
OF 8 TRIALS. TOTAL AND WEIGHTED ERRORS. GROUP AND
INDIVIDUALS'Seaeeeeeeeeeeeeeeeeee 54"
58
IX. CORRECT RESPONSE SEQUENCE WITH PAIRw STIMULUS PATTERNS 69-
7O
FIGURES

1. MEAN PROPORTION AND MEAN NUMBER OF CORRECT RESPONSES
OFGROUPSA,E,A+E,A'ANDE'............ 18

2. SCATTER PLOT OF 80 INDIVIDUAL TOTAL AND WEIGHTED
mon MIMTES OF 9 O O O O O O C O O O O O O O O O O 31

3. SCATTER PLOT 0F 80 HYPOTHETICAL INDIVIDUAL TOTAL AND
WEIGHTED ERROR ESTIMATES OF 9, WHERE T. GROUP 8 2.25 4O

Viie

I NTRODUC T IO N

This thesis is concerned with quantitatively formulating and
testing the concept of additivity-of-cues.

In a number of experiments. learning has been found to be more
rapid Ihen relevant stimuli are presented in more than one modality.
Eninger (1952) states that this follows from Spence's theory of dis-
crimination learning. Thus. the hypothesis that cues have an additive
effect in learning seems plausible.

Prior to the recent application of mathematical models of learn-
ing to such data. the only criterion for deciding whether or not cues
were additive was whether subjects showed improved performance with
increased cues. Performance increments were not algebraic in form and
there was no suitable measure to apply which could reflect a rational
function of additivitw. Qualitatively. it was known that learning is
always faster in the combined-cue situation, but no quantitative laws
had been formulated and tested (Restle. 1955).

In a series of recent papers (1955, 1957, 1958). Restle has used
a mathematical model of discrimination learning to quantify the analysis
of additivity-of-oues. Included in his analysis have been examples of
additivity-cf-cues in T-maze learning of rats (Blcdgett et al.. 1949:
Sninger. l952: Galanter a: Saw, 1954) and Scharlock. 1965) and in color.
form A: size discrimination learning by monkeys (Warren, 1955).
Recently, (Beetle. 1959) this analysis has been extended to hI-an

learning in a simple two-choice discrimination problem.-

1

2

The assumption made by Beetle (1955 and 1959) is that in simple
two-choice discrimination learning. the proportion of relevant cues
determines the rate of learning. Thus it follows that additivity of
relevant cues will be manifested as the additivity of learning rates.
which can be estimated. 2

The stimulus situation in two-choice discrimination learning
experiments is represented by a set of discriminable aspects called
cues. Every individual cue my be thought of as either ”relevant” or
“irrelevant". A cue is relevant if it can be used by the subject to
predict where or how reinforcement is to be obtained. Cues which are
uncorrelated with reinforcement are irrelevant.

This model contains two hypothesised processes of discrimination
learning, “conditioning” and ”adaptation”. The relevant cues in the
stimulus situation are conditioned to the correct response. On the
other hand. the subject's responses become independent of the irrelevant
cues. i.e.. irrelevant cues are adapted. Once a cue is adapted. it has
no effect on response, and ciher cues contribute toward the probability
of a correct response or an error accordingly as they are conditioned
to be correct or wrong response. On each trial of a given problu.

a constant proportion. 9. of unconditioned relevant cues becomes
conditioned. The "fundamental simplifying assuszption" of his theory
deals with the learning rate parameter, 9. This assumption is that

0:: r (1)
r+i

 

where 5 is the number of relevant cues in the problem and _i_ is the

number of irrelevant cues (Restle. 1955. p. 12. Eq. 5). Thus. 9 is

3
the preportion of relevant cues in the problem, The rates of conditione
ing and adaptation are assumed to be equal and to equal the proportion
of relevant cues in the problem.

The above definition of 9 implies that if one increases the number
of relevant cues in the stimulus situation, the effect is to increase
the learning rate. Additivity-of-cues is directly concerned with the
effect of increasing the number of relevant cues in the problem.
Similarly, a reduction in the number of irrelevant cues in the stimulus
situation would have the effect of increasing the learning rate, 9.
Both of these effecte will be treated in this thesis.

Included as Appendix I to this thesis is a more detailed form
of the mathematical model used in the analysis of additivity-cf-cues.
Formulas and.methcds of estimating learning rates are contained therein
and will be referred to when necessary.

Consider a number of experhmental problems of two-choice discrimi-
nation learning where the stimulus situation consists of a pattern of
letters. In the present study, the letters are ABDEF which vary between
capital and small from.trial to trial, but retain the same alphabetical
order. The responses are X.cr 0, either of which may'be correct
depending upon which pattern of letters appears. To make a given
letter relevant, we make the correct response contingent on whether
that letter is capital or mnall. For example, if E is relevant, when-
ever E is capital the correct answer is 0 and whenever e is small the
correct answer is X. To make a letter irrelevant we make the correct
response independent of whether that letter is capital or small. If

B is irrelevant, when B is capital, the correct answer may be either

4
I or 0. In this case, the subject cannot use B to predict reinforce-
ments.

We suppose that each letter gives rise to a set of cues. We shall
call the set of cues arising from letter A by the name a, those from
the letter B by the name 6 , etc. The measure of O. is written m(Q),
and corresponds to the importance of this set of cues in controlling
behavior.

To test additivity-of-cues, we construct a problem where the
letter A is relevant, another problem where the letter B is relevant,
and a third problem where both A and E are relevant and redundant (the
subject can use either A or E or both to predict reinforcement). In
this third problem, the set of relevant cues 13008 , that is, the
_ set of cues which are in Qor in (for comcn to both. The hypothesis
of additivity states that and 8 are disjoint, having no ccmon
elements. In this case,

.<au€>=m<a>+.<e). <2)
Thus the measure of relevant cues in the third problem is the sum of
the measures of relevant cues in the first two problems.

In the theory, 9 is the proportion of relevant cues. Suppose A
is relevant, E is irrelevant, and all other cues in the situation
(which we designate by the sets) are irrelevant. We shall call this

problem 1. Then,

Similarly, if B is relevant, A is irrelevant. and all other cues

in the situation are irrelevant (9 ), we find the following relation

to hold. We shall call this problem 2.
92 m( E ) (4)

Thirdly, problem 3, if both A and E are relevant and redundant,

and all other cues in the situation are irrelevant (9 ), then,

93 mg0.2‘+m(€% (5)
m + m 4. m

Since the denominator terms in the above expressions are all equal,
93= 91+ 93. This is Hypothesis 1.

The first hypothesis is that if the total number of cues is the
same in all three problems, then the proportions of relevant cues, 9's,
will add directly. The total number of cues in the above problems is
held constant by ensuring that the same cues are present in all three
problems (i.e., the same stimulus letters ABDEF). In the first problem,
a is relevant and 8 is present but irrelevant. In the second problem,
8 is relevant and ais present but irrelevant. In the third problem,
botha and 8 are present and relevant. The first purpose of the
present thesis is to test this hypothesis of direct additivity of learn-
ing rates.

By the simplifying assumption of 9, it was noted that if we
reduce the amount of irrelevancy in the stimulus situation, we could
expect an increase in the learning rate. Suppose we consider two
problems in addition to the ones stated above. In both of these
problems, the same stimulus letters are used (ABDEF) but in each
problem, one of the letters is fixed, e.g., always capital. Such a
change in the stimulus situation is expected to reduce the measure of

irrelevant cues. The assumption is made that the fixing of a letter

6
removes the variations and has set of cues arising from such variation.
For example, in problem 2, where the cues B,e were relevant (seta),
if we fix the letter by making it capital 3, the relevant cues disappear.
In this case, the subject could not use E to predict reinforcement and
any cues left are irrelevant. Similarly, for irrelevant cues, taking
problem 1 as an example, where E,e are irrelevant, if we fix the letter
by making it capital B, we have removed the variations and hence we
have reduced the set of irrelevant cues by removing 8 . These state-
ments assume that variations between E and e give rise to the set
cues which may be either relevant or irrelevant or m be removed. The
fixing of the letter is not the same‘as the physical removal of the
letter and these latter effects are not as yet known. In this study,
if we hold a letter fixed, we assme that we have removed any cues
arising from the variations and by so doing have removed the set of
cues (a or g) which were relevant in the single-cue problems.

Suppose the letter A is relevant, E is fixed, and all other cues
in the situation are irrelevant (9 ), then in problem 4,

94 m( a») . (6)
Eran- «57

Similarly, in problem 5, if E is relevant, A is fixed, and all

 

other cues are irrelevant (9 ), tbn,

95 m( E) . (7)
+ m

m
In problems 4 and 5 we have the same number of relevant cues as

in problem 3. In problems 4 and 5, the relevant cues Q and g are

entirely separate and different, while in problem 3, all the relevant

cues of 4 and 5 are present and relevant. Formally, r3: r4+- r5.

Because of differences in the denominator terms of the 9 values in
problems 3, 4 and 5, the hypothesis of direct additivity of proportions
of relevant cues does not hold. Additivity-of-cues can be shown, but

it is additivity by a function. If we know 94, and 95, we can compute

93.
Solving equations 6 and 7 for m( Q) and m( E ) respectively, we
find:
m( 0.) = I“? )o( 95)
‘ 4 (8)
m (E )= ) < e )
49119371— <9)
Substituting ﬂiese values for m(Q ) and ME) into equation 5,
we find:

93- .(9) (Gigi-99+“? ) (cg-ea) .
1- 4 + m 0 1-05 + m

Cancelling m(9 ) which is connon to all terms and simplifying,

we find our second lwpothesis:

Hypothesis 2: 93 = 9 (1-8 )+ 9 (1-94) (10)
(1 ' 6435)

The second hypothesis is that if the number of relevant cues in
the combined-cue situation (problem 3) is the same as the sum of the
relevant cues in the single-cue situations (problems 4 and 5), the
preportion of relevant cues in the single-cue problems will add by a
function to yield the proportion of relevant cues in the combined-cue
problem. Thus we can find additivity of relevant cues, r3: r4+ r5,

even in a case where the 9 values do not add directly.

8

In Restle's study (1959), he used a pattern of consonants (e.g.,
letters EXQWU) as stimuli which did not vary in foam between upper and
lower case. The responses were verbal. The problems involved one
relevant one and one irrelevant cue, or both cues relevant (as in our
problems 1, 2 a 3). The hypothesis of direct additivity of 9's
(93:: 91—f-92) was used in the analysis. The relevant cues were not
found to be of differential strength with respect to influencing
performance.

In this study, the particular letter pattern used (ABDEF) retains
the same alphabetical order, but each letter is made to vary in foam
between upper and lower case. Variation of the letters, which in-
creases the number of stimulus dimensions, is expected to increase the
amount of irrelevancy in the situation as compared with the stimulus
pattern used by Restle. we could therefore expect the problems here to
be more difficult and the learning rates to be slower than those reported
by’Restle.

The letter A was chosen as one relevant letter because it appeared
first in the alphabetical order of the letters and always on the left
of the group. The letter B falling between two irrelevant letters and
in the fourth.position in the pattern of letters was expected to be
weaker than A. Thusa was expected to be larger than: . If these
cues proved to be of differential strength, additivity of cues of unv
equal strength could then be shown.

Both the stated hypotheses have been used in the analysis of
additivity-of-cues problems in several papers (Beetle, 1955, 1957, a

1958) but none of these applications was to problems specifically

9
formulated for a detailed analysis of the additivity hypothesis. This
thesis formulates these hypotheses in set-theoretic mathematics and
analyzes data well fitted for the purpose. Further, particular attention
is given to the methods of estimating the parameter, 9, and to statist-

ical tests of ﬁle hypotheses of addivity of relevant cues.

METHOD

Sub ects

The subjects were 80 students in elementary psychology courses
at ﬂichigan State University who receive credit for taking part in
experiments. The subjects were tested in groups of two or three or
individually, being assigned to the experimental groups as they appeared
in an irregular, prearranged order. There were 5 experimental groups
of 16 subjects, 10 males and 6 females each.

Apparatus _

From the point of view of the subject, the apparatus consisted of
a shield (about 9" x 18") with two apertures (1/2“ x 3"), one on the
left and one on the right. and a lever on the left. Stimulus patterns
appeared in the left aperture, and responses were recorded by the
subject, in pencil, through the right aperature. The lever permitted
the subject, after responding, to view the reinforcement (an x or an O)
in the left window and then to obtain the next stimulus, the pace being
controlled entirely by the subject.

A standard typewriter carriage was modified so as to take two long
continuous tapes. In the front of the carriage was placed a shield with
apertures constructed on the right and left, each corresponding to one
tape. In the left aperture, in a memory drum fashion, appeared the
group of letters followed by the correct response. The carriage lever
was set at a double spacing position so that only the stimulus pattern
or the correct answer could be seen by the subject at any one time.

10

11
The tape on the right was blank and the subject recorded his response
on this tape only when the stimulus pattern appeared in the aperture
on the left. The subject could self-operate the rotation of the
carriage as one does when typewriting. This arrangement was used so
that the subject could work at his own pace and be maximally attentive
when the stimulus patterns appeared.

Procedure and Stimuli

 

Each subject was seated at a table facing the apparatus and given
a pencil to record his responses. If more than one subject was present,
cardboard dividers were used to shield the subject from view of the
other's apparatus.

The same instructions were given to all groups:

"This is what*we call a "cue identification” problem. On each
trial, or when you push the lever to the right, a group of letters,
ABDEF, will appear in the window on the left. The same letters (ABDEF)
will appear every trial, but some will be capitals and others will be
small.

”There is a 'correct' answer to each group of letters, either an
X or an 0. The correct answer depends upon which letters are capital
and which are small. Your job is to write the correct answer in the
window on the right when an I or an 0 appears. At first you must guess,
but later you can learn by experience.

"Step 1. ‘Write your name in the space on the right, opposite
the letters START.

”Step 2. Swing the lever once and the first stimulus or group

of letters will appear on the left.

12

”Step 3. write your first guess in the window on the right.
Your guess will be either an O or an X.

”Step 4. Swing the lever once more and the correct answer will
appear on the left. READ IT AND “RITE NOTHING.

"Step 5. Swing the lever once more and a new group of letters
will appear on the left. You will write a new answer down and continue
this procedure throughout the experiment. Once more, the correct
answer depends on the group of letters and which letters are capital
and which are small. A simple principle will solve the problem; you
need not memorize each group of letters. You should be able to antici-
pate the O or the X for each group of letters and find out why. You
are to work at your own pace as there is no time limit on this
experiment.”

All groups were given.l§§_training trials. The first trial was
given as a practice trial to all subjects during the instruction
session in order to familiarize them.with the procedure. This trial
was not included in the recording of errors. Omitted responses on
other trials were considered one-half error.

Five problems were prepared, all involving the same letters,
ABDEF, the letters varying between upper and lower case over trials.
The same alphabetical order of the letters was always presented. For
all groups, the letters B, D, a F were varied independently of one
another and were uncorrelated with the correct response. Stimulus
patterns were prepared by first making all combinations of the five

letters, varying between upper and lower case (n:32). The extreme

13
cases where all letters were capitals or all small were omittedl,
leaving a base of 30 patterns. For two of the groups (A and E) all
the letters were made to vary and all 30 combinations were used. For
two of the groups (A' and 8') one letter was always capitalised,
leaving 15 patterns. For a third group (Ass) two letters were relevant,
again leaving 15 patterns. A complete list of all stimulus patterns,
their trial appearances and total number of errors made per pattern
is given on Tdble VIII.

Correct responses were paired to the stimulus patterns, care
being taken that the stimulus patterns appeared about equally often
and in an irregular order. The sequence of X's and 0's was the same
for all groups, and was constructed to have statistical properties
like the expectations of a random.sequence. That is, X.and 0 each
appeared 64 times. The relative frequencies of x after 1,.1 after 0,

0 after I, and 0 after 0 were approximately equal. The run structure
was as follows: single 11s appeared 16 tunes; runs of two (XX) occurred
8 times; runs of three (XXX) occurred 5 times; runs of four (XXII)
occurred 3 times; and there was one run of five (XXIII). Exactdy'the
same distribution of 0's was used. This is about as close as is
possible to the expected frequencies of 16, 8, 4, 2, and 1. These
frequencies add up to 57 trials. The remaining 7 trials were consumed
in one extra run of three and one of four. Any attempt a subject might

make to guess the correct answer by considering the previous sequence

1. These extreme patterns were quite easily recognized and rapidly
learned in previous studies using similar procedures. (Beetle,
personal communication, 1959).

14
of X's and 0's would give him a probability of success very near 1/2.
A complete listing of the response sequence, the pattern pairings and
trial occurenoes is given on Table II.

Experimental Group!

 

For all groups, the letters B, D, F were irrelevant and were made
to vary independant of one another beteeen upper and lower case.

Group A had problem 1 where capital A was always followed by an
O and small a by an I. The letter B was made to vary and was irrelevant.

Group 3 had problem 2 where capital 3 was always followed by an
0 and small e by an X. The letter A was made to vary and was irrelevant.

Group A+E had problem 3 where both A and B were relevant and
redundant. Small a and capital 3 (a3) were always followed by an O
and capital A and small e (As) were always followed by an x. A! and
as did not occur.

Group .A' had problem 4 which is the same as Group A except that
the letter B always appeared capitalized.

Group E' had problem 5 which is the same as Group 3 except that
the letter A always appeared capitalized.

Table I contains a sumary of the five experimental groups.

These groups gave data of two kinds: total errors and total
weighted errors, where each error is weighted by the number of trials
of training preceding it. For example, if an individual subject made
errors on trials 3, 5, 6 and 8, his total error score would be 4 and
his weighted error score would be 2+4+5+7 3 18. The method of weighted
error scoring gives emphasis to errors made late in learning. The

mathematical formulas used obtaining the estimates of 9 from these two

15

 

-“_-_....~.—--_u-.‘..- -.-.---. _ .-

 

 

TABIE I

EXPERIMENTAL GROUPS USED TO TEST ADDITIVITY-OF-CUES

 

 

Correct Response

 

 

 

 

 

 

 

 

 

Problem 1: O

l A (A relevant) ae Ae

an AB

5 mrelevant) ae aE

Ae AB

3 M (Both A and Ae a3
E relevant)

4 A' (A relevant aE E
and 3 fixed)

5 'Er' (ﬁelevant eA EA
and A fixed)

 

 

 

kinds of scores are given the theoretical Appendix I. Errors and
corresponding 9 values for the two meﬁlods are tabled on Table VI.
From the group means of total and weighted errors, estimates
of the proportional weights of a and 8 cues could be estimated. These
estimates could then be used to test for additivity of cues in two ways:
Hypothesis 1: prediction of performance of Group A+E by direct
additivity (Groups A and E) and
Hypothesis 2: prediction of performance of Group ME by additivity

by a function (Groups A' and E').

RESULTS

The initial running of Groups A and A' gave an anomalous result
in that Group A made 21.00 mean errors whereas Group A' made 31.84 mean
errors. This difference was not statistically significant at the .05
level (t=l.34, df=30) but the direction of the difference was un-
expected for several reasons. Both groups were given problems with
the same relevant cue Q and in Group A', the irrelevant letter B was
fixed. The groups had essentially the same stimulus patterns, with
Group A' having one-half the umber of patterns as Group A. These
experimental conditions led strongly to the expectation that these
groups would show essentially the same performance levels or that
Group A' would learn the problem faster tan Group A. As expected,
Group E' learned faster than Group B, so ﬁne unexpected result was
not repeated within the experiment. A recheck of the subject sampling
procedures, hours of day or days of the week on which the subjects of
groups A and A' were run, and the experimental procedures, failed to
find an explanation of this difference. The anomalous nature of the
results led to a decision to rerun these two groups under the same
experimental conditions in order to see whether or not this result
would repeat itself. The second running of the groups did not show
the same results and the data from the second ho groups are used in

the following analysis.

16

 

 

18
Figure 1 gives the group performance curves in terms of mean
proportion of correct responses per 8-trial block.

Insert Figure 1 Here

Inspection of Figure 1 indicates that Group RtE, have both cues
a, and 8 relevant and redundant, learned faster than the other four
experimental groups. Groups A' and A, both having one Q’relevant,
learned their problems faster than Groups 3' and E, which had cue 8
relevant. Group A' (A relevant and E fixed) tended to learn the
problem faster than Group A (A relevant and E varying). Similarly,
Group E' learned the problem faster than Group B.

Group AtE's performance indicates the effect of additivity-of-cues.
Increasing the number of relevant cues resulted in faster learning
when Group A+E is compared with the other groups which each had one
one relevant. The superiority in performance of Groups A' and A when
compared to that of Groups 3' and E support the expectancy that awas
stronger than a . The tendency of Group A' and Group E' to show fewer
errors than Group A and Group B, respectively, supports the hypothesis
that a reduction in the number of irrelevant cues in the stimulus
situation causes faster learning.

For statistical analysis of the differences between the groups,
the performance of each subject was sumarized by the total number of
errors made on trials 2-128. Trial 1 was discarded because it was
given as a practice trial. There was a total of 6 omitted responses

and they were counted as one-half error.

19
An overall analysis of variance was performed on all five
experimental groups and was found to be statistically significant
(F 6.23, p .01). To order the differences discussed above, Tukey's
studentized range (Snedecor, 1956, pages 251-253) was performed.

Table II contains a summary of the comparisons between group means.

TABLE II

INDIVIDUAL COMPARISONS BETWEEN GROUP MEANS

 

Mean.Differences

 

 

 

 

 

Group Mean Errors f-A E f—A ' f-A i-E'
E 47.44 35.72* 23.56* 19.94* 13.16
E' 34.28 22.56* 10.40 6.78 -----
A 27.50 15.78 3.62 ----- -----
A' 23.88 12.16 ----- ----- -----

A E 11.72 ----- ----- ..... -----

:2

 

 

*Difference statistically significant at the .05 level.
ranked mean comparisons, the within mean square from the overall

analyeig of variance was used to calculate the standard error of the
mm (“=5e28)e

In making the

From Table II, Group E and Group E' were found to be different

from.Group A+E at the .05 level of significance.

not significantly different from Group A+E.

tended to make more errors than the combined-cue group (AtE).

Groups A and A' were

However, Groups A and A'

It was

predicted that Group E would be slower than Group E' and that Group A

would be slower than Group A'. Both these differences were in the

predicted direction, but they did not reach the .05 level of signifi-

cance. It is apparent from these data that a, was a larger set than

20
E , especially since the combined-cue problem, Group A-\-E was not
demonstrably different from a problems, Group A and A', but was
different from 8 problems, Groups E and E'.

It will be noted in Figure 1 that all groups, except Group A+E,
failed to reach 100% performance at the end of 128 trials. Plots of
the distributions of total error scores indicated the presence of both
learners and non-learners among the subjects in Groups A, E, A' and E'.
In Group E, 10 subjects showed essentially no learning. In Group A,

5 subjects; Group E' 6 subjects; and Group A', 1 subject showed the
same result. Groups E and E', with the weaker set of relevant cues,
E , had mostly non-learners with scores near 64, and a scattering of
learners, producing a distribution which appeared negatively skewed.
Groups A and A', which had the stronger set of relevant cues,0~ , had
a large number of learners with low error scores and a fair number of
non-learners whose scores distributed themselves near 64 errors in 128
trials, producing a bimodal distribution. Group A‘l-E had all learners,
but with mostly low error scores, yielding a positively skewed dis-
tribution. Such differences suggest possible sources of difficulty
for any analysis, especially estimation of learning rates.

Estimation of learning Rates

 

In order to make and test the hypothesized predictions of additivity
of learning rates, to estimations of the 9 values for each of the five
experimental groups were obtained. Theoretically, the prediction is
exact for a single subject, but since the three estimates necessary to
show additivity cannot be obtained on a single subject, we use the mean

error score for'each group. Two estimation methods were employed on

21
the data for each group, one based on total errors starting with trial
2, and the other based on weighted errors where each error is weighted
by the number of trials of training which precede it. The second method
gives emphasis to late errors and has some advantage of a trials to
criterion score which could not be used in this study. The particular
equations used in estimating 9 by these two methods are given in the
theoretical Appendix I and are tabled with errors and corresponding 0
values on Table VI.

Table III contains the O estimates obtained for each of the five

experimental groups by the two estimation methods.

 

 

 

‘L

 

TABLE III

9 ESTIMATES FOR THE FIVE EXPERIMENTAL GROUPS A8
OBTAINED BY THE MEAN TOTAL AND WEIGHTED ERROR SCORES

 

Source of Estimate
Group Mean Total Errors ll_ean Weighted Errors
Al-E .100 .070
A' .060 .050
A .056 .046
E' .046 .039
E .033 .029

 

 

 

:—__
r T“

 

From Table III, the estimates of 0 based on weighted errors are
consistently smaller than the estimates based on total errors. Such
a discrepancy strongly suggests that the learning curves found did not
have the shape stated by the theory. Since the estimates were based

on group averages, the discrepancy might mean that individual differences

22
were sufficient to distort the shape of the learning curve, or that
the individual subjects did not distribute their errors as the theory
predicts. It will be recalled that the weighted error score places
emphasis on errors made late in learning. Since the estimates of 9
are low when weighted errors aroused, it appears that the groups made
disprOportionately more errors late in learning, compared with the
theoretical expectations.

If agroup is composed of both learners and non-learners, the

average learning curve will reflect some learning, but even very late
in training, the non-learners will be producing numerous errors. The
result would be that the group average curve is flat compared with the
individual performance curves, with disproportionately many late errors.
Inspection of the data supports the idea that there are large differences
between subjects.

It is possible to test whether the individual subject's learning

 

curves were of the shape predicted by the theory, in the following
way. If each subject distributes his errors appropriately, then ﬁle
twc estimates of 9, from total and weighted errors, should be about
the same for each subject. This hypothesis was tested by calculating
the individual 9 values for each subject by both methods. An overall
sign test was then run between the. two sets of estimates. It was found
that the individual weighted error estimates were consistently smaller
than the individual total error estimates (2 = 3.91, p: .0001).
Although these differences were statistically significant, they were
not large in most cases. 0f the 80 comparisons of estimates of 9,

70% of the discrepancies were less than .015. Restle's (1959) estimates

23
were quite comparable and he computed the average of the two in his
analysis. Because of the discrepancies encountered in our estimates,
such a procedure is not justified in this study. Therefore any further
computations involving 9 values had to consider the two estimates of
9 separately. A

Addi tiviEL-of-Cugs

 

If cues Qand g are additive, the 91 and 92 should add directly
to yield 93 (Hypothesis 1). Similarly, if cues a and E are additive
and in the case where the 9 values do not add directly (because the
amount of irrelevant cues have been reduced in the stimulus situation),

94 and 95 should add by the described function of

9 1-9 9 1-9
4( 5) + 5( :1: to yield 93 (Hypothesis 2).

 

1-9495
By making use of the estimates of 9 for each group obtained from
the two estimation methods (Table III), the above formulas were cal-
culated and compared with the observed learning rates in Group A+E.
Tables IV and V contain the mean errors in 128 trials and the pro-
portion of relevant cues (9) for Group AtE based on the estimates
obtained by. the two methods of estimation and the hypothesized pre-

dictions.

24

TABLE IV

MEAN TOTAL ERRORS IN 128 TRIALS AND PROPORTION OF
REIEVANT CUES (9) FOR GROUP A E

 

 

 

 

 

 

 

Mean Total
Source 93 Errors t 2
Observed AVE .100 11.72 ---- ----
Prede 91+ 9 e088 14e57 0.81 ).05
Pred. f (94.55) e100 11072 0.00 >e05
TABLE V

MEAN WEIGHTED ERRORS IN 128 TRIALS AND PROPORTION
OF RELEVANTCUES (9) FOR GROUP A E

 

 

 

 

Mean Total
Source 93 Errors ___F 2
Observed M-E .070 495.34 ----- -----
Pred. 91+ 92 .075 428.00 0.35 >.05

 

 

The four predictions of 93 shown in Tables IV and V were tested
by converting 93 into a corresponding expected mean error score or
mean weighted error score. The predicted mean error score or mean
weighted error score was then treated as a fixed value as the variance
of the predicted score was unknown. Each predicted score was then

tested against the observed error score for Group AtE by means of 'a t

25
test using the variance of the mean of Group A+E. Any failure to reject
THE NULL HYPOTHESOS OF No DIPFERENCB BETWEEN
the predicted mean error score and the observed mean error score is
favorable to the theory. Therefore, this t_test seemed more conserva-
tive as it would be less likely to reject the null hypothesis than a.t
test'where the two variances are unknown.but presumed equal. The re-
sults indicated a failure to reject the null hypothesis in each case,
and the predictions are all supported. Similarly, these predictions
were tested by means of the binomial test, where the number of cases
falling above and below the predicted score was hypothesized to be
equal. In all four cases, these tests were statistically non-signifi-
cant at the .05 level, and supportive of the predictions. Since this
test assumes the predicted:median.to be fixed, whereas in fact it is a
random variable depending on sampling variations in the data of groups
used to make predictions, it is also overly stringent.

Hypothesis 1, the prediction of direct additivity of cues,
93': 91+- 92, is fairly accurate in the predictions made using estimates
obtained by both total and weighted error methods. In the first method
(total errors), the tendency is to underestimate the learning rate of
Group A+E. By the second method (weighted errors), the prediction tends
to overestimate the observed value. When the predicted 9 values were
converted to mean error scores and tested against the observed mean
error score for Group A+E, the resulting differences were found to be
statistically non-significant in both cases.

Thus, if the total number of cues is the same in the three problems,
the proportions of relevant cues will also add. Knowing the proportions

of relevant cues in two single-cue problems, it is possible to predict

26
the prOportion of relevant cues in a third problem which contains the
sum of the relevant cues in the first two problems. In addition, it
is possible to predict additivity-of-cues of unequal strength in the
combined-cue problem.

Hypothesis 2, the prediction of additivity-of-cues by'a function,

is perfect when mean total errors

 

1-9495

esthmates are used. In the case where mean weighted error estimates

are used, the prediction tends to overestimate the observed value of
Group AtE. By the results obtained using both total and weighted error
estinntes of 9, we can demonstrate additivity of relevant cues, r3 = r44-
r5, even in a situation where the 9 values do not add directly because
of differences in the denonimator values of the 9's or because of the
different strengths of the relevant cues.

AThe peculiarities of differences betleen the two estimation.methods
again are reflected in the predictions. When mean total errors are used,
the predictions tend to underestimate slightly or are perfect. 'lhen
mean weighted errors are used, the predictions tend to overestimate the
observed values. Again the suspicion is that individual subjects did
not distribute their errors as the theory predicts. However, the dis-
crepancy may also be inherent in the methods of estimation. From.the
differences observed in Table III, it appears that the subjects in
Group A+E may have made disprOportionately more errors late in learning
compared with the other four experimental groups. The difference

between the Observed 9 values by the two methods for Group A+E are

27
larger than those in the other groups. The exact reason for these
discrepancies remains unknown at present.

As individual differences were present in terms of performance
in the experimental groups, a further test of the prediction of
additivity of relevant cues was carried out. Using the same two
hypotheses of additivity, the question arose as to whether or not the
cumulative distribution of learning rates in the combined-cue situation
could be predicted. The assumption was made that there was a high
positive correlation as to rank position in the different experimental
groups. Acting upon this assumption, the individual 9 values were
ranked from the lowest to the highest in each group. The matched
rank 9 values for Groups A and E were then added directly to yield a
predicted distribution (Hypothesis 1). Similarly, the matched rank 9
values for Groups A‘ and E' were combined by the described additivity
function to yield a second predicted distribution.(Hypothesis 2).
These calculations were carried out using both total and weighted error
estimates of 9. The four resulting cumulative distributions were
then tested against the observed cumulative distributions for Group
A+E by means of the Kolmorgorov-Smirnov cumulative distribution test
for two observed distributions.

Hypothesis 1, 91‘? 92:: 93, was accurate for both total and
weighted error estimates of 9 (maximum difference Observed was equal
te 5, whereas for a two-tailed test, p=.05, a maximum difference of
8 is necessary for rejection of the hypothesis).

Hypothesis 2, the hypothesis of additivity by a function of 94

and 95 to predict 93 was also accurate for both total and weighted

28
error estimates of 9 (maximum difference observed for total error 9's
was 4 and for weighted error 9's was 3).

Thus in all four cases, the prediction of additivity of cues by
additivity of the proportions of relevant cues were accurate in pre-
dicting the cumulative distribution of the combined-cue problem.

The prediction of the whole distribution, despite the extra
assumption of positive correlation, seemed appropriate since the pre-
diction takes into account the large variance of the predictor groups,
A a E, and A’ a E'. The predictions using means disregarded the fact
that the predictions themselves are uncertain and are based on the
means of the groups with skewed or bimodal distributions. 0n the other
hand, the Kolmorgorov-Smirnov test is weak against any particular dif-
ference between the distributions compared, making it likely that one
would commit a Type-II error favorable to the theory.

As the Kolmorgorov-Smirnov test is weak against any particular
difference betwaen the distributions compared, it seemed desirable to
make use of a more powerful test. In the tests of the mean score
predictions (Tables IV and V), the variance of the predicted score
was noted to be unknown. In this case, we have a predicted distri-
bution of scores and a variance. It is possible to test differences
betmen the predicted and observed distributions by means of a para-
metric _t_ test where the variances are \mknown but presumed equal.

Hypothesis 1, 93: 91 + 92, was again found to be accurate for
both total and weighted error estimates of 9. For the total error
prediction, t = 1.11, df = so, with .30 > p ) .20. For the weighted

error prediction, t t 0.86, df= 30,.with .50 ) p ) .40.

29

Hypothesis 2, f(94,95) = 93, was also found to be accurate. For
the total error prediction, t = 1.16, df a 30, with .30) p >.20. For
the weighted error prediction, t = .74, df 5' 30, .50) p ) .40.

Thus in all four cases, the prediction of additivity-of-cues by
the additivity of individual preportions of relevant cues was accurate
in predicting the distribution of learning rates of the combined-cue
problem. In these tests, it was shown that the mean of the distribution,
taking individual differences into account, could be accurately pre-
dicted.

The median, as another measure of central tendency, could also be
predicted and tested by use of a binomial test. Fbr each predicted
distribution and corresponding observed distribution, the median was
found and a four-fold classification of subjects falling above and
below the median value for the predictor group and the observed group
(Id-E) was constructed. A Chi-square, corrected for continuity, was
performed on the four predictions.

In all four predictions, Kiwas found in be equal to 1.12, with
1 df and .30) p) .20. Again, the hypotheses of additivity are not
rejected.

Thus by making an additional assumption of a positive correlation
as to subject position in the predictor and predicted groups, it was
possible to demonstrate the accuracy of the hypothesis of additivity
of cues by predicting the cumulative distribution, mean and median of
the combined-cue group.

The results may be summarized with respect to four points:

30

First, the general order of the means was as expected; Group ArE
was best, indicating some form of additivity; the letter A produced a
larger measure of cues than the letter E, and reducing the measure of
irrelevant cues had a small beneficial effect.on learning.

Second, the two methods of estimating 0 yielded discrepant results
suggesting that neither the group nor the individual learning curves
were of the shape predicted by the theory. It is possible that such
discrepancies are inherent in the estimation methods but the exact
nature of the discrepancies is at present unknown.

Third, the quantitative predictions of additivity-of-cues were
fairly accurate. The distributions of scores and the unknown distri-
bution of 9 caused difficulties in making tests, but our most stringent
parametric test £3§12_to indicate that the hypotheses of additivity
must be rejected.

Fourth, by making an additional assumption as to subject position
in the distributions it was possible to make accurate predictions of
the cumulative distribution, mean and median of the combined-cue

group, taking into account individual differences in learning rate.

DISCUSSION

In this experiment, the difficulties encountered in finding
comparable estimates of 9 by the two methods (total and weighted errors)
suggested by the theory, forced the making of predictions of additivity
with separate 9 values. An examination of the discrepancies between
values of 9 estimated from the individual total and weighted error
scores reveals some perculiarities inherent in the methods. Figure 2
shows a scatter-plot of individual total and weighted error estimates
of 9.

Insert Figure 2 Here

Inspection of Figure 2 indicates that the two estimates of 9 are
very close for low rates of learning (9 < .10). The discrepancies be-
come larger as 9 increases. The observed discrepancy distribution .was
positively skewed, with 70% of the differences being very small (from
-.009 to .015). Discrepancies larger than .015 (n:23) and the corres-
ponding error scores showed a negative correlation of -.58, p<.01.
The range of the discrepancies was from..016 to .280, while the cor-
responding range of errors was from 12 to 1. Thus, the discrepancy
between the two estimation methods is largest for fast learners.

A discrepancy between two estimates of the same parameter may
have any of several causes: the most important possibilities are (a)
one or both of the estimates may be biassed, or (b) the data may not
conform to the model. If alternative (b) is the case, the model should

31

 

33
be rejected. To decide this, it is necessary to eliminate alternative
(a), that the estimates are biased.

When estimatea are computed on the group mean total errors or the
group mean weighted errors, serious discrepancies can arise from.in-
dividual differences among subjects in the parameter 9. As is
mentioned above, the average learning curve of a heterogeneous group
will tend to be flat, some subjects making almost no errors and others
making errors late as well as early in learning. The weighted-error
method yields a low 6‘ (indicating slow learning because of ﬁxe numerous
late errors) and the total-error method yields a higher‘6~(indicating
that the group curve is above p:.50 during most of the problem). How-
ever, it was also found that estimates based on individual subject's
data were discrepant, and this cannot be explained by effects of
averaging over heterogeneous groups.

The largest individual discrepancies occurred for large values
of 9. It is possible that such discrepancies arise, not because of
any psychological phenomenon, but merely as a statistical artifact.
This possibility was given a preliminary test‘by a Monte-Carlo com-
putation as follows. The theoretical error curve, ( l-p(n) ), was
computed with 9 =' .25. Using these values and a table of random
numbers, data for 80 hypothetical subjects were constructed. These
data conform exactly to the theoretical curve of learning. The two
methods of estimating parameters, through total errors and weighted
errors, were used on the data of each individual hypothetical subject.
If the methods of estimation behave in a desirable fashion, the

estimated values of 9 should be about .25.

34

It was found that the estimates were quite variable, with values

centering in the region of .25. Figure 3 contains a scatter-plot of
A

6E (9 estimates based on total errors) and 9W ( 9 estimates based on

weighted errors). Inspection of Figure 3 indicates that the discre-

\
pancies are larger with a higher value of 9, but the discrepancies

are in the opposite direction to those observed in our data. In the
hypothetical procedure, it was found that ’9“, tended to be larger than
’63 (z a 3.15, p:.0016). This difference is Opposite to that observed
in our data. However, the discrepancies beheen the two estimated
values, ( {SE-8W ), yielded a distribution of the same general shape

as that in- Figure 2, but the distribution is shifted to the negative
side of the scale. The result strongly suggests that when parameters
are estimated for individual subjects, (i.e., with sample size 1) the
estimtes are biased, and ﬁle bias is different for the two methods of

estimationl. The Monte-Carlo computation does not explain the

_A

1. An estimate is said to be biased if the average of a great many
estimates, each based on a finite sample, fails to converge to the

true value of ﬁre parameter. This is distinct from. inconsistency, where
a single estimate based on a very large (infinite) sample fails to
converge to the true value. Many methods of estimation commonly used
in statistics are biased, and this fault is usually considered less
important than inconsistency or than inefficiency. In this as in most
learning models (Bush & Mosteller, 1955), estimates are made by the
best method available, but may be quite imperfect. Technically, both
of the methods of estimation used in this thesis are extensions of the
"method of moments" and are closely related to methods in common use

in learning theory. Severe computational difficulties accompany attempts
to use the more desirable method of maximum likelihood.

 

 

36

discrepancy found, but suggests that it may well be due to estimation
rather than to a serious discrepancy between theory and facts The
lento-Carlo procedure does offer a fruitful method for generating
empirical distributions of parameters such as 9 and for exploring the
behavior of such discrepancies further.

The foregoing analysis helps to clarify the overestimation of the
rate of learning of Group A+E, when the prediction was made by weighted-
error 9 values of the single-cue groups. Group A+E contained all
learners, most of whom solved the problem early in training. Groups
having one cue problems contained learners and non-learners. It has
been noted that the larger discrepancies between the two types of
estimated values ( 33-3“ ) occurred when the rate of learning was
faster. Secondly, Table III shows that for Group n+3, the difference
between the two estimates of 9 was larger than the differences for the
other four experimental groups. When the additivity computations were
carried. out, it is apparent that these discrepancies remained in the
prediction. It appears reasonable that the discrepancy may be due to
biases in the estimation methods.

The present experiment yielded a quantitative test of the concept
of additivity-cf-cues used in several recent papers (Restle, 1955, 1957,
and 1958). Two kinds of additivity of the proportions of relevant
cues were shown, one direct and the second by a function. The pre-
dictions were formulated, a priori, in terms of set-theoretic mathe-
Inatics and tested on cues arising from separate letters in a stimulus
pattern. The predictions were found to be reasonably accurate in all

cQBCBe

37

One or both of these hypotheses of additivity has been applied
in analyses of experiments involving infra-human subjects (Restle,
1955, 1957 and 1958). The first hypothesis of additivity (direct
additivity of 9‘s from separateecue groups to predict the 9 of a
combined-cue group) was tested against data from “place and response”
experiments on rate in a T-mase (the data of Galanter a Shaw, 1954;
Sherlock, 1955; Blodgett et a1., 1949; all in Restle, 1957). The
second hypothesis of additivity (additivity of 9's from separate-cue
groups by a function to predict the 9 of a combined-cue group) has
been applied in analysis of color, form and size discrimination learn-
ing by monkeys (the data of'larren, 1953; in Beetle, 1958) and T-mase
learning of rats (part of the data of Scharlock, 19553 and Eninger,
19523 in Beetle, 1957 and 1955). These earlier tests of the theory
were a;pggterio£i and based on data not specifically designed for the
testing of the hypothesis of additivity-of-cues. Therefore, any
failure of these predictions, involving data designed for the purpose
of testing them, would have led to serious doubts concerning the
adequacy of the theory as an analytical tool. However, the positive
results obtained here fail_tc indicate that the hypothesis of additivity
should be rejected and support the use of the theory as a means of
quantifying and analyzing data from suitable learning experiments.

In a study similar to the present one, involving human subjects,
Restde (1959) used the direct additivity hypothesis to predict per-
.fcrmance of a combined-cue group. The relevant cues used were letters
Jin.a pattern of consonants. The letters were all capitalised and

‘thus did not change in form.over trials. In the present study, varying

38
the relevant and irrelevant letters between upper and lower case was
intended to increase the number of stimulus dimensions that the sub-
ject had to discriminate in order to solve the problem. This increase
of irrelevant cues through the use of changing letter forms'was expected
to reduce the rate of learning. VA comparison of the estimated prepor-
tions of relevant cues (9's) obtained by Restle with those of this
study show the effects of increasing the number of irrelevant cues.
The estimates reported by Restle are consistently larger than those
obtained here.

Further, Restle did not find the relevant cues in his study to
be of unequal strength, whereas our analysis does. The demonstration
of additivity of unequal cues is of interest. It could not reasonably
expect relevant cues in problems of discrimination learning to be
always of equal strength. For example, lhrren (1953) found the visual
component of color to be the dominant cue over form and size in the
discrimination learning of monkeys. Our analysis indicates that cues
of unequal strength can be adequately handled in a quantitative analysis.

A second way of analyzing the additivity result is by configura-
tions or stimulus patterns (Restle, 1959). Since Problems A', E' and
AtE all had 15 patterns, they should be learned at the same rate,
according to the pattern hypothesis. In Figure 1, it is shown that
E"was slower than A' and that both E' and A. are slower than A+E in
learning their problems. Also, since Problens A and E are both 30
jpattern problems, they should be learned twice as slowly as Problems
.A', E' and AtE. Inspection of Figure 1 shows that Group AfE reached

90% correct in about 28 trials, while Groups A' and A reached the same

39
level of performance in 80 and 128 trials, respectively. Groups 3' and
E never reached 90% correct, but are at about 82% and 73% correct at
the end of 128 trials of training. These findings in no way support
the hypothesis that Problems A+E, A' and E' are learned just twice as
fast as Problems A and E.

The failure of the configurational hypothesis may be partly due
to the existence of unequal cues in the stimulus situation. Such a
hypothesis does not take into account cues of differential strength
‘but deals solely with the number of patterns. The use of a theory
which considers both the number and strength of relevant ones, as does
the one used in this study, leads to a more accurate prediction of

performance.

SUMMARY

To test additivity-of-cues, a two-choice discrimination learning
problem was used. Five groups of 16 human subjects each were tested
on separate problems. Two problems had one cue relevant and one cue
irrelevant. A second two problems had one one relevant, but the measure
of the irrelevant cues was reduced. The remaining problem had both
cues relevant.

The stimuli were patterns of letters which had a fixed alphabetical
order but varied in form.between upper and lower case. The response
was written when the subject saw a pattern of letters. The correct
answer (an.X or O) was predictable from the pattern of letters by'a
consistent principle. I

Two hypotheses of additivity were formulated and tested: direct
additivity (prediction of the proportion of relevant cues in a combined-
cue problem.by addition of the proportions of relevant cues in two
single-cue problems) and additivity by a derived function (prediction
of the proportion of relevant cues in a combined-cue problem.by a
function of the proportions of relevant cues in two single-cue problems,
where the measure of the irrelevant cues was reduced).

The results may be summarized as to four findings:

1. The combined-cue group showed faster learning than the single-
cue groups, indicating some form of additivity. One of the relevant
cues was found to be stronger than another and the reduction of the
measure of irrelevant cues through the fixing of a latter had a small

beneficial effect on learning.

40

41

2. The two methods of estimating the learning rate parameter,
9, yielded discrepant results, indicating that neither the group nor
the individual learning curves were of the shape predicted by the
theory. An analysis of the discrepancies between the two methods of
estimation suggested a bias inherent in the methods.

3. The predictions of the mean error scores by both hypotheses
of additivity were found to be accurate in all cases. Statistical
tests of these hypotheses failed to indicate that they should be re-
jeoted.

4. To account for individual differences in learning rats, an
assumption of a high positive correlation as to subject position in
the groups was made. Predictions of the distribution of rates in the
combined-cue problem by the application of the hypotheses of additivity
to the matched rank values in the single-cue groups were made. These
predictions were found to be accurate for the cumulative distribution
of learning rates, the mean and the median of the combined-cue group.

The discrepancies between the two methods of estimating learning
rates were discussed. Inspection of the discrepancies indicated that
the discrepancies were larger with faster rates of learning. A Monte-
Carlo procedure was used to test this difference but did not clearly
indicate the nature of the discrepancies. It is suggested that such
a procedure would be fruitful for investigations of the behavior of
the learning rate parameter, 9 and its theoretical distribution.

A second quantitative analysis, based on the number of stimulus
patterns in the problems was discussed and found not to be consistent

with the additivity-of-cues data.

APPEND ICES

42

43
APPENDIX I
RESTLE'S THEORY OF DISCRIMINATION LEARNING

Recent mathematical formulations of learning (Bush & Mostellar,
1951 & 19503 Estes, 19503 and Restle, 1955, 1957) have described the
stimulus situation as a set of elements, each of which is conditioned
to (i.e., tends to evoke) exactly one response at a given time. During
learning, if a certain response, A1, is reinforced, a cue may switch
and become newly conditioned to A1. The probability of such a change
is the rate of learning parameter, 9.

The major points made in the two-choice discrimination theory are
as follows:

1. The stimulus situation is represented by a set of discriminable
aSpects called "cues";

2. A cue may be ”conditioned" to either response;

3. A cue may be "adapted" and rendered nonfunctional during
learning;

4. The probability of a response is the prOportion of the un-
adapted cues conditioned to it.

Stating these assumptions quantitatively shows that under certain
limiting experimental conditions it leads to a process similar to that

in Estes' theory and, in particular, with the same asymptotes.

 

1. This apoendix is largely a restatement of Restle's two-choice dis-
crimination theory as developed in two recent papers (1955 and 1957).
It is included primarily as a direct reference for those who are
interested in the mathematical and logical development involved in the
estimation of learning rates. Additions to the theory, in the form of
definitions, examples and formulas, have been made to help clarify the
theoretical formulation to the new reader.

44
Theory
A) Set of cues

The stimulus situation in two-choice learning experiments is
represented by a set of discriminable aspects called cues, k, k', k"...
The set of cues is called K and the number of cues is N. A subset:of
these cues may correspond to anything to which the subject can learn
to make a differential response. Such a definition assumes that the
subject’has the capacity to learn a differential response. An in-
dividual cue is thought of as "indivisible" in the sense that different
responses cannot be learned to different parts of it. The term "cue"
is also used to refer to any set of cues, all of which are manipulated
in the same way during a whole experiment.

Every individual cue may be thought of as either "relevant" or
"irrelevant". A cue is relevant if it can be used by the subject to
predict where or how reward is to be obtained. For example, Von Frisch
(1955) gives summaries of experiments on determining the chemical and
color senses of the bee. In one case, bees were trained to feed only
on a card colored blue. This training was accomplished by rewarding
the bees with sugar-water only on a blue-colored card. Then, cues
‘which were aroused by the color blue are relevant. Similarly, position
of the cards was randomized, such that the bees could not gain reward
from the use of position cues. Cues from position are thought to be
uncorrelated with reward or irrelevant. In the two-choice experiments
testing this theory, the subject has just two-choice responses and no

other activities are considered.

45
B) Conditioning of Cues
We assume that a cue is conditioned to one or the other response
alternative at any time on an all-or-none basis. The probability that
one 1: is conditioned to response A1 at trial n is called F(k,n). If a
cue is conditioned to A2 and then A1 is reinforced, it may switch over.
and become conditioned to A1. The probability of such a switch is a
constant called 9, the "rate of learning" parameter. On this assump-
tion, we get the following equation of change of F(k,n):
1) if K is reinforced on trial n (k occurring every trial),
F(k, n+1): F(k,n)(l-G)+- e (1)
This equation may be solved by the linear difference equation method,
whereTr is l, i. e. consistent reinforcement, giving
men) = (1-F(k.1)(1-e)n'1) (2)
C) Adaptation of cues
During learning an irrelevant one may become "adapted” and lose
its effect on a response. An adapted cue is one which the subject does
not consider in deciding upon his choice response. If a cue is thought
of as a ”possible solution" to the problem an adapted cue is a possible
solution which the subject rejects or ignores. Different cues have
different probabilities of being adapted. If cue k is not adapted by
the beginning of trial n, and it is irrelevant, the probability that
it will be adapted by trial n+1 is 9. This consideration gives us an
equation for a(k,n), the probability that cue k is adapted at the begin-
ning of trial n, as follows,

a(k,n+l)- a(k,n)(1-9) + 9 (3)

46

Again, solving equation 3 by use of the general linear difference
solution, we find,

80931)" 0-a(k.1)(1-9)“'1) (4)

It will be noted that the same constant, 9, appears in both
equations 2 and 4. The fundamental simplifying assumption of this
theory deals with 9, where 9 is defined as a constant prOportion of
unconditioned relevant cues which become conditioned on each trial of
a given problem. This assumption is that:

9 = r / r + i
where r is the number of relevant cues in the problem and i is the
number of irrelevant cues. Thus, 9 is defined as the preportion of
relevant cues in the problem. This proportion is set equal to the
fraction of unconditioned cues conditioned on each trial.
0) Probability of a Response.

The probability of a response, A , is the proportion of unadapted

1
cues conditioned to it. The probability that cue k is unadapted is
(1-a(k,n)) and the probability that cue k is conditioned 1:: A1 is

F(k,n), thus the performance function p(n) is,

p(n) = zF(k.n)(1-a(k€n)2 (5)
2(l-a k,n

The 2 (sumation sign) indicates the sum over all cues in the situation.
D) Some consequences Regarding Simple learning
If a subject is naive at the beginning of training, so hat for
any relevant cue, F(k,l) is near % (two-choice learning, where chance
is about 50% for success), then from equation 2,

F(k,n) = 1 - Edi-er“ (6)

47
Similarly, on trial one, the probability of a cue being adapted
is O, i.e., a(k,l): 0. Then, from equation 4,
a(k,n)= 1 - (1-9)""’1 (7)
Under these circumstances, we can substitute equations 6 a 7 into
equation 5 (performance function) and taking advantage of the

simplifying effects of our definition of 9, we have,

( )=. 1 -%(l-9)n-l
1’ n ma ‘8’

The development of equation 8 is as’follows:

First, in equation 5, it will be noted that the summation sign
calls for summing the probabilities over all cues. We shall divide our
labor into sunning over two kinds of cues, relevant and irrelevant. In
other words, we are partitioning our composite sum into two sums. We
first consider the numerator, F(k,n)(l-a(k,n)). For the 1: relevant cues,
F(k,n) :: 1 - %(1-9)n'1 and a(k,n) : O, whence the function is just
1 - §(1-e)n'1. when an. 1. eumed over the 5 relevant cues, it being
the same for each of them, we get,

r(1-%(1-e)n'1).

Now when we sue over the _i_ irrelevant cues, we have F'(k,n)"-= i,
because by the nature of the experimmt an irrelevant cue cannot be
consistently conditioned to the correct response, but a(k,n): 1 -
(1-9)n-1. Summing F(k,n)(l-a(k,n)) over the irrelevant cues gives
1(%)(1-e)“'1.

Suming F(k,n)(l-a(k,n)) over all cues gives,

r(1-1e( HP“) + “ﬂu-9P“.

48

The same procedure of summation is used in dealing With the
denominator, which is (1-a(k,n)). Stunning over the 5 relevant cues,
for each of which a(k,n) '2. 0, we get 5. Suming over the _i_ irrelevant
cues gives i(l-9)n'1. The denominator term now is, r+ i(l-9)n-1.

Forming the ratio, we have,

pm 3 r(1-%(1-9)n'1)+ e we)“: .
r.+ 1(1-e)n"1

Dividing each term in the equation by r+i,

 

_-.-. - 1-e “'1 1 1-9 “'1
Mn) “FET— iﬁré ) sari} )

r + 1 (1-9)“"1

*

#1 3171'
Taking advantage of the simplifying effects of G: r/r+i, and
(1-9): i/rti.

p<n)= geomorgm)“ .
9 + (1---9)n

This equation can be reduced into equation 8 by means of an

 

indirect simplification. We begin by subtracting both sides of the
equation from 1.

1-p(n) -.- 1 - e - %e(1-e)""1+zl:(1-e)m
e + (1-9)11

 

Finding the common denominator for the right hand side of the above
equation,
1 -p<n)= e «rm-e)“ + sec-e)“ - sue)“ - e
e + (1-e)n

 

49
Simplifying the numerator,
1-p(n)= 931-9)“- ie(1-e)“'1 = 4(1-e)n'1(1-e+e)
e + (1-9)“ a +- 0-9)“
iu-GP'I

— e + (1-9)m

 

 

and solving for p(n) yields equation 8,

p(n): 1 - are)“
e + (1-9)m

Plotting equation 8 shows that p(n) is an S-shaped function of n

 

with an asymptote (for a greater than 0) at 1.00. Also p(l) is %.
Since p(n) is an increasing monotonic function of 9, we can estimate

9 from observations of performance. If we want to know the theoretical
proportion of relevant cues in a problem for a particular subject, we
have the subject work on the problem, record his performance curve, and
solve equation 8 for 9. This result depends directly upon the simpli-
fying assumption of the definition of 9.

Since the instability of individual learning curves makes it
difficult to fit curves to them, it is fortunate that 9 can be determined
in a different way. Suppose a subject makes E errors in the course of
solving the problem to a very rigorous criterion and it is assumed for
practical purposes that he has made all the errors he is going to make.
Theoretically, the expected total number of errors made on a problem
can be written 3" = Cl-p(n)). Under the conditions satisfying
equation 8, this can be evalmted approximately by using the continuous
time variable I: in place of the discrete trial variable 3, and integrat-

ing. The result of this integration is that,

50

§=%log9 . (9)

(l-Ghogzl-G)
Equation 9 may be derived as follows:

Substitutilzg equation 7 into the expected error formula,

E 3 E1 rl-P(n))= it. (1-9)n"1
e + Tl-O)“

and multiplying both sides of the equation by (1-9)/(l-9),

E = % (1-0)}! 0

T1 5) (5 +(i-93")
As we are integrating the equation for E and using the continuous

time interval _t for _n, our integral now stands as,

E; 1 z (1-9)t t dt.
(LN-e “a +T1-‘79

As 9 is a constant, let (l-G): k, then kt: (1-9)t and kt: et “5°00
(p. 97, Burrington, 1955). Letting a =loge(k), then kt: eat which'is
also : (1-9)t. Our integral then is,

n

1Z2 eat dt, which when integrated, yields,
1-9 e

1
l log(9t eat) (p. 82, Burrington, 1955, sq. 310).
2 1-9 a

Substituting into the integrand, the values for eat and a, and

 

evaluating,
n
log(9 + (1-e)t) _. if = % lo a + (1-9 n (9).
T“‘)z 1-9 —T_Tlog 1-9 ' "' TAT—Th9 log £32719
1

E) Methods of Estinntion of learning Rates (9's)
Two estimation methods, yielding comparable values of 9, m be
employed on the data of an individual subject or of the group. The

first method is based on total errors and the second method on weighted

51
errors, where each error is weighted by the number of trials of train-
ing which precedes it. For example, it a subject made errors on trials
1, 4, 6 and 10, his totaluerror score would be 4 and his weighted error
score would be Of3f5+9=l7. The second method gives emphasis to late
errors and has some advantage of a trials to criterion score which could
not be used in a study of this nature.

For the first method, that involving total errors, the performance
of each subject is summarized and the mean error score for the group

found. Equation 9 is used, where,

E =(1/ 2) lo gnu-9)“)
(155570504)

where E z the mean error score of the group and E is the total number
of trials (in this study, 128). Table VI contains '17} evaluated at
n.:128. The expected errors and corresponding 9 values used in this
study are contained therein. To find the learning rate for an in-
dividual or a group, one uses the total error score and enters it into
the table and the corresponding 9 value is read off.

For weighted errors, each error uncle by the subject is multiplied
by the number of trials preceding the one on which the error is made.
The weighted errors are then summed over trials for each subject and
the mean weighted error score for the group is calculated.

For finding values of 9 and corresponding weighted error scores,

equation 8 is used, where the probability of an error is,

1-p(n)=$_1[2)(1- r“ .
a + (IL-e)"

52
The expected weighted errors, under the assumption of independence of

trial probabilities is

0(1-p1) f 1(1-p2) + 2(l-p3) .........

N N
= 1- ) r (1 2 1-9 “'1 .
E1!“ p(n ) 112:; {10 211% g

This sum has no explicit solution and the corresponding integral is
also intractable. The function has been tabled for various values of
9 and N. Table VI contains the values of 9 and corresponding weighted

error scores for n=128.

APPENDIX II

TABLE VI

E AND 0 VALUES ESTIMATED BY THE TOTAL ERROR AND
‘IEIGHTED ERROR METHODS OF ESTIMATION FOR 128 TRIALS

 

Total Weighted Total Weighted

 

 

9 Errors Errors 9 Errors Errors
.005 63.91 4053 .200 4.51 30
.010 62.92 4006 .210 4.19 26
.015 61.63 3893 .220 3.91 23
.020 59.27 3672 .230 3.65 20
.025 55.66 3311 .240 3.42 18
.030 51.19 2832 .250 3.21 16
.035 44.70 2313 .260 3.02 14
.040 39.63 1838 .270 2.85 13
.045 34.52 1448 .280 2.69 11
.050 30.50 1145 .290 2.55 10
.060 24.16 738 .300 2.41 9
.070 19.71 501 .310 2.29 8
.080 16.47 355 .320 2.17 8
.090 14.02 261 .330 2.07 7
.100 12.13 198 .340 1.97 6
.110 10.64 153 .350 1.87 6
.120 9.43 122 .360 1.79 5
.130 8.42 98 .370 1.71 5
.140 7.58 80 .380 1.63 5
.150 6.86 66 .390 1.56 4
.160 6.62 56 .400 1.49 4
.170 5.73 47 .450 1.21 3
.180 5.27 40 .500 1.00 1
.190 4.87 34 0.00 0

1.000

 

 

TABLE VII

SUBJECT SUMMARY DATA, SHOWING NUMBER OF ERRORS PER
BLOCK OF 8 TRIALS, TOTAL AND WEIGHTED ERRORS, GROUP
AND INDIVIDUAL 0'8

 

Group A
Errors per Block of 8 Trials

 

 

 

 

 

s. .1. .2. .2 4 2 9. .7. .9. 21211221412125.52
J.l~1.2442354355553644(.005<.005
J.F. 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 .500 .220
2.3. 1 2 1 3 2 2 0 1 0 0 0 0 0 0 0 0 .100 .080
J.S. 2 4 2 3 5 3 4 2 2 0 1 1 0 1 0 0 .051 .049
0.0. 3 4 5 4 3 4 5 2 4 2 3 3 2 2 5 4 .026 .024
L.K. 5 3 4 4 1 3 1 0 0 0 o 0 0 0 1 0 .065 .066
T.w; 4 2 3 3 6 3 5 4 4 5 6 1 4 2 0 0 .029 .030
0.0. 3 4 2 6 6 1 5 0 o 0 0 0 0 0 o o .055 .055
8.2. 2 3 2 0 0 0 1 o o 0 o o 0 0 o 0 .135 .110
0.2. 3 3 4 3 6 3 6 4 2 4 2 4 4 6 3 4 .016 .013
J.C. 3 2 4 5 0 2 1 0 0 0 0 0 0 1 0 0 .075 .070
M.w. 2 6 4 4 3 5 6 4 3 3 2 2 2 3 2 2 .026 .029
0.0. 1 2 3 0 6 3 2 2 3 2 2 1 0 0 0 0 .055 .046
0.1. 2 0 0 o 0 0 o 0 0 0 0 0 0 0 0 0 .337 .260
A.N. 0 0 0 o 0 0 0 0 0 0 0 0 0 0 0 01.0001.000
6.0. 3 o 1 1 0 2 0 0 1 o 0 0 0 0 0 0 .135 .097
I‘m-.33 3‘9‘ 39' 39 41' 33' 4‘6 2'2' 24' 21' 21' 17 15' 21' 13 1f

2 68 7o 70 70 68 72 69 83 81 84 64 67 88 64 66 89
Correct

36 Total Errors Tptal Weighted Errors. Group 9's
J.M. 64.0 4395.0 .055 .046
J.F. 1.0 24.0
H.J. 12.0 346.0
J.S. 30.0 1224.0
0.0. 55.0 3357.0
1.2. 22.0 565.0
T.w. 52.0 2662.0
0.0. 27.0 776.0
6.2. 6.0 133.0
0.2. 61.0 3939.0
J.C. 16.0 503.0
M.w; 53.0 2697.0
0.0. 27.0 1251.0
0.7. 2.0 11.0
A.N. 0.0 0.0
S.D. 8.0 217.0
'15151. 44676 22520760

54

 

 

TABLE 711
(Continued)
Group E
Errors per Block of 8 Trials

LB 12.24221221211121114121249 .6...
0.1. 1 5 4 3 5 6 5 4 3 3 5 5 5 3 4 4 <.005<;005
1.9. 4 5 6 4 6 5 3 4 5 4 6 4 3 5%.5 7 <.005<;oo5
7.x. 4 3 2 0 0 0 1 0 0 0 o 0 0 o 0 2 .100 .077
5.6. 4 4 4 4 4 4 4 4 3 4 6 3 5 5 1 3 .014 .017
0.3. 3- 2 5 5 3 3 1 7 4 5 4 5 4 5 3 4 .009<;005
J.S. 2 5 2 3 7 1 1 4 5 4 1 5 3 5 2 4 .027 .022
4.5. 2 1 0 5' 1 0 0 4 4% 0 0 0 0 0 0 o .176 .117
0.2. 3 4 3 6 6- 4 4 3 0 0 0 0 0 0 0 1 .046 .050
2.x. 3 1 0 1 2 2 3 2 4 1 3 2 2 3 3 1 .047 .035
wza. 1 7 3 3 4 4 3 3 5 4 3 4 0 0 0 0 .036 .037

‘ 1.9. 2 1 3 4 4 4 4 4 7 5 3 3 3 5 4 3 .020 .010
5.3. 3 4 4 3 5 4 4 5 4 5 4 4 2 4 4 5 <5005<2005
0.1. 4 3 2 5 2 4 5 3 4 3 3 5 3 4 5 6 .016<;005
0.0. 3 4 3 3 5 6 4 5 4 2 6 5 4 3 1 5 .009 .011
9.1. 3 1 0 0 o 0 0 0 o 0 o 0 0 0 0 o .217 .220
P.S. 2 1 2 6 5 6 2 2 6 4 4 5 3 7 2 2 .020 .013
TEHﬁﬁﬂﬁﬁEQMﬁQWﬂmMﬂ—ﬁ““
at 666067 61 54 5966 61 57 6662 61 71 61 73 63
Correct

85 Total Errors Total weighted Errors Group 9's
0.1. 65.0 4213.0 .033 .029
1.9. 76.5 4966.5
V.M. 12.0 397.0
was. 62.0 3611.0
0.5. 63.0 4177.0
J.s. 54.0 3533.0
A.H. 5.5 132.5
3.0. 34.0 1174.0
2.x. 33.0 2269.0
v.4. 44.0 2121.0
1.9. 59.0 4000.0
3.1. 64.0 4160.0
0.1. 61.0 4156.0
0.0. 63.0 3974.0
B.T. 4.0 24.0
9.9. 59.0 3939.0
‘15541 75973' 4704975
Mean 47.44 2940.56

55

TABLE VII
(Continued)

Group AtE

Errors per Block of 8 Trials

 

£9. .1. E 5.5. i .5. 2
K.B. 3 3 2 4 2 0
R.S. l 0 0 0 0 0
T.H. 3 1- 2 0 4 2
8.0. 3 4% 3 0 5 5
A.B. 3 3 4 3 4 2
J.F. 3 0 2 1 0 0
J.K. 1 0 0 0 0. 0
R.S. 2 0 0 0 0 0
8.0. 2 0 3 0 4 2
B.P. 2 4 4 0 0 0
K.G. l 0 0 0 0 0
P.B. 0 2 0 O 0 0
AJY. 2 0 l 0 0 0
A.B. 0 0 0 0 0 0
M.E. 0 0 0 0 0 0
A.W. 1 3 4 1 4 2
13??27’20%25"9 23 £3
% 79 84 80 93 82 90
Correct
36 Total Errors
K.B. 15.0
R.S. 1.0
T.H. 23.0
BeC'e 35.5
AOB. 45.0
J.F. 6.0
J.K. 1.3
R.S. 3.0
S.C. 27.0
B.P. 10.0
K.G. 1.0
PCB. 2.0
LOW. 3.0
A.B. 0.0
M.E.‘ 0.0
A.W. 15.0
TOtal 187.5
Mean 11.72

7

F‘F4CDCDCDC3CDC)C>h3C>C>C>#-¢-F‘C>C>

00H

.086
.500
.063
.044
.035
.165
.500
.261
.055
.115
.500
.337
.261

.080
.410
.049
.043
.033
.146
.410
.146
.044
.116
.460
.213
.220

1.0001.000
1.0001.000
.086 .079

1 21211111211121: 4.9 .9...
0 0 0 0 0 1 0 0 0
0 0 0 o o 0 0 0 o
0 3 1 2 1 0 2 0 1
1 1 2 3 3 1 0 0 0
1 3 5 4 5 3 0 0 1
0 0 0 0 o 0 o 0 0
o 0 0 0 0 0 0 o 0
0 1 0 0 0 0 0 o 0
1 5 2 2 3 0 0 1 o
0 0 0 0 0 o o o 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 o
0 0 o 0 0 0 0 0 0
0 0 o 0 0 o 0 o o
0 o o 0 o o o 0 0
0 0 0 0 o 0 o 0 0
§i13710711'12' 37 2' 1' 2'
96 90 92 91 91 96 96 99 96

Total Weighted Errors

 

358.0
4.0
1229.0
1623.5
2493.0
72.0
4.0
72.0
1511.0
134.0
3.0
25.0
23.0
0.0
0.0
374.0

495.34

Group 9‘s

.100 .070

56

57

 

 

TABLE‘VII
(Continued)
Group A'
Errors per Block of 8 Trials

__..sa 121122111121111111212; .9...
0.3. 5 3 2 5 2 6 4 l 0 0 0 0 0 0 0 0 .053 .058
R.V. 4 4 3 0 2 4 0 2 0 l 0 0 0 0 0 0 .069 .067
B.R. 4 5 7 4 6 1 2 0 5 0 2 2 O 3 0 1 .038 .040
A.C. l 0 0 l 0 0 0 O 0 0 0 0 0 0 0 0 .337 .208
D.D.Ol00000000000000.500.340
V.P. l 0 0 0 0 0 0 0 l 0 0 0 0 0 0 0 .337 .151
L.F. 2 4 2 3 1 2 l 1 l 2 1 3 2 2 0 l .053 .044
Jtlh 4 6 3 4. 2 1 l 3 1 0 0 1 0 0 0 0 .057 .060
J.B. 2 l 0 0 0 0 0 0 0 0 0 0 0 0 0 0 .261 .245
IuRe 2 6 3 3 4 3 5 4 5 4 0 1 2 0 0 0 .038 .039
A.A. 5 2 2 0 0 0 0 0 O 0 0 0 0 0 0 0 .124 .136
0.8. 3 3 1 1 4 1 l 1 3 2 2 2 2 2 l 1 .051 .041
A.H. 3 3 3 3 4 3 3 4 2 2 2 4 l l 3 4 .035 .031
ll.l.2100100000101000.165.095
JkP. 3 3 5 6 4 2 1 0 0 0 l 0 0 0 3 0 .053 .053
R.K. 3 5 5 3 4 6 4 4 3 3 3 5 5 6 5 6(.005(.005
7676745 4'6 33 33' 3'4' 22 22' 20' 21 14" 12 16 1'3’ 1‘4‘ 12 13 """"‘"""'"
% 6O 64 72 74 73 77 83 84 84 89 91 86 90 89 91 90
Correct

39 Total Errors Total Epighted Errors Group 9's
G.E. 28.0 833.0 .060 .050
ROVI 20.0 570.0
B.R. 42.0 1824.0

A.C. 2.0 27.0

D.D. 1.0 16.0
VtP. 2.0 65.0

L.F. 28.0 1533.0

J.IL 26.0 746.0

J.B. 3.0 17.0

luR. 42.0 1919.0
A.A. 9.0 88.0

0.8. 30.0 1748.0
A.H. 45.0 2731.0
M.l. 6.0 229.0

J.F. 28.0 1011.0

R.K. 70.0 4699.0

Total 382.0 18046.0

Mean 23.88 1127.88

TABLE VII
(Continued)

Group E'
Errors per Block of 8 Trials

 

 

_s- 111.4.111.8.2121111.1111.12.11_...,_e .9...
_ 2.7. 4 4 2 5 5 5 1 3 5 5 6 2 3 7 3 3 .009 .006
J.C. 2 6 2 2 3 5 2 ‘1 3 0 1 1 1 0 0 o .049 .050
9.7. 3 4 3 4 2 2 5 4 3 2 5 2 3 2 3 2 .032 .029
“no. 4 2 3 4 4 4 4 5 3 4 6 3 2 4 1 3 .024 .024
1.0. 1 0 1 0 0 0 0 0 0 0 0 0 0 0 o o .337 .206
1.2. 4 4 4 5 4 4 3 3 3 2 0 3 3 4 5 3 .027 .026
0.2. 1 0 2 2 0 0 0 0 0 0 0 0 0 o 0 0 .167 .130
J.W. 12-3 3 4 5 5 4 5 5 4 4 3 3 7 5 3 <.005<;005
J.M.3602400000000000.086.088
D.H. 1 3 1 3 1 0 0 o 0 0 0 o o 0 0 0 .124 .106
2.0. 1 0 0 0 0 0 0 0 0 0 0 0 o 0 o 0 .500 .540
0.22 2 2 5 3 2 3 5 1 3 5 3 4 1 3 0 2 .036 .034
1.0. 3 3 3 3 5 6 5 3 4 6 3 4 3 3 2 2 .022 .023
2.1. 3 1 2 4 7 4 2 3 4 3 0 3 3 3 2 2 .034 .031
J.D. 0 0 0 0 0 0 0 0 0 0 0 o 0 0 0 0 1.0001.000
2.1. 5 5 0 3 4 2 5 1 3 4 3 5 5 1 2 3 .030 .027
‘mmwnammmmmmnwnnnn***'
% 70 65 76 65 64 69 72 77 73 73 76 77 79 73 62 62
Correct
36 Total Errors Total'leighted Errors Group_§lg
2.7. 63.0 4039.0 .046 .039
J.C. 31.0 1152.0
9.1. . 49.0 2947.0
‘w.0. 56.0 3419.0
1.0. 2.0 27.0
1.2. 54.0 3239.0
0.22 5.0 96.0
J.w. 64.5 4329.5
6.2. 15.0 277.0
0.2. 9.0 170.0
9.0. 1.0 1.0
0.3. 44.0 2621.0
1.c. 56.0 3488.0
2.1. 46.0 2763.0
J.D. 0.0 0.0
2.1. 51.0 3115.0
‘2537 52275 3132375
Mean 34.26 1960.34

59
TABLE VIII

STIMULUS LETTER PATTERNS, TRIAL APPEARANCES AND TOTAL
NUMBER OF ERRORS MADE PER STIMULUS PATTERN

 

 

 

Group A Errors Group E Errors Trial Appearances
Response Response
0 0

1. ABDEf 7 ABDEf 16 59, 74,101,112

2. ABdEF 10 ABdEF 13 17, 49, 86,117

3. AbDEF 16 AbDEF 21 8, 45, 78,107

4. ABDeF 16 .6066 27 25, 52, 62, 99,123
5. ABdEf 16 ABdEf 20% 2, 36, 64,111

6. AbDEf 15 45061 22 26, 40, 56,113

7. ABDef 17 aBDEf 24 16, 34, 75, 95

6. AdeF 17 AdeF 23 6, 32, 57,102

9. ABdeF 22 aBdEF 25 18, 33, 73, 66,120
10. AbDeF 20 abDEF 23 13, 36, 62,110

11. Adef 13 Adef 3o 0, 29, 60, 76,116
12. ABdef 14 aBdEf 36% 15, 39, 71, 94,125
13. 450.1 11 abDEf 29 10, 20, 65,105

14. AbdeF 17 adeF 25 7, 46,100,116

15. Abdef 17 adef 26 50, 69,103,124
Response Response

x x

16. .6066 - 7 ABDeF 23 55, 79,119,127

17. aBDEf 10 ABDef 21 14, 63,106,115

18. aBdEF 9 ABdeF 22 24, 53, 96,126

19. abDEF 13 450.1 35 3, 43, 66, 65

20. aBDeF 21 aBDeF 32 11, 27, 51, 92,121
21. aBdEf 12 ABdef 25 9, 30, 27, 91
22. abDEf 9 450.1 22% 21, 35, 70,104

23. aBDef 13 aBDef 27 1, 42, 58, 63

24. adeF 17 AbdeF 34 19, 47, 66, 97,114
25. abDeF 23 abDeF 35 23, 37, 54, 93,109
26. aBdeF 16 aBdeF 26 12, 31, 69, 90

27. adef 22 Abdef 24% 4, 28, 61, 61

28. aBdef 17 aBdef 26 22, 44, 60,106,122
29. abDef 15 450.1 24 5, 46, 67, 67
30. abdeF 6 abdeF 17 41, 72, 64, 96

 

TotaI . '4-4—0'

2&9

3&10

(nnhtkrkb

&12

6&15
8&14

11&15

16&20

17&23
18&26

19&25

21&28
22&29
24&30

27

Group

Response

0
aBDEf

aBdEF

abDEF

aBDEF

aBdEf

abDEf
adeF

adef

Response

X

ABDeF

ABDef
ABdeF

AbDeF

ABdef
AbDef
AbdeF

Abdef

.1353.—

Errors

0)
NIH

25

10

25

12
17
17

13
157.5

 

TABLE VIII
(Continued)
Group
A' Errors
Response
0
ABDEf 24
ABdEF 21
AbDEF 30
ABDEf 4
ABdEF 4
AbDEF O
ABdEf 33
45061 22
AdeF ' 26
Adef 15
Response
4x
aBDEF 27
aBDEf 29
aBdEF 25
abDEF 30
aBdEf 28
abDEf 28
adeF 14
adef 22
382.0

Group
E!

Response

0
ABDEf

ABdEF

AbDEF
ABDEf
ABdEF
AbDEf
ABdEf
AbDEf
AdeF

Adef

Response

X

ABDeF

ABDef
ABdeF

AbDeF

ABdef
AbDef
AbdeF

Abdef

 

Errors

39

42

4O

11

(330)01

34
26

37

35
41

39%

31
34
29

21
546.5

60

Trial
Appearances

 

59,
16,
17,
16,
120

8.
13,
25,
25,

74,101,112

33,

45,
36,
99

52,

52,123

62
2.
15,
125
26,
10,
6.
7.
29,
50,

55,
11,
121
14,

1.
24,
12,

3.
23,
109

9.

36,
39,

40,
20,
32,

75, 95
86,117
73, 88,

78,107
62,110

82, 99,123
64,111
71, 94
56,115

65,105
57,102,

46,100,116

60,

76,118,

89,103,124

79,119,127,

27,

51, 92,

63,108,115,

42,
53,
31,
43,
37,

30,

56, 63
96, 126,
69, 90
68, 65,
54, 93,

27, 91,

22,44,60,106,122
21, 35, 70,104,
5, 46, 67, 67
19, 47, 66, 97,
114,41,72,64,96
4, 26, 61, 61

TABLE IX

61

CORRECT RESPONSE SEQUENCE'WITH PAIRED STIMULUS PATTERNS

 

Trial
Number

Practice

(DmQQUIDFOINH

10

12
13
14
16
16
17
18
19
20
21
22
23
24
25
26
27
28
29
3O
31
32
33
34
35
36
37
38
39
4O
41

Correct
Response

Stimulus

 

><c>c>c>>4c>>4c>c>c3>4><c>>4><cac>>4><>4><c>><c>c3<3<3><c>><><c>><c>c>c>>4>4><c>><c>

 

Pattern No.

11
23
5
19
27
29
8
14
3
21
13
20
26
10
17
12
7
2
9
24
13
22
28
25
18
4
6
20
27
11
21
26
8
9
7
22
6
26
10
12
6
3O

Trial
Number

42
43
44
45
46
47
46
49
50
51
52
53
54
55
56
57
56
59
60
61
62
63
64
65
66
67
66 _
69
70
71
72
73
74
75
76
77
76
79
60
61
62
63

Correct

Response

>4C>P4>4>4C3>4C3C>CDC>>4C>><>4>4ﬁ<>4C>C>>4CD>4CDC>>¢C>CD>4>4><C>><<3<D><>4C>C>64><>4

Stimulus
Pattern N2:

 

23
19
28

3
14
24
29

2
16
20

4
18
26
16

6

8
23

1
11
27
10
17

5
13
24
29
19
26
22
12
3O

9

1

7
11
21

3
16
28
27

4
23

Trial

Number

84
85
86
87
88
89
90
91
92
93
94
96
96
97
98
99
100
101
102
103
104
105

Correct

Response

c>c>c>c>c>c>c>>€><>4c>c>><><>4><c>c>><c>>424

 

TABLE IX
(Continued)
Stimulus Trial
Pattern No. Number
30 106
'19 107
2 108
29 109
9 110
15 111
26 112
21 113
20 114
26 116
12 116
7 117
18 118
24 119
30 120
4 121
14 122
1 123
8 124
16 125
22 126
13 127

Correct

Response

><><c>c>c3><><c3><c>c>c>><>4c>c>c>c>><>4c3>4

62

Stimulus
Pattern N0 0

 

28
3
17
25
10
5
1
6
24
17
14
2
11
16
9
20
28
4
15
12
18
16

 

1

REFERENCES

Blodgett, H.'c., 11.0.5.5... x. a. 11.55.... 2. Spatial learning in 55.
T-maze: the influence of direction, turn, and food location. J. exp.
Psychol., 1949, - 39, 600-609.

Burington, R. S. Handbook of mathematical tables and formulas.
(3rd. ed.) Sanduslcy, Ohio: Handbook'Publishers, 1956.

 

Bush, R. R. a. Mosteller, F. A model for stimulus generalization and
discrimination. gsychol. Rev. , 1951, 62, 413-423.

 

Bush, R. R. 4 Hosteller, F. A mathematical model for simple learning.
Psychol. Rev., 1951, 58, 313-323.

 

Bush, R. R. 4. Mosteller, F. Stocastic models for learning. New York:
‘liley' 19550

 

Eninger, M. U. Habit emanation in a selective learning problem. g;
60). physiol. ngchol., 1952, i5, 604-608.

 

Estes, W. K. Toward a statistical theory of learning. Pﬂchol. Rev.,
1950, 51, 94-107.

 

Galanter, E. N. 6: 31am I. A. "Cue” vs. ”reactive inhibition" in
place and response learning. J. comp. phlsiol. Psychol" 1954, 41,
395’3980

Scharlock, D. P. The role of extamaze cues in place and response
learning. J. exp. Psychol., 1969, _51, 9-14.

 

Restle, F. Additivity of cues and transfer in discrimination of
consoant clusters. J. e19. Psychol., 1959, 2?, 9-14.

 

Restle, F. Discrimination of cues in mazes: A resolution of the
”place-vs.-response" question. Psychol. Rev., 1957, 61, 217-228.

 

Restle, F. A theory of discrimination learning. Psychol. Rev. , 1965,
62, 11-19.

 

Restle, F. Theory of selective learning with probable reinforcements.
Psychol. Rev., 1957, E4, 182-191.

 

Restle, F. Toward a quantitative description of learning-set data.
Psychol. Rev., 1968, _62, 77-91.

 

63

Snedecor, c. w. 35.51.516.11 Methods. (555 .4. ), Anesz low. State
College Press, 1956.

 

Von.Frisch, K. The dancin bees, an account_of the life and senses
of the honey bee. Trans. by Dora Ilse), new fork: Harcourt, Brace
& COO. 1955.

 

 

‘larren, J. M. Additivity of cues in visual pattern discriminations
by monkeys. J. comp. physiol. Psychol., 1953, 46, 484-486.

 

 

u1ulunjllullw1111116111um