.L-s —- a HUAE l: .SIINS' 800K BINDER? INC. LIBRM- , BtNDERS "new"? :T’mculm‘ll ABSTRACT THE FREE-OPERANT PARTIAL REINFORCEMENT EFFECT: A DISCRIMINATION ANALYSIS By Stephen Reed Overmann According to a revised Response-Unit explanation (Denny, Wells, and Maatsch, 1957) free-operant intermittent reinforce- ment results in increased resistance to extinction because of the acquisition of a discrimination of visiting the goal area only after a series of non-reinforced operants and the click (SD) associated with the terminal operant has occurred. The maintenance of this discrimination habit of partial rein- forcement during extinction results in a series of operants being emitted prior to each non—reinforced goal-approach. Only those operants that are followed by non-reinforced visits to the food area serve to bring about extinction. Given this analysis as correct, it should be possible to manipulate resistance to extinction through manipulation of the number of goal-approaches during extinction. Twelve CRF controls were extinguished in a modified operant chamber, with an SD for goal-approach after each response. Three groups of twelve rats trained under FR 10 were extinguished with the sD for goal-approach after 6, 10, or 1“ responses. Results showed: (1) number of responses to extinction was a O I I D O O C 0 direct function of the response to S ratio during extinction, (7) number of food or goal-approaches was independent of both Stephen Reed Overmann training and extinction conditions, and (3) prior to the breakdown in discrimination, the revised Response-Unit hypothesis accurately predicted number of responses for each FR group. The results offer strong support for the discrimi- nation analysis of the effect of intermittent schedules of reintorcement on resistance to ext‘nction. [.1 4:3") /€/‘) C: N /w,‘ .' ['1 r If ’*‘ ‘7', A p pro V C d 8 V I I" “ ”)r 42 C "1- - gr-vd-“rS 67, Date: _:L*‘/é" /*§ Committee members: Dr. M. R. Denny (Chairman) Dr, R. Raisler Dr. I". Rilling THE FREE-OPERANT PARTIAL REINFORCEMENT EFFECT: A DISCRIMINATION ANALYSIS By Stephen Reed Overmann A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of MASTER OF ARTS Department of Psychology 1973 TO MY DAD Edwin Frank Overmann ACKNOWLEDGMENTS Thanks are due to the members of my committee: Dr. M. Ray Denny, Dr. Robert Raisler, and Dr. Mark Rilling for their challenging and constructive criticism. I am especially grateful to my chairman, Dr. M. Ray Denny for his guidance in a manner which allowed me to work to the fullest of my abilities, and for his willingness to provide aid whenever those abilities fell short. Also, I would like to thank Dan Tortora for his invaluable assistance, and my wife, Karen, for her patience and love. INTRODUCTION, , HETFOD, , , , , , Subjects. . . Apparatus. Procedure. RESULTS. . . . . . Acquisition. Extinction l. Extinction 2. DISCUSSION. . . REFERENCES. . . . TABLE OF iv CONTENTS Table 1. LIST OF TABLES A comparison of mean number of predicted and obtained bar-presses to breakdown of discriminated cup-approach together with obtained mean number of response units and cup-approaches. . . . . . . . . . . . . . 18 ,4. \‘J A LIST OF FIGURES Mean number of cup-approaches per reinforce— ment for all extinction groups during train— ing. The FR groups all had identical condi- tions during training. . . . . . . . . . . Mean latrrcy of approach to the food~cup fo all groups following the click durirg aCQUlSl- tion and extinction. . . . . . . . . . . . Nean number of bar-presses and cup—approaches to the 10 min extinction criterion for all yo 1 groups during Extinction 1 and Extinction 2. Mean daily duration of cup-visits for all groups during acquisition and extinction. Page \0 11 14 PO INTRODUCTION In attemptins to account for the high resistance to extinction following free operant intermittent reinforcement, Mowrer and Jones (19HS) proposed the Response-Unit hypothesis. According to this explanation the horogeneous chain of nor- rewarded operants that ends with a rewarded operant can be seen as a single response unit (FU). The entire RU, whether . ngle bar-press for CRF animals or, say, five bar-presses “J Ho for animals trained under FR 5, is strengthened by each rein- forcement. From this analysis and the assumption that extinc- tion is a direct function of the number of non-reinforcements, it follows that given equal numbers of reinforcements, both continuously and intermittently reinforced animals should emit the same number of RUs to extinction. The increased resistance to extinction found following intermittent rein- forcement is thereby accounted for by the multiple response requirement of the RU. The RU hypothesis predicts that the number of responses to extinction following intermittent reinforcement is equal to the product of the number of respons s to extinction made by CRF animals and the number of responses in the intermittent RU. Consistent with this prediction, Mowrer and Jones found the number of responses to extinction to be a direct function of the size of the fixed ratio (RU) of original conditioning. l 2 Somewhat at variance with their prediction, however, the number of responses to extinction was less than predicted from the R" hypothesis, with the discrepancy between the predicted and obtained number of responses to extinction increasing with size of the FR unit. Denny, Wells, and Maatsch (1957) demonstrated that the RU hypothesis was correct to the extent that the animal learned to approach the food-cup only after the terminal bar-press of a unit (i.e., typically only after the click of the food magazine). According to their analysis, extinction comes about through non-reinforcement, but non-reinforcement occurs only when the animal anticipates reward (i.e., visits the food-cup) and does not receive it. Bar-presses during extinction which are not followed by trips to the food-cup contribute nothing toward extinction; only those presses that precede non-reinforced trips to the food cup are related to the extinction of bar-pressing. Thus the schedule under which the click occurs during extinction (whether it consis- tently or intermittently follows a bar-press) is an important determinant of when S visits the food-cup and thus the true pattern of non-reinforcement. This discrimination of visiting the cup only after the click, called the discrimination habit of partial reinforcement by Denny gt al., is acquired during original training under intermittent reinforcement. That is, visits to the cup that are not preceded by the SD of the click tend to be extinguished. The animal gradually learns a chain consisting of a series of homogeneous responses that are instrumental in producing the 3 to-be-reinforced terminal bar-press. The terminal bar-press produces the click that signals approach to the food-cup. If the discrimination habit of partial reinforcement develops to a 100 per cent level during training and is maintained during extinction, then the RU hypothesis of Mowrer and Jones accu- rately predicts resistance to extinction following intermittent reinforcement. The level and strength of discrimination implicitly assumed in Mowrer and Jones' formulation, however, was not supported by the findings of Denny 23 gl. Although the discrimination clearly improved with train- ing, it had not reached the 100 per cent level prior to the onset of extinction. That is, the animals continued to make a few approaches to the cup prior to making the terminal bar- press in the RU. In addition, this discrimination tended to breakdown during extinction, despite the continued presenta- tion of the SD for cup-approach on the same schedule as used during training. More important, the results of Denny gt al. demonstrate that when the level of discrimination developed during conditioning is fully taken into consideration, the resulting or modified RU hypothesis very closely predicts the number of responses to extinction for the intermittently reinforced animals. Also, the finding that all CRF and inter- mittently reinforced groups visited the food-cup the same number of times during extinction strongly supported their position. According to Denny gt al., increased resistance to extinc- tion is due to the extent to which intermittently reinforced §s continue to bar-press without visiting the food-cup. If their analysis is correct, it should be possible to manipulate u resistance to extinction through manipulating cup-approach during extinction. One obvious way to do this is to vary the presentation ratio of the click during extinction. In the current study, rats trained under FR 10 (i.e., a click accompanying every tenth bar-press), were extinguished in three groups with the click accompanying every sixth, tenth, or fourteenth bar-press. If the discrimination habit is well established during training, the schedule of click presentation during extinction should specify the number of bar-presses emitted prior to each visit to the food-cup. Therefore, greater intermittency of click presentation should result in a greater number of bar-presses to extinction, but the same number of cup-approaches. METHOD Subjects Forty-eight male Sprague-Dawley rats, obtained from Spartan Research Animals (Haslett, Michigan) were 110-120 days of age at the start of training. A minimum of five days of ad libitum food and water maintenance were given prior to fourteen days of food deprivation. At the start of depriva- tion all §s were given 6 g of chow (Wayne Lab Blox) daily until their weight approached 80 per cent of their ad lib weight. Food rations were then individually adjusted to main- tain all S8 at approximately 80 per cent of their ad lib weight throughout the experiment. The §s were housed four to a cage (66 x 25 x 19 cm) and maintained under constant light. During weight control, each animal was fed individually in a separate cage. Daily weighings prior to training served to habituate the animals to handling. Apparatus A modified operant chamber (24 x 22 x 21 cm) in a sound attentuated compartment was used. The entrance to the food- cup, 9 cm from the bar, was covered by an opaque hinged door, (5 x 6 cm) allowing automatic recording of the frequency, latency and duration of food-cup visits. Latency of food-cup approach was measured from the click of the food magazine that accom- panied a to-be-reinforced bar-press until S opened the food-cup 5 6 door approximately 20°. The timing circuit for latency was automatically broken if an approach was not made within 20 sec following a reinforced bar-press. A second timing circuit, operated as long as the food-cup door was opened 20° or more, recorded duration of food-cup visits. Observation of the animals indicated that they were unable to view the food-cup without a visit being recorded. Latency and duration measures were taken using digital counters and an astable multivibrator calibrated at five pulses per second. Continuous chamber lighting was provided throughout a session by two three-watt bulbs diffused through translucent plastic. A ventilating fan served to mask extra-chamber noises. All contingencies and recordings were programmed with standard electromechanical equipment. Procedure On Day 1 of pre-training, all §s were fed 10 pellets (Noyes, 97 mg used throughout) in a carrying cage. On Day 2 of pre- training §s were manually shaped to a criterion of 40 bar-presses and on Day 3 all §s were given #0 reinforcements on a CRF schedule. The time required to earn the #0 reinforcements was used as a measure of the animal's response speed. The §s were assigned to one of two groups so as to equate the groups' mean response speed and given ten days of further training. In the continuously reinforced control group, 12 §s received 40 CRF reinforcements per day, while in the intermittently reinforced experimental group, 36 gs received 40 FR 10 reinforcements per day for ten days. Each animal was removed from the apparatus 7 immediately after receiving its fortieth reinforcement to prevent the occurrence of any additional visits to the food- cup. Throughout acquisition and training the following measures were recorded: number of bar-presses and approaches to the food-cup, latency of approach, and duration of visits to the food-cup. Extinction took place on Day 1h. CRF animals received a click of the new empty food magazine after each bar-press. For extinction, the FR 10 animals were assigned to one of three groups so as to equate the mean response speed on Day 13. The groups were balanced on mean response speed since acquisition rate of responding has been found to be signifi- cantly correlated with resistance to extinction (Dutch and Quartermain, 1967). The three FR 10 groups of twelve animals each, received a click after 6 (FR 10-6), 10 (FR 10-10), or 1h (FR 10-14) bar-presses. All measures recorded during training were taken at extinction criteria of 3, 5, and 10 min without a bar-press. On Day 15 the §s were given a second extinction session under conditions and procedures identical to those of Day 14. RESULTS Acquisition Throughout training, the three FR 10 groups were very closely matched on mean number of approaches to the food-cup and mean duration of cup-visits, and were equivalent by the end of training on latency of approach to the food-cup (See Figures 1, 2, and 4). Figure 1 represents the development of the discrimina- tion habit of partial reinforcement in terms of the reduction of unnecessary visits to the food-cup. On Day 1 of condition- ing, the FR 10 group made a mean of n.89 approaches to the cup per RU, while the animals continued on CRF made only 1.65 approaches per RU. By Day 10 of training these initial dif- ferences had largely disappeared, with CRF animals making 1.30 approaches per RU and FR 10 animals making only 1.39 approaches per RU. An additional measure of the development of discriminated approach is the latency of approach following a to-be-reinforced or click-accompanied bar-press. Prior to the shift to inter- mittent reinforcement, all stimuli associated with a bar-press were discriminative stimuli for approach. With the onset of FR 10 training, however, the brunt of the stimulus control of approach is placed on the click of the magazine. Figure 2 shows that, as the click presumably gained discriminative ‘ 8 Figure 1. Mean number of cup-approaches per reinforcement for all extinction groups during training. The FR groups all had identical conditions during training. MEAN NUMBER of CUP-APPROACHES 5.0 4.0 3.0 2.0 LO A—A CRF A—A FRIO-6 0—0 FRIO-IO H FRIO-I4 1111111111 I2 3 4 5 6 7 8 9 IO DAYS OF ACQUISITION 1’) 11 Figure 2. Mean latency of approach to the food-cup for all groups following the click during acquisition and extinction. A608 zo_._.oz_._.xm 02.5.5 Ioozm.r<1_ zozwb<4 24m: DAYS OF ACQUISITION 13 control over approach, latency of approach decreased. Both the number of approaches to the food-cup and cor- responding latencies strongly suggest that the magazine click had attained a high degree of stimulus control of cup-approach by the end of training. This level of control was seen as a prerequisite to using the click for manipulating approaches to the cup during extinction. Extinction 1 The mean number of bar-presses and cup-approaches to the 10 min extinction criterion for all groups are shown in Figure 3. The partial reinforcement effect is readily evident with all FR 10 groups making many more bar-presses to extinc- tion than the CRF group. As tested with an analysis of vari- ance, there was a significant difference in the number of bar-presses emitted by the FR groups (F = #.79, df = 2/33, p < .025) but no significant difference in mean number of cup-approaches (F = 0.17). These results show, as predicted, that the mean number of approaches to the cup during extinc- tion was independent of both training and extinction conditions, while mean number of bar-presses to extinction was a direct function of the bar-press to click ratio of extinction. It is interesting to note the symmetrical effect of the experi- mental manipulation on bar-presses; that is, the FR 10-6 group made a mean of 171 fewer bar-presses than the FR 10-10 group, which in turn made a mean of 162 fewer responses than the FR 10-14 group. However, the Denny 2:.él- revision of the RU hypothesis 14 Me n number of bar-presses and cup-approaches to Figure 3. a the 10 min extinction criterion for all groups during Extinc- tion 1 and Extinction 2. MEAN NUMBER of RESPONSES 10 I0 MIN. EXTINCTION CRITERION 900 800 700 600 200 I00 EXTINCTION 1 I). 1f 1, . EXTINCTION 2 . CUP-APPROACHES [:1 BAR-PRESSES FRIO-IO 900 800 700 - 600 200 IOO 16 did not precisely predict the total number of bar-presses to the 10 min criterion. Presumably, the failure to predict accurately was due to the breakdown of discriminated approaches during extinction. When the data were evaluated with respect to this breakdown, as described below, then the Denny g1 g1. revision of the RU hypothesis accurately predicted the number of bar-presses for FR 10-6, FR 10-10, andFR 10-14. Discriminated cup-approaches can breakdown in two ways: (1) the animal should approach after a click but does not, and (2) the animal should not approach (no click) but does. For each rat the number of bar-presses and cup-approaches to these breakdown points in extinction were determined. The selected criterion for breakdown, based on an examination of the data of two randomly selected §s, was two approach responses either side of 100 per cent discrimination in a block of 5 or 10 response units. Depending on which happened first, g reached the breakdown criteria when it first made: (1) 3 or fewer approaches for a block of 5 RUs or 8 or fewer approaches for a block of 10 RUs 53d (2) 7 or more approaches for a block of 5 RUs or 12 or more approaches for a block of 10 RUs. The mean of these criterion points was taken as the point at which discrimination broke down for each animal. For each treatment group the mean number of bar-presses and cup-approaches to the breakdown point was found. The mean number of bar-presses to extinction, as predicted by Denny gt gl., involves the calculation of an empirical discrimination index for each intermittent group. This was found by dividing the mean number of cup-approaches on Day 10 17 of training for CRF animals by the mean number of approaches on Day 10 of training for each FR group. The discrimination index indicates to what extent the development of discriminated approaches by an FR group approximates that of the CRF group. These values, multiplied by the number of bar-presses in the respective RU (press to click-ratio of extinction) yielded the predicted number of bar-presses (see Table 1). Results in Table 1 show that the mean number of bar-presses predicted from the revised RU hypothesis closely matched the number actually obtained. In the bottom row of Table 1 it can be seen that the number of RUs to breakdown was approximately equal for the three FR groups. It also appears that the size of the press-to—click ratio determines the number of errors of discrimination that an FR group makes prior to discrimination breakdown. With a larger ratio, there is a greater chance of making the error of approach- ing the cup without the click. As can be seen in the next to last row of Table 1, FR 10-14 makes the most cup approaches, FR 10-10 the next most cup-approaches, while all three groups are approximately matched on total number of RUs to breakdown. The CRF group made 20.6 cup-approaches. a value almost identical to that of FR 10—6. During extinction, latency of cup-approach increased (Figure 2). As extinction progressed, the discriminative con- trol of the click over cup-approach weakened and late in extinc- tion animals often failed to approach after the click, allowing the timing circuit to go to its 20 sec limit. Throughout training the post-reinforcement pause typically found under g). TABLE 1 A comparison of mean number of predicted and obtained bar-presses to breakdown of discriminated cup-approach together with obtained mean number of response units and cup-approaches FR 10-6 FR 10-10 FR 10-14 Number of bar-presses to breakdown by CRF (Nbp) 23.75 23.75 23.75 Extinction response unit (RU) 6 10 14 Discrimination index (DI) 0.94 0.92 0.94 Predicted number of bar- presses (Nbp x RU x DI) 133.9 218.5 312.5 Obtained number of bar- presses to breakdown 121.3 216.7 313.0 Number of cup-approaches to breakdown 20.5 25.6 34.8 Number of RU's to breakdown 20.2 21.8 22.4 18 19 FR conditioning presumably resulted in FR groups spending u! twice as long in the food-cup as CRF animals (Figure 4). During extinction, duration of visits increased for all groups. As revealed by observation of the animals, the increase in cup duration early in extinction was correlated with prolonged and agitated searches of the food-cup, while late in extinction § tended to lie motionless with its head in the food-cup for long periods. Extinction g The second extinction session indicated that the dis- crimination of approaching the cup only after the click had been substantially weakened. Figure 3 shows the mean number of bar-presses to the 10 min extinction criterion were nearly equal for the three FR groups, though the effect of inter- mittent reinforcement was still obvious. As with Extinction 1, all groups made approximately the same mean number of cup-approaches during extinction. Early during the second extinction period, the rats did respond to the discriminative stimulus with a rapid approach to the food-cup. However, as extinction progressed, frequency of failures to visit the cup increased, again allowing the latency timing circuit to run to its 20 sec completion (Figure 2). Duration of cup visits also increased over Extinction 1, as the behavioral pattern of agitation and apparent "depression" again appeared (Figure 4). 20 Figure 4. Mean daily duration of cup-visits for all groups during acquisition and extinction. MEAN DURATION of CUP-VISIT DURING ACQUISITION and EXTINCTION hoe.) I 0.0 9.0 8.0 7.0 6.0 3.0 2.0 A o—o car H FRIO-C o——o FRIO-IO H FRlO-M l 1 l l l J I L 1 l J 3 4 5 6 7 a 9 IOEXTIEXTZ DAYS OF ACQUISITION 21 DISCUSSION The results of the current study provide strong support for the discrimination analysis as proposed by Denny g§,§l.. namely that intermittent reinforcement results in increased resistance to extinction to the extent that an animal learns the discrimination of approaching the food-cup only after the click has occurred. Over the ten days of intermittent rein- forcement given in the present study, this discrimination developed to a high level for all groups. Comparison of the FR 10 groups showed that lengthening the homogeneous chain preceding cup-approach during extinction increases resistance to extinction. Also, as predicted, all groups made nearly the same number of cup-approaches during both extinction sessions, thus receiving the same number of actual non-reinforce- ments. The RU hypothesis of Mowrer and Jones seems to have been correct in assuming that intermittent reinforcement involved a homogeneous chain of non-rewarded operants and a final rewarded operant. However, such response chains are only gradually developed and are not perfectly established even with extended training; plus. they are not maintained when the terminal reinforcement is withdrawn. Through calculation of a discrimination index reflecting the imperfect develOpment of this response chain and through recognition of the break- down of discriminated approach during extinction, the RU 22 23 equation can be revised to predict extinction behavior rather accurately. Day and Platt (1972) have also reported data showing that the revised RU hypothesis of Denny £3.31. very closely predicts the number of responses to extinction follow- ing free-operant reinforcement. Further support for this analysis is the finding by Behan (1954) that the lack of an SD for cup-approach resulted in a failure to develop the discrimination habit of partial reinforcement during acquisi- tion and subsequently a failure to show the partial reinforce- ment effect during extinction. The finding that cup-durations increased as extinction progressed was unexpected. The agitated searches of the cup were anticipated, but prevailing theories of extinction would predict that the cues of the food area become aversive and elicit withdrawal responses rather than staying responses. The visits to the cup late in extinction where §' lay flat and motionless with its head in the cup can seemingly be likened to "depression". Perhaps prolonged inescapable frustrative non-reinforcement in rats can lead to an amotivational state that could be classified as depression, much as Hurlock (1925) found that continued frustration or failure at the human level seems to reduce morale or motivation. Although the current study involved only FR reinforcement, the discrimination analysis is intended to apply to all inter- mittent schedules. In fact, the click of the magazine as a SD for cup-approach may be even more important in VI or VR sched- ules. Under FR or FT schedules, the inherent regularity of reinforcement following a fixed number of responses or amount 24 of time may serve as an additional (albeit secondary) SD for cup-approach. When the ratio or interval is varied, however, the click becomes the only reliable cue and presumably ac- quires an even greater level of discriminative control over cup—approach. REFERENCES REFERENCES Behan, F. L. Resistance to extinction as a function of the percentage of discrimination with fixed ratio reward. Unpublished M.A. thesis, Michigan State College, 1954. Day, R. and Platt, J. Several tests of the Response-Unit hypothesis. Paper presented at the meeting of the Psychonomic Society, St. Louis, Missouri, November, 1972. Denny, M. R., Wells, R., and Maatsch, J. Resistance to extinction as a function of the discrimination habit established during fixed ratio reinforcement. Journal 2: Egperimental Psychology, 1957, 53, 451-456. Dutch, J. and Quartermain, D. Unrewarded trials and resistance to extinction of a bar-pressing response. ngchonomic Science, 1967, 9, 505-506. Hurlock, E. B. An evaluation of certain incentives used in school work. Journal 9: Educational Psychology, 1925. 16, 145-149. Mowrer, 0. R., and Jones, H. Habit strength as a function of the pattern of reinforcement. Journal 9f Experimental Psycholog_, 1945, 35, 293-311. Winer, B. J. Statistical Principles 13 Experimental Design. New York: McGraw-Hill, 1971.