Kfis‘iSTMQ‘E T0 EXTINCTfCrN AS A FUNCFEON OS: WEE FERCEN‘FAGE OF DISCRRWNATEON WELTH FEXE!) RATEO REWARé} Thesis for flu Degree of M. A MICHIGAN STATE CGLLEGE fiance‘s Lerraén-a 35km ‘1954’3 This is to certify that the thesis entitled lU-..J_;!‘\J‘: To JAL‘J.-JU .‘IO_\-j‘ I10 4% Lt-:TUDIO}I O.” (“'1 r- v' y ' , - 2;“; ;).4‘-1‘3:l 3:1-AL11 0.: D: SCLL...l :::.;':10:I 'II-Al 41-114-) 114; 10:: ILnLLLU presented by ") rances Lorraine :ehan has been accepted towards fulfillment of the requirements for I" ‘ . 7- , _ “'L _ degree 1n__u.__tP5*C 0100’ Wait? Major professor C—t D (D 1...] C) 1. I‘ Date , 195"" 0—169 RESISTANCE TO EXTINCTION AS A FUNCTION OF THE PERCENTAGE OF DISCRIMINATION WITH FIXED RATIO REWARD 3! Frances Lorraine Behan A THESIS Submitted to the School of Graduate Studies of Michigan State College of Agriculture and Applied Science in partial fulfillment.of the requirements for the degree of MASTER OF ARTS Department of Psychology 1951+ ACKNOWLEDGEMENT Grateful acknowledgement is made to Dr. No R. Denny, Dre So Ho Bartley, and ‘Dr. D. M. Johnson'who all aided in the completion of this investigation. TABLE OF CONTENTS Page INTRODUCTION 0 o o . . e o e . o o o o o e e o o o o e 0 STATEMENT OF THE PROBLEM o o . . . o o o o . e o e o e o APPARATUS. e . o e o o o o . . . e o . o o . . . e . e o 8 SUBJECTS o o e o o e e o o o o e e o e o o o e o e o e e 11 PROCEDURE. 0 e . e o e e e . o e . e o . . o e e o e o o 12 Preliminarytraining...o............. 12 Training.......................13 Extinction. . . . . . . . . . . . . . . . . . . . o o 15 RESULTS. 9 . e e e e o e o e o e o e o . o e . o e o o o 16 DISCUSSION 0 e e e o e e o o o o . o o o e o o e o o o o 32 SW.........................36 BIBLIOGRAPHY......................38 APPENDIX........................39 ,Table IIIa IIIb IV VII VIII LIST OF TABLES Summary of the results of the U-test comparing groups on the last 10 pre- trainingblocks......ooooo.o.. Summary of the mean percentage of discrimination on the last 10 pre- - trainingblocks.............o. Summary of the results of the U-test comparing groups on each set of 10 blocks during training. . . . . . . . . . . . Summary of the results of the U-test comparing scores for each group with scores for the same group at a different pointintrainingoo.....o.o.... Mean percentage of discrimination for the last 10 pro-training blocks and for the successive sets of 10 blocks of responses during training . . . . . . . . . . Summary of the results of the U-test comparing groups on the two extinction critOl‘iaeeeeeeeeeeeeeeeeeee Summary of the mean.nnmber of bar presses to extinction for the two extinction criteria Summary of the predicted number of trials to extinction by two methods. and the actual number of bar presses to extinction for the cue group for the two extinction criteria . . APPENDIX Mean percentage of discrimination for successive 10 blocks of responses for all animals during training . e . . . . . . . . . Mean number of bar presses to extinction for all animals for the two extinction oritsrifieeeeeeeeeeeeeeeeeee Page 1? 18 20 22 23 27 28 30 no Figure II LIST OF FIGURES Page A schematic presentation of the bar pressing apparatus . . . . e . e . e . . . . e 9 Mean percentage of discrimination for successive ten blocks of responses for the last ten pro-training blocks (PT-AD) and for the training blocks (T-lO through T-lOO) for each of three groups, FRv FRna’mch%100'°.'.'°.°°’°°'°° 2’4 INTRODUCTION During the past few years psychologists have recog- nized that there are important differences in behavior between situations utilizing continuous and non-continuous reward techniques, particularly when resistance to extinction is considered. Until recently, few studies dealt with per- centage of responses rewarded (partial reward) or the pattern of rewarded and non-rewarded responses. In.a recent review of partial reinforcement, Jenkins and Stanley (1) make the following generalizations: ' 1. Acquisition. Response strength is built up somewfiEt more rapidly under a schedule of 100% reinforcement than.under a partial regimen. Differences in learning, however, are not always large, and with prolonged training the ulthnate level of acquisition for partially rewarded subjects may approach that for the 100% ones. 2. Maintenance. 'While the behavior in post acquIsItIon performance is stable in the partial reinforcement situation, it is usually at a lower level than in the 100% insta1ce. Never— theless, differences are not always statistically significant and.may well be of no great practical consequence . 3. Resistance to extinction. The most striking effects of parEIEI reIETorcement are apparent in response strength as measured by resistance to extinction. In almost every experiment, large and significant differences in extinction favoring the groups partially reinforced in conditioning over the 100% ones were found. The practical implications of this principle for’maintaining behavior is obvious: Administer the reinforcing stimulus in conditioning according to a partial schedule, and the behavior will be maintained for long periods in the absence of external support from.primary reward." Skinner was one of the first to investigate the phenomena associated with partial reward (1933 and 1936). Using his technique of periodig reconditioning, which formed the basis for his earliest investigations, Skinner measured the rate of bar pressing in a standard Skinner box situation. By employing this bar pressing apparatus, he first studied behavior using a periodic reward technique (reward per unit time) and later studied behavior as a function of reward per’number of responses (reward at a fixed ratio) (5). He was presumably studying a response chain which involved at least three elements: the bar pressing response, the approach to the food tray, and the eating of the food (5, p. 5h). With respect to fixed ratio reward, he asserted (S, p. 300): ' As a rather general statement it may be said that when reinforcement depends upon the com. pletion of a number of smmilar acts the whole group tends to acquire the status of a single response and the contribution to the reserve tends to be in terms of groups.“ Skinner's thinking was in part used as a basis for a previous study (7). Using Skinner's analysis, it was reasoned that 'if discrimination were perfect, then the number of blocks of responses* to extinction under fixed ratio reward would be equal to the number of blocks of responses to extinction under continuous reward (2). The groups of the previous 5Block of responses is equivalent to a specified behavior sequence terminated by reward, e.g., with one to five ratio, the block is a sequence of five bar resses the last of which is followed by reward. p ’ study (7) did not attain perfect discrimination, but the results did imply that the amount of discrimination attained before extinction was a pertinent variable in predicting the number of bar presses to extinction for the fixed ratio groups. A group receiving one reward to five bar presses did not, even after having received 80 rewards on the partial schedule, make five times the number of bar presses to extinction of the control group (continuous reward). The function was thought to be more nearly that the number of bar presses to extinction under fixed ratio reward is equal to the product of the number of bar presses to extinction under continuous reward, the number of bar presses in the fixed ratio block, and the percentage of discrimination at the end of training (2, 3). Partial reward in the present study is viewed as leading to rather complex behavior. Behavior under partial reward is considered to be discrimination learning; the annual is learning that certain cues lead to reward and certain other cues do not lead to reward. If the animals were to learn to discriminate perfectly, then one would expect that the number of responses to extinction under partial reward training would equal the product of the responses to extinction following training under continuous reward and the number of reSponses in the fixed ratio block during acquisition. This fellows from the view of the extinction procedure that during extinction the animals which were trained under partial reward are receiving the same number of non-rewards per block of responses as the animals which were continuously rewarded. With the apparatus employed in the present study, it is possible to determine how well animals can learn to use the cues in the experimental situation (click for rewarded response; no click for non-rewarded responses) and how well other animals can learn with no discriminable cues in the experimental situation (click for both rewarded and non- rewarded responses). With a conventional Skinner box, in which the bar and food dish are adjacent, it is not possible to record approaches to the food dish independ- ently of the bar pressing response and thus a.measure of discrimination can not automatically be recorded. “With the apparatus designed for the present study, the bar and the food dish are at Opposite ends of a short alley, allowing the bar pressing response to be recorded inde- pendently of the approach to the food dish. STATEMENT OF THE PROBLEM The present study was designed to test some notions (2, 3) which were derived from the data of“Wclls' study (7) under a different ratio of reward. ‘Wells' data seemed to indicate that the number of extinction trials under one to five ratio were equal to the product of the number of trials to extinction under continuous reward, the amount of dis- crimination at the end of training, and the number of bar presses in the fixed ratio block. The present study was designed to test this notion with one to three ratio of reward and to find out the effect of a discriminable cue upon the learning of the discrimination and extinction. The measure of discrimination in the present study is as follows: the animal was considered to have made a correct approach to the food dish, if and onlz_i£ it approached the food dish after pressing the bar the apprOpriate number of times. For animals trained with one to three fixed ratio reward, the correct discrimination was pressing the bar three times and then approaching the food dish only after the third bar press. For continuously rewarded animals, the correct discrimination was pressing the bar and approach- ing the food dish after each bar press. The percentage of discrimination was equal to the number of correct approaches to the food dish per 10 blocks of responses multiplied by 10. Specific hypotheses are as follows: Hypothesis I: If a discriminative one is present during training and extinction under partial reward, so that it differentiates the rewarded bar press from the non- rcwarded bar press, then the number of bar process to extinction following training under fixed ratio reward will be equal to the product of the number of bar presses to extinction following continuous reward, the number of bar presses in.the fixed ratio block, and the final percentage of discrimination attained. Hypothesis II: If no specific one is present during the fixed ratio schedule, then the number of bar presses to extinction will be less than the number of bar presses to extinction following training under fixed ratio reward with a cue, but greater than the number obtained following continuous reward with a cue. Let: Er c '=-thc number of bar presses to extinction r, under fixed ratio reward with a one present. E a: the number of bar presses to extincti on under fixed ratio reward with no one present. Ec,e a-the number of bar presses to extinction under continuous reward with a one present. :1 H the number of bar presses in the fixed ratio block. D a the final percentage of discrimination at the end of fixed ratio training. Hypothesis I asserts that: If a discriminative one is present. . ., then Efr,c 2Ec c x R x D O Hypothesis II asserts that: If no specific one is FPOUCnte e e, thCn E O< E c. fr,nc 4 Efr,c APPARATUS The apparatus (see Figure l) employed was a short unpainted wooden alley, the interior of which was lined with sheet metal. The inside dimensions were 6 inches in height, hi inches in width, and 2h.inches in length. The top was a hinged door constructed of hardware cloth framed with wood. A metal food tray lined with felt was located at the end of the short alley. Food drOpped into this tray via a felt-lined chute from the electrically Operated food releasing mechanism connected to one end of the box. A 6-inch.metal portion of the floor (treadle) just beneath the food dish was hinged so that it was depressed when the animal stepped on it, closing the microswitch, thus recording all approaches to the food dish. A 2-inch metal bar pro- Jeotcd into the box at the and Opposite to the food dish. The food releasing mechanism was activated when a pressure of 30 grmms was applied to this bar, except when the control box was set for partial reward. An audible click was pro- duced by the activation of the food releasing mechanism. With partial reward the click occurred only following the rewarded bar presses. Illumination was furnished by a 7% watt bulb which hung 12 inches above the center of the box. ‘Water was present at all times and was introduced through a glass FIGURE I. A SCHEMATIC PRESENTATION OF THE BAR PRESSING APPARATUS. B-BAR, T-TREADLE, FD- FOOD DISH, VIE-WATER aor TLE, FM- FEEDING MECANISNI 10 tube connected to a bottle on the outside of the box near the food dish (see discussion p. 33). The feeding mechanism, bar, and polygraph were connected to an electric control box.which was designed and constructed by T. H. Maatsch. A record was made on a ploygraph of the number, duration and spacing of the bar pressing responses; the occurrences of reward; time and presence of the animal on the treadle, i.e.. presence of the animal at the food dish or at the water bottle. [I ,1 SUBJECTS The animals used in the present study were 53 female albino rats from the colony maintained by the Department of Psychology of Michigan State College. Thirty-one of the animals were naive and were 90-100 days old when started on the experiment. Six of these animals were eliminated from the study for reasons given in the pro- cedure. Twenty-two of the animals were used on a previous Skinner box study and were approximately 200 days old when started on the present study. Five of these animals did not finish the study. PROCEDURE Preliminary Training All animals received 9 grams of Purina Dog Chow for five days and then were not fed for ha hours prior to training. 'While the animals were on this feeding schedule they lived in individual feeding cages. Animals were never handled to tame them. The only handling by the experi- menter occurred in the transporting of the animals to the feeding cages and thence seven days later to the apparatus. Before each animal was introduced into the apparatus, the bar was in place and two pellets of "lab chow" tablets (0.0h5 gm. each; made by P. J. Noyes Company, Lancaster, lkfi.) had been placed in the food tray. Animals were allowed to explore and press the bar, however no single bar press was rewarded until the two pellets of food had been eaten. After the animal had eaten the two pellets of food, each subsequent bar press was rewarded with a pellet of food on the condition that the animal had eaten the pellet that was in the food dish prior to the occurrence of that par- ticular bar press. Accordingly, no more than one pellet of food was in the food dish at any one time. This pro- cedure was followed in order to eliminate hoarding on the part of the animals. If an animal did not approach the end from which the 13 bar projected, a scratching sound was made at the end near the bar to induce the animal to that end of the alley. If an animal had not eaten after 30 minutes in the apparatus, it was discarded. If an animal had not pressed the bar for 30 minutes after having eaten the two pellets, it was discarded. In total, five animals were discarded for fail- ing to eat and three animals for failing to press the bar. In addition, three animals were discarded because of mis- cellaneous apparatus failure. Thus h2 animals completed the training program, and this report is based upon data obtained from these animals. After 10 pellets had been received in this manner, the food releasing mechanism was loaded with no pellets, extraneous cues were discontinued and the animal was allowed to proceed at its own pace with continuous reward at a i-second delay after each bar press. These no con- tinuously rewarded bar presses will hereafter be referred to as the pro-training blocks of responses. Training Immediately following these continuously rewarded trials, each animal was then given the remaining trials according to the group to which he had been previously assigned. For the sake of brevity, let us introduce the following symbols to represent the groups: FRo - experimental I‘ 11+ group receiving a cue for rewarded bar presses, FRno - experimental group receiving no cue for rewarded bar presses, 0100 - control group receiving 100 continuously rewarded bar presses, and 050 - control group receiving 50 continuously rewarded bar presses. The treatment of the groups fellowing pre-training is as follows: PRG (nsll) PR“ (Ir-:9) 0100 (m=12) 050 (11:10) This group received 100 rewards with partial reward at the ratio of one reward to three bar pressings. For this group the food releasing mechanism made no click when food was not presented. The control box activated the food releasing mechanism only on every third bar press (rewarded bar press). This group received 100 rewards with partial reward-at the ratio of one rmward to three bar presses. For this group the food releasing mechanism was activated every time the bar was pressed but food was given only on every third bar press. This was accomplished by placing food in every third hole of the magazine. Thus each bar press was followed by a click, but only every third bar press was followed by reward. This is the no one group. This group received 100 continuously rewarded bar presses. This group received 50 continuously rewarded 15 bar presses. This group was added after the above three and was included to check the effect of the lowered discrimination in the 0100 group after 50 rewards, on the number of trials to extinction. A11 ofthe pro-training trials, training trials, and extinction trials were given on the same day. No interval of time was introduced by the experimenter between the divisions of the study. Each animal performed at its own rate during each division of the study. Extinction Immediately following training, each animal was kept on its training schedule, although no further rewards were administered. Two criteria of extinction were considered: failure to press the bar for three minutes and failure to press the bar for ten minutes. RESULTS In the present study the training and pre-training data were recorded as the percentage of discrimination during a set of ten blocks of responses (see p. 2), i.e., the ratio of the number of correct approaches to the food dish per ten.blocks of responses multiplied by 100. Since Snedecor (6, p. 316, uh?) advises the use of an arc sine transformation when dealing with percentages, the pro-train- ing and training data were transformed into arc sines using the table presented by Snedecor (6, p. hh9). Since one of the requirements for the use of analysis of variance is homogeneity of variance, Bartlett's test was applied to the arc sines of the percentages of discrimi- nation. The group variances were heterogeneous for the training data, therefore the decision was made to use non- parametric statistics. The ManndWhitney U-test (h, pp. 128- 130) was applied to the pro-training and training data. The results of these analyses are summarized in Tables I, IIIa and IIIbe Table I summarizes the results of the U-test as applied to the last ten blocks of responses during pro-training. There were no significant differences in percentage of dis- crimination between groups of sephisticated animals and groups of naive animals, although (Table II) the groups of v" TABLE I 17 SUMMARY OF THE RESULTS OF THE U-TEST COMPARING GROUPS ON THE LAST 10 FEE-TRAINING BLOCKS Comparisons between: U E(U) 0’1; p SOph. and naive 135 181‘. 35.01 C .17 0100 and 050 90 66 16.25 c .15 0100 and F'Ro 14.2 St; 1h.o7 n.s. cl0° and mm” in 1.2 11.83 n.s. no and pant) 32e5 3145 9.16 nele F30 and 050 68.5 “-9.5 13.16 n08. Pane and 050 27 38.5 11.05 4 e30 TABLE II SUMMARY OF THE MEAN PERCENTAGE OF DISCRIMINATION ON THE LAST 10 FEE-TRAINING BLOCKS 18 Groups Sephisticated Naive Group Means File 66.? 16.0 52.2 mm 65.0 use 50.0 cloo ' 52.0 n.3, v.5 050 h3o3 2h.0 3&05 Means 53.1 39.9 19 sephisticated animals were consistently superior to the groups of naive animals. In addition, there were no significant differences between groups during the last ten pro-training blocks. This latter result is to be expected since all groups had been treated alike up to this point of the study. We can say that all groups began the training period at approximately the same level of discrimination. It is interesting to note (see Table II) that at this point, although the animals had received no continuously rewarded blocks of responses, the discrimination was low (approxi- mately hS percent). Table IIIa summarizes the results of the U-test com- paring groups on successive 10 blocks of responses during training. The experimental groups do not seem to perform different from each other during training except during the last 10 blocks of responses, but as can be seen in Figure 2 and.Table IV, the percentages of discrimination for both groups are extremely low even at the end of training. The results indicate that the auditory cue apparently does not easily facilitate discrimination. The results of the U-test as summarized in Table IIIa indicate that the control group performs significantly superior to both of the experimental groups through 60 blocks of responses. After 70 blocks of responses the dis- crimination for the control group decreases as can be seen in Figure 2, and the cue group shows an increase in ON EACH SET OF 10 BLOCKS DURING TRAINING TABLE IIIa SUMMARY OF THE RESULTS OF THE U-TEST COMPARING GROUPS 20 Comparisons between: U EU!) 6; p r-10 cSoacloo and FRO 380 121 26.18 4.0001 1-10 050&100 and FRnc 168 99 22.98 (.0030 1-10 FRO and FRno 39.5 A9.5 13.16 n.s. $-20 Cgmoo and FRO (4.12.5 12.1 26e18