THE FUNCTIONAL ROLE or SELF-ADMINISTERED ' I CONSEQUENCES: AN INVESTIGATION ‘OFETHE . 7 STIMULUS CONTINGENCY-CONTIGUITY DIMENSION . ' Dissertation for the Degree of Ph.‘ D.- MICHIGAN STATE UNIVERSITY. RONALD CARL mess _ ' 1976 ' ' «H. A/) r .A __ ___*‘- ABSTRACT THE FUNCTIONAL ROLE OF SELF-ADMINISTERED CONSEQUENCES: AN INVESTIGATION OF THE STIMULUS CONTINGENCY-CONTIGUITY DIMENSION By Ronald Carl Riggs Sixteen female white Carneaux pigeons were trained to peck an illuminated disk before eating from a freely available food source, thus rewarding their own performance. This self-reinforcement pattern was established during a training period of punishing non-contingent self-feeding by food withdrawal. Eight subjects were trained to key- peck once before approaching the food hopper (continuous reinforcement or CRF); the remaining eight subjects were trained to key-peck five times before approaching the food hopper (fixed ratio 5 or FRS). The effects of these schedules of reinforcement in free-food and no-food testing conditions were measured in terms of number of responses. number of sessions, number of trials, number of reinforcements, and number of transgressions to extinction. While the free—food variable produced a greater number of responses, sessions, and trials to extinction, the schedule had no efféct on any of the five variables. The absence of a partial reinfOrcement effect was interpreted as indicating that self-administered consequences do not have reinforcing effects on preceding behaviors. The results indicate that contingency Ronald Carl Riggs may be a necessary condition for reinforcement and cast doubt on the automaticity assumption, the assumption that a positive consequence automatically strengthens a preceding behavior. Implications for further research are discussed, as are implications for counseling/ psychotherapeutic applications. THE FUNCTIONAL ROLE OF SELF-ADMINISTERED CONSEQUENCES: AN INVESTIGATION OF THE STIMULUS CONTINGENCY-CONTIGUITY DIMENSION By Ronald Carl Riggs A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Counseling, Personnel Services, and Educational Psychology 1976 © Copyright by RONALD CARL RIGGS l976 ii DEDICATION As this manuscript represents the culmination of a number of years of endeavor, it should be dedicated to those people who were most important to me over that period of time. Thus, it is for my Mother and Father, who taught me a great many things I could not have learned from books, my Brother, to whom I have grown closer, and certainly Donna, whom I was wise enough to marry. ACKNOWLEDGEMENTS I would like to acknowledge the assistance of the following people during the course of this study: Steve Overmann, Rob Howard, and Rod Young, for their technical assistance in the lab; James Engelkes and Norman Stewart, for serving as committee members and for critically reviewing the manuscript; Richard Johnson, for serving as committee chairman, for his assistance and encouragement throughout the project, for critically reviewing the manuscript and for allowing me the latitude to conduct a rather unusual study; M. Ray Denny, for serving as committee member, for his assistance in securing lab space, equipment, and subjects, for critically reviewing the manuscript, for providing a theory that best explains the data and gives some of my ideas better order, and for teaching me to analyze obser- vations in terms of that theory; and Donna Riggs, for providing support and encouragement over the course of my studies and for typing the several revisions of the manuscript. iv TABLE OF CONTENTS page LIST OF TABLES ................................................... vii LIST OF FIGURES .................................................. ix CHAPTER I ........................................................ l INTRODUCTION ................................................ l CHAPTER II ....................................................... 6 REVIEW OF THE LITERATURE .................................... 6 Introduction ........................................... 6 The Mowrer and-Ullman Study ............................ 7 The Mahoney-Bandura Analogue Study ..................... 8 The Bandura-Mahoney Analogue Study ..................... 11 The Caplan Analogue Study .............................. 13 The Mahoney-Bandura-Dirks and Wright Analogue Study .... 16 Related "Free Food" Studies ............................ 18 Summary ................................................ 19 CHAPTER III ............. ......................................... 21 METHODOLOGY ................................................. 2l Subjects ............................................... 21 Apparatus .............................................. 21 Procedure .............................................. 22 CHAPTER IV ....................................................... 26 RESULTS ..................................................... 26 Training ............................................... 26 Testing ................................................ 33 Relationship between Training and Testing Variables .... 40 Summary ................................................ 41 CHAPTER V ........................................................ 43 DISCUSSION .................................................. 43 TABLE OF CONTENTS (continued) page Implications for Further Research ...................... 46 Human Implications ..................................... 48 REFERENCES ....................................................... 51 vi Table 10 11 12 T3 14 LIST OF TABLES Summary of Method Parameters, Caplan Study .............. Summary of Method Parameters ............................ Mean Number of Sessions to Acquisition of the Self- Reinforcing Response for Groups I-IV .................... Mean Number of Key-Peck Responses to Acquisition of the Self-Reinforcing Response for Groups I-IV ........... Mean Number of Reinforcements to Acquisition of the Self-Reinforcing Response for Groups I-IV ............... Mean Number of Punishments to Acquisition of the Self- Reinforcing Response for Groups I-IV .................... Mean Number of Trials to Acquisition of the Self- Reinforcing Response for Groups I-IV .................... Mean Number of Sessions to Extinction of the Self; Reinforcing Response for Groups I-IV .................... Mean Number of Key-Peck Responses to Extinction of the Self-Reinforcing Response for Groups I-IV ........... Mean Number of Trials to Extinction of the Self- Reinforcing Response for Groups I-IV .................... Number of Testing Sessions to Extinction of the Self- Reinforcing Response for Groups I-IV - ANOVA ............ Number of Key-Peck Responses to Extinction of the Self—Reinforcing Response for Groups I—IV - ANOVA ....... Mean Number of Reinforcements and Mean Number of Transgressions to Extinction of the Self-Reinforcing Response for Groups III and IV .......................... Pearson Product-Moment Coefficients of Correlation Showing Relationships between Training and Testing Variables for Groups III and IV ......................... vii page T4 25 26 27 27 27 28 33 33 34 34 34 35 4O LIST OF TABLES (continued) Table page 15 Pearson Product-Moment Coefficients of Correlation Showing Relationships between Training and Testing Variables for Groups I and II ........................... 41 viii Figure LIST OF FIGURES Number of transgressions per session to acquisition of the self-reinforcing response: Group 1 (FR5, no food). ................................................ Number of transgressions per session to acquisition of the self-reinforcing response: Group 2 (FR1, no food). ................................................ Number of transgressions per session to acquisition of the self-reinforcing response: Group 3 (FRS, free food). .............................................. Number of transgressions per session to acquisition of the self—reinforcing response: Group 4 (FR1, free food). .............................................. Number of key-peck responses per session to extinction of the key-peck response: Group 1 (FRS, no food). ....... Number of key-peck responses per session to extinction of the key-peck response: Group 2 (FR1, no food). ....... Number of key-peck responses per session and number of self-reinforcements per session to extinction of the self-reinforcing response: Group 3 (FRS, free food). .... Number of key-peck responses per session and number of self-reinforcements per session to extinction of the self—reinforcing response: Group 4 (FR1, free food). .... ix page 30 31 32 36 37 39 CHAPTER I INTRODUCTION A t0pic generating much recent investigation by behavioral scientists is that of self-reinforcement. Current data indicate the origins of self-control lie in the biological and social environment (Jeffrey, 1974). In the counseling/psychotherapy situation, this typically involves the client/patient observing his own behavior and controlling what happens after the particular behavior of concern is emitted or withheld. Self-reinforcement has no firm basis in the experimental laboratory. With many behavioral therapists currently using various self-control techniques, it is imperative that research into basic processes be conducted. This would allow for more effec- tive human applications in terms of presenting a more precise con- ception of when such procedures are indicated as the therapeutic technique of choice and for refining the strategy for changing behavior. This study investigated the existence of a self-reinforcement phenom- enon. Goldiamond (1965), among others, argued that behavior is a function of the environment. Reinforcement contingencies provided by social and natural events continuously modify a person's behavior (Kanfer and Karoly, 1972). It can also be argued that behavior is a function of many environments, including the private environment within the skin (Mahoney and Thoresen, 1974), thus yielding self-con- trol mechanisms as a research area. Extreme environmentalists would maintain, however, that self-control "really refers to certain forms of environmental control of behavior" (Rachlin, 1970). It should also be noted that some authors (e.g., Gewirtz, 1971; Stuart, 1972) are critical of "self-hyphenated" terms for several reasons: 1) they are usually poorly defined, 2) descriptive and explanatory functions are often confused, 3) the role of environmental causative factors is often deempha- sized or forgotten. A great deal of the interest in the self-control area stems from two assumptions (Jeffrey, 1974): l) the probability of generalization and maintenance of behavior is dramatically increased, 2) self-control strategies are more efficient and cheaper than traditional therapies. There is a paucity of data to substantiate either assumption. Self-control strategies generally rely on the prearrangement of cues relevant to the target behavior (Mahoney and Thoresen, 1974). The rationale for this type of stimulus control is borrowed directly from laboratory research on discrimination learning (Mahoney, 1972). However, self-control strategies may also call for the prearrangement of response consequences, akin to Skinner's (1953) "controlling response". There is little laboratory research concerning the organ- ism's management of response consequences. Kanfer's (1971) model of self-regulation posits three components of self-control (SC): self-monitoring (SM), the inputs following a response; self-evaluation (SE), a conditional discrimination; and se1f—reinforcement (SR), the delivery of a reinforcer to the person by the person. This permits the generation of the following formulation: SC = SM + SE + SR The SR component could also be SP or self-punishment, depending on the nature of the self-evaluation. The concept of observing one's own behavior is intricately entwined with an evaluation of the performance. Self-monitoring has an inconsistent reactive effect - that is, learning to observe one's own behavior and then to perform that observing response may or may not alter or modify the observed behavior (Kazdin, 1974). This may in some instances be due to a negative self-evaluation and resultant self-punishment (Hannum, Thoresen, and Hubbard, 1974). At any rate, the reactive effects of self-monitoring attenuate over time more frequently than not. A criterion making self-evaluation possible is a necessary compo- nent of any explanation of self-control. Kazdin (1974) argues that this feedback component contributes to behavior change. Bolstad and Johnson (1972) maintain that the evidence suggests that behavior maintained by self-reinforcement may be more resistant to extinction than behaviors maintained by external reinforcement and that such results may be due to the conditioning of self-evaluative responses as secondary reinforcers. Homme (1965) has noted that if behavior is a function of its con- sequences, it does not matter who manipulates the consequence, even if it is the person himself. The variable outcomes of self-reinforcement studies, however, would indicate that there is some question whether a person can manage his own contingency. Bandura (1971) indicates that self>reinforcement includes several subsidiary processes, including a) that the reinforcers are under the person's own control, and b) that the person serves as his own reinforcing agent. He goes on to say the basic question is whether self-generated consequences serve a reinforcing function in regulating behavior. Both Premack (1973) and Bandura (1971) eXpress the same thought: there is nothing significant about a person handing himself a reinforcer, but rather his requiring a criterion performance befbre doing so. The issue resolves to one of contingency vs. contiguity, a Premackian issue (Mahoney, 1972). According to Premack, in order for a response-consequence pairing to be contiguous, the probablity of the consequence can in no way be decreased from its ad’Zibitum value. He goes on to argue that a contingency, in which the proba- bility of the consequence is based on the occurrence of the response, is not sufficient for a reinforcement effect; the issue he is raising is the problem of self-deprivation, since in a satiated organism the consequence has little reinforcing value. It would appear that a self-administered consequence is more akin to stimulus contiguity than to stimulus contingency in that the organism is not likely to deprive itself of a highly potent reinforcer. Harrison and Schaeffer (1975) have presented data indicating that temporal contiguity is not a suf- ficient condition for reinforcement. Paul (1969) has recommended the following sequence for behavior modification research: 1) developing treatment procedures in the laboratory, 2) testing procedures in case studies and single-group experiments, 3) evaluating systematically controlled group studies, and 4) conducting comprehensive factorially designed experiments. The first step in this sequence could be conducted with human or non- human subjects. The complexity of human behavior, however, makes it exceedingly difficult to isolate and identify causal factors. A broad parallel exists between principles governing non-human behavior and principles governing human behavior (Mahoney and Thoresen, 1974). Thus it is possible to investigate basic mechanisms by observing non- human behavior and generalizing to the human domain. Catania (1975) has raised the question of whether self-adminis- tered consequences have reinforcing effects, i.e., does self-reinforce- ment raise the probability of self-reinforced responses. This study attempts to answer this question by systematically comparing contin- uous and fixed ratio 5 schedules of reinforcement during training in free-food and no-food testing conditions. If the probability of the self-reinforced response is altered, then the partial reinforcement effect (PRE), which has been so well established in the external rein- forcement paradigm (e.g., Reese, 1962; Young and Costelloe, 1974) as well as classical appetitive conditioning (Poulos and Gormezano, 1974), should also be found in the self-reinforcement paradigm. The finding of a partial reinforcement effect would thus support the assertion that self-administered consequences do have reinforcing effects, whereas the absence of a partial reinforcement effect would indicate that self-administered consequences do not have reinforcing effects. CHAPTER II REVIEW OF THE LITERATURE Introduction The topic to be reviewed is the literature concerning studies of self-reinforcement employing non-human subjects. While there exists a considerable amount of research concerning self-reinforcement in human subjects, many of these studies have methodological deficiencies (Mahoney, 1972), and the possible presence of confounding variables, as previously noted, precludes isolation and identification of causal factors. Although there exist only five such studies using non—human sub- jects, this in itself is indicative of the state of knowledge in this area. Each study will be reviewed in detail with particular attention to the parameters employed. The studies are presented chronologically; logical develOpment between each and extension to the current endeavor is apparent. Finally, various "free food" studies will be reviewed briefly. While not directly related to the topic of self-reinforcement, there exist methodological similarities as well as the possibility of the operation of the same basic mechanisms. The Mowrer and Ullman Study The Mowrer and Ullman study (1945) demonstrated that time is an important variable in self-reinforcement. The subjects were rats maintained at 85% of their free-feeding weight. A discrete trial pro- cedure was employed with an inter-trial interval of 60 seconds. During training a buzzer sounded for two seconds and as the buzzer terminated, food was delivered into a trough. Procedurally, this is a classical conditioning paradigm. The only difference between this procedure and contemporary research in autoshaping is that Mowrer and Ullman did not provide an operandum to which subjects' responses could be directed. Subjects were trained until the latency of touching the food pellet was less than one second. The experimenters selected punish- ment as the procedure for establishing self-denial. For three groups of subjects the shock was delivered 3, 6, or 12 seconds after the pellet appeared provided the subjects took the food within a 3-second interval after it appeared in the trough. All three groups were iden- tical except with respect to how soon the shock was administered following touching the food within the 3-second interval. If the sub- ject waited for three seconds after food presentation, punishment was avoided. Therefore, the self-reinforced response was waiting for three seconds after food appeared. As the authors put it, "one might think of this as a kind of 'rat etiquette' according to which it was not I~polite' to eat until the prescribed length of time had elapsed." Finally, the function of the buzzer changed during testing. It functioned as a warning signal and remained on for 3, 6, or 12 seconds until the shock occurred. The three response options available to the subject were selected as the dependent variables: a) taking the food during the danger period and receiving a shock; b) not responding at all, thereby not receiving food or shock; and c) delaying eating for 3 seconds, thereby obtaining the foodeithout shock. The results of this experiment showed the acquisition of self- reinforced responding for most of the subjects in the group with the 3-second delay of punishment. The lZ-second group showed no improve- ment in self-reinforced responding and intermediate improvement was obtained for the 6-second group. A limitation on these findings was that the results were dependent upon maintaining the warning stimulus throughout the delay interval. Pilot research demonstrated a failure of self-reinforcement when the buzzer was terminated after 3 seconds. This study demonstrates that punishment is an effective procedure for eliminating eating responses with short latencies, and indicates that the effects of punishment decline as the temporal interval between the response and punishment increases. The experiment demon- strates that the analysis of self-reinforcement with non-human subjects is feasible. It is unfortunate, however, that this study concerned itself only with the acquisition of the self-reinfOrced response, rather than proceeding into a testing situation. The Mahoney-Bandura Analogue Study Mahoney and Bandura (1972) trained three pigeons to key-peck and then gradually moved the hopper presentation forward temporally until the hopper was presented before the key-peck was exhibited. If the subject entered the hopper without first emitting the key-peck, the hopper was withdrawn and lights turned off for the duration of the inter-trial period. If the subject key-pecked before entering the hopper, it was allowed 3.5 seconds of feeding time. After the feeding period was completed the hopper was withdrawn and all lights turned off for the duration of the inter-trial interval. It should be noted that the disc was illuminated green except immediately after a peck, when it pulsed white and a buzzer sounded. During test conditions punishment for transgressive behavior was completely removed. Train- ing continued until each subject exhibited self-reinforcement on 100 consecutive trials. Subject 1 displayed close to 100% self-reinforcement and main- tained this level for 1000 test trials and emitted 1.07 key-pecks per self-reinforcement. At this point, reversal was attempted, that is, the experimenters attempted to halt the key-peck and self-reinforcing responses. Modeling was not effective but response-prevention was. In order to reinstitute self-reinforcement it was necessary to employ the shaping procedure, and self-reinforcement extinguished rapidly on subsequent testing trials. Subject 2 displayed an almost equally high self-reinforcement rate over the first 800 test trials (averaging 2.49 pecks per self- reinforcement) and then showed wide fluctuation over the next 650 trials. Reconditioning of the self-reinforcing response took 17 sessions to reach 100% and the self-reinforcement rate fluctuated widely during the second testing phase. In the second phase of the experiment, subject 3 was required to adopt increasing performance requirements for each self-reinforcement. The white light-buzzer occurred only on criterion responses. An 80% 10 self-reinforcement criterion was used to raise the fixed ratio level from 1 to 5, at which point punishment for transgressions was discon- tinued. The fixed ratio was increased by 1 up to 9 using the light- buzzer cue. The fixed ratio 9 rapidly extinguished; although it was possible to reinstitute a fixed ratio 5 rapidly, it extinguished rapidly. Possible explanations of the maintenance of the key-peck with a freely accessible reinforcer are discussed. Intrinsic reinforcement value in pecking behavior is dismissed on the basis of other research (Neuringer, 1969). A second explanation in terms of punish- ment—induced effects, similar to avoidance behavior, is dismissed as lacking ease of reinstatement. The explanation in terms of self- reinforcement is accepted. It seems to this writer that the explanation in terms of avoid- ance behavior is much too rapidly discarded. Also Elson (1973) makes an excellent case for an explanation in terms of the light- buzzer as a conditioned secondary reinforcer. This is supported in research by Alferink, Crossman, and Cheney (1973). Elson further points out that this may simply enhance acquisition of self-reinforce- ment but may not be a necessary condition. The idea that the proba- bilities have reversed and that the key—peck now serves to reinforce eating behavior is not considered. The results of the fixed ratio study would not support the latter explanatory hypothesis. Denny (1970, 1971) has presented the idea, drawn from ethological considerations, that blocking ongoing consummatory behavior produces frustration that leads to other appropriate or inappropriate consum- matory responses. Thus, removing food from a feeding situation, as 11 in the punishment for transgressions, leads the organism to make feeding responses to other objects, such as the key in the apparatus. This line of thinking is not considered, although it appears viable in explaining acquisition. The Bandura-Mahoney Analogue Study Bandura and Mahoney (1974) conducted a two-phased study; the first phase, using two male Nhite King pigeons as subjects, examined conditions maintaining self-reinforcement functions, while the second phase, using a Cocker-Poodle as the subject, examined transfer of self-reinforcement to new responses. In the first phase, the two subjects were trained to key-peck to an 80% criterion on a fixed ratio 5 schedule using their previously discussed training procedure. Punishment for transgressions was then discontinued for two days of testing. Then the performance standard was progressively raised by one response until the subjects discon- tinued the key-peck. At that point training was reinstated and the performance criterion was raised to 150% of the level at which the subjects had discontinued the key-peck. This was conducted to an 80% criterion, followed by testing until the subjects discarded the per- formance requirement. The subjects were retrained at their highest previous level. Initial maintenance was accomplished via a 1.00 probability of punishment for transgression; after two days this was decreased to a .90 and progressively reduced to 0.00. Findings indicated that a .50 probability of punishment or less had relatively weak sustaining value; this is very similar to some of the results obtained by Caplan, to be discussed next. 12 An extension of this phase involved a similar procedure with one subject; however, the response was a treadle press rather than a key- peck. The results of the key-peck study were essentially replicated. In the second phase the Cocker Poodle had control of both amount of food reinforcement and of the work requirement. Secondary rein- forcers were employed during training (but discontinued in testing). The task was "typing", which was extinguished, and pressing a telegraph key was then substituted. A key-press resulted in a tone as a dis- criminative stimulus for self-reinforcement. The first transfer task was jumping through a 30-inch hoop; there were two testing sessions in which appropriate performance produced the tone but no food. The second transfer task was a treadle press. Findings indicated a 90-99% performance adherence and a 60-80% consumption adherence (the range due to the two different tasks). The authors discuss consistency across subjects, responses, and species in acquisition and maintenance. Non-human subjects were able to acquire self-reinforcement functions, but are quick to discard performance standards if transgressions are not punished, the optimal level of sanctions depending on the "onerousness of the requisite performances". This is further explored and commented on in the dis- cussion of the Caplan study. They remark that a high response output is related to more rapid abandonment of self-reinforcement; this has direct implications for the current study. Finally, they discuss briefly the importance of the performance standard as a discriminative stimulus for self-reinforcement and the importance of differential reinforcement in this respect; this discriminative stimulus analysis is very closely related to the analysis of Catania reported in the 13 discussion section of this study. The Caplan Analogue Study Caplan (1974) had several purposes in his study: a) five dif- ferent punishment probabilities were present during testing to study the effect on extinction; b) a lenient and a stringent training cri- terion were set to study the effect on acquisition and extinction; c) the effect of punishment training was studied by omitting the training procedure; and d) one group received a 0.75 probability of punishment during training to determine if the self-reinforcing response could be acquired with a probability less than 1.00. The procedure included the pulsating white light, white noise provided a masking sound. Reinforcement was 3.5 seconds access to mixed grain; punishment was immediate withdrawal and a 30-second blackout. Training is summarized in Table 1. During testing it was assumed that once a subject dropped below 40% self-reinforcement, it never exceeded 40% self-reinforcement again (this was supported). Some of the results included: 1) During testing an 0.75 probability of punishment for trans- gression was more effective than the lower probabilities (.50, .25, and .00) in maintaining the key-peck. 2) A more stringent training criterion (100% self-reinforcement) was more effective in maintaining the self-reinforcing response than a lenient training criterion (94% self-rein- forcement). 3) When punishment for transgression does not occur during 14 Table 1 Summary of Method Parameters, Caplan Study Group Pretraining Training Testing_ I magazine training; 1.00 prob. pun. 0.00 prob. shape; 10 days FRl 2 days X.: 3 II " " 0.25 prob. III " ” 0.50 prob. IV " " 0.75 prob. V " " 1.00 prob. VI " 1.00 prob. pun. 0.00 prob. 2 days at 0 VII " " 0.75 prob. VIII " FRl for 15 days 0.00 prob. IX " 0.75 prob. pun. - X magazine training; 1.00 prob. pun. 0.00 prob. 10 days free food; 2 days X :_3 shape; 10 days FRl training, self-reinforcement is reduced during testing. 4) A free-food history resulted in an initial tendency to reduce the percent of self-reinforcement in testing. 5) There was no correlation between number of days to training criterion and number of days to testing criterion. The inverse of these results can also be stated, e.g., the second might be rewritten as a more stringent training criterion helped to reduce transgressions in testing. In his discussion Caplan adresses Skinner's (1953) question as to "whether the self-generated consequence has any strengthening effect upon the behavior which precedes it", and asserts that his 15 results indicate that it does not. Caplan asserts that his data for self-reinforcement more closely resemble extinction curves and that "the response which is reinforced in the testing phase is that of transgression". The transgression curves are acquisition curves. He provides two explanatory theories. In a Premackian view, the study involved the relationship between two responses: pecking and transgression. Initially transgressing is the more probable response; punishment for transgression training reverses these probabilities. In testing, the two responses return to their initial probability relationship; this return can be retarded by continued punishment, increased severity of training criterion, and increased probability of punishment in training (1.00 is a necessary condition), among other factors. Caplan also discusses the results in terms of elicitation theory (Denny, 1971). During pretraining the grain elicited an eating response. As the magazine was contiguous to the food temporally and physically, the magazine came to elicit an approach response. During magazine training and key-peck shaping, the click of the magazine came to elicit approach to the key. During training both the magazine and the key were present at the start of a trial; two competing approach responses were present. But approach to the magazine was punished whereas approach to the key was followed by grain access. Thus, the eliciting value of the key increased to the point at which the pigeon approached the key rather than the magazine. The testing phase shall be discussed only for the groups with 0.00 probability of punishment. During testing, the punishment for 16 approach to the magazine was removed. When the punishment was removed the subject either approached the magazine or packed the key in order to have grain access. The key-peck response extinguished more rapidly for the group trained to the lenient criterion (magazine approach still occurred) than for the stringent criterion (magazine approach no longer occurred); the most likely explanation for this is that the subjects with the stringent criterion approached the magazine less at the start of testing. Since the response of approach to the magazine was again fellowed by grain, the magazine again began to elicit the approach response. The key began to be a less effective elicitor and the key- peck gradually extinguished. Thus the eliciting value of the magazine and of the key returned to their initial levels, ad did the proba- bilities of the responses they elicited. Caplan maintains that this elicitation analysis handles the data and does not leave the mystery of why the key-peck was not rein- forced by the freely available grain. The Mahoneijandura-Dirks and Wright Analogue Study Mahoney, Bandura, Dirks, and Wright (1974) attempted to determine whether two male Capuchin monkeys would prefer self-monitored or externally imposed systems of reinforcement. A concurrent choice procedure, in which the subject must reSpond to one operandum in order to make one of two other operanda (self-reinforcement or external rein— forcement keys) Operative, was coupled to the training procedure previously developed by Mahoney and Bandura (1972). A training cri- terion of 90% was set before introducing the choice system; during choice system training punishment for transgression was included if 17 the subject chose the self-reinforcement option. Punishment was removed during the preference test. Subject 1 quickly developed a strong preference for the self- reinforcement system and continued responding for over 1000 trials. Subject 1 also emitted more responses per reinforcement in the self- reinforcement condition than in the external reinforcement condition over the first 500 trials. Subject 2 showed a "small, consistent preference" for external reinforcement, but the findings are con- founded by a left-response bias. This does not fully account for the preference pattern because the subject gradually shifted in the self- reinforcement direction, although only to 49% preference. Subject 2 emitted more responses per reinforcement in the external reinforcement than in the self-reinforcement condition. Mahoney et a1. (1974) state the results may indicate subjects prefer a self-monitored system. The performance of subject 1 clearly supports this, and possible reasons for the lack of a clear prefer- ence in the performance of subject 2 are provided (basically it was punished much more in the self-reinforcement condition than was subject 1 - i.e., it was "dumber"). After the subjects had shown a somewhat higher transgression rate, they continued self-reinforcement at a 92% level. The authors hypothesize that the intermittent rein- forcement on an external reinforcement basis might constitute more favorable maintaining conditions than paradigms without external sup- ports. However, they refrain from conclusiveness due to the limited data. A point not made by Mahoney et a1. (1974) is that each subject preferred the condition (external reinforcement or self-reinforcement) 18 opposite to that on which it was first trained; the data are not sufficiently extensive to support more than noting this. The secondary reinforcement value of the visibility of the reinforcer in generating a preference for the self-reinforcement condition as seen in subject 1 is likewise not explored. Related “Free Food" Studies There have been a number of investigations of the effects of freely available food on an externally reinforced operant. Jensen (1963) demonstrated that rats would continue to bar-press to obtain much of the food consumed during an experimental session; Neuringer (1969) extended these findings to pigeons. Singh (1970) obtained similar results demonstrating that rats prefer reinforcement obtained by means of a barbpress over noncontingent reinforcement which was programmed at the same density as the contingent reinforcement. Carder (1972) observed maintenance of the bar-press operant for a liquid food reinforcer but not for water reinforcer. Knutson and Carlson (1973), however, found that the operant was maintained in the presence of both free food and free water. Sawisch and Denny (1973) replicated Neuringer's results and additionally observed, in accord with Premack's reinforcement principle, that availability of the key (high probability response) could reinforce both the eating and non- eating of free food. Tarte (1974), however, found that the presence of free food caused the diminution of extinction responding. Hothersall, Huey, and Thatcher (1973) found rats to show a preference for free food, and that preference increased further when more than one response was 19 required to produce a food pellet. They interpret these results as contraindicating a generalized tendency in rats to prefer response- produced to free food. Alferink, Crossman, and Cheney (1973) sug- gested that continued responding was an artifact of the design, that is, a function of the conditioned reinforcing properties of the hopper light. Powell (1974) found that neither of two species of rats responded appreciably for water when free water was available but that crows showed appreciable responding for food in the presence of identical free food. He suggests that type of reinforcer and species are variables which significantly influence this phenomenon. Summary The experimental study of self-reinforcement with non-human sub- jects is feasible, and punishment is an effective procedure for shaping the self-reinforcing response. The self-reinforced behavior is typically the key-peck response, and the key—peck typically is main- tained over hundreds of trials after training the self-reinforcing response. Whether this resistance to extinction is due to the rein- forcing effects of self-administered consequences is, however, open to question. The results of the various "free food" studies further complicate the data. Bandura and Mahoney (1974) and Caplan (1974) have demonstrated a high probability of punishment during testing to be necessary to sustain a high self-reinforcement rate. Only Caplan indicates that his data suggest that self-administered consequences do not strengthen the preceding behavior. Both the Mahoney-Bandura (1972) study and the Bandura-Mahoney 20 (1974) study included some examination of the effect of various schedules of reinforcement. Neither study approached this examination systematically, but the limited data of both studies indicated the absence of a partial reinforcement effect. The purpose of this study is to investigate whether self-admin- istered consequences have reinforcing effects by comparing systemati- cally groups trained on continuous and fixed ratio 5 schedules of self-reinforcement in both free-food and no-food testing conditions. The finding of a large and statistically significant partial rein- forcement effect on number of responses, number of sessions, number of trials, and/or number of self-reinforcements to extinction would support the assertion that self-administered consequences do have reinforcing effects. No differences or differences in the opposite direction would indicate that self-administered consequences do not have reinforcing effects. The finding of an interaction in which the partial reinforcement effect is found in the no-food testing condition but not in the free-food testing condition might indicate that the partial reinforcement effect is an artifact of the traditional operant no-food extinction condition. CHAPTER III METHODOLOGY Subjects The first sixteen of nineteen experimentally naive white Carneaux pigeons to acquire the key—peck response by means of shaping by suc- cessive approximation were randomly assigned to four groups of four pigeons each. The subjects were maintained at approximately 80% of their free-feeding weights and were run at approximately 80 :_2%. The subjects were housed in individual home cages under conditions of constant illumination where they had free access to water and grit. Apparatus One Lehigh Valley Electronics pigeon chamber was used. The left key was constantly illuminated by a green light; the right key was constantly dark. A house light located at the top of the chamber was illuminated during each trial. A light was present in the food magazine, which was equipped with a photocell to monitor feeding behavior. A Grason-Stadler Model 9013 noise generator, set on white noise, provided a masking sound. All operations were controlled by electro-mechanical programming equipment. 21 22 Procedure Pretraining - Groups I-IV. Following magazine training, the sub— jects were shaped by successive approximation to peck the key and given 50 reinforced trials. The next session involved introducing a lO-second blackout of the houselight between trials. During the final pretraining session white noise was introduced. During a trial each key-peck was reinforced; each key-peck raised the grain hopper from the lowered position and the subject was allowed 3.5 seconds access to mixed grain reinforcement. At the end of the 3.5 second interval, the hopper was dropped and the lO-second blackout began. Each session was comprised of 50 reinforcements. Training - Groups II and IV: conditionigg the self-reinforced response withypunishment for transgression on a continuous reinforce- ment schedule. Each trial began with the illumination of the chamber by the houselight and with the hopper in the raised position with grain visible. If the subject first responded on the key prior to placing its head into the food magazine, it received 3.5 seconds access to the mixed grain. If the subject placed its head into the food magazine without pecking the key, and thus transgressing, the grain hopper was withdrawn without allowing access to the grain and the houselight went off for 3.5 seconds. The probability of punishment for transgression was 1.00. A lO-second blackout followed each trial. Each session terminated after 50 reinforcements. Only one response on the key was necessary for reinforcement; this is continuous reinforce- ment, hereafter termed FRl. Training continued until each subject transgressed less than or equal to a mean of four times in two con- secutive sessions. 23 A probable outcome of this phase is the extinction of all responding. If the subject attempted to feed immediately at the onset of the trial - a very high probability response - and if this behavior continued, the subject will not learn the dependency between the self- reinforced response and reinforcement. The problem then is one of getting the subject to respond first to the key. Mahoney and Bandura (1972) solved this problem by gradually moving hopper presentation forward temporally in the pecking sequence until the hopper was pre- sented before the subject emitted any response; this, however, involves a degree of artistry making replication difficult. Caplan (1974) solved the problem by giving each subject 10 days of training in the previous phase; this, however, lengthens the total time required for the experi- ment. Therefore a new procedure was developed in which the hopper was covered with a piece of cardboard on alternate trials until after the self-reinforced response occurred. This procedure was employed in the first two sessions and then gradually faded out. The advantages of this procedure are that it is effective in eliminating extinction, does not require many sessions, and is readily specified. Trainigg - Groups I and III: conditioning the self-reinforced response with punishment for transgression on a fixed rations rei - forcement schedule. Using parameters identical to those described for Groups II and IV, the subjects were trained to self-reinforce. How- ever, rather than proceeding to criterion after fading out use of the card to prevent direct hopper approach, the work criterion was increased to 3 key-pecks per self-reinforcement. The card was rein- troduced on alternate trials for two sessions and was again faded out. The criterion then was increased to 5 key-pecks per self-reinforcement, 24 again reintroducing the card over the hopper on alternate trials and gradually fading it out. This is a fixed ratio 5 schedule of rein- forcement, hereafter termed FR5. A transgression was defined as the subject placing its head into the hopper prior to pecking the key 5 times. Training continued until each subject transgressed less than or equal to a mean of four times in two consecutive sessions. Testing - Groups 111 and IV. These groups were switched from a 1.00 probability of punishment for transgression during training to a 0.00 probability, i.e., free food. All other parameters remained the same as in training. Testing continued for each subject until a criterion of 0 self-reinforcements in a session was reached. Testing - Groups I and II. The food hopper was no longer pre- sented to these groups during testing. Each session was defined as 50 trials separated by a lO-second inter-trial interval. A trial was defined as the onset of the houselight and a lO-second period of illumination. Testing continued for each subject until a criterion of 0 key-pecks in a session was reached. The method parameters are presented in summary form in Table 2. 25 mpnmpwm>m mpampwm>m mpampwm>m mpnmpwm>m uoow mmgm "mcwpmmu woo» out; "mcwummp uoom o: umcwpmwp uoow o: "mcwpmmu e mcorMmmcamcmcu mcorMmmgmmcmcu meoPMmogmmcmgu mcoPMmmgmmcmcu e v.m mzmu N An e v.M mace N An v v m.ma~u N An a v.M mace N An mmmuum spasm op mmmoum :wmga op mmmuom cwmgm op mmmuum :wmgu op xumn xmx P am mxoma am; m Am xuma am; P Am mxoma xmx m Am “wwgmawco newcmuwcu umwgmuwgu "mwgmuwgo newsgmwcaa pcmscmwcsa pcmsgmwcaa pcmssmwcza zuwpwnmaoga muwpwnmnoga xpwpmnmnocn xuwppamaoga oo._ "mcwcwazp oo.P “meacwatp oo._ ”mcwewatp oo.F "mcwcwacu m Apmav Apxav Apmav Apmav gown xux mamcm xumg »U¥ macaw gown zmx mamcm xuma avg mangm N mcvcwmcu ocw~mmme mcwcwmcp wcw~mmme mcwcvogu chNmmme mcpcwogu m=w~mmme F e asogu m msogo N amocw P awogo macaw mcmmemgma togumz mo >ngE:m N wPQMP CHAPTER IV RESULTS Training It took a greater number of sessions and reinforcements to train subjects to criterion in the FR5 schedule than in the FRl schedule. Similarly, a greater number of responses, punishments, and trials was observed in the FR5 condition than in the FRl condition. Data includ- ing mean number of sessions, mean number of responses, mean number of reinforcements, mean number of punishments, and mean number of trials are presented in summary form in Tables 3-7. No statistical tests were performed on these data to determine group differences. Table 3 Mean Number of Sessions to Acquisition of the Self-Reinforcing Response for Groups I-IV Schedule Testipngondition FRl FR5 no food 23.75 33.00 free food 14.00 24.75 26 27 Table 4 Mean Number of Key-Peck Responses to Acquisition of the Self-Reinforcing Response for Groups I-IV Schedule Testinngondition FRl FR5 no food 1,891.75 10,942.00 free food 1,401.75 7,120.50 Table 5 Mean Number of Reinforcements to Acquisition of the Self-Reinforcing Response for Groups I-IV Schedule Testing_Condition FRl FR5 no food 1187.50 1650.00 free food 700.00 1242.00 Table 6 Mean Number of Punishments to Acquisition of the Self-Reinforcing Response for Groups I-IV Schedule Testing Condition FRl FR5 no food 299.50 490.50 free food 156.50 271.25 28 Table 7 Mean Number of Trials to Acquisition of the Self-Reinforcing Response for Groups I-IV Schedule Testing Condition FRl FR5 no food 1487.00 2140.50 free food 856.50 1513.25 . Figures 1-4 show individual learning curves for all subjects by group. Data are presented on sessions after complete removal of the card which was used to block entry to the food hopper. (Data plotted are number of transgressions by sessions and thus are extinction CUY‘VES . 29 A.e pounazm ._ asogm n :fim mm pumnnam .p aaogm u mflm “N uumnnam .p aaogm u Nam up uumnaam ._ azogm u HHmv .Auoom o: .mmuv F azogw "mmcoamwg acpucoecwmg.$_mm 6:» we cowuwmvzaum cu commmmm can mcowmmmgmmcmgp mo gmnsaz ._ mgzmwm mzonmmmm om mu ON m" oH m a om mg OH m. a on m” o“ m H ptEFrELFEhPEFPEFPEFPEFJ_ . rEFELFELFEFtEFr. "FEF: I!) H ‘ TmN ISNOISSBUOSNVUL :Hm mam NHm Ham A.¢ pumnnam .N asogm u :Nm Hm uomnaam .N nzogm n mNm HN uumnnam .N guano n NNm Hp pumnnam .N asogm u Hva .Auoom o: .Hmmv N naocu Homcoqmmg mcpugomcwmguwpmm mg» we cowuwmpacum op :owmmmm emu mcowmmmgmmcogp mo consaz .N «Lamp; monmmmm 9: mm on mN ON mH oH m H mH oH m H mH 0H m H oH m H rrcphtcLktcprchrrchhrtcphrcLhrrH cprr:pr::ppr. EuphrcprrchrH certECFrv om I I!) N SNOISSHHOSNVHI :Nm mum NNm HNm mm cm mm 31 pumwnzm .m asogm u mmm m azogo HN uuawnzm .m Ozogm u Nmm "mmcoammg mcwugomcvmgu$Hmm as» mo coPpOmwacuO o» cowmmmm LOO mcowmmmg mN om .2” E. , 3mm : an H HOOOOOO .O m H .e pumnnam .m Osocm u :mm monmmmm OH OH O H E. mmm Nmm .m H rcpr Hmm m OOEO u Hmmv..MOOOO Omgc .OOOV mcmcu we Logan: OH OH ON ON ON OO O: O: OO mm OO mm .O OEOOHO SNOISSBHOSNVHL 32 A.¢ pumnnam .e aaosm u ::m «N «own loam .O asogm u OOm HN pumnazm .e ansm n New Hp pumnnam .O qaogm u Hemv .muooe mos» .Hamv e Osage mecoammg mcwugoycpmgumpmm 6gp eo cowupmwzaua op cowmmwm can meowmmmg mecca mo consaz .O mcamwm monmmmm OH OH O H OH OH O H m H m H :chchrrcth_. _hr:LpbrccphJ ELM“ c:*% I SNOISSBHOSNVHI OOm New New Hem r mm om mm 33 Testing The presence of freely available food resulted in a significantly greater number of sessions and number of trials in testing when all four groups were compared; the reinforcement schedule had a non- significant effect. The free-food variable also increased responding in testing; the reinforcement schedule again had a nonsignificant effect on this variable. Data including mean number of sessions, mean number of responses, and mean number of trials are presented in summary form in Tables 8, 9, and 10. Analysis of variance tables on number of sessions and number of responses are presented in Tables 11 and 12. Table 8 Mean Number of Sessions to Extinction of the Self-Reinforcing Response for Groups I-IV Schedule Testing Condition FRl FR5 no food 3.75 5.50 free food 14.50 10.75 Table 9 Mean Number of Key-Peck Responses to Extinction of the Self-Reinforcing Response for Groups I-IV Schedule Testing Condition FR1—' FR5 no food 216.50 636.50 free food 1522.75 2570.60 34 Table 10 Mean Number of Trials to Extinction of the Self—Reinforcing Response for Groups I-IV Testing Condition no food free food Schedule FRl FR5 187.50 275.00 725.00 537.50 Table 11 Number of Testing Sessions to Extinction of the Self-Reinfbrcing Response for Groups I-IV - ANOVA Source of Variation Sum of Squares DF Mean Square F yp_ Food 2565000 1 256.000 5.673 .033 Schedule 4.000 1 4.000 .089 NS Food x Schedule Interaction 30.250 1 30.250 .670 NS Residual 541.500 12 45.125 Total 831.750 15 55.450 Table 12 Number of Key-Peck Responses to Extinction of the Self-Reinforcing Response for Groups I-IV - ANOVA DF Mean Square F ,p_ Source of Variation Sum of Sguares Food 10,499,220.063 Schedule 2,154,290.063 Food x Schedule Interaction 394,070.750 Residual ll,729,019.750 Total 24,776,599.937 1 10,499,220.063 10.742 .007 1 2,154,290.063 2.204 .161 2 394,070.750 .403 NS 12 977,418.312 15 1,651,773.329 35 ' It was possible to compare the effect of reinforcement schedule on number of reinforcements and number of transgressions in testing for the groups in the free-food condition (Groups III and IV). The rein- forcement schedule had no significant effect on these variables. Data including mean number of reinforcements, mean number of transgressions, and the statistical probabilities of t-test differences of these values are presented in Table 13. Table 13 Mean Number of Reinforcements and Mean Number of Transgressions to Extinction of the SelfOReinforcing Response for Groups III and IV Group III GroupyIV t pp Mean number of reinforcements 298.75 302.25 .03 .976 Mean number of transgressions 238.75 422.75 .78 .466 Figures 5-8 show individual learning curves for all subjects by group. Data plotted are number of responses and number of reinforce- ments by sessions and thus are extinction curves. 36 A.e nuannam .H qaogm u OHm Hm pumwaam ._ Ozocm u mHm HN pumwazm .H azogm u NHm HH pumnnam .H Ozogm u HHmv .Auoom o: .mqu H angw "Oncogmwg xuma-»«x ms» yo :owuucwuxm o» cowmmmm gma mmmcoammg xumguxux mo gmnszz .m mesmmm monmmum 0H m H m H O“ m a cu m a pHHHH I OCH 1 omH I can fi 1 OmN I com SBSNOdSHH I omm 1 co: :am mum Nam Ham om: r oom 37 . A.e pumnnam .N azogm u :Nm Hm pumnnsm .N ancu u ONm HN «umwnam .N Ozogm u NNm HP Humnnam .N Ozogm u HNmV .Auoom o: .Hmuv N Ozogo "mmcoamwg xumnuawx mgu mo cappucwuxm o» copmmmm emu mmmcoammg Nomauxdx mo LmaEOz .o mesmwm monmmmm m H m H OH O H m H 1 2: 1 cm.“ .. ooN SHSNOdSBH . OmN 1 OOO 1 OOO ONm NNm HNm :Nm r OOO om: com 38 A.v Hemneem .m queen u Omm Hm pemneem .m eaegm u Omm HN newneem .m neegm n Nmm HP Hemneem .m Oeeem u Hmmv .Aeeem mace .mamv m neecw "emceemeg mcwegeecmecum—NO esp me cepaecwexm e» eeOOOeO Lee OHcmEeegeOOOOLINHOO we geese: ece :ePOOeO gee OmOceeOec xemeuxux me geesez .N mgamwm monmmmm OH O H OH OH O. H ON OH OH O H O H 8 fl” a - OO .6 0 m I sea 3 S u OOH w p r OON ”a a- 4.... r omN no 3 m I com I... 0 w ..__ 0mm m 3 I N Omm mmm Nmm Hmm OOO M r OOO r OOO Oucmsmusewcwemumpem . Ommceemmm _+n|.+ 39 A.e uouneem .e eeegm n OOO Hm HumneeO .e geese u OOO HN peeneem .e eeegm n NOm HF «umneaO .e eeegm u HOmv .Aeeem emLO .Hmuv e neegc HeOeeeOmg mcwugemcpmsumHeO esp we cewuecwpxm ea ceOOOmO see OHONEOOLechegueHmO me geezec use :eOOOmO Lee OeOceeOmg Numeuamx we geezez .m msemmn monmmmm OO OO ON ON OH OH O H OH O .H OH O H M OO c. .a 1 mm . OOH c. 3 S I OOH e U D- u OON c. .11. .II : OON as “a .l 3 OOO mm 4. z nu OOO .w 3 1 OOO mm NO HO O: O: O O O O m I 5 OOO r OOO Ouemeeugemcwmm-eHem OmOceeOem +.|u+ 40 Relationships between Training and Testing Variables None of the relationships between training and testing variables for Groups III and IV reached the a §_.05 level of significance; in only two cases were the relationships significant at the a 5_.10 level. For groups I and II one such relationship was significant at the a 5_.05 level; increasing the level to .10 did not include any other relationships. Thus, there appeared to be little relationship between training and testing variables. Data including number of sessions, number of responses, number of reinforcements, number of punishments, and number of trials during the training phase were correlated with number of sessions, number of responses, number of reinforcements, and number of transgressions during the testing phase for Groups III and IV and are presented in Table 14. Data including number of sessions, number of responses, number of rein- forcements, number of punishments, and number of trials during the training phase were correlated with number of sessions and number of responses during the testing phase for Groups I and II and are presented in Table 15. Table 14 Pearson Product-Moment Coefficients of Correlation Showing Relationships between Training and Testing Variables for Groups III and IV Training . ' Testing Variables Variables Sessions Responses Reinforcements Transgressions Sessions' r = -.43 r = .00 r = -.19 r = -.50 Responses r = -.17 r = .40 r = .04 r = -.25 Reinforcements r = -.42 r = .00 r = .19 r = -.50 Punishments r = -.47 r = -.16 r = -.28 r = -.52* Trials r = -.44 4 = -.03 r = -.21 r = -.51* * p <.10 Summary 41 Table 15 Pearson Product-Moment Coefficients of Correlation Showing Relationships between Training and Testing Variables for Groups I and II Training Testipg Variables Variables Sessions Responses Sessions r = -.05 r = .38* Responses r = .45 r = .71 Reinforcements r = -.05 r = .38 Punishments r = -.18 r = .49 Trials r = -.09 r = .42 *p <.025 The results may be summarized as follows: 1. Training in the FR5 schedule took somewhat longer than in the FRl schedule. The free food testing condition increased number of responses, number of sessions, and number of trials to extinction over the no food testing condition. No effect due to schedule was found for any of the above variables. No effect due to schedule was found for number of reinforce- ments in testing in the free food condition. No effect due to schedule was found for number of transgres- sions in testing in the free food condition. No interactions were found. No relationship between training variables and testing vare iables was found. The major finding of this research, as indicated by statements 3, 4, and 5 above, is the absence of any effect due to reinforcement 42 schedule. The hypothesis that a partial reinforcement effect would be found was not supported. CHAPTER V DISCUSSION In discussing self—reinforcement, both Skinner (1953) and Catania (1975) have noted the questionable status of the self-reinforcing response; both suggest that the question to be answered is whether the consequence strengthens the preceding behavior. In the present study Bandura's (1971) two requirements for self-reinforcement, that the reinforcers are freely available and that the organism has access to the reinforcers independently of responding, are met; the results suggest that the selfeadministered consequence does not strengthen the preceding behavior. The data presented in Figures 7 and 8, showing self-reinforcement in testing, resemble extinction curves. If the key-peck response were being strengthened by grain access, it would show an increase from the 92% training criterion rather than a decrease in occurrence. Further, pecking should then maintain at approximately 100% self-reinforcement. The extinction of the key-peck response precludes an interpretation of self-reinforcement increasing response strength, i.e., reinforcing. The response which is strengthened in testing is that of transgressing. The results of the present study shed some light on the issue of automaticity of reinforcement. The assumption of automaticity, usually only implied, asserts that any response followed by a rein- fercing event will be automatically strengthened. The issue seems to 43 44 be one of contingency versus temporal contiguity; results obtained support Harrison and Schaeffer's (1975) statement that temporal con- tiguity is not a sufficient condition for reinforcement. The results of the present study also bear on the various free fead studies which have been reported (e.g., Neuringer, 1969). Neuringer stated that responding for food appears to be a "natural part of the behavior of animals". Similar to Caplan's (1974) data, the results of this study do not support this conclusion. Neuringer's results are apparently limited to a particular set of conditions; the key-peck may need to produce food or an environmental stimulus change in order to be maintained. What does seem to maintain the self-reinforcing response is con- tingent punishment; according to the research of Bandura and Mahoney (1974) and Caplan (1974) this punishment must occur with a probability greater than .50. That intermittent punishment of less than .50 probability is ineffective in maintaining the self-reinforcing response may be accounted fer by the natural eliciting value of the freely available reinforcer. During pretraining, the grain elicited an eating response. Via a "backchaining" process (Denny, 1970) approach to the key was elicited. During training both the magazine and the key were present at the start of a trial, generating two competing approach responses. How- ever, approach to the magazine was punished whereas approach to the key resulted in grain access, and this set of contingencies increased the eliciting value of the key over that of the magazine. During testing, contingent punishment was removed; approach to the magazine or the key resulted in grain access. In the absence of contingent 45 punishment the eliciting values of the key and of the magazine returned to their original levels as did the probabilities of the responses they elicited: the key became a less effective elicitor ‘and the key-Peck extinguished. In other words, the contingency rather than the contiguity seems to be the critical variable in medi- ating the acquisition and maintenance of the key-peck response. The elicitation analysis (Denny, 1971; Caplan, 1974) accounts adequately for the results of the present study, whereas traditional reinforcement theory does not. A complementary analysis, with spe- cific reference to the self-reinforcement phenomenon, has since been presented by Catania (1975) and will be reviewed here. In comparing reinforcement with self-reinforcement, reference to Figure 9 facilitates understanding. In A, the phenomena appear dif- ferent because food is continuously present in self-reinforcement but is present only after pecks in basic reinforcement. In B, the difference vanishes when food is replaced by the opportunity to eat, which occurs only after a peck in both situations. In C, the analysis preferred by Catania, reinforcement and self-reinforcement again appear different: feeder approach comes under the control of food presentation in reinforcement but is under the control of the subject's own prior behavior, i.e., whether or not pecking has occured, in self; reinforcement. Thus, the essence of the so-called self-reinforcement paradigm is not that the grain itself nor that the opportunity to eat maintains the key-peck, but that the key peck provides a discrimina- tive stimulus for eating. This might, Catania suggests, more accurately be called self-discrimination or self-awareness. 46 REINFORCEMENT "SELF-REINFORCEMENT" A. Pecks I ' 1 Food _____I‘—-1____— l 1 Eating ______I"'1_____ _____J-—1______ B. Pecks _1 I. ”232:?” ___,——,_ n Eating ___J—']__ .n C. Pecks 1 _J Food _____I"“1_____ i L____‘ Feeder Approach ' _J‘]_ _____f-L__I-L___ Eating J“] I-1A Figure 9. Comparison of reinforcement and self-reinforcement (from Catania, 1975). Implications for Further Research The results of one study with a relatively small sample size can best serve as a stimulus to further research. While there is always the need for replication, further experimentation along two distinct lines is suggested by this study. The first line of extension involves the manipulation of various stimulus and response variables. First, various schedules of reinforcement should be compared. A comparison of key-pecking and treadle-presSing as self-reinforced responses would allow separating the similarities between the key-peck and grain consumption from the variable of interest, the self-reinforcing response. Seligman's (1975) theory of learned helplessness, and his contributions in the area of 47 experimental design, suggests the investigation of prior response- independent and response-dependent food on self-reinforcement. The use of electric shock instead of or in addition to fbod withdrawal should provide data on the effects of punishment during acquisition of the selfereinforcing response. Finally, a group design investigation of preference for self-reinforcement or external reinforcement employing a concurrent choice procedure is indicated. The second line of extension involves the exploration of which species will perfOrm which responses in order to obtain which rein- forcers. To date, all studies involving non-human subjects have used some form of food as the reinforcer. None of the studies have in- volved responding for a secondary reinforcer. Investigations should be extended up the phylogenic scale towards man; while some efforts have been made in this direction (e.g., Bandura and Mahoney, 1974; Mahoney, Bandura, Dirks, and Wright, 1974), a more systematic approach is needed, as cross-species generalizations must be made guardedly. It should be noted that research in the area of self-control is now at a point at which comparative studies of the behavior of groups of subjects are indicated. Whereas previous studies reporting the behavior of two or three subjects stimulated interest and further investigation, such studies now serve to confound and mask measures of interest. For example, given the great variability observed within groups in this study, it would be possible to choose two subjects whose data would support the assertion that self-administered conse- quences do have reinforcing effects or to choose two subjects whose data would contradict that assertion. While N of 1 research metho- dology suggests the use of the ABAB design, this design has not been 48 extensively employed in investigating self-control phenomena, perhaps because it prolongs the time necessary to complete the research. How- ever, the eXpenditure of time and effort is prerequisite to meaningful research in this area. The simple observation that a phenomenon persists over a period of time, as in the Mahoney-Bandura (1972) study, or the use of designs that do not adequately control confounding var- iables, as in the Bandura-Mahoney (1974) study which did not control history and the Mahoney, Bandura, Dirks, and Wright study (1974) which did not provide for meaningful comparisons, is no longer tolerable in research in this area. Human Implications While the results of the present study indicate that for pigeons the key-peck response was not reinforced by noncontingent reward, the same might not be true for humans: while the key-peck response was reinforced by the specific stimulus of mixed grain, the responses of the human subject are likely to be followed by the generalized rein- forcer of social approval in addition to any specific reinforcing stimulus. However, it may be the self-discrimination which Catania hypothesizes in combination with this generalized social reinforcement which maintains the self-reinforced response, with the specific rein- forcing stimulus being an extraneous, noncontributory variable. Mahoney and Thoresen (1974) have pointed out that a broad par- allel exists between principles governing infrahuman behavior and those governing human behavior. Thus, the same basic phenomena should be operative: in this instance the effects of temporally contiguous consequences should be similar for both pigeons and humans. Following 49 Catania's (1975) laboratory analysis and generalization to human behavior, the learning of the self-discrimination is the Crucial factor in accounting for the self-reinforcement phenomenon. This, then, would logically be the focus of counseling or psychotherapy. Catania concludes with the proposition that self-discrimination is more likely to be taught effectively if those who teach it recognize it for what it is. Both the selfemonitoring (SM) and selfeevaluation (SE) components of Kanfer's (1971) model of self-regulation would constitute self-dis- crimination. There is an adequate laboratory basis to substantiate both the existence and the various learning conditions relevant to teaching these behaviors efficiently. It remains for counselors and psychotherapists to apply the technology effectively. A fairly common counseling concern which lends itself to such an analysis in either laboratory or field research is the development of effective study skills. A two-group design could compare the effective- ness of a fairly typical self-reinforcement strategy with a strategy teaching the self-monitoring of study behaviors and the self-discrimi- nation of the performance of such behaviors. Counseling interviews should be tape recorded in order to analyze and control such extraneous variables as external, i.e., counselor, reinforcement. The results of this study would indicate that there should be no differences between the two treatment packages. While it is suggested that research concentrate on fairly discrete behaviors at this time, it is by no means implied that self-control strategies are limited to the modification of such behaviors. This suggestion is made to insure that research efforts may yield relatively 50 clear data. In counseling/psychotherapeutic practice, much more com- plex stimulus situations and behaviors may be considered within this self-control paradigm. In conclusion, the present study suggests that the self-reinforcing response does not behave as if it were a typical operant. A freely available reinforcer following a response does not strengthen the response. Temporal contiguity between a response and a reinforcer is not a sufficient condition to maintain a response; a contingency may be required. This casts doubt on the automaticity assumption. Gen- eralizations to the human situation must be made somewhat guardedly. although basic processes are seen as constant across species. Further experimentation, particularly in terms of applying Catania's model to the counseling/psychotherapy situation, is indicated. REFERENCES REFERENCES Alferink, L.A., Crossman, K., and Cheney, C.D. Control of responding by a conditioned reinforcer in the presence of free food. Animal Learning and Behavior, 1973, 1, 38-40. Bandura, A. Vicarious and self-reinforcement processes. In: R. Glaser (Ed.), The nature of reinforcement. New York: Academic Press, 197T} Bandura, A. and Mahoney, M.J. Maintenance and transfer of self- reinforcement functions. Behavior Research and Therapy. 1974, 12, 89-97. Bolstad, 0.0. and Johnson, S.M. Self-regulation in the modification of disruptive classroom behavior. Journal of Applied Behavior Analysis, 1972, 5, 443-454. Caplan, H.J. Determinants of flpre-moral" development in pigeons. Unpublished doctoral dissertation, Michigan State University, 1974. Carder, B. Rats' preference for earned in comparison with free liquid reinforcers. Egychonomic Science, 1972, 26, 25-26. Catania, A.C. The myth of self-reinforcement. Behaviorism, 1975, .§(2), 192-199. Denny, M.R. and Ratner, S.C. Comparative psychology: research in animal behavior. Homewood, Illinois: Dorsey Press, 1970. Denny, M.R. A theory of experimental extinction and its relation to a general theory. In: H.H. Kendler and J.T. Spence (Eds.), Essays in neobehaviorism: a memorial volume to Kenneth W. Spence. New York: Appleton-Century-Crofts, 1971. Elson, S.E. Determinants of self-reinforcement in igeons. Unpub- lished manuscript, 1973 (available from authorN. Gewirtz, J.L. The roles of covert responding and extrinsic reinforce- ment in "self"- and “vicarious reinforcement" phenomena and in "observational learning" and imitation. In: R. Glaser (Ed.), The nature of reinforcement. New York: Academic Press, 1971. Goldiamond, I. Self-control procedures in personal behavior problems. Psycholpgical Reports, l965,_17, 851-868. 51 52 Hannum, J.W., Thoresen, C.E., and Hubbard, D.R. A behavioral study of self-esteem with elementary teachers. In: M.J. Mahoney and C.E. Thoresen (Eds.), Self-control: power to the person. Monterey, California: Brooks/Cole Publishing Co., 1974. Harrison, R.G. and Schaeffer, R.W. Temporal contiguity: is it a sufficient condition for reinforcement? Bulletin of the Psycho- nomic Society, 1975, 5(5), 230-232. Homme, L.E. Perspectives in psychology: XXIV. Control of coverants, the operants of the mind. Psychological Record, 1965, 15, 501-511. Hothersall, 0., Huey, D., and Thatcher, K. The preference of rats for free or response-produced food. Animal Learning and Behavior. 1973, 1(4), 241-243. Jeffrey, D.B. Self-control: methodological issues and research trends. In: M.J. Mahoney and C.E. Thoresen (Eds.), Self;control: ppwer to the person. Monterey, California: Brooks/Cole Publish- ing Co., 1974. Jensen, G.D. Preference for bar pressing over "freeloading" as a function of number of rewarded presses. Journal of Experimental Psychology, 1965, pp, 451-454. Kanfer, F.H. The maintenance of behavior by self-generated stimuli and reinforcement. In: A. Jacobs and L.B. Sachs (Eds.), ng psychology of private events. New York: Academic Press, 1971. Kanfer, F.H. and Karoly, P. Self-control: a behavioristic excursion into the lion's den. Behavior Therapy, 1972, 3, 398-416. Kazdin, A.E. Self-monitorin and behavior change. In: M.J. Mahoney and C.E. Thoresen (Eds.T, Self-control: ppwer to thepperson. Monterey, California: Brooks/COTe Publishing Co., 1974. Knutson, J.F. and Carlson, C.W. Operant responding with free access to the reinfOrcer: a replication and extension. Animal Learningyand Behavior, 1973,_1(2), 133-136. Mahoney, M.J. Research issues in self-management. Behavior Therapy, 1972, §, 45-63. Mahoney, M.J. and Bandura, A. Self-reinforcement in pigeons. Learn- ing and Motivation, 1972, 3, 293-303. Mahoney, M.J., Bandura, A., Dirks, S.J., and Wright, C.L. Relative preference for external and self-controlled reinforcement in monkeys. Behavior Research and Therapy, 1974, 12, 157-163. 53 Mahoney, M.J. and Thoresen, C.E. (Eds.), Self control: _power to the erson. Monterey, California: Brooks/Cole Publishing Co., I974. Mowrer, 0.H. and Ullman, A.D. Time as a determinant in integrative learning. Psychologigal Review, 1945, 52, 61-90. Neuringer, A.J. Animals respond for food in the presence of free food. Science, 1969, 166, 399-401. Paul, G.L. Behavior modification research: design and tactics. In: C.M. Franks (Ed.), Behavior therapy: appraisal and status. New York: McGraw Hill, 1969. Poulos, C.X. and Gormezano, I. Effects of partial and continuous reinforcement on acquisition and extinction in classical appeti- tive conditioning. Bulletin of the Ppychonomic Society, 1974, 1(3), 197-198. Powell, R.W. Comparative studies of the preference for free vs response-produced reinforcers. Animal Learningpand Behavior, 1974, 2(3), 185-188. Premack, D. and Anglin, B. On the possibilities of self-control in men and animals. Journal of Abnormal Psychology, 1973, 81, 137-151. Rachlin, H. Introduction to modern behaviorism. San Francisco: Freeman, 1970. Reese, E.P. Experiments in operant behavior. New York: Appleton- Century Crofts, 1962. Sawisch, L.P. and Denny, M.R. Reversing the reinforcement contingen- cies of eating and keypecking behaviors. Animal Learningpand Behavior, 1973, 1(3), 189-192. Seligman, M.E.P. Helplessness: on depression, development, and death. San Francisco: H.H. Freeman & Co., 1975. Singh, 0. Preference for bar pressing to obtain reward over free- loading in rats and children. Journal of Comparative and Physiological Psychology, 1970, 13, 320-327. Skinner, B.F. Science and human behavior. New York: Macmillan, 1953. Stuart, R.B. Situational versus self-control. In: R.D. Rubin, H. Fensterheim, J.D. Henderson, and L.F. Ullman (Eds.), Advances in behavior therapy. New York: Academic Press, 1972. Tarte, R.D. Extinction of rats' barpressing in the presence of free food. Animal Learning and Behavior, 1974, 2(4), 289-292. 54 Young, A.G. and Costelloe, C.A. Resistance to extinction as a function of partial reinfbrcement and external stimuli: a within-S design. Bulletin of the ngchonomic Society, 1974, 3(3A), 191-192.