negative effects of positive reinforcement

14
The Behavior Analyst 2003, 26, 1-14 No. 1 (Spring) Negative Effects of Positive Reinforcement Michael Perone West Virginia University Procedures classified as positive reinforcement are generally regarded as more desirable than those classified as aversive-those that involve negative reinforcement or punishment. This is a crude test of the desirability of a procedure to change or maintain behavior. The problems can be identified on the basis of theory, experimental analysis, and consideration of practical cases. Theoretically, the distinction between positive and negative reinforcement has proven difficult (some would say the distinction is untenable). When the distinction is made purely in operational terms, experiments reveal that positive reinforcement has aversive functions. On a practical level, positive reinforcement can lead to deleterious effects, and it is implicated in a range of personal and societal problems. These issues challenge us to identify other criteria for judging behavioral procedures. Key words: negative reinforcement, punishment, positive reinforcement, aversive control The purpose of this article is to cause you to worry about the broad en- dorsement of positive reinforcement that can be found throughout the liter- ature of behavior analysis. I hope to accomplish this by raising some ques- tions about the nature of positive re- inforcement. At issue is whether it is free of the negative effects commonly attributed to the methods of behavioral control known as "aversive." The topic of aversive control makes many people uncomfortable, and rela- tively few people study it (Baron, 1991; Crosbie, 1998). Ferster (1967) expressed the common view when he wrote, "It has been clear for some time that many of the ills of human behavior have come from aversive control" (p. 341). I believe that much of what has been said about aversive control is mistaken, or at least misleading. Aversive con- trol, in and of itself, it is not necessar- ily bad; sometimes it is good. And, more to the point, the alternative-pos- itive reinforcement-is not necessarily good; sometimes it is bad. Aversive control is an inherent part of our world, This paper is based on the presidential address delivered to the Association for Behavior Anal- ysis convention in Toronto, Ontario, May 2002. Correspondence should be sent to the author at the Department of Psychology, West Virginia University, P.O. Box 6040, Morgantown, West Virginia 26506-6040 (e-mail: Michael.Perone @mail.wvu.edu). an inevitable feature of behavioral con- trol, in both natural contingencies and contrived ones. When I say that aver- sive control is inevitable, I mean just that: Even the procedures that we re- gard as prototypes of positive rein- forcement have elements of negative reinforcement or punishment imbedded within them. DEFINING FEATURES OF AVERSIVE CONTROL It is important to be clear about the narrow meaning of aversive control in scientific discourse. A stimulus is aver- sive if its contingent removal, preven- tion, or postponement maintains behav- ior-that constitutes negative reinforce- ment-or if its contingent presenta- tion suppresses behavior-punishment. That is all there is to it. There is no mention in these definitions of pain, fear, anxiety, or distress, nor should there be. It is easy to cite instances of effective aversive control in which such negative reactions are absent. Aversive control is responsible for the fact that we button our coats when the temperature drops and loosen our ties when it rises. It leads us to come in out of the rain, to blow on our hot coffee before we drink it, and to keep our fin- gers out of electrical outlets. The pres- ence of aversive control in these cases clearly works to the individual's advan- tage. 1

Upload: luciano-lobato

Post on 21-Apr-2015

199 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Negative Effects of Positive Reinforcement

The Behavior Analyst 2003, 26, 1-14 No. 1 (Spring)

Negative Effects of Positive ReinforcementMichael Perone

West Virginia University

Procedures classified as positive reinforcement are generally regarded as more desirable than thoseclassified as aversive-those that involve negative reinforcement or punishment. This is a crude testof the desirability of a procedure to change or maintain behavior. The problems can be identifiedon the basis of theory, experimental analysis, and consideration of practical cases. Theoretically,the distinction between positive and negative reinforcement has proven difficult (some would saythe distinction is untenable). When the distinction is made purely in operational terms, experimentsreveal that positive reinforcement has aversive functions. On a practical level, positive reinforcementcan lead to deleterious effects, and it is implicated in a range of personal and societal problems.These issues challenge us to identify other criteria for judging behavioral procedures.Key words: negative reinforcement, punishment, positive reinforcement, aversive control

The purpose of this article is tocause you to worry about the broad en-dorsement of positive reinforcementthat can be found throughout the liter-ature of behavior analysis. I hope toaccomplish this by raising some ques-tions about the nature of positive re-inforcement. At issue is whether it isfree of the negative effects commonlyattributed to the methods of behavioralcontrol known as "aversive."The topic of aversive control makes

many people uncomfortable, and rela-tively few people study it (Baron,1991; Crosbie, 1998). Ferster (1967)expressed the common view when hewrote, "It has been clear for some timethat many of the ills of human behaviorhave come from aversive control" (p.341).

I believe that much of what has beensaid about aversive control is mistaken,or at least misleading. Aversive con-trol, in and of itself, it is not necessar-ily bad; sometimes it is good. And,more to the point, the alternative-pos-itive reinforcement-is not necessarilygood; sometimes it is bad. Aversivecontrol is an inherent part of our world,

This paper is based on the presidential addressdelivered to the Association for Behavior Anal-ysis convention in Toronto, Ontario, May 2002.

Correspondence should be sent to the authorat the Department of Psychology, West VirginiaUniversity, P.O. Box 6040, Morgantown, WestVirginia 26506-6040 (e-mail: [email protected]).

an inevitable feature of behavioral con-trol, in both natural contingencies andcontrived ones. When I say that aver-sive control is inevitable, I mean justthat: Even the procedures that we re-gard as prototypes of positive rein-forcement have elements of negativereinforcement or punishment imbeddedwithin them.

DEFINING FEATURES OFAVERSIVE CONTROL

It is important to be clear about thenarrow meaning of aversive control inscientific discourse. A stimulus is aver-sive if its contingent removal, preven-tion, or postponement maintains behav-ior-that constitutes negative reinforce-ment-or if its contingent presenta-tion suppresses behavior-punishment.That is all there is to it. There is nomention in these definitions of pain,fear, anxiety, or distress, nor shouldthere be. It is easy to cite instances ofeffective aversive control in whichsuch negative reactions are absent.Aversive control is responsible for thefact that we button our coats when thetemperature drops and loosen our tieswhen it rises. It leads us to come in outof the rain, to blow on our hot coffeebefore we drink it, and to keep our fin-gers out of electrical outlets. The pres-ence of aversive control in these casesclearly works to the individual's advan-tage.

1

Page 2: Negative Effects of Positive Reinforcement

2 MICHAEL PERONE

- 7~~~* ~=---r-1

Figure 1. Scripture's arrangement for studyingthe effects of the rate of change in the intensityof a stimulus. Heat from a flame (D) was trans-ferred via a ball (C) and rod (B) connected to abeaker of water (A) with a live frog inside. Ifthe water was heated slowly enough, it could bebrought to a boil without inducing the frog toescape. Scripture's caption read, "Boiling a frogwithout his knowing it." (Figure 70 from Scrip-ture, 1895)

By the same token, it is easy to citecases in which the absence of aversivecontrol is to the individual's disadvan-tage. Dramatic demonstrations are pos-sible in laboratory settings. Figure 1 il-lustrates an experiment reported over100 years ago by E. W. Scripture, di-rector of the first psychological labo-ratory at Yale. A frog was placed in abeaker of water, which was then heatedat a rate of 0.002°C per second. Scrip-ture (1895) reported that "the frog nev-er moved and at the end of two andone half hours was found dead. He hadevidently been boiled without noticingit" (p. 120). It was not Scripture's in-tention to kill the frog; his goal was tostudy rate of stimulus change in sen-sory processes. We also should forgiveScripture for his unwarranted inferenceabout the frog's awareness. For ourpurposes, it is enough to acknowledgethat this particular environmental ar-rangement clearly was not in the frog'slong-term best interest. Because thegradual rise in water temperature didnot establish a change of scenery as a

negative reinforcer, no escape behaviorwas generated. There just wasn'tenough reinforcement-negative rein-forcement-to control adaptive behav-ior.

It should be remembered that the def-inition of an aversive stimulus-or forthat matter, a positive reinforcer-isbased on function, not structure. Aver-siveness is not an inherent property ofa stimulus. It depends critically on theenvironmental context of the stimulus,and it cannot be measured apart fromthe effect of the stimulus on behavior.Consider electric shock, a stimulus soclosely associated with the analysis ofaversive control that we tend to thinkof it as inherently aversive. The erroris understandable, but it is still an error.Illustrative data come from an experi-ment by de Souza, de Moraes, and To-dorov (1984). These investigators stud-ied rats responding on a signaledshock-postponement schedule. The in-dependent variable was the intensity ofthe shock, which was varied across awide range of values in a mixed order.Responding was stabilized at each in-tensity value. Figure 2 shows the re-sults for 5 individual rats, along withthe group average. When the intensitywas below about 1 mA, the rats did notrespond much, and as a result theyavoided only a small percentage of theshocks. At intensities above 1 mA,however, the rats were more success-ful, avoiding between 80% and 100%of the shocks. By this measure, then,shocks below 1 mA are not aversive.Now consider a study by Sizemore

and Maxwell (1985), who used electricshock to study not avoidance, but pun-ishment. In baseline conditions, rats'responding was maintained by vari-able-interval (VI) 40-s schedules offood reinforcement. In experimentalconditions, some responses also pro-duced an electric shock. Sizemore andMaxwell found that shocks as low as0.3 to 0.4 mA completely, or almostcompletely, suppressed responding.The shaded bars in Figure 2 showwhere these values fall in relation to deSouza et al.'s (1984) avoidance func-

Page 3: Negative Effects of Positive Reinforcement

NEGATIVE EFFECTS OF POSITIVE REINFORCEMENT 3

100

410

40. O4O .0" 1 0S " 1" iO * ZtRAT t RAT00a

o I o

0 0.10 M0.5 .0 W04.0: .0 t c01. 510 g 4.0 tioo ~~~~~~~100-

S so

IL 40 400

20 20Wigure 20. 1020.5ficiencyoshcavineaRAT 2 * RAT W89avoidanc requird an 1.0f 40 '.0 0.1 0.2 0A 1.0 2.0 4.0 6

ioo ~~~~~~~100

40

20 ~~~~20 MAq

1.0 4~~~0.10-.2 U0.10240 ..0 0 010.20.5 1.0'2A .040 .0

SHOCK INESTImA)Figure 2. Proficiency of shock avoidance as a function of the intensity of the shock, as reportedby de Souza et al. (1984). Note that shock intensity is represented on a logarithmic scale. Reliableavoidance required an intensity of at least 1 mA in this signaled-shock procedure. The shaded barsdesignate the range of shock intensities that were successful in suppressing rats' food-maintainedresponding in a punishment procedure by Sizemore and Maxwell (1985). Shocks as low as 0.3 mAwere effective punishers.

tions. Even though a shock of 0.3 to0.4 mA was effective as a punisher,such a shock did not reliably sustainavoidance. Put another way, in a pun-ishment paradigm a shock intensity of0.3 mA is aversive, but in an avoidanceparadigm it is not aversive. The aver-siveness of a stimulus cannot be sepa-rated from environmental contingen-cies. As Morse and Kelleher (1977)

observed 25 years ago, in both punish-ment and reinforcement, contingen-cies-not stimuli-are fundamental.The fact that aversiveness is a matter

of function, not structure, is often over-looked. If you have young children inenlightened preschool settings, youmay have heard teachers say, "Wedon't use punishment here. We usetime-out." But if the time-out is con-

Page 4: Negative Effects of Positive Reinforcement

4 MICHAEL PERONE

tingent on some behavior and if it ef-fectively reduces that behavior, then itis punishment. And, by definition, themore effective it is, the more aversiveit is. But schoolteachers are not theonly ones to forget this; even behavioranalysts writing for professional audi-ences can be found to slip up. For ex-ample, an article advocating the use oftime-out in parental discipline suggest-ed that "through use of time-out, par-ents learn that punishment need not beaversive or painful."' The authors cor-rectly classified time-out as a form ofpunishment, but erred by suggestingthat it is not aversive. If time-out is notaversive, it could not possibly functionas a punisher.The verbal behavior of these authors

may be under control of some dimen-sion of the stimulus events besidestheir aversiveness-perhaps someevents are mistakenly described asnonaversive because they are aestheti-cally inoffensive, or because they donot leave welts or bruises. Granted, itmay well be that some forms of aver-sive control should be preferred overothers. Teachers and parents might beright to prefer time-out over spanking.But the justification for the preferencecannot be that one is aversive and theother is not.

POSITIVE REINFORCEMENTAS AN ALTERNATIVE TOAVERSIVE CONTROL

Some commentators have seriousreservations about any form of aver-sive control. In Coercion and Its Fall-out, Sidman (1989) worried about thenegative side effects of aversive con-trol. "People who use punishment be-come conditioned punishers them-selves. ... Others will fear, hate, and

Although this quote does come from a pub-lished source in general circulation, the sourceis not identified because no purpose would beserved other than perhaps to embarrass the au-thors. Equivalent errors are easy to find in theliterature, and it would be unfair to single outany particular error for special attention.

avoid them. Anyone who uses shockbecomes a shock" (p. 79).

This is a powerful indictment ofpunishment. But Sidman was con-cerned with aversive control morebroadly, and he extended his treatmentto negative reinforcement. Accordingto Sidman (1989), punishment andnegative reinforcement constitute "co-ercion." Control by positive reinforce-ment is given dispensation.The problem is that the distinction

between positive and negative rein-forcement is often unclear, even in lab-oratory procedures. Michael (1975)suggested that the distinction be aban-doned altogether, not only in scientificdiscourse but also as a rough and readyguide to humane practice. A portion ofMichael's essay is especially relevant:

[It might be argued] that by maintaining thisdistinction we can more effectively warn behav-ior controllers against the use of an undesirabletechnique. "Use positive rather than negative re-inforcement." But if the distinction is quite dif-ficult to make in many cases of human behaviorthe warning will not be easy to follow; and it isan empirical question at the present time wheth-er such a warning is reasonable-a questionwhich many feel has not been answered. (pp.41-42)

To illustrate the empirical difficultiesin distinguishing between positive andnegative reinforcement, consider a pairof experiments conducted by Baron,Williams, and Posner (unpublisheddata). They studied the responding ofrats on progressive-ratio schedules inwhich the required number of respons-es increases, with each reinforcer, overthe course of the session. The effec-tiveness of the reinforcer is gauged bythe terminal ratio: the highest ratio theanimal will complete before respond-ing ceases. The left panel of Figure 3shows results from 3 rats whose re-sponding produced a signaled time-outfrom an avoidance schedule that oper-ated on another lever. As the durationof the time-out was raised, the terminalratio increased, showing that longertime-outs are more effective negativereinforcers. The right panel shows re-sults from rats working for sweetenedcondensed milk mixed with water.

Page 5: Negative Effects of Positive Reinforcement

NEGATIVE EFFECTS OF POSITIVE REINFORCEMENT 5

NEGATIVE POSITIVE H .01 ml

|Al | NI *_ .05ml

Z701<50~30.

10.

A3 N5Z 70

-50.

310 UC30

~10 __ _

W ~~~A12 N6H70

= 1 2 4 8 10~~~~~~~~l' 50 7'0

TIMEOUT %MILK(MIN)

Figure 3. Effectiveness of negative and positive reinforcers as shown by the highest ratio com-pleted by rats on a progressive-ratio schedule. Left: Responding produced a signaled time-out froman avoidance schedule; reinforcer magnitude was manipulated by changing the duration of the time-out. Right: Responding produced a solution of sweetened condensed milk in water (O.Ol-ml or 0.05-ml cups); magnitude was manipulated by changing the concentration of the milk. (Unpublished datacollected at the University of Wisconsin-Milwaukee by Baron, Williams, and Posner)

Raising the concentration of the milkincreased the terminal ratio in muchthe same way as raising the duration ofthe timeout from avoidance. The con-tingencies in these two experiments-presenting milk or removing an avoid-

ance schedule-can be distinguishedas positive and negative. But the func-tional relation between responding andreinforcer magnitude appears to be thesame.

Despite Michael's (1975) arguments,

Page 6: Negative Effects of Positive Reinforcement

6 MICHAEL PERONE

few behavior analysts seem willing togive up the distinction between posi-tive and negative reinforcement (for re-joinders to Michael, see Hineline,1984, pp. 496-497; Perone & Galizio,1987, p. 112). Still, it should be rec-ognized that at least in some instancesthe differences can be subtle, with theconsequence that it may be difficult toidentify coercive practices on this ba-SiS.

Nevertheless, for Sidman, positivereinforcement can free society of themisery engendered by our current re-liance on negative reinforcement andpunishment. As he says, "a personwho is largely sustained by positive re-inforcement, frequently producing'good things,' will feel quite different-ly about life than will a person whocomes into contact most often withnegative reinforcement, frequentlyhaving to escape from or prevent 'badthings' " (Sidman, 1989, p. 37).

In my view, Sidman's endorsementof positive reinforcement is too broad.There are plenty of bad things to sayabout positive reinforcement. In fact,many of them were said by Skinnerhimself, despite his position as theforemost advocate of positive rein-forcement in the service of humankind.As Skinner (1971) observed in BeyondFreedom and Dignity, behavior gener-ated by positive reinforcement mayhave aversive consequences that occurafter a delay. These aversive conse-quences are difficult to deal with effec-tively, said Skinner, "because they donot occur at a time when escape or at-tack is feasible-when, for example,the controller can be identified or iswithin reach. But the immediate [italicsadded] reinforcement is positive andgoes unchallenged" (p. 35).

Skinner summarized the problemand its solution this way: "A problemarises ... when the behavior generatedby positive reinforcement has deferredaversive consequences. The problem tobe solved by those concerned withfreedom is to create immediate aver-sive consequences" (1971, p. 33).There is irony here: Not only is posi-

tive reinforcement seen as bad, but theantidote is to override the positive withsome form of aversive control.You may wonder if I have quoted

Skinner out of context. I do not thinkso. He repeats the general point many,times. In his autobiography, for exam-ple, Skinner (1983) comments withdismay that some activities are so re-inforcing that they exhaust him. Heworries about having enough energy todo the things that are really important(even if they might not be as reinforc-ing in the short run). Veterans of pro-fessional conferences, with their abun-dant opportunities for well-lubricated,late-night social interaction, will appre-ciate the phrase Skinner turns here:"Fatigue is a ridiculous hangover fromtoo much reinforcement" (p. 79). Toprevent this deleterious side effect ofpositive reinforcement, Skinner laiddown draconian rules prohibiting him-self from engaging in the reinforcedactivities: "Exhausting avocations area danger. No more chess. No morebridge problems. No more detectivestories" (p. 79).

In Beyond Freedom and Dignity,Skinner (1971) alerted us to the dan-gers that may accompany positive re-inforcement. Positive contingenciescan be dangerous specifically becausethey do not generate avoidance, es-cape, or their emotional counterparts,even when the contingencies are ulti-mately detrimental. Skinner went on toidentify examples of detrimental con-tingencies, both contrived and natural,that may be accepted and even defend-ed by the people who are controlled bythem; in other words, by people whoare exploited by governments, employ-ers, gambling casinos, pornographers,drug dealers, and pimps. To take oneexample, gambling may prevent otherbehavior that would be more beneficialin the long run, but countercontrol inthe form of avoidance, escape, or legalprohibition tends to be weak and inef-fective, simply because behavior ismore susceptible to control by short-term gains than long-term losses. Thus,the very people who can least afford

Page 7: Negative Effects of Positive Reinforcement

NEGATIVE EFFECTS OF POSITIVE REINFORCEMENT 7

lottery tickets may be the first to objectto proposals to ban the lottery.

Other examples may seem moremundane, but they are just as sociallysignificant. Positive reinforcement isimplicated in eating junk food insteadof a balanced meal, watching televi-sion instead of exercising, buying in-stead of saving, playing instead ofworking, and working instead ofspending time with one's family. Pos-itive reinforcement underlies our pro-pensity towards heart disease, cancer,and other diseases that are related moreto maladaptive lifestyles than to purelyphysiological or anatomical weakness-es.

CAN AVERSIVE CONTROLBE AVOIDED?

I hope you are beginning to sharemy concerns. If so, you might be think-ing something along these lines: Ifcontingencies of positive reinforce-ment can be so bad, is it possible toavoid aversive control? The answer is"'no."9Aversive control is inevitable be-

cause every positive contingency canbe construed in negative terms. Thepoint can be made many ways. AsBaum (1974) has noted, reinforcementcan be understood as a transition fromone situation to another. The transitioninvolved in positive reinforcement pre-sumably represents an improvement.But the production of improved con-ditions may also be regarded as an es-cape from relatively aversive condi-tions. Thus, we may say that the ratpresses a lever because such behaviorproduces food (a positive reinforcer) orbecause it reduces food deprivation (anegative reinforcer).The issue may be one of perspective.

But outside the laboratory, I cannothelp but be impressed with the propen-sity of people to respond to the nega-tive side of positive contingencies.Consider college students. In my largeundergraduate courses I have tried avariety of contingencies to encourageclass attendance. Early on, I simply

scored attendance and gave the score aweighting of 10% of the course grade.There were lots of complaints. The stu-dents clearly saw this system as puni-tive: Each absence represents a loss ofpoints towards the course grade. So Iswitched to a system to positively re-inforce attendance. When studentscome to class on time, they earn apoint above and beyond the pointsneeded to earn a perfect score in thecourse. Thus, a student with perfect at-tendance, and a perfect course perfor-mance, would earn 103% of the so-called maximum. A student who nevercame to class, but otherwise performedflawlessly, would earn 100%. If coursepoints function as reinforcers, then thissurely is a positive contingency. Butthe students reacted pretty much thesame as before. They saw this as an-other form of punishment: With eachabsence I was denying them a bonuspoint. Of course the students are right.Whenever a reinforcer is contingent onbehavior, it must be denied in the ab-sence of that behavior.

Perhaps the propensity to see thenegative side of positive contingenciesdepends on sophisticated verbal andsymbolic repertoires that may filter theimpact of contingencies on human be-havior. Certainly college studentscan-and often do-convert coursecontingencies to symbolic terms, thenproceed to manipulate those terms:They calculate the number of points at-tainable, including bonus points, andlabel 103% as the maximum. Theykeep a running tally as the maximumthey can attain drops below 103%-thus, they calculate a reduced maxi-mum after each absence. It would ap-pear, then, that when they miss classthey deliver the punishing stimulusthemselves! So it might be argued thatthe punishment contingency is an in-direct by-product of a certain kind ofproblem-solving repertoire unique tohumans.

AVERSIVE FUNCTIONS OFPOSITIVE REINFORCEMENTIs verbal or symbolic sophistication

necessary for schedules of positive re-

Page 8: Negative Effects of Positive Reinforcement

8 MICHAEL PERONE

L: GREEN & RED L: GREEN L: GREEN& RED L: GREEN

zR: GREEN & RED R: GREEN & RED R: GREEN R: GREEN & RED

40

w

30w

'° °La-a -

(J 20

1) 10 6

~~LEFT NO(ZQ RIGHiT

LAST FIVE SESSIONSFigure 4. Rates at which a pigeon pecked concurrently available observing keys to produce colorscorrelated with the VI 30-s and VI 120-s components of a compound schedule of food reinforce-ment, as reported by Jwaideh and Mulvaney (1976). Responding on the key that produced red(correlated with VI 120 s) as well as green (correlated with VI 30 s) was suppressed relative toresponding on the key that produced only green.

inforcement to manifest aversive func-tions, or is it possible that a more basicprocess is at work? One answer comesfrom research on the conditioned prop-erties of discriminative stimuli associ-ated with the components of multipleschedules.Working in Dinsmoor's laboratory at

Indiana University, Jwaideh and Mul-vaney (1976) trained pigeons on a mul-tiple schedule with alternating VI com-ponents of food reinforcement. Whenthe pecking key was green, a richschedule was in effect, one that al-lowed the bird to earn food every 30 son average. When the key was red, aleaner schedule was in effect; the birdcould earn food every 120 s. In thenext phase, the colors signaling the VIschedule components were withheldunless the bird pecked side keys. Thisarrangement is called an observing re-sponse procedure because pecking theside keys allows the bird to see the col-or correlated with the VI schedule un-derway on the main key. The experi-

mental question is this: Will the colorscorrelated with the VI schedules offood reinforcement serve as reinforcersthemselves? That is, will they maintainresponding on the observing keys? Toanswer this question, Jwaideh andMulvaney manipulated the conse-quences of pecking the two observingkeys.

Figure 4 shows the experimentalmanipulations as well as the resultsfrom 1 bird. Response rates on the twoobserving keys are shown across fourexperimental conditions. In the firstpanel, both keys produced green orred, depending on which schedule wasin effect on the main key; responserates on the two keys were about equal.In the remaining three panels, pecks onone key continued to produce green orred, but pecks on the other key couldproduce only green, the color correlat-ed with the rich schedule. The two re-sponse rates differed under these con-ditions. The bird pecked at high rateson the key that produced green (the

Page 9: Negative Effects of Positive Reinforcement

NEGATIVE EFFECTS OF POSITIVE REINFORCEMENT 9

"rich" color) and low rates on the keythat produced red (the "lean" color) aswell as green. This pattern held upacross several reversals.

In other words, the color red-thestimulus correlated with the leaner ofthe two schedules of food reinforce-ment-suppressed responding relativeto responding on a key that did notproduce this color. Let me underscorethe significance of this result. Red wascorrelated with positive reinforcementin a prototypical arrangement for dem-onstrating positive reinforcement: afood-deprived pigeon pecking a keyfor grain on an intermittent schedule.One might expect a stimulus correlatedwith a VI schedule of food reinforce-ment to be a good thing. Nevertheless,the red stimulus functioned as a con-ditioned punisher: It suppressed the ob-serving response that produced it.

This result is not an oddity. It posesno difficulty whatsoever for contem-porary theories of conditioning andlearning. Indeed, the experiment fitsquite nicely with current understandingof Pavlovian conditioning and its rolein imbuing otherwise neutral stimuliwith conditioned reinforcing or aver-sive properties (e.g., Dinsmoor, 1983;Fantino, 1977). The result does pose aproblem for simplistic conceptions thatassign the label "good" to positive re-inforcement and "bad" to negative re-inforcement. Jwaideh and Mulvaney's(1976) experiment suggests that wheth-er a schedule (or stimuli correlatedwith it) will be good or bad dependson the broader environmental contextin which the schedule or stimulus isembedded. In alternation with a rich VI30-s schedule, a relatively lean VI 120-s schedule constitutes an aversive con-dition.

Like Jwaideh and Mulvaney (1976),Metzger and I also considered the aver-sive aspects of positive reinforcement,although we approached the problem abit differently. Our research was pat-terned after a classic study by Azrin(1961). Azrin gave pigeons the oppor-tunity to remove stimuli correlatedwith a fixed-ratio (FR) schedule. Two

response keys were available; pecks onone key (the "food" key) were rein-forced according to an FR schedule,and one peck on the other key (the "es-cape" key) initiated a time-out. Duringa time-out, the colors of the responsekeys were changed, the color and in-tensity of the houselight were changed,and pecks on the food key were inef-fective. A second peck on the escapekey reinstated the original stimuli andthe FR schedule. The birds usually es-caped during the period following re-inforcement, suggesting that this wasthe most aversive part of the schedule.

It is important to recognize that an-imals can escape from a schedule evenwhen the experimenter has not ar-ranged explicit contingencies. Whenno explicit escape option is madeavailable, animals may pause for ex-tended periods after reinforcement;during this time, birds typically turn ormove away from stimuli correlatedwith the schedule (Cohen & Campag-noni, 1989; Thompson, 1965). Becausepausing provides a way to reduce con-tact with the schedule and the stimulicorrelated with it, it may function as aform of escape. To test this idea, Metz-ger and I wanted to see if factorsknown to affect pausing would affectescape in a similar way.

Following a procedure developed byPerone and Courtney (1992), wetrained pigeons on multiple scheduleswith two FR components. The onlydifference was the reinforcer magni-tude; ratios in one component ended ina small reinforcer, and ratios in the oth-er ended in a large reinforcer. We wereinterested in the behavior that occurredbetween the end of one reinforcer andthe start of the ratio leading to the nextreinforcer. There were 40 such transi-tions during each session. The escapekey was available during half of thesetransitions.

Figure 5 shows the four kinds oftransitions that occurred each session.A ratio ending in a small reinforcercould be followed by another ending ina small reinforcer or by one ending ina large reinforcer. Likewise, a ratio

Page 10: Negative Effects of Positive Reinforcement

10 MICHAEL PERONE

Upcoming Reinforcer

Past Reinforcer Small LargeSmall 5 regular 5 regular

5 w/ escape 5 w/ escapeLarge 5 regular 5 regular

5 w/ escape 5 w/escape

Figure 5. Metzger and Perone's method forcomparing pausing and escape in the four pos-sible transitions between fixed ratios ending insmall or large food reinforcers. Over the courseof a session, half of the transitions included theactivation of an escape key that could be peckedto suspend the schedule. Pausing was measuredin the transitions without the escape option.

ending in a large reinforcer could befollowed by one ending in a small re-inforcer or a large reinforcer. The fourtypes of transitions were programmedin an irregular sequence. Each occurred10 times per session, five times withthe escape key available to initiatetime-out and five times without it.When the escape key was available,

both the food and escape keys were lit.A single peck on the escape key initi-ated a time-out, during which thehouselight and food key were turnedoff and the escape key was dimmed.Another peck on the escape key turnedon the houselight and food key and re-instated the FR schedule so that peckson the food key led eventually to re-inforcement. The escape key wasturned off.The bird did not have to peck the

escape key. If it pecked the food keyfirst, the escape key was simply turnedoff. Pecks on the food key led even-tually to reinforcement.

Figure 6 shows results from 1 bird.The upper panels show the pauses thatoccurred during the transitions withoutthe escape option. These data comefrom the last 10 sessions at each ofseveral FR sizes. In each panel, pausesin the four transitions are shown sep-arately. Pausing is a joint function ofthe FR size, the magnitude of the pastreinforcer, and the magnitude of theupcoming reinforcer. Most important,however, is the general pattern in thefunctions across the six conditions inthe upper panel. Note also that highest

data point is always the open circle tothe right. This represents pausing in thetransition after a large reinforcer andbefore a small one.The middle panels in Figure 6 show

the escapes that occurred in the samesessions, but in the other half of thetransitions-the ones with the escapeoption available. The general pattern ofescape behavior is strikingly similar tothe pattern of pausing. Indeed the prob-ability of escape is highest under thesame conditions that produce the lon-gest pauses: in the transition after alarge reinforcer and before a small one.The bottom panels present an alter-

native measure of escape behavior: thepercentage of the session the bird spentin the self-imposed time-outs. Thesedata are not as pretty as the others, butthey do fall into line.My students and I have replicated

this experiment with fixed-intervalschedules leading to different reinforc-er magnitudes, and with transitions in-volving large and small FR sizes. Theresults are pretty much the same: Paus-ing and escape change in tandem. Bothforms of behavior are influenced by thesame variables in the same ways.

Figure 7 shows data from an exper-iment in which escape was studiedwith mixed schedules as well as mul-tiple schedules. In both cases, an FRled to either small or large reinforcers.In the mixed-schedule condition, how-ever, no key colors signaled the up-coming reinforcer magnitude. Pauseswere relatively brief on the mixedschedule, and escape behavior was ab-sent. In the multiple-schedule condi-tion (when the upcoming magnitudewas signaled), pausing increased, par-ticularly in the large-to-small transi-tion, and the pattern of escape behaviorresembled the pattern of pausing.The clear parallels in the data on

pausing and escape suggest that paus-ing functions as a means of escapefrom the aversive aspects of the sched-ule. It seems likely that the pause-re-spond pattern that typifies performanceon FR (and fixed-interval) schedules

Page 11: Negative Effects of Positive Reinforcement

NEGATIVE EFFECTS OF POSITIVE REINFORCEMENT 11

FR 20 FR 40 FR 60 FR 80 FR I 00 FR 120

30 BIRD

20 2336WP MULTR1C7CO)

I0~

02cn

w 30 UPC.RNF.

o 20 0 SMALLw * LARGE

ON 0

PAST REINFORCERFigure 6. One pigeon's pausing and escape in the transitions between fixed ratios ending in smallor large food reinforcers (1-s or 7-s access to grain). The reinforcer magnitudes were signaled bydistinctive key colors. Data are medians (and interquartile ranges) over 10 sessions. The ratio sizewas manipulated across conditions. Top: pausing. Middle: number of escapes; the session maximumwas five per transition. Bottom: percentage of the session spent in the escape-produced time-out.(Unpublished data)

represents a combination of positiveand negative reinforcement.

THE UBIQUITY OFAVERSIVE CONTROL

This observation brings me back tothe general theme of this paper. Insideand outside the laboratory, aversive

control is ubiquitous. Indeed, it seemsto be unavoidable. Given this state ofaffairs, perhaps it would be worth con-sidering whether aversive control is de-sirable or at least acceptable.

In a book on teaching, Michael(1993) observed that "College learningis largely under aversive control, and it

Page 12: Negative Effects of Positive Reinforcement

MICHAEL PERONE

MIX 1/7

x.Z7-w

L

MULT 1 /7

40

35

30

25

20

15

10

5

0

5

4

2 -

I

0

20

15F

i0ol

5

0

S

PAST REINFORCERFigure 7. Pausing and escape in the transitions between fixed ratios ending in small or large foodreinforcers (1-s or 7-s access to grain). Left: The reinforcer magnitudes were unsignaled (mixedschedule). Right: The magnitudes were signaled by distinctive key colors (multiple schedule). (Un-published data)

BIRD 3280

UPC. RNF

- SMALL_-*-LARGE

w

U)

Cl:

40

35

30

25

20

15

10

5

0

BIRD 2378

*q

4

3

U)wLL

U)w

2h

0

p p

20

15F-

oD

0w

0,

10 I-

5

0

Sw

L

12

3 F

1

Page 13: Negative Effects of Positive Reinforcement

NEGATIVE EFFECTS OF POSITIVE REINFORCEMENT 13

is our task to make such control effec-tive, in which case it becomes a formof gentle persuasion" (p. 120). Theidea is that aversive control might beacceptable if it generates behavior ofsome long-term utility. Think for a mo-ment what it means to have a truly ef-fective contingency of punishment ornegative reinforcement. When a pun-ishment contingency is effective, un-desirable behavior is decreased and theaversive stimulus is almost never con-tacted. When an avoidance contingen-cy is effective, desirable behavior is in-creased, and again there is minimalcontact with the aversive stimulus. Inmy classes I impose rather stiff penal-ties when assignments are submittedlate. Without this aversive contingency,late papers abound. With it, however,late papers are so rare that I doubt thatI impose the penalty more often thanonce in a hundred opportunities. Inshort, Michael is right: A well-de-signed program of aversive control isgentle, and a lot of good can come ofit.

That is fortunate, because it is im-possible to construct a behavioral sys-tem free of aversive control. The formsof behavioral control we call "posi-tive" and "negative" are inextricablylinked. Thus, decisions about "good"and "bad" methods of control must bedecided quite apart from the questionsof whether the methods meet the tech-nical specification of "positive rein-forcement" or "aversive" control. Weneed to seek a higher standard, one thatemphasizes outcomes more than pro-cedures. Our chief concern should notbe whether the contingencies involvethe processes of positive reinforce-ment, negative reinforcement, or pun-ishment. Instead, we should emphasizethe ability of the contingencies to fos-ter behavior in the long-term interest ofthe individual. Of course, this is all wecan ask of any behavioral intervention,regardless of its classification.

REFERENCESAzrin, N. H. (1961). Time out from positive

reinforcement. Science, 8133, 382-383.

Baron, A. (1991). Avoidance and punishment.In I. H. Iversen & K. A. Lattal (Eds.), Exper-imental analysis of behavior (Part 1, pp. 173-217). Amsterdam: Elsevier.

Baum, W. M. (1974). Chained concurrentschedules: Reinforcement as situation transi-tion. Journal of the Experimental Analysis ofBehavior, 22, 91-101.

Cohen, P. S., & Campagnoni, F R. (1989). Thenature and determinants of spatial retreat inthe pigeon between periodic grain presenta-tions. Animal Learning & Behavior, 17, 39-48.

Crosbie, J. (1998). Negative reinforcement andpunishment. In K. A. Lattal & M. Perone(Eds.), Handbook of research methods in hu-man operant behavior (pp. 163-189). NewYork: Plenum.

de Souza, D. D., de Moraes, A. B. A., & To-dorov, J. C. (1984). Shock intensity and sig-naled avoidance responding. Journal of theExperimental Analysis of Behavior, 42, 67-74.

Dinsmoor, J. A. (1983). Observing and condi-tioned reinforcement. Behavioral and BrainSciences, 6, 693-728. (Includes commentary)

Fantino, E. (1977). Choice and conditioned re-inforcement. In W. K. Honig & J. E. R. Stad-don (Eds.), Handbook of operant behavior(pp. 313-339). Englewood Cliffs, NJ: PrenticeHall.

Ferster, C. B. (1967). Arbitrary and natural re-inforcement. The Psychological Record, 17,341-347.

Hineline, P. N. (1984). Aversive control: A sep-arate domain? Journal of the ExperimentalAnalysis of Behavior, 42, 495-509.

Jwaideh, A. R., & Mulvaney, D. E. (1976).Punishment of observing by a stimulus asso-ciated with the lower of two reinforcementfrequencies. Learning and Motivation, 7, 211-222.

Michael, J. (1975). Positive and negative rein-forcement, a distinction that is no longer nec-essary; or a better way to talk about badthings. Behaviorism, 3, 33-44.

Michael, J. (1993). Concepts and principles ofbehavior analysis. Kalamazoo, MI: Associa-tion for Behavior Analysis.

Morse, W. H., & Kelleher, R. T. (1977). Deter-minants of reinforcement and punishment. InW. K. Honig & J. E. R. Staddon (Eds.), Hand-book of operant behavior (pp. 174-200). En-glewood Cliffs, NJ: Prentice Hall.

Perone, M., & Courtney, K. (1992). Fixed-ratiopausing: Joint effects of past reinforcer mag-nitude and stimuli correlated with upcomingmagnitude. Journal of the Experimental Anal-ysis of Behavior, 57, 33-46.

Perone, M., & Galizio, M. (1987). Variable-in-terval schedules of timeout from avoidance.Journal of the Experimental Analysis of Be-havior, 47, 97-113.

Scripture, E. W. (1895). Thinking, feeling, do-ing. Meadville, PA: Flood and Vincent.

Page 14: Negative Effects of Positive Reinforcement

14 MICHAEL PERONE

Sidman, M. (1989). Coercion and its fallout.Boston: Authors Cooperative.

Sizemore, 0. J., & Maxwell, F R. (1985). Se-lective punishment of interresponse times:The roles of shock intensity and scheduling.Joumal of the Experimental Analysis of Be-havior, 44, 355-366.

Skinner, B. F (1971). Beyondfreedom and dig-nity. New York: Knopf.

Skinner, B. F (1983). A matter ofconsequences.New York: Knopf.

Thompson, D. M. (1965). Time-out from fixed-ratio reinforcement: A systematic replication.Psychonomic Science, 2, 109-110.