reinforcement: food signals the time and location of

24
REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF FUTURE FOOD SARAH COWIE,MICHAEL DAVISON, AND DOUGLAS ELLIFFE THE UNIVERSITY OF AUCKLAND It has long been understood that food deliveries may act as signals of future food location, and not only as strengtheners of prefood responding as the law of effect suggests. Recent research has taken this idea further—the main effect of food deliveries, or other ‘‘reinforcers’’, may be signaling rather than strengthening. The present experiment investigated the ability of food deliveries to signal food contingencies across time after food. In Phase 1, the next food delivery was always equally likely to be arranged for a left- or a right-key response. Conditions were arranged such that the next food delivery was likely to occur either sooner on the left (or right) key, or sooner on the just-productive (or not-just- productive) key. In Phase 2, similar contingencies were arranged, but the last-food location was signaled by a red keylight. Preference, measured in 2-s bins across interfood intervals, was jointly controlled by the likely time and location of the next food delivery. In Phase 1, when any food delivery signaled a likely sooner next food delivery on a particular key, postfood preference was strongly toward that key, and moved toward the other key across the interreinforcer interval. In other conditions in which food delivery on the two keys signaled different subsequent contingencies, postfood preference was less extreme, and quickly moved toward indifference. In Phase 2, in all three conditions, initial preference was strongly toward the likely-sooner food key, and moved to the other key across the interfood interval. In both phases, at a more extended level of analysis, sequences of same-key food deliveries caused a small increase in preference for the just-productive key, suggesting the presence of a ‘‘reinforcement effect’’, albeit one that was very small. Key words: choice, food reinforcement, signaling, preference pulse, pecking, pigeon _______________________________________________________________________________ The law of effect was described by Thorn- dike (1911) and later by Skinner (1938) as generically asserting that reinforcers increase the probability of the response they follow. More recent research suggests that this pre- diction is true only when past and future contingencies are the same; in situations where future consequences differ from those in the past, it is the future contingency that controls choice following reinforcers, rather than the effect of reinforcers enhancing or maintaining responses emitted just prior to the reinforcer. The ability of reinforcers to function as discriminative stimuli signaling future behav- ior–food contingencies has frequently been acknowledged in the experimental analysis of behavior. The period of decreased or absent responding following a food delivery on fixed- interval and fixed-ratio schedules is thought to occur because each reinforcer delivery signals the start of a period during which no responses will be reinforced (e.g., Schneider, 1969). A closely-related literature further suggests that the signaling properties of reinforcers may sometimes outweigh the strengthening effects. In the radial-arm maze (Olton & Samuelson, 1976), for example, rats quickly learned not to reenter an arm in which they had recently found food. That is, the response of entering a particular arm is not ‘‘reinforced’’ in the simple sense implied by the law of effect, because the discovery of food in one arm signals a change in the response– reinforcer contingency—food will not be found again in that arm in the immediate future. Similarly, in a conditional-discrimina- tion task, the presence or absence of a reinforcer following the sample-stimulus pre- sentation can signal the location of the alternative that will likely produce reinforce- ment in the comparison phase (Randall & Zentall, 1997). Under these conditions, the comparison choice was generally made to the key likely to produce a reinforcer, even when this key was the one that had not produced the reinforcer in the sample phase. The location of the last-obtained reinforcer can also be a discriminative stimulus signaling We thank the members of the Auckland University Experimental Analysis of Behaviour Research Unit for their help in conducting this experiment, which will form part of Sarah Cowie’s doctoral dissertation. We also thank Mick Sibley for looking after the pigeons. Address correspondence to Sarah Cowie, Department of Psychology, The University of Auckland, Private Bag 92019, Auckland 1142, New Zealand (e-mail: scow028@ aucklanduni.ac.nz). doi: 10.1901/jeab.2011.96-63 JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 2011, 96, 63–86 NUMBER 1( JULY) 63

Upload: doanhanh

Post on 02-Jan-2017

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF FUTURE FOOD

SARAH COWIE, MICHAEL DAVISON, AND DOUGLAS ELLIFFE

THE UNIVERSITY OF AUCKLAND

It has long been understood that food deliveries may act as signals of future food location, and not onlyas strengtheners of prefood responding as the law of effect suggests. Recent research has taken this ideafurther—the main effect of food deliveries, or other ‘‘reinforcers’’, may be signaling rather thanstrengthening. The present experiment investigated the ability of food deliveries to signal foodcontingencies across time after food. In Phase 1, the next food delivery was always equally likely to bearranged for a left- or a right-key response. Conditions were arranged such that the next food deliverywas likely to occur either sooner on the left (or right) key, or sooner on the just-productive (or not-just-productive) key. In Phase 2, similar contingencies were arranged, but the last-food location was signaledby a red keylight. Preference, measured in 2-s bins across interfood intervals, was jointly controlled bythe likely time and location of the next food delivery. In Phase 1, when any food delivery signaled alikely sooner next food delivery on a particular key, postfood preference was strongly toward that key,and moved toward the other key across the interreinforcer interval. In other conditions in which fooddelivery on the two keys signaled different subsequent contingencies, postfood preference was lessextreme, and quickly moved toward indifference. In Phase 2, in all three conditions, initial preferencewas strongly toward the likely-sooner food key, and moved to the other key across the interfood interval.In both phases, at a more extended level of analysis, sequences of same-key food deliveries caused asmall increase in preference for the just-productive key, suggesting the presence of a ‘‘reinforcementeffect’’, albeit one that was very small.

Key words: choice, food reinforcement, signaling, preference pulse, pecking, pigeon

_______________________________________________________________________________

The law of effect was described by Thorn-dike (1911) and later by Skinner (1938) asgenerically asserting that reinforcers increasethe probability of the response they follow.More recent research suggests that this pre-diction is true only when past and futurecontingencies are the same; in situationswhere future consequences differ from thosein the past, it is the future contingency thatcontrols choice following reinforcers, ratherthan the effect of reinforcers enhancing ormaintaining responses emitted just prior tothe reinforcer.

The ability of reinforcers to function asdiscriminative stimuli signaling future behav-ior–food contingencies has frequently beenacknowledged in the experimental analysis ofbehavior. The period of decreased or absentresponding following a food delivery on fixed-interval and fixed-ratio schedules is thought to

occur because each reinforcer delivery signalsthe start of a period during which noresponses will be reinforced (e.g., Schneider,1969). A closely-related literature furthersuggests that the signaling properties ofreinforcers may sometimes outweigh thestrengthening effects. In the radial-arm maze(Olton & Samuelson, 1976), for example, ratsquickly learned not to reenter an arm in whichthey had recently found food. That is, theresponse of entering a particular arm is not‘‘reinforced’’ in the simple sense implied bythe law of effect, because the discovery of foodin one arm signals a change in the response–reinforcer contingency—food will not befound again in that arm in the immediatefuture. Similarly, in a conditional-discrimina-tion task, the presence or absence of areinforcer following the sample-stimulus pre-sentation can signal the location of thealternative that will likely produce reinforce-ment in the comparison phase (Randall &Zentall, 1997). Under these conditions, thecomparison choice was generally made to thekey likely to produce a reinforcer, even whenthis key was the one that had not produced thereinforcer in the sample phase.

The location of the last-obtained reinforcercan also be a discriminative stimulus signaling

We thank the members of the Auckland UniversityExperimental Analysis of Behaviour Research Unit fortheir help in conducting this experiment, which will formpart of Sarah Cowie’s doctoral dissertation. We also thankMick Sibley for looking after the pigeons.

Address correspondence to Sarah Cowie, Department ofPsychology, The University of Auckland, Private Bag92019, Auckland 1142, New Zealand (e-mail: [email protected]).

doi: 10.1901/jeab.2011.96-63

JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 2011, 96, 63–86 NUMBER 1 ( JULY)

63

Page 2: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

future behavior–food contingencies. Krageloh,Davison and Elliffe (2005) varied the condi-tional probability of obtaining food on onekey, given that the previous food delivery wasobtained on that key. At conditional probabil-ities of .7 and above, postfood preference wasstrongly toward the just-productive key. Prefer-ence pulses—the log response ratio plotted as afunction of time or responses since a fooddelivery—lasted longer at higher conditionalprobabilities, and the level at which preferencestabilized was more extreme. As the probabilitythat the next food delivery would be on the just-productive key was decreased, postfood prefer-ence moved toward the not-just-productive key,although it was never as extreme as that for thejust-productive key at high conditional proba-bilities. Thus, the direction of postfood prefer-ence appeared to be controlled by the likelylocation of the next food delivery, rather thanby the location of the previous food.

Stimuli that are often called ‘‘conditionalreinforcers’’ may also act as signals of futurefood ratios and thus produce effects whichhave been attributed to ‘‘reinforcement’’when they signal similar future food contin-gencies. Davison and Baum (2006; 2010)noted that when ratios of additional nonfoodstimuli were presented, and were highlycorrelated with food ratios, poststimulus pref-erence was in the same direction as postfoodpreference, but when the stimulus-ratio tofood-ratio correlation was negative, poststimu-lus preference pulses were toward the not-just-productive key. This was so whether or not theadditional stimuli were paired with food.

In a steady-state environment, however, thecorrelation between stimulus ratio and foodratio has no effect on poststimulus choice.Under such conditions, Boutros, Davison andElliffe (2009) found that postfood preferencewas affected only by stimulus–food pairing, butthat the effect of the stimulus was discernibleonly at the most local level of analysis, as anincrease in preference-pulse amplitude. Theeffect of the response-contingent stimuli onpreference was also small in comparison withthat of a food delivery. Boutros et al.(2009)concluded that the importance of response-contingent stimuli in a steady-state environ-ment was small because the overall food ratiois already discernible from continued foodratios. In contrast, response-contingent stimuliin a highly variable environment in which the

current food ratio is unknown, as used byDavison and Baum (2006; 2010), are impor-tant in that they do provide additionalinformation about where the next food ismore likely to occur. Thus, in any environ-ment, the pattern of food delivery itself canprovide information about the likely futurecontingencies of food delivery, and choicefollows these future contingencies. This expla-nation was confirmed by Boutros, Davison &Elliffe (2011).

Krageloh and Davison (2003) suggested thatboth time to food and location of next foodmay moderate the magnitude of preference,with relatively more delayed food beingfollowed by smaller, shorter preference pulses(see also Davison & Baum, 2007). The studiesabove also show that the magnitude ofpreference is in some way related to thecontingency; choice for the just-productivealternative generally occurs more frequently,or is more extreme, than choice for the not-just-productive alternative, even when thecontingencies favor the not-just-productivealternative. Randall and Zentall (1997) notedthat when responding was required on acenter key between the presentation of thereinforcer and the comparison phase in aconditional discrimination, their pigeons tend-ed to respond slightly more on the just-productive alternative, even when this wasassociated with a lower probability of rein-forcement. Similarly, although Krageloh, Da-vison and Elliffe (2005) found that thedirection of preference changed with theprobability of obtaining food on that alterna-tive, preference under conditions where thenext food was very likely to occur on the not-just-productive alternative was much less ex-treme. Such findings suggest an effect addi-tional to a signaling effect, which could eitherbe a ‘‘reinforcement’’ effect or possibly asimple continuation of responding on a keysuch as occurs in a normal visit to that key(e.g., Schneider & Davison, 2006).

Behavior also appears to be controlled bythe probability of obtaining a reinforcer on aresponse alternative at any point in time.Catania and Reynolds (1968) observed anincrease in response rate as time since areinforcer increased, either if the variable-interval (VI) schedules were arranged arith-metically, or if the probability of a reinforcerbeing arranged increased with time since the

64 SARAH COWIE et al.

Page 3: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

last reinforcer. In contrast, when VI scheduleswere arranged according to a constant-proba-bility schedule, the rate of responding re-mained relatively constant across time since areinforcer. Elliffe and Alsop (1996) alsoreported clear changes in concurrent choicethat were associated with differences in the wayin which reinforcer availability changed withthe passage of time. Similarly, Church andLacourse (2001) observed that when rats wererequired to work on VI schedules with equalmeans and standard deviations, but withdiffering distributions of food in time afterreinforcers, the pattern of postreinforcerresponding differed depending on the distri-bution of foods in time. When intervals werearranged according to an exponential distri-bution, the mean first postfood response andthe maximum rate of responding occurredearlier than when intervals were arrangedaccording to a Wald distribution, in whichthe probability of food increases and thendecreases in interreinforcer intervals. Theeffects of the different distributions acrosstime since an event again show that the localtime-based probability of obtaining food is animportant determinant of behavior.

Do food deliveries signal local changes infood probabilities on conventional concurrentVI VI schedules? Any such effects require thatthere are local changes in food probability forresponses after food deliveries, and suchchanges will occur in conventional concurrentVI VI schedules that are arranged with achangeover delay (Herrnstein, 1961) thatcontinues to operate in the time followingfood delivery. Consider a 3-s changeover delay.If this operates after food delivery (i.e.,according to the usual definition of a change-over delay), food is only available for respond-ing after food delivery on the key that justproduced food—the food ratio is locallyinfinite. If the animal changes over, it canreceive no food for 3 s. Furthermore, such aneffect may also occur when no changeoverdelay is arranged: If a food delivery on a keyproduces a locally strong preference forresponding on (or an extended visit to) thesame key after food, food deliveries scheduledand held on the not-just- productive key willnot be collected and will become availablelater, again resulting in a locally extreme foodratio to the just-productive key. In this way, adynamical enhancement of preference can

occur: A local preference can produce alocal change in the food ratio on a key, whichwill then presumably further enhance thelocal preference. Thus local preference pulsesmay be held up by their own bootstraps, as itwere.

The present experiment investigated theability of individual reinforcers to signal futuretime-based contingencies. Under some condi-tions, the signaling properties of the reinforc-er were in direct opposition to the strength-ening effects of the reinforcer. Phase 1 of thepresent experiment was designed to investi-gate whether food delivery can signal the likelyavailability of future food in time, usingconcurrent VI schedules on which the overallfood ratio in sessions was always 1:1, but foodcould occur on the average sooner or laterafter food delivery on the just-productive ornot-just-productive key, or on the left or theright keys. In order to reduce or eliminate theeffects of memory decrement over time sincefood, in Phase 2 a key-light stimulus thatsignaled the location of the just-productive keywas arranged throughout the following inter-food interval. No changeover delay was used.

METHOD

Subjects

Six naıve homing pigeons numbered 21 to26 were maintained at 85% 6 15 g of theirfree-feeding body weight. Water and grit wereavailable at all times. Pigeons were fed post-feed of mixed grain when necessary tomaintain their designated body weights.

Apparatus

The pigeons were housed individually intheir home cages (375 mm high by 375 mmdeep by 370 mm wide) which also served asexperimental chambers. On one wall of thecage, 20 mm above the floor, were three plastickeys (20 mm in diameter) set 100 mm apartcenter to center. Each key could be illuminat-ed yellow or red, and responses to illuminatedkeys exceeding 0.1 N were recorded. Beneaththe center key, 60 mm from the perch, was amagazine aperture measuring 40 mm by40 mm. During food delivery, key lights wereextinguished, the aperture was illuminated,and the hopper containing wheat was raisedfor 2.5 s. The subjects could see and hear other

CHOICE AND FUTURE FOOD 65

Page 4: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

pigeons in the room during the experiment;no person entered the room during this time.

Procedure

The pigeons were slowly deprived of food bylimiting their intakes, and were taught to eatfrom the food magazine when it was present-ed. When pigeons were reliably eating during2.5-s magazine presentations, they were auto-shaped to peck the two response keys. One ofthe two keys was illuminated yellow or red for4 s, after which food was presented indepen-dently of responding. If the pigeon pecked theilluminated key, food was presented immedi-ately. Once the pigeons were reliably peckingthe illuminated keys, they were trained over2 weeks on a series of food-delivery schedulesincreasing from continuous reinforcement toVI 50 s presented singly on the left or rightkeys with yellow keylights. They were thenplaced on the final procedure describedbelow.

Sessions were conducted in the pigeons’home cages in a time-shifted environment inwhich the room lights were lit at 12 midnight.Sessions for all 6 pigeons commenced at about1.00 am. Room lights were extinguished at 4pm.

Sessions were conducted once a day, com-mencing with the left and right keylights lityellow, signaling the availability of a VIschedule on each key. Additionally, in Phase2 Conditions (8, 9 and 10), the key that hadproduced the last food was illuminated redduring the following interfood interval. Ses-sions ran for 60 minutes or until 60 fooddeliveries had been collected, whichever oc-curred first. No changeover delay (COD;Herrnstein, 1961) was used.

Phase 1

Food was arranged according to a modifiedconcurrent VI VI schedule, where the next-food location was determined at the prior fooddelivery (and at the start of each session) witha probability of .5. Thus, approximately equalnumbers of food deliveries were available onboth alternatives in each session. Althoughonly one schedule at a time was ever inoperation, both keys remained illuminatedfor the duration of each interfood interval.

Figure 1 shows a diagram of the contingen-cies arranged in Phase 1. In Conditions 1 and

3, both schedules were VI 27 s, so theprobability of food delivery at any time afterfood delivery on either key remained equal,and a food delivery did not signal anydifference in expected time to the next food.In subsequent conditions, the two scheduleswere VI 5 s and VI 50 s, and across conditionswe varied how the last food delivery, and insome conditions, its key location, signaled thekey on which the VI 5-s and the VI 50-sschedules were available in the next interfoodinterval (Figure 1). Thus, in Condition 2, foodwas arranged on a VI 5-s schedule on the just-productive key, or on a VI 50-s schedule on thenot-just-productive key. In Condition 5, 7 and11, the contingencies were the reverse of thosein Condition 2, so that food was available on aVI 5-s schedule on the not-just-productive key,and on a VI 50-s schedule on the just-productive key. Thus, for Conditions 2, 5, 7and 11, the location of the last reinforcerdetermined the location of the subsequent

Fig. 1. Phase 1. A schematic diagram of the likelymean time to food on the left and right keys following aleft- or right-key food delivery in Conditions 1 to 7, and 11.The mean interval for left- and right-key food deliveries,respectively, is shown by filled and open circles, respec-tively. Similar contingencies were arranged in Phase 2(Conditions 8 to 10), but the key that produced the lastfood delivery was illuminated red (the other remainedyellow) during the next interfood interval.

66 SARAH COWIE et al.

Page 5: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

sooner schedule. In Condition 4, food wasavailable on a VI 5-s schedule on the left key,and on a VI 50-s schedule on the right key, andthese schedules were reversed in Condition6—thus, under these conditions the locationof the sooner schedule was independent of thelast-reinforcer location. In all conditions, themean time to the next food was 27 s.

Phase 2

As in Phase 1, the overall food ratio was heldapproximately equal on the two keys, and theprobability that the next food would beobtained sooner on the just-productive key,or sooner on the not-just-productive key, orsooner at a particular location (left or rightkey), was varied across conditions. However,the just-productive key was lit red for theduration of the following interreinforcer in-terval (IRI). In Condition 8, food was arrangedon a VI 5-s schedule on the not-just-productivekey, and a VI 50-s schedule on the just-productive key, as in Conditions 5, 7 and 11in Phase 1. In Condition 9, food was arrangedon a VI 5-s schedule on the right key, and on aVI 50-s schedule on the left key (as forCondition 6, Phase 1). In Condition 10, foodwas available on a VI 5-s schedule on the just-productive key, and on a VI 50-s schedule onthe not-just-productive key (as in Condition 2of Phase 1). Thus, in Conditions 8 and 10, thelocation of the likely-sooner food deliverydepended on the location of the previousreinforcer, but in Condition 9, the location ofthe likely-sooner food delivery was always onthe same key and was thus independent of thelast-reinforcer location.

A PC-compatible computer running MED-PCH IV software in an adjacent room con-trolled and recorded all experimental eventsand the time at which they occurred. Eachcondition lasted for 85 daily sessions, and thedata from the last 60 sessions were analyzed.Stability of data was assessed visually usinggraphs of log response ratios, updated daily.Changes in performance were complete withinthe first 25 sessions, and thus the data from thelast 60 sessions used in the analysis may beregarded as being stable.

RESULTS

In the following data analyses, no groupmean data were plotted if there were fewer

than 120 responses or 60 food deliveries in atime bin summed across all pigeons over the65 sessions. For individual data points, therespective numbers were 20 and 10. Addition-ally, some log-ratio data could not be plottedbecause no responses were emitted, or nofoods obtained, on one key in a 2-s time bin.

Phase 1

Figure 2 shows the mean obtained log left/right food ratios in successive 2-s bins ininterfood intervals after left- and right-keyfood deliveries for Phase 1 (Conditions 1 to 7and 11). As arranged, for all conditions exceptConditions 1 and 3, the local distribution offood deliveries changed with increasing timesince the last food. In the same–later condi-

Fig. 2. Phase 1. Mean log (L/R) obtained food ratio asa function of time since a left or right food delivery, in 2-sbins. Some data fell off the graphs.

CHOICE AND FUTURE FOOD 67

Page 6: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

tions (Conditions 5, 7, and 11), the log foodratio immediately following a food delivery wasstrongly in the direction of the not-just-productive key. As time since the last foodincreased, the log food ratio moved in thedirection of the just-productive key, becomingequal at around 10 s since the last fooddelivery, then moving strongly toward thejust-productive key. In the same–sooner con-dition (Condition 2), the local food ratio ininterfood intervals was in the opposite direc-tion, beginning strongly in the direction of thejust-productive key, and moving toward thenot-just-productive key. In the left–soonercondition (Condition 4), the local food ratiowas strongly toward the left key immediatelyfollowing the delivery of a food, but movedtoward the right key with time since food. Theopposite was true for the left–later condition(Condition 6).

The effect of recent food deliveries onpostfood choice was analyzed using preferencepulses, in which the log left:right responseratios were plotted as a function of time in 2-sbins since the last food, following both left-and right-key food deliveries. Figure 3 showspreference pulses from Phase 1 (Conditions 1to 7 and 11) of the experiment collapsedacross the 6 pigeons. Comparing this figurewith Appendix Figures A1 to A7 (individual-subject data), it is apparent that the group datawere generally representative of individualresponding, with the possible exception ofConditions 5, 7 and 11. In these conditions,choice remained close to indifference for mostbirds, but the pattern of the mean data wasaffected by Pigeon 22, whose postfood choicewas strongly controlled by the sooner-on-other-key contingency (Appendix Figures A5, A7and A8). A similar pattern developed forPigeon 25 in Condition 11. However, despitethese quantitative differences, left-key choicewas higher than right-key choice immediatelyafter food for 5 pigeons in Condition 5 and 7,and all 6 pigeons in Condition 11, showing adegree of control by the other–sooner contin-gency. Henceforth, discussion will focus ongroup data, noting individual-pigeon discrep-ancies. Particular features of interest are themagnitude and direction of the preferencepulses immediately following the food deliv-ery, and mean choice averaged across inter-food intervals (the same as across wholesessions). The mean and range first-bin and

extended log response ratios for each condi-tion are shown in Table 1.

When food deliveries were arranged accord-ing to a VI 27-s schedule on both keys(Conditions 1 and 3; Figure 3), postfoodpreference in the first 2-s time bin wasgenerally close to indifference following left-food deliveries from either alternative (Ta-ble 1). Preference stabilized around indiffer-ence after about 4 s. In Condition 3, immedi-ately after food, Pigeon 22 showed strongchoice toward the key that had just provided

Fig. 3. Phase 1. Mean log (Left/Right) response ratiosas a function of time since left and right food deliveries.Also shown is the extended-level preference averagedacross the last 65 sessions of the condition separately foreach prior reinforcer. Condition 3 was a replication ofCondition 1, and Conditions 7 and 11 replicatedCondition 5.

68 SARAH COWIE et al.

Page 7: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

food—an apparent carryover from its perfor-mance in Condition 2.

Mean first-bin preference in the same–sooner condition (Condition 2; Figure 3),was larger than in the equal VI 27-s conditions(Conditions 1 and 3) following left- and right-key food deliveries (Table 1). Figure 3 showsthat extended choice moved toward thesessional mean, slightly below indifference.As in other conditions, Pigeon 22 showedstrong first-bin choice toward the key thatarranged the sooner food.

Table 1 shows that first-bin preference inthe left–sooner condition (Condition 4; Fig-ure 3) was again more extreme than the VI27-s and same–sooner conditions (Conditions1 to 3). In the left–sooner condition, first-binpreference was strongly toward the left key,irrespective of the last-food location. First-binpreference following left-key food was system-atically more extreme than preference follow-ing a right-key food (Table 1). A binomial testof response ratios during the first 20 s follow-ing a food delivery averaged over the last 65sessions of Condition 4 indicated that thisdifference was significant across individualsubjects, z 5 2.1, N 5 6, p , .05. Preferencemoved toward and past indifference, andstabilized at a level reflecting extreme right-key choice. Such increased preference for thelater-food key at longer times since the lastfood delivery resulted in mean choice acrossinterfood intervals being toward the right(later) key for both the group mean and foreach individual pigeon (Table 1).

In the left-food later condition (Condition6), the pattern of preference was generallyopposite to that observed in the left–soonercondition (Condition 4), as might be expectedgiven the opposing contingencies in effect.First-bin preference was strongly in the direc-tion of the right key following both left-keyand right-key food deliveries (Table 1). Thispattern of responding was opposite to thatseen in Condition 4. First-bin preferencefavored the right key, as indicated by thenegative log ratios in Figure 3. A binomial testof log response ratios during the first 20 s aftera food delivery averaged over the last 65sessions indicated that this difference wassignificant across individual subjects, z 5 2.1,N 5 6, p , .05.

As found in the left-food sooner condition(Condition 4), choice across interfood inter-vals moved toward the key on which the laterschedule operated, reaching the sessionalmean approximately 20 s after the last fooddelivery, and continuing to a level reflectingmore extreme preference for the later-food(left) key. Unlike preference in the otherPhase-1 conditions, preference in the left–sooner and left–later conditions did notstabilize at a level reflecting the sessionalmean. As a result, mean preference in theboth the right-food later and left-food laterconditions (Conditions 4 and 6) reflectedoverall preference for the later-food alterna-tive.

In the same–later conditions (Conditions 5,7 and 11), food deliveries occurred sooner on

Table 1

Mean (m), lower (Min.) and upper (Max.) first-bin, and extended, log response ratios followingleft and right food deliveries. C is condition number.

C

|L food delivery |R food delivery

First-bin Extended First-bin Extended

m Min. Max. m Min. Max. m Min. Max. m Min. Max.

1 20.16 20.42 0.11 20.06 20.42 0.11 20.14 20.45 0.06 20.10 20.45 0.062 0.47 20.17 0.12 20.02 20.17 0.12 20.67 20.17 0.07 20.22 20.53 0.073 0.28 20.20 0.12 20.02 20.20 0.12 20.21 20.28 0.09 20.08 20.28 0.094 1.38 20.37 20.07 20.24 20.37 20.07 0.72 20.54 20.18 20.35 20.54 20.185 0.06 20.46 0.13 20.10 20.46 0.13 0.68 20.21 0.10 20.02 20.21 0.106 20.52 0.06 0.63 0.33 0.06 0.63 20.79 0.01 0.53 0.25 0.01 0.537 20.32 20.53 0.27 20.07 20.53 0.27 0.40 20.23 0.23 0.04 20.23 0.238 20.90 20.36 0.61 0.17 20.36 0.61 1.00 20.42 20.10 20.28 20.42 20.109 21.37 20.20 0.63 0.18 20.20 0.63 20.76 0.01 0.58 0.31 0.01 0.58

10 0.74 20.57 20.29 20.46 20.57 20.29 21.05 0.02 0.64 0.28 0.02 0.6411 20.42 20.50 0.12 20.10 20.48 0.15 0.42 20.14 0.15 20.02 20.16 0.19

CHOICE AND FUTURE FOOD 69

Page 8: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

the not-just productive key, and postfoodpreference in the first 2-s bin was generallytoward the not-just-productive key following afood delivery from either alternative (Ta-ble 1). In Condition 7, mean first-bin prefer-ence was in the direction of the likely sooner-food location, whereas in Condition 5, onlypreference following right-key food deliverieswas in the direction of the likely sooner-foodlocation (Table 1). Although Condition 11 wasconducted after several conditions in whichthe last-food location was signaled (Phase 2),the intervening conditions did not appear toaffect postfood preference; the results fromCondition 11 closely replicated those fromCondition 7. As previously noted, local choicefor all pigeons except Pigeon 22 (in all same–later conditions) and Pigeon 25 (in Condition11) generally was close to indifference imme-diately following food delivery, and quicklystabilized at a level close to the sessional mean.

Although the contingencies were the directopposite of those in the same–sooner condi-tion (Condition 2), preference in the same–later conditions (Conditions 5, 7 and 11) wassubstantially less extreme. Both the magnitudeand the duration of the postfood preferencepulse were notably smaller in the same–laterconditions, possibly reflecting a bias againstpostfood responding on the not-just-produc-tive alternative.

Thus, the contingencies when the VI sched-ules were the same on both keys produced theleast extreme postfood preference (Condi-tions 1 and 3). When the schedule wasarranged to reinforce further the effect ofthe previous food delivery (same–sooner;Condition 2), postfood preference was moreextreme. When the schedule favored a partic-ular key (left sooner and left later; Conditions4 and 6), there were strong postfood pulses inthe direction of that key. The choice data fromconditions in which the schedule favored thenot-just-productive key (same later; Conditions5, 7 and 11) were somewhat ambiguous, butsuggested only weak control by the contingen-cies in most pigeons, with the exception ofPigeon 22, and Pigeon 25 in Condition 11,which showed more extreme preference forthe not-just-productive key.

Effects of Successive Reinforcers

The log response ratios averaged oversuccessive interfood intervals (‘‘trees’’) for

selected sequences of up to four food deliver-ies obtained in a condition are plotted inFigure 4 as a function of food deliveries. InConditions 1 and 3 (VI 27 s), the trees weresimilar in shape to those obtained elsewhere(e.g., Davison & Baum, 2000; 2002), with eachsuccessive food delivery having a small butreliable increasing effect on preference. Thischange in preference was greatest followingdiscontinuations of sequences of same-keyfood deliveries. The tree diagrams weresymmetrical, with left-key and right-key foodshaving a similar-sized effect on preference. Butthe size of the preference changes acrosssuccessive food deliveries were much attenuat-ed compared with those reported by Davisonand Baum (2000; 2002).

Successive-reinforcer analyses for the same–sooner condition (Condition 2), were similarto those in Conditions 1 and 3 in that theywere symmetrical, with each food deliverybeing followed by a change in preferencetoward the just-productive key, and with thegreatest effects occurring following discontin-uations of same-key food deliveries. Thechange in preference following the first fooddelivery was, however, substantially greater,and the difference between post-left and post-right choice more apparent than in Condi-tions 1 and 3. In the left–sooner condition(Condition 4), tree analyses showed a similarmagnitude of change in preference acrosssuccessive food deliveries to that observed inthe same–sooner condition (Condition 2), butwith a smaller change occurring over the firstthree left-key food deliveries. Preference fol-lowing continued right-key food deliveries wasslightly less extreme than following left-keyfood deliveries. The opposite pattern wasobserved in the left–later condition (Condi-tion 6), with the differential effects of theopposite contingencies (left food sooner inCondition 4, right food sooner in Condition 6)apparent in the overall level of the tree graphs.Preference in the left–sooner condition (Con-dition 4) was more toward the right key, whilepreference in the left–later condition (Condi-tion 6) was more toward the left key— in bothcases, toward the key providing the averagelonger time to food. The overall level ofpreference for the later alternative in bothconditions is a necessary consequence of theexcellent control by the local food ratios forthe duration of the IRI. With the VI 5-s and VI

70 SARAH COWIE et al.

Page 9: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

50-s schedules in operation, responding quick-ly moved from the sooner key to the later key,and thus, overall, more time was spentresponding on the later key.

The group-mean data show that each fooddelivery in the same–later conditions (Condi-tions 5, 7 and 11; Figure 4) produced a changein preference toward the not-just-productivekey. The magnitude of change in preferenceproduced across a sequence of four fooddeliveries in these conditions was similar inextent to that in Conditions 2, 4 and 6, but thepattern was quite different. While discontinu-ations again had a greater effect on preferencethan continuations, discontinuations changed

choice toward the key that had not justprovided food. Continued food deliveries ona key, however, changed preference slightlybut progressively toward the key that hadprovided the last food.

Phase 2

In Phase 2, the last-food location wassignaled by a red keylight, and henceforthPhase 1 conditions will be termed ‘‘un-signaled’’, and Phase 2 conditions ‘‘signaled’’,conditions. Log food ratios as a function oftime since the last food delivery for Phase 2(Conditions 8 to 10) are shown in Figure 5.Under these conditions, which were replica-

Fig. 4. Phase 1. Mean log response ratio withininterreinforcer intervals as a function of successive fooddeliveries from left and right keys. Sequences analyzedoverlapped—that is, the data points plotted for an x valueif 1 were from any sequence ending in a left or a rightreinforcer; for x 5 2, the points were for any sequencesending in two left food deliveries, two right food deliveries,or left-right or right-left food deliveries.

Fig. 5. Phase 2. Mean log (L/R) obtained food ratio asa function of time since a left or right food delivery, in 2-sbins. Some data fell off the graphs.

CHOICE AND FUTURE FOOD 71

Page 10: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

tions of the same–later (Conditions 5, 7 and11), left–later (Condition 6), and same–sooner(Condition 2) conditions, but with the last-food location signaled by a red keylight, thelocal food-ratio changes were generally similarto those in the Phase 1 conditions, althoughmore extreme immediately following thedelivery of a food from either key in thesignaled left–later condition (Condition 9),and slightly less extreme later in the IRI in thesignaled same–later condition (Condition 8).

Figure 6 shows the log response ratioplotted as a function of time since a left andright food delivery for Conditions 8 to 10.Comparison of these data with the log foodratios in Figure 5 shows that the log food ratiowas followed closely throughout the IRI in allconditions of Phase 2.

In the signaled same–later condition (Con-dition 8), individual data were well represent-ed by the mean data (Figure 6; AppendixFigure B1), unlike those in the unsignaledsame–later conditions (Conditions 5, 7 and11). First-bin preference in the signaled same–later condition was strongly in the direction ofthe not-just-productive key for all birds. As inthe unsignaled same–later conditions, prefer-ence moved from the not-just-productive keytoward indifference over a period of 10 s, buteventually stabilized at a level strongly in thedirection of the just-productive key, ratherthan at a level reflecting the sessional mean,which was close to indifference. The mean logresponse ratio across the interfood intervalfollowing a left food delivery was thus substan-tially more toward the later alternative than inthe unsignaled same–later conditions (Ta-ble 1).

As in the unsignaled left–later condition(Condition 6; Figure 3), first-bin preference inthe signaled left–later condition (Condition 9;Figure 6) was strongly toward the right key.The magnitude of this preference was slightlymore extreme than in the unsignaled left–latercondition (Table 1). Preference in the first 2-sbin was marginally more toward the right keyfollowing left-key food deliveries than follow-ing right-key food deliveries. The change inpreference over time since food delivery in thesignaled left–later condition was similar to thatthat found in the unsignaled left–later condi-tion, although the mean across interfoodintervals was somewhat lower (Table 1). Thelevel at which IRI preference stabilized was

similar to that in the unsignaled left–latercondition. As in the unsignaled left–soonerand left–later conditions (Conditions 4 and 6,respectively), preference across interreinforcerintervals in the signaled left–later conditionstabilized at a level much more extreme thanthe sessional mean.

First-bin postfood preference in the signaledsame–sooner condition (Condition 10; Fig-ure 6) was directionally similar to that in theunsignaled same–sooner condition (Condi-tion 2), but was much more extreme (Ta-ble 1). Interfood preference did not stabilizeat the same point across the interreinforcer

Fig. 6. Phase 2. Mean log (Left/Right) response ratiosas a function of time since left and right food deliveries.Also shown is the extended-level preference averagedacross the last 65 sessions of the condition separately foreach prior reinforcer.

72 SARAH COWIE et al.

Page 11: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

interval as in the unsignaled same–soonercondition, instead crossing the sessional meanto reflect strong preference for the not-just-productive key. Mean preference across inter-reinforcer intervals was thus more extreme forthe signaled same–sooner condition than forthe unsignaled same–sooner condition, as aresult of the increased time spent respondingon the later-food key, later in the interfoodinterval.

With the addition of a stimulus signaling thelast-food location, postfood preference underconditions where the location of the soonerschedule depended on the last-food location(Conditions 8 and 10) was more extreme, andalways in the direction of the likely soonernext-food location, regardless of whether thenext food delivery was arranged sooner on thejust-productive, or on the not-just-productivekey. The addition of the signaling stimuluswhen the sooner schedule remained on onekey (Condition 9) had only a minimal effecton the magnitude of choice immediatelyfollowing the delivery of a food in comparisonwith the unsignaled left–later condition (Con-dition 6).

In addition, preference later in the inter-food interval for the signaled conditionsapproximated the local food ratio, in accor-dance with the contingencies in effect. Incomparison, preference later in the interfoodinterval in the unsignaled conditions (with theexception of the left–sooner and left–laterconditions; Conditions 4 and 6) approximatedthe overall food ratio. The addition of thesignaling stimuli in Phase 2 thus appears tohave removed the requirement to rememberthe last-food location in order to discriminatethe location of the next sooner schedule;under these signaled conditions, only timesince food was necessary for tracking the next-food location, and thus the behavior was moresimilar to that observed in those conditionswhere last-food location signaled nothingabout the next-food location.

Effects of Successive Reinforcers

Figure 7 shows tree analyses for selectedsequences of food deliveries obtained in acondition, for Conditions 8 to 10. When thesignaled same–later contingencies in opera-tion (Condition 8), the change in preferencethat accompanied the first food delivery in asequence was much greater than in any Phase

1 conditions. Discontinuations again had thegreatest effect on preference, moving it to alevel slightly less than that of a continuation ofthe same length on the other key. Unlike theunsignaled same–later conditions (Conditions5, 7 and 11), food deliveries in the signaledsame–later condition were followed by achange in preference toward the just-produc-tive key, reflecting the increased preferencefor the just-productive (later-food) alternativelater in the interfood interval.

In the signaled left–later condition (Condi-tion 9), overall preference was toward the leftkey—a pattern similar to that observed in the

Fig. 7. Phase 2. Mean log response ratio withininterreinforcer intervals as a function of successive fooddeliveries from left and right keys. See legend to Figure 4.

CHOICE AND FUTURE FOOD 73

Page 12: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

unsignaled left–later condition (Condition 6),which likely reflects a tendency to spend moretime responding on the left key. As in theunsignaled same–later conditions (Conditions5, 7 and 11), but unlike the unsignaled left–later condition (Condition 6), continued left-or right-key food deliveries after the firstmoved choice toward the key on which thefood had been obtained, although the amountof change was slight. Discontinuations had alarger effect, moving choice toward the keythat did not provide the last food delivery; forexample, a left food delivery followed by aright food delivery produced a strong left-keychoice, whereas a right food delivery followedby a left food delivery resulted in a smaller left-food choice.

The pattern of choice changes in the signaledsame–sooner condition (Condition 10) werethe reverse of those in the unsignaled same–sooner condition (Condition 2)—food oneither key produced a strong choice for theother key over the next interfood interval.Again, this difference is attributable to thedifferences in the pattern of interfood choice,in particular the later stabilization of preferenceat a point beyond indifference in the signaledsame–sooner condition, toward the not-just-productive (later-food) alternative. In contrast,local preference in the unsignaled same–soonercondition stabilized close to indifference, withextreme preference occurring only in the firstbins for the just-productive alternative. Thus, inthe unsignaled same–sooner conditions, moretime was spent responding on the just-produc-tive (sooner) alternative, and the pattern ofchoice on a more extended level was thusreversed. However, continued foods on a keydid produce a small but consistent change inchoice toward that key, as observed in theunsignaled same–sooner condition.

In summary, the addition of a stimulus thatsignaled the location of the last food deliveryhad two main effects discernible at the mostlocal level of analysis: heightened postfoodpreference for the next-sooner key, andextreme preference for the next-later key atincreased time since the last food delivery. Aswould be expected, these effects were gener-ally present only when the contingency wassuch that the location of the next-soonerschedule depended on the last-food location.These effects of the signaling stimuli were alsoobservable in analyses of the effects of

sequences of food deliveries, and are likelyresponsible for the majority of differencesbetween the tree analyses for the signaledversus unsignaled conditions.

DISCUSSION

A number of replications were conducted inthis experiment (see Figure 2): Condition 3was the same as Condition 1 (both schedulesVI 27 s), but did not produce an exactreproduction of the results, in that smallpostfood preference pulses lasting about 6 soccurred in Condition 3, but not in Condition1. The difference was largely due to Pigeon 22,the only pigeon that showed strong differen-tial choice in Condition 3. Condition 7 (same–later) was a replication of Condition 5, and wasdone to investigate whether the considerablecontrol over postfood choice in Condition 6(left–later) had any lasting effect on choice.Again, the result was not identical, but first-binchoice given left and right food deliveries inCondition 7 differed directionally in the sameway as in Condition 5. The conditions werereplicated again in Condition 11, and the near-identity of the results of Conditions 7 and 11suggests that it is those of Condition 5, notCondition 7, that are anomalous. Condition 5may have been affected by the prior Condition4, which arranged a left–sooner contingency.Condition 11 was conducted after Phase 2 tocheck whether the Phase 2 conditions had anylasting effect on postfood choice, and, as itprovided an almost perfect reproduction ofthe choice in Condition 7, it appears there wasno lasting effect.

In Phase 1 of the present experiment, fooddeliveries signaled subsequent contingenciesof reinforcement in varying ways, but the keyon which the just-obtained food had occurredwas not separately signaled. In Conditions 1, 3,4 and 6, the last-food location was unrelated tothe location of the next-sooner schedule, whilein Conditions 2, 5, 7 and 11, left- and right-keyfood deliveries differentially signaled thelocations of the next-sooner food. The contin-gencies signaled by food delivery were eithernondifferential with respect to the sooner food(the same mean times for both keys; Condi-tions 1 and 3), or were differential with respectto the next-sooner food (different contingen-cies for each key; Conditions 2, 4, 5, 6, 7 and11).

74 SARAH COWIE et al.

Page 13: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

As a general conclusion, local choice wasjointly controlled by the probability of obtain-ing food at a location at a given time, by thecomplexity of the contingency, and apparentlyby decaying control over time by the last-foodlocation. When the food locations were non-differential signals, and the subsequent con-tingencies were nondifferential (when thelikelihood of obtaining a food on the left orright key was similar; Conditions 1 and 3),there were virtually no preference pulses(Figure 3). When only the key position sig-naled the next sooner-food location (Condi-tions 4, left sooner, and 6, right sooner),postfood preference pulses were stronglytoward the key providing the likely soonerfood as signaled by the time since the last fooddelivery, and control by obtained local foodratios lasted throughout the IRI (compareFigures 2 and 3). When left and right fooddeliveries signaled different postfood contin-gencies (Condition 2, same–sooner), prefer-ence pulses were smaller than in Conditions 4and 6, and control by the local food ratio wasobserved only for about 10 s after either fooddelivery. When left and right food deliveriesnot only signaled different contingencies, buteach signaled contingencies favoring the otherkey sooner (Conditions 5, 7, and 11), prefer-ence pulses were small, and were causedmainly by one individual pigeon in Condition5 and 7 (Appendix Figures A5 and A7). Thusit appears that postfood preference was con-trolled by the contingencies signaled by priorfood location, but that the degree of controldepended on the complexity of the stimulus–behavior–reinforcer contingency. Control wasgreatest when any food signaled sooner foodson a particular key, less when food signaledsooner food on the same key, less again whenfood signaled sooner on the other key, andabsent when food was not a differential signalfor the next likely sooner food.

The present results are important in show-ing that a changeover delay (Herrnstein, 1961)is not necessary for preference pulses todevelop. However, the arrangement of achangeover delay may be sufficient for prefer-ence pulses because such procedures donaturally affect the local probability of fooddelivery on two keys. If, as appears common, afood delivery on a key leaves the changeoverdelay satisfied, the obtained food ratio will beinfinitely toward the just-reinforced key during

the changeover-delay period, as no food canbe obtained on the other key, even if arranged,during this time—although the overall foodrate in the changeover-delay period will belowered. Given the results reported here, suchan immediate local food differential signaledby prior food would strongly affect localchoice.

Reinforcement effects may be thought of aslocal or as extended. Local reinforcementeffects would be shown by choice beingenhanced toward the just-productive key overthe next interfood interval. Extended effectscan be thought of as a similar differentialchoice but across continuations of same-keyfood deliveries (Figure 4). First, local rein-forcement effects are discussed. In Phase 1conditions in which it was possible to seedifferential interfood choice (Figure 3, inparticular the left–sooner and left–later Con-ditions 4 and 6), the location of the just-productive key biased choice toward that keyover the next interreinforcer interval, suggest-ing a local reinforcement effect. But this wasevident in only a transient way (the first 6 s) inConditions 1 and 3, in which it might havebeen expected to have occurred more stronglybecause there could be no signaling effect ofthe last food delivery.

A method of calculating the size of rein-forcement effects is to take the followingmeasure:

0:5 logBL RLjð ÞBL RRjð Þ

: BR RRjð ÞBR RLjð Þ

� �,

where B and R refer to responses andreinforcers respectively, and L and R to theleft and right keys respectively. Notice thesimilarity of this measure to measures taken inconditional-discrimination analyses (e.g., Davi-son & Nevin, 1999), with RL and RR replacingS1 and S2. The measure, applied to responsesin each 2-s time bin, provides an estimate ofthe local reinforcement effect independent ofthe signaling effect of the prior food delivery.Where the measure is zero, the pigeon emittedequal numbers of responses to the two keysregardless of where the last food was delivered;thus, no signaling or reinforcer effects werepresent. When the measure is positive, moreresponses were emitted on the just-productivekey than on the not-just-productive key,indicating a reinforcement effect. When the

CHOICE AND FUTURE FOOD 75

Page 14: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

measure is negative, more responses wereemitted on the not-just-productive key, indi-cating a signaling (or location) effect. Thisanalysis is shown in Figure 8. The measureshowed that in Conditions 2, 3, 4, and 6, therewas a small, transient enhancement in prefer-ence to the just-productive key (reinforcereffect), followed, in the same–sooner and VI27-s conditions (Conditions 2 and 3) by a fallto zero (equal responding on both keys)—thus, the reinforcement effect did not last. Inthe left– and right–sooner conditions (Condi-tions 4 and 6), the local increase in preferenceto the just-reinforced key lasted longer than inConditions 2 and 3, being maintained for 20 to40 s after the last food delivery. However, these

effects were smaller in magnitude than inConditions 2 and 3.

The left– and right–sooner conditions(Conditions 4 and 6) were the only conditionsin Phase 1 in which choice in interreinforcerintervals continued to favor the just-productivekey well into the interreinforcer interval, asshown by the log response ratios following aleft food delivery being consistently more tothe left key than those following a right fooddelivery (Figure 3). Since this effect onlyoccurred in these two conditions, it does notappear to be a ‘‘reinforcement’’ effect; rather,it must be due to some other cause specific tothese conditions, perhaps related to theexcellent control by the time since the lastfood delivery across the interfood interval.

Figure 8 shows that, in the same–laterconditions (Conditions 5, 7 and 11), no effectof prior food (reinforcement effect) waspresent, due to the strong signaling propertiesof the reinforcers. Postfood choice was driventoward the not-just-reinforced key over the first10 s of the interfood interval; after this 10-speriod, very little signaling effect was evident.Thus, if there was a small transient reinforce-ment effect, it was easily overcome by strongsignaling effects of foods when these favorchanging between keys. Perhaps the pure‘‘reinforcement’’ effect is simply a bias tocontinue emitting the same behavior that justpaid off, easily overcome when the subsequentcontingencies favor just one of the two keys(Conditions 4 and 6), and when they favorchanging to the other keys.

Was there evidence of extended reinforcereffects occurring across successive reinforcers?Figure 4 shows that successive same-key fooddeliveries (continuations) did progressivelyincrease preference for a key in Conditions3, 4 and 6, although the changes weregenerally small even after four successivereinforcers—much smaller than found inprocedures in which reinforcer ratios fre-quently changed across relatively short com-ponents (e.g., Davison & Baum, 2002). How-ever, the same–later conditions (Conditions 5,7 and 11) gave a very different pattern ofchoice across continuations: The pattern isalternation, as would be expected given thepreference-pulse results. While a single leftfood moved choice toward the right key,successive reinforcers on the left key produceda pattern of stronger right-key choice. This

Fig. 8. Calculated effects of left and right fooddeliveries across 60 s following those deliveries for selectedPhase 1 conditions. See text for further explanation.

76 SARAH COWIE et al.

Page 15: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

pattern was exactly reversed after right-keyfoods. The pattern of decreasing other-keychoice across successive food deliveries thusmay constitute a small, but consistent, rein-forcement effect. The increase in preferencetoward the just-reinforced key across successivefood in all Phase 1 conditions appears to beclear evidence of extended reinforcementeffects across continued food deliveries onkeys.

In summary, there was little evidence of areinforcement effect within interreinforcerintervals, in which choice was controlledstrongly by the signaling effects of fooddeliveries, but, rather, evidence of a reinforce-ment effect operating across successive fooddeliveries on a key. However, even this effectmight also be a discriminative, rather than areinforcement, effect: Clearly, continuedfoods at a key will begin to undermine thearranged contingencies (e.g., other–sooner)in some conditions: In the same–later condi-tions (Conditions 5, 7, and 11), extendedcontinuations of food deliveries on a key mightbegin to signal more frequent (but moredelayed) food deliveries on the same key.However, in the left–sooner condition (Con-dition 4), continued left-key food deliveriescould signal a high frequency of short-delayedfood deliveries, and continued right-key fooddeliveries could signal a high frequency oflong-delayed food deliveries (Condition 6being the inverse of this). Under theseconditions, choice moved toward the alterna-tive on which the continuations occurred—that is, choice moved toward the key thatappeared to be providing more food deliver-ies. In the same–sooner condition (Condition2), choice also moved toward the key that hadrecently provided more food deliveries, but inthis case, the food availabilities were at thehigher rate on both keys—and choice changeswere more extreme than in the left–sooner orleft–later conditions (Conditions 4 and 6).What is surprising is that continued lower-ratefood deliveries changed overall choice aboutthe same amount as continued higher-ratefood deliveries in Conditions 4 and 6 (that is,the preference trees for these conditions weresymmetric) when high-rate food continuationshad a greater effect on choice in Condition 2.However, in all conditions, continued fooddeliveries on a key constituted training for‘‘stay at a key’’ independent of the time to the

next food, choice coming more under controlof food frequency than of the expected timesto the next food on keys.

Clearly, locally, when food on either keysignaled no differential reinforcer ratios overtime since food (Conditions 1 and 3), therecould be, and was, very little signaling effect offood. When the VI 5-s and VI 50-s schedulesremained on a particular key for the durationof a condition (left–sooner or right–sooner;Conditions 4 and 6), changes in choice closelyfollowed changes in the obtained log foodratio for the duration of the interfood interval,and thus there was a strong signaling effectthat spanned the whole of the next interfoodinterval. When food deliveries on the left andright keys signaled subsequent reinforcerratios that followed the location of the lastfood (same–sooner; Condition 2), the signal-ing effect was smaller and lasted only 8 to 10 s.Thereafter, log response ratios deviated fromthe obtained log food ratio to settle at a levelreflecting the sessional mean food ratio.Finally, when food deliveries on the left andright keys signaled subsequent reinforcerratios that were opposite from the last-foodlocation (same–later; Conditions 5, 7, and 11),the signaling effect was transient, producingbrief but consistent preference for the not-just-reinforced key, a clear, albeit transitory,signaling effect.

Thus, only when any food signaled thatreinforcers would likely follow sooner on a keyindependent of the last-reinforcer location,did the signaling effect last the whole interre-inforcer interval. Of course, only in the left–sooner and left–later conditions (Conditions 4and 6; ignoring Conditions 1 and 3) was thelast reinforcer location irrelevant to howreinforcer ratios changed in the next interre-inforcer interval—all that was relevant was thetime elapsed since the last food delivery.

Phase 2 of this experiment asked whethersignaling the last food location would enhancecontrol over choice across the next interrein-forcer interval, presumably making conditionsin which the last-food location was relevant,more like Conditions 4 and 6. The results fromPhase 2 supported this interpretation. Theaddition of the signal for the last-food locationdid not change the way interreinforcer choicechanged in the signaled left–later condition(Condition 9), which was similar to Condition6—in Condition 9, the signal for the last-food

CHOICE AND FUTURE FOOD 77

Page 16: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

location would be irrelevant. But in Condi-tions 8 (equivalent to Conditions 5, 7 and 11)and 10 (equivalent to Condition 2), choice wasunder precise control of the obtained foodratios across the interreinforcer interval. Inparticular, choice following a left or right fooddelivery in Condition 8 strongly favored initialpreference to the right or left keys respectively,and preference changes in this condition werealmost, but not quite, the inverse of those inCondition 10.

The reasons for the patterns of choicechanges across successive food deliveries inPhase 2 (Figure 7) seem less obvious—forinstance, the pattern in Condition 8 (same–later) was similar to that seen in Condition 2(same–sooner; Figure 4), but they can beunderstood by looking at the mean interre-inforcer choice shown in Figures 3 and 6.Mean interfood choice depends on therelative number of responses on the left andright keys across the whole interfood interval.In Condition 2, and more so in Condition 8,more responses were emitted on the left keyfollowing a left-key food delivery than on theleft following a right-key food delivery simplybecause more of the interreinforcer intervalwas spent choosing left after a right fooddelivery. In Condition 2, this was because ofthe immediate postfood pulses; in Condition8, it was because more time was spentresponding on the left key after a left fooddelivery toward the end of the interreinforcerinterval. As a result, the preference tree forCondition 8 appears to favor responding leftafter a left food delivery, even though theimmediate postfood preference was to theright key. The reverse is true of Condition 10(same–sooner), in which the preference treessuggest preference for the left key after aright food delivery. Condition 9 (left later)gave a similar preference tree to Conditions5, 7 and 11 (same later) for a similar reason:For example, following either a left or rightfood delivery, preference transiently movedto the right key, but more responses wereemitted to the left key over the interreinforc-er interval. This difference provided thestrong overall preference for the left key(the later reinforcer) in Conditions 6 and 9.However, the mean interreinforcer choicefollowing a left food delivery in Condition 9was more to the left, and following a rightreinforcer was more to the right, giving the

preference-tree structure shown in Figure 7—with this pattern reversed in Condition 6.Hence, the differences in tree structure areexplicable. However, it is worth highlightingthat preference trees can clearly provideconfusing information when choice changesacross interreinforcer intervals. It would bepossible to interpret the Condition 8 prefer-ence tree as suggesting that only in this same–later condition did successive food deliveriesfrom a key enhance choice toward that key—as indeed it did, in one sense—but thisconclusion would rest on the different aver-age times to food, and would not reflect theprecise control by the location of the lastfood delivery and the time since the last fooddelivery shown in local choice within theinterreinforcer intervals. An extended-levelexample makes this problem clearer: If VI 5-sVI 300-s schedules had been arranged on thekeys in Condition 9, preference within theinterreinforcer intervals would have lookedsimilar, but the interfood preference towardthe right, later, key would have been consid-erably greater than in VI 5 s VI 50 s becauseof the greater amount of the interreinforcerinterval spent responding on the right key.Clearly, overall preference in such situationswill be controlled by the inverse of the overallreinforcer-frequency ratio, a result whichappears incompatible with the generalizedmatching law (Baum, 1974). Thus, extendedmeasures of choice, be they mean choiceacross interreinforcer intervals or choiceacross whole sessions, fail to represent choiceaccurately when there is local control ofchoice within interreinforcer intervals.

The present results showed remarkably goodcontrol by time, up to 60 s, since food inConditions 4, 6, 8, 9 and 10—conditions inwhich either the postfood reinforcer ratios weresimply located, or in which the location of thelast food delivery was signaled throughout theinterval. For example, in Condition 10, at 60 safter food, preference was strongly to the rightkey when the left was lit red, and strongly to theleft key when the right key was red. But thepresent experiment was not designed to inves-tigate timing, and ought not be used for thispurpose. It should be mentioned, however, thatthe present results do raise questions aboutfood contingencies and signals in timingresearch—an aspect of control that has beenrather neglected in that area.

78 SARAH COWIE et al.

Page 17: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

What do the present results tell us aboutpreference pulses—periods of increased pref-erence following food deliveries, usually butnot invariably toward the key that just providedfood? They likely arise when food deliveriesthemselves, or other stimuli, signal immediatelocal deviations of food ratios from the overallor sessional ratio arranged—and such localdeviations will be directly produced by thearrangement of changeover delays or change-over ratios. Thus, Krageloh and Davison(2003) found that postfood preference pulseswere absent when no changeover delay wasarranged. Further, Davison, Elliffe and Marr(2010) found no postfood preference pulsesin conventional concurrent VI VI scheduleperformance when no changeover delay wasarranged (Phase 1)—they presented no pref-erence-pulse analyses simply because no pulseswere present. Preference pulses thus do notmeasure the effects of the just-received food,but rather the effects of the obtained foodratio following food. Furthermore, this effect isdynamic, because enhanced preference to akey drives obtained food ratios in the directionof the local preference. Preference followingfood is controlled by time signaling local foodratios—and, inasmuch as local food ratios areconsonant with the likely location of food atthe end of an interfood interval, they maysignal the next food location. It is this lattereffect that may produce changing interfoodpreference across continuations of food deliv-eries on a key. It is apparent that foodreinforcement does not produce changes inchoice by increasing prior behavior—rather,the contingencies signaled by the food deliveryitself, and by the time since the food delivery,drive the behavior. It remains the case that, iffood signals more food at a location, prefer-ence for that location will increase as impliedby the law of effect. The caveat is that food maynot always signal more food at a key, but maysignal more food at a different key—each fooddelivery signals the start of a time period with achanged food ratio, and this food-ratio changecan be further amplified by a reversal ofpreference.

In summary, food delivery at a responsealternative can control local choice whencontingencies systematically change after suchfood deliveries, and we need to be aware ofsuch signaling effects. It is not known to whatextent such effects have occurred in data on

choice that have been reported. Take, forexample, concurrent VI 5-s VI 100-s perfor-mance. The degree of signaling that may occurwill depend on how the schedules arearranged. With nonindependent exponential(also termed constant-probability, or random-interval) schedules, the expected time to thenext food delivery does not change with timesince food. Thus, any food delivery signals a 5-saverage interval on one key, and a 100-saverage interval on the other. The situationwill be similar to Condition 6 (left sooner,right later) here. It would be expected thatpreference immediately after any food deliverywould be to the VI 5-s key, but would notsubsequently increase to the right key becauseexpected times to food remain constant. Butother ways of arranging concurrent VI VIschedules will give subtly different effects: Forexample, if the interreinforcer intervals arearranged using randomized lists of eitherexponentially or arithmetically determinedintervals, expected times to food will not stayconstant over time since food, and a reversal ofpreference may follow the initial preferencepulse as in Conditions 4 and 6. Such a reversalappears to be a recipe for extended-levelundermatching (Baum, 1974). However, it isbeyond the scope of this paper to consider thetheory of these differences, and arguably anempirical approach to determine whetherfood deliveries do act as signals in suchsituations is preferable.

It is apparent, however, that control by fooddelivery can occur in some concurrent sched-ules. For example, White and Davison (1973)investigated performance on concurrent fixed-interval (FI) FI schedules and reported that,under some conditions, FI scallops developedon both keys, in some just on one key, and insome on neither key. The scalloped patternsclearly show control by changing food ratiosacross time since food on a key. Additionally,the results of Krageloh et al. (2005) showedthat prior food location can control subse-quent interreinforcer choice when food sig-nals particular conditional probabilities ofsubsequent food locations. It is clear thatsignaling effects of food can happen whenthe conditions support that—the question is,under what conditions does the signalingeffect become important and override anypurely reinforcing (or response-continuation)effects?

CHOICE AND FUTURE FOOD 79

Page 18: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

REFERENCES

Baum, W. M. (1974). On two types of deviation from thematching law: bias and undermatching. Journal of theExperimental Analysis of Behavior, 22, 231–242.

Boutros, N., Davison, M., & Elliffe, D. (2009). Conditionalreinforcers and informative stimuli in a constantenvironment. Journal of the Experimental Analysis ofBehavior, 91, 41–60.

Boutros, N., Davison, M., & Elliffe, D. (2011). Contingentstimuli signal subsequent reinforcer ratios. Journal ofthe Experimental Analysis of Behavior, 96, 39–61.

Catania, A. C., & Reynolds, G. S. (1968). A quantitativeanalysis of the responding maintained by intervalschedules of reinforcement. Journal of the ExperimentalAnalysis of Behavior, 11, 327–383.

Church, R. M., & Lacourse, D. M. (2001). Temporalmemory of interfood interval distributions with thesame mean and variance. Learning and Motivation, 32,2–21.

Davison, M., & Baum, W. M. (2000). Choice in a variableenvironment: Every reinforcer counts. Journal of theExperimental Analysis of Behavior, 74, 1–24.

Davison, M., & Baum, W. M. (2002). Choice in a variableenvironment: Effects of blackout duration and extinc-tion between components. Journal of the ExperimentalAnalysis of Behavior, 77, 65–89.

Davison, M., & Baum, W. M. (2006). Do conditionalreinforcers count? Journal of the Experimental Analysis ofBehavior, 86, 269–283.

Davison, M., & Baum, W. M. (2007). Local effects ofdelayed food. Journal of the Experimental Analysis ofBehavior, 87, 241–260.

Davison, M., & Baum, W. M. (2010). Stimulus effects onlocal preference: Stimulus–response contingencies,stimulus–food pairing, and stimulus–food correlation.Journal of the Experimental Analysis of Behavior, 93,45–59.

Davison, M., Elliffe, D., & Marr, M. J. (2010). The effects ofa local negative feedback function between choiceand relative reinforcer rate. Journal of the ExperimentalAnalysis of Behavior, 94, 197–207.

Davison, M., & Nevin, J. A. (1999). Stimuli, reinforcers,and behavior: An integration. Journal of the Experimen-tal Analysis of Behavior, 71, 439–482.

Elliffe, D., & Alsop, B. (1996). Concurrent choice: Effectsof overall reinforcer rate and the temporal distribu-tion of reinforcers. Journal of the Experimental Analysis ofBehavior, 65, 445–463.

Herrnstein, R. J. (1961). Relative and absolute strength ofresponse as a function of frequency of reinforcement.Journal of the Experimental Analysis of Behavior, 4,267–272.

Krageloh, C. U., & Davison, M. (2003). Concurrent-schedule performance in transition: changeoverdelays and signaled reinforcer ratios. Journal of theExperimental Analysis of Behavior, 79, 87–109.

Krageloh, C. U., Davison, M., & Elliffe, D. M. (2005). Localpreference in concurrent schedules: the effects ofreinforcer sequences. Journal of the Experimental Analy-sis of Behavior, 84, 37–64.

Olton, D. S., & Samuelson, R. J. (1976). Remembrance ofplaces passed: Spatial memory in rats. Journal ofExperimental Psychology: Animal Behavior Processes, 2,97–116.

Randall, C. K., & Zentall, T. R. (1997). Win-stay/lose-shiftand win-shift/lose-stay learning by pigeons in theabsence of overt response mediation. BehaviouralProcesses, 41, 227–236.

Schneider, B. A. (1969). A two-state analysis of fixed-interval responding in the pigeon. Journal of theExperimental Analysis of Behavior, 12, 677–687.

Schneider, S. M., & Davison, M. (2006). Molecular order inconcurrent response sequences. Behavioural Processes,73, 187–198.

Skinner, B. F. (1938). The behavior of organisms. New York:Appleton-Century.

Thorndike, E. L. (1911). Animal intelligence. New York:Macmillan.

White, A. J., & Davison, M. C. (1973). Performance inconcurrent fixed-interval schedules. Journal of theExperimental Analysis of Behavior, 19, 147–153.

Received: November 22, 2010Final Acceptance: March 21, 2011

80 SARAH COWIE et al.

Page 19: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

Fig. A2. Phase 1, Condition 2. Log response ratio as afunction of time since food, and mean extended-levelpreference, for Pigeons 21 to 26 according to the locationof the last food delivery. The data are the last 65 sessions ofthis condition. The initial point for Pigeon 22 fell offthe graph.

Fig. A1. Phase 1, Condition 1. Log response ratio as afunction of time since food, and mean extended-levelpreference, for Pigeons 21 to 26 according to the locationof the last food delivery. The data are the last 65 sessions ofthis condition.

CHOICE AND FUTURE FOOD 81

Page 20: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

Fig. A4. Phase 1, Condition 4. Log response ratio as afunction of time since food, and mean extended-levelpreference, for Pigeons 21 to 26 according to the locationof the last food delivery. The data are the last 65 sessions ofthis condition. Some initial points fell off the graphs.

Fig. A3. Phase 1, Condition 3. Log response ratio as afunction of time since food, and mean extended-levelpreference, for Pigeons 21 to 26 according to the locationof the last food delivery. The data are the last 65 sessions ofthis condition.

82 SARAH COWIE et al.

Page 21: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

Fig. A6. Phase 1, Condition 6. Log response ratio as afunction of time since food, and mean extended-levelpreference, for Pigeons 21 to 26 according to the locationof the last food delivery. The data are the last 65 sessions ofthis condition.

Fig. A5. Phase 1, Condition 5. Log response ratio as afunction of time since food, and mean extended-levelpreference, for Pigeons 21 to 26 according to the locationof the last food delivery. The data are the last 65 sessions ofthis condition.

CHOICE AND FUTURE FOOD 83

Page 22: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

Fig. A8. Phase 1, Condition 11. Log response ratio as afunction of time since food, and mean extended-levelpreference, for Pigeons 21 to 26 according to the locationof the last food delivery. The data are the last 65 sessions ofthis condition.

Fig. A7. Phase 1, Condition 7. Log response ratio as afunction of time since food, and mean extended-levelpreference, for Pigeons 21 to 26 according to the locationof the last food delivery. The data are the last 65 sessions ofthis condition.

84 SARAH COWIE et al.

Page 23: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

Fig. B2. Phase 2, Condition 9. Log response ratio as afunction of time since food, and mean extended-levelpreference, for Pigeons 21 to 26 according to the locationof the last food delivery. The data are the last 65 sessions ofthis condition.Fig. B1. Phase 2, Condition 8. Log response ratio as a

function of time since food, and mean extended-levelpreference, for Pigeons 21 to 26 according to the locationof the last food delivery. The data are the last 65 sessions ofthis condition. The initial point for Pugeon 22 fell offthe graph.

CHOICE AND FUTURE FOOD 85

Page 24: REINFORCEMENT: FOOD SIGNALS THE TIME AND LOCATION OF

Fig. B3. Phase 2, Condition 10. Log response ratio as afunction of time since food, and mean extended-levelpreference, for Pigeons 21 to 26 according to the locationof the last food delivery. The data are the last 65 sessions ofthis condition.

86 SARAH COWIE et al.