chapter7 learning - cankaya.edu.tr

11/19/15

1

Chapter 7 – Schedules of Reinforcement

Operant Condititoning

1

Schedules of Reinforcement

�  Does each lever press by the rat result in a food, or are several lever presses required?

�  Did your mom give you a cookie each time you asked for one, or only some of the time?

�  A continuous reinforcement schedule (CRF) is one in which each specified response

is reinforced. �  An intermittent (or partial) reinforcement schedule (PRF) is one in which only some responses are reinforced.

2

EXAMPLES - Schedules of Reinforcement

�  Each time you flick the light switch, the light comes on. The behavior of flicking the light switch is on a(n) _____________ schedule of reinforcement.

�  When the weather is very cold, you are sometimes unable to start your car. The behavior of starting your car in very cold weather is on a(n) _____________ schedule of reinforcement.

3


�  4 types of intermittent schedules ¡ (Fixed Ratio=FR) ¡ (Variable ratio=VR) ¡ (Fixed Interval=FI) ¡ (Variable Interval=VI)

4

11/19/15

2


�  Fixed Ratio=FR �  Reinforcement is contingent upon a fixed,

predictable number of responses. Note that an FR 1 schedule is the same as a CRF schedule in which each response is reinforced.

5


�  Fixed Ratio=FR On a fixed ratio 5 schedule (abbreviated FR 5), a rat has to press the lever 5 times to obtain a food. On an FR 50 schedule, it has to press the lever 50 times.

6


�  Fixed Ratio=FR �  FR schedules generally produce a high rate of

response along with a short pause following the attainment of each reinforcer. This short pause is known as a postreinforcement

pause. �  E.g. a rat will take a short break following each reinforcer. �  Higher ratio requirements produce longer postreinforcement pauses.

7


�  Fixed Ratio=FR �  An FR 200 schedule of reinforcement will result in a

(longer/shorter) ____pause than an FR 50schedule. �  Schedules in which the reinforcer is easily obtained

are said to be very dense or rich, while schedules in which the reinforcer is difficult to obtain are said to be very lean.

�  E.g. An FR 5 schedule is considered a very dense schedule of reinforcement compared to an FR 100.

�  An FR 12 schedule of reinforcement is (denser/leaner) _____________ than an FR 100 schedule.

8

11/19/15

3


�  Variable Ratio=VR �  Reinforcement is contingent upon a varying,

unpredictable number of responses. �  On a variable ratio 5 (VR 5) schedule, a rat has to emit an average of 5 lever presses for each food pellet, with the number of lever responses on any particular trial varying between, say, 1 and 10. (3,7,5)

�  E.g. gambling, lottery

9


�  Variable Ratio=VR VR schedules generally produce a high and steady rate of response with little or no postreinforcement pause.

10


�  Fixed Interval=FI

�  Reinforcement is contingent upon the first response after a fixed, predictable period of time. e.g. receiving your salary after a 1-month period.

11


�  Fixed Interval=FI

For a rat on a fixed interval 30-second (FI 30-sec) schedule, the first lever press after a 30-second interval has elapsed results in a food pellet. Following that, another 30 seconds must elapse before a lever press will again produce a food pellet.

12

11/19/15

4


�  Fixed Interval=FI �  FI schedules often produce a “scalloped” (upwardly curved) pattern of responding, consisting of a postreinforcement pause followed by a gradually increasing rate of response as the interval draws to a close.

13


�  Variable Interval=VI �  reinforcement is contingent upon the first response

after a varying, unpredictable period of time.

�  For a rat on a variable interval 30-second (VI 30-sec) schedule, the first lever press after an average interval of 30 seconds will result in a food pellet, with the actual interval on any particular trial varying between, say, 1 and 60 seconds.

14


�  Variable Interval=VI VI schedules usually produce a moderate, steady rate of response with little or no postreinforcement pause.

15

Schedules of Reinforcement 16

11/19/15

5




�  On ________________ schedules, the reinforcer is largely time contingent, meaning that the rapidity with which responses are emitted has (little/considerable)_______________ effect on how quickly the reinforcer is obtained.

�  In general, ______________ schedules produce postreinforcement pauses because obtaining one reinforcer means that the next reinforcer is necessarily quite (distant/close) _________.

�  In general, (variable/fixed) ________________ schedules produce little or no postreinforcement pausing because such schedules provide the possibility of relatively immediate reinforcement, even if one has just obtained a reinforcer.

Chapter 7 – Theories of Reinforcement Operant Condititoning

20

11/19/15

6

Theories of Reinforcement

-  Clark Hull (Drive Reduction Theory) -  Sheffield (Drive Induction Theory) -  D. Premack (Premack Principle) -  Timberlake & Allison (Response Deprivation

Hypothesis)

21

Clark Hull (Drive Reduction Theory)

� Food is a reinforcer because the hunger drive is reduced when you obtain it. When a stimuli is associated with a reduction in

some type of physiological drive, we can call this stimuli as ‘reinforcing.’ and the behavior that the organism performs before the drive reduction is strengthened.

�  e.g. if a hungry rat in a maze turns left just before it finds food in the goal box, the act of turning left in the maze will be automatically strengthened.

22

Clark Hull (Drive Reduction Theory)

� The problem with this theory is some reinforcers do not seem to be associated with any type of drive reduction. e.g. A rat will press a lever to obtain access to a running wheel. �  So, as opposed to an internal drive state, incentive

motivation could be the key. � E.g. Playing a video game for the fun of it,

attending a concert because you enjoy the music. � Going to a restaurant for a meal might be largely

driven by hunger; however, the fact that you prefer a restaurant that serves hot, spicy food is an example of incentive motivation.

23

Sheffield (Drive Induction Theory)

According to Sheffield, Hull just explained the half of the story. The other half requires another thing. It’s not the drive reduction, but drive induction which makes a stimuli a reinforcer. e.g. Rabbit-carrot. Animal learns to react to me and carrot? The animal never eats the carrot. Where is the drive reduction here? Sheffield says you can support learning by allowing induction of a drive, not allowing reduction of it.

24

11/19/15

7


Sexual behavior is also similar. In the barber shop, he’s very clever that he puts some playboys on the desk. While customers are waiting, they read them. And they go to the same barber unconsciously again and again. This is the induction of sexual contact. Inducing the drive is SR!

25


-Male and female rats (no reduction, but drive induction) -male&male version

26


In a factory, you have a standard payment for your workers. As a manager, you say that if they produce more, you will make an increament in their salary. When you make this promise, you aren’t reducing anything. Just the opposite! You are inducing something, new things. If you want a very strong SR, you should combine Hull and Sheffield. First induce the drive and then give the opportunity to reduce it. Drive induction followed by drive reduction!

27


In order to sell a product, first create a need state. And then give the product that reduce the desires. Fear of perspiration & then give the deodorant. Induce the drive & reduce it. This is the general strategy of commercials. SR has 2 legs; induction of drive and reduction of drive.

28

11/19/15

8

Premack Principle

�  Reinforcers can often be viewed as behaviors rather than stimuli. For example, rather than saying that lever pressing was reinforced by food (a stimulus), we could say that lever pressing was reinforced by the act of eating food (a behavior).

�  Then the process of reinforcement can be conceptualized as a sequence of two behaviors: (1) the behavior that is being reinforced, followed by (2) the behavior that is the reinforcer. Moreover, by comparing the frequency of various behaviors, we can determine whether one can be used as a reinforcer for the other.

29

Premack Principle

�  A high-probability behavior can be used to reinforce a low-probability behavior.

�  We should first decide the free-choice preference rate of behaviors. .

�  E.g. eating food (the high-probability behavior [HPB]) running in a wheel (the low-probability behavior [LPB]). On the other hand, when the rat is not hungry…

30

Premack Principle

�  More probable behaviors will reinforce less probable behaviors.

�  This principle is like Grandma’s rule: First you work (a low-probability behavior), then you play (a high-probability behavior).

�  First, eat your spinach, and then you can get your ice cream.

�  If you drink five cups of coffee each day and only one glass of orange juice, then the opportunity to drink ________ can likely be used as a reinforcer for drinking ________.

31

Premack Principle

�  One problem is the probabilities of behaviors might fluctuate so it might be difficult to measure. It doesn’t fit well in the lab. There may be an error in determination of free-choice preference rate.

�  Another problem also arises when two behaviors have the same probability. Initial probabilities are not stable over time. Every SR looses its reinforcement power.

eating spinach eating ice-cream 10% 90%

32

11/19/15

9

Premack Principle

Everytime you reinforce another response, you change the probability of this response. We have variability in the reinforcement event. The erosion of SRà after extensive usage of same SR, it begins to show a decline. This is called erosion-effect.

33

Response Deprivation Theory (Allison & Timberlake, 1974)

� A behavior can be used as a reinforcer if access to the behavior is restricted so that its frequency falls below its baseline rate of occurrence.

� Do not need to know the relative probabilities of two behaviors beforehand. The frequency of one behavior relative to its baseline is the important aspect. �  Example:

¡  Man normally studies 60 min, exercises 30 min a day ¡  Schedule: Every 20 min of study earns 5 min of exercise ¡  Exercise is the deprived behavior ¡  Prediction: Study time should increase.

34

Response Deprivation Theory (Allison & Timberlake, 1974)

� Example: ¡ a rat typically runs for 1 hour a day whenever it

has free access to a running wheel (the rat’s preferred level of running).

¡ If the rat is then allowed free access to the wheel for only 15 minutes per day, it will be unable to reach this preferred level (deprivation)

¡ So, the rat will now be willing to press a lever to obtain additional time on the wheel.

35

chapter7 learning - cankaya.edu.tr

Documents