chapter 10: theories of learning
DESCRIPTION
Chapter 10: Theories of Learning. Unit 4 – AOS 1 Learning Pages 452-540. Study Design Content. • applications of, and comparisons of, learning theories: - PowerPoint PPT PresentationTRANSCRIPT
Chapter 10: Theories of LearningUnit 4 – AOS 1LearningPages 452-540
Study Design Content• applications of, and comparisons of, learning theories:– classical conditioning as informed by Ivan Pavlov: roles of neutral, unconditioned,
conditioned stimuli; unconditioned and conditioned responses– applications of classical conditioning: graduated exposure, aversion therapy, flooding– three-phase model of operant conditioning as informed by B.F. Skinner: positive and
negative reinforcement, response cost, punishment and schedules of reinforcement– applications of operant conditioning: shaping, token economies– comparisons of classical and operant conditioning in terms of the processes of
acquisition, extinction, stimulus generalisation, stimulus discrimination, spontaneous recovery, role of learner, timing of stimulus and response, and nature of response (reflexive/voluntary)
– one-trial learning with reference to taste aversion as informed by John Garcia and Robert A. Koelling (1966)
– trial-and-error learning as informed by Edward Lee Thorndike’s puzzle-box experiment– observational learning (modelling) processes in terms of the role of attention, retention,
reproduction, motivation, reinforcement as informed by Albert Bandura’s (1961, 1963a, 1963b) experiments with children
– insight learning as informed by Wolfgang Kőhler– latent learning as informed by Edward Tolman• the extent to which ethical principles were applied to classic research investigations into
learningincluding John Watson’s ‘Little Albert’ experiment
Conditioning• There are many different types of learning many of which
will be examined in this chapter▫ Classical conditioning▫ Trial and error learning▫ Operant conditioning▫ One trial learning▫ Observational learning▫ Insight learning▫ Latent learning
• Conditioning is the process of learning associations between a stimulus in the environment (one event) and a behavioural response (another event)
Classical Conditioning• Classical conditioning was first described
by Russian psychologist Ivan Pavlov (1899) while he was conducting research into the digestive system of dogs.
• Fig 10.1 – 10.3 – Pavlov and his Dogs, pg. 453-454
• In the course of his research, Pavlov observed that the dogs not only salivated at the sight of the food and when the food enters their mouths, but also at the sight or sound of the laboratory technician that was feeding them.
Pavlov’s Dogs•These unintentional observations intrigued
Pavlov and he decided to conduct experiments under controlled conditions in order to systematically investigate the phenomenon.
•Pavlov’s subsequent experiments provided clear evidence of a very simple form of learning which was based on the repetitive association of two different stimuli.
Pavlov’s Dogs• A stimulus is any event that produces a
response from an organism.
• A response is a reaction by an organism to a stimulus.
• In Pavlov’s experiment, the stimulus of food initially produced the response of salivation.
• Eventually the sight or sound of the technician became the stimulus that produced the salivation response.
Pavlov’s Dogs•The salivation response, which is
controlled by the peripheral nervous system and occurs involuntarily, had now been associated with and conditioned to a new stimulus (the sight and sound of the technician).
•This process by where the dog learned to associate the sight or sound of the technician with food, in essence, is the process of classical conditioning.
Pavlov’s Dogs
Pavlov’s Dogs• Classical Conditioning refers to a simple
form of learning which occurs through repeated association of two different stimuli.
• Learning is said to have occurred when a particular stimulus consistently produces a response that it did not previously produce.
• It results from combining this stimulus over a number of trials until the stimulus becomes associated or linked with the response.
Pavlov’s Dogs• In further studies
again, Pavlov found that he could bring on the salivation response with stimuli such as – a bell, music, touch, light and even the sight of a circle.
Elements of Classical Conditioning•There are 4 key terms used to describe
the process of classical conditioning. These are:
▫Unconditioned Stimulus (UCS)▫Unconditioned Response (UCR)▫Conditioned Stimulus (CS)▫Conditioned Response (CR)
Unconditioned Stimulus (UCS)• The unconditioned stimulus (UCS) is any
stimulus which consistently produces a particular naturally occurring automatic response.
• In Pavlov’s experiments, the unconditioned stimulus was the food.
• Another example of an UCS is the feeding of a cat. This UCS will cause the dog to rub its body against the persons leg.
Unconditioned Response (UCR)• The unconditioned response (UCR) is the
response which occurs automatically when the unconditioned stimulus is presented.
• An UCR is a reflexive involuntary response that is predictably caused by an UCS.
• In Pavlov’s experiments the UCR was the salivation of the dogs to the presence of food.
• In the example of the cat – the UCR would be the action of running up to the person feeding them and rubbing against them
Conditioned Stimulus (CS)• The conditioned stimulus (CS) is the stimulus
that is neutral at the start of the conditioning process and does not normally produce the unconditioned response; but through repeated association with the unconditioned stimulus, the conditioned stimulus triggers the same response as the UCS.
• Association refers to the pairing or linking of one stimulus with another stimulus (usually a stimulus that would not normally produce an automatic response).
Conditioned Stimulus (CS)• In Pavlov’s experiments, the bell was the
conditioned stimulus that was linked with the food that was given to the dog (UCS).
• For the cat the CS may be the appearance of the can opener or the sound of the can.
• The can learns to associate this auditory experience (CS) with the pleasurable sensation of feeding (CR).
Conditioned Response (CR)• The conditioned response (CR) is the learned
response that is elicited by the conditioned stimulus (CS).
• The CR occurs after the CS has been associated with the UCS.
• Pavlov’s dogs demonstrated the Conditioned Response when the food (UCS) was removed and the bell (CS) caused the dogs to salivate.
• Fig 10.6 - Process of Classical Conditioning, pg. 457
Classical Conditioning
Classical Conditioning• Learning Activity 10.2 – Review Questions,
pg. 459
• Learning Activity 10.3 – Identifying Elements of Classical Conditioning, pg. 459
Classical Conditioning• In the course of his research Pavlov
distinguished five major processes that may be involved in conditioning.
• These have been known as:
▫ Acquisition▫ Extinction▫ Spontaneous Recovery▫ Stimulus Generalisation▫ Stimulus Discrimination
Acquisition• Acquisition is the overall process during
which the organism learns to associate two events (the CS and the UCS).
• During acquisition, the presentations of the CS and UCS occur close together in time and always in sequence.
• The duration of the acquisition stage is usually measured by the number of trials it takes the CR to be acquired (learned).
Acquisition• The rate of learning is often very fast in the
early stages of the acquisition phase.
• One of the important considerations in classical conditioning is the timing of the CS and UCS pairing.
• Pavlov found that a very short time between presentations of the two stimuli was most effective.
Extinction• A conditioned stimulus-response connection
does not necessarily last forever and there are situations when the association between the conditioned stimulus (CS) and the conditioned response (CR) needs to be extinguished (ie: in therapy for a problem behaviour).
• Extinction is the gradual decrease in the strength or rate of a response that has been conditioned when the unconditioned stimulus is no longer present.
Extinction• The dogs eventually ceased salivation in
response to just the bell alone after continual repeated trials on classical conditioning.
• The rate in which extinction occurs varies between individuals.
• It also varies on the type of response (ie Stop flinching compared to the removal of a phobia – which will take longer to remove?).
Spontaneous Recovery• Extinction is not always permanent.
• Spontaneous Recovery of the CR may occur.
• Spontaneous Recovery is the reappearance of a conditioned response (CR) when the conditioned stimulus (CS) is presented, following a rest period when the conditioned response has appeared to be extinguished.
• If extinction was to continue – eventually spontaneous recovery will cease to occur.
Stimulus Generalisation• Once a person or an animal has learned to
respond to a conditioned stimulus, other stimuli which are similar to the conditioned stimulus may also trigger the conditioned response (but usually at a reduced level).
• Pavlov observed that his dogs began to salivate when it heard noises that were similar to the sound of the bell.
• This phenomenon is called stimulus generalisation.
Stimulus Discrimination• The opposite to stimulus generalisation,
stimulus discrimination occurs when a person or animal responds to the conditioned stimulus only, and not to any other similar stimuli.
• In Pavlov’s experiment, stimulus discrimination would occur if the dog only salivated over the experimental bell, and not to any other type of bell.
• In humans, stimulus discrimination would occur if a person with a phobia of large dogs, doesn’t flinch or have a fear of small dogs.
Classical Conditioning
Classical Conditioning• Box 10.2 – Factors influencing classical
conditioning, pg. 462
• Learning Activity 10.4 – Review questions, pg. 463
• Learning Activity 10.5 – Key terms in classical conditioning, pg. 463
• Learning Activity 10.6 – Summarising key processes of classical conditioning, pg. 464
Applications of Classical Conditioning•Learning through classical conditioning is
by no means restricted to salivating dogs. Pavlov’s scientific approach to conditioning can help us understand a great deal about our own everyday behaviour, from simple behaviour through to more complex behaviour.
Conditioned Reflexes• Classical conditioning plays an important
part in how we learn to adjust to our environment.
• Many of our behaviours that involve no conscious effort may appear to be innate, but actually arise from prior experience.
• Such behaviours have been described as conditioned reflexes.
• A conditioned reflex is an automatic process that occurs as the result of previous experience.
Conditioned Reflexes• By learning to associate stimuli in our
everyday experience, we gain information about our environment, some of which we take for granted but which is nevertheless valuable.
• Examples of conditioned reflexes include:
▫ Going silent when the lights dim in a cinema.▫ Hitting the brakes as soon as you see the lit brake
lights of the car in front of you.
Conditioned Reflexes▫ Reaching for your mobile phone when you hear
anyone’s phone ring.
▫ Yelling ‘BALL’ as soon as a player is tackled with the football.
▫ Responding to somone’s greeting without actually thinking about an answer.
▫ Running outside when you hear the sound of the gelati van in your neighbourhood.
Conditioned Emotional Responses• Many people cringe at the sound of a dentist’s drill. Yet
there are other things that sound the same but we don’t have the same response.
• The sound of the drill has become a conditioned stimulus which, through association with the unconditioned stimulus (pain/discomfort), elicits a conditioned emotional response (fear).
• A conditioned emotional response is an emotional reaction that usually occurs when the autonomic nervous system produces a response to a stimulus that did not previously trigger that response.
• Learning Activity 10.7 – Review questions, pg. 467
Watson’s experimentation with little Albert B.• One of the most controversial and best
known studies which used classical conditioning to intentionally condition an emotional response was first reported in 1920 by US psychologist John B Watson.
• Their research was designed to test the notion that fears can be acquired through classical conditioning.
• Their research participant was Albert B, the 11-month old son of a woman who worked at the same clinic as Watson.
Watson’s experimentation with little Albert B.
Watson’s experimentation with little Albert B.•The story of Albert B – Read pg. 467-469
•Learning Activity 10.8 – Review questions, pg. 470
Watson’s experimentation with little Albert B.•Watson’s experiment is famous due the
ethical considerations prevalent in the research
•In what ways was the study in breach of ethics as we now know them?
Graduated Exposure•In most cases a CR acquired through
classical conditioning will extinguish if the UCS is not paired with the CS at least occasionally
•However the association is sometimes so strong and well-established that it persists over time and is difficult to extinguish unless there is some kind of intervention
Graduated Exposure• Graduated exposure
involves presenting successive approximations of the CS until the CS itself does not produce the CR
• This technique involves gradually and progressively introducing or exposing the individual to increasingly similar stimuli that produce the CR, and ultimately to the CS itself
Flooding•Flooding involves bringing the client into
direct contact with the anxiety or fear producing stimulus, and keeping them in contact with it until the CR is extinguished
•It is believed that people will stop fearing the stimulus and experiencing the fear associated with it when they are exposed to it and made to realise that it is actually quite harmless
Aversion Therapy•When people develop behaviours that are
habitual and harmful to themselves or to others, such as substance dependence, a gambling addiction or other habit, it is often difficult to help them permanently stop the unwanted behaviour.
•This is especially the case when the behaviour is immediately followed by a sense of pleasure, or relif from discomfort.
Aversion Therapy• Aversion Therapy is a form of behaviour
therapy that applies classical conditioning principles to inhibit or discourage undesirable behaviour by associating or pairing it with an aversive (unpleasant) stimulus such as a feeling of disgust, pain or nausea.
• A key aspect of aversion therapy is the use of punishment to suppress or weaken the undesirable behaviour.
Aversion Therapy• Aversion therapy is often used to treat alcoholism by
associating alcohol (NS – CS) with a drug (UCS) that induces nausea (UCR)
• This pairing will lead alcohol (CS) to produce nausea (CR)
• Aversion therapy can have limitations with individuals with chronic alcoholism – the learned aversion can often fail to generalise to situations other than those in which the learning took place
• Case study – Brendan Fevola
Aversion Therapy
Learning Activity 10.9 – Review questions, pg. 476
Trial and Error Learning• Classical conditioning cannot explain behaviour which is
voluntary – behaviour which we can control.
• Much of our learning comes from trial and error.
• Trial and error learning involves learning by trying alternative possibilities until the desired outcome is achieved
• It involves;▫ Motivation (a desire to achieve a goal)▫ Exploration (an increase in activity)▫ Responses (correct or incorrect)▫ Reward (the correct response is made and rewarded)
Thorndike’s Experiments with Cats
• At about the same time as Pavlov, US psychologist Edward Thorndike was conducting the first noted studies on operant conditioning.
• Thorndike designed a box in which he put a hungry cat. Outside the box he place a piece of fish.
• The only way the cat could get to the food was to push a lever which would release the door.
Thorndike’s Experiments with Cats
Thorndike’s Experiments with Cats
Thorndike’s Experiments with Cats• Initially the cat bit the bars and tried to claw its
way out for about 10 minutes, until it accidentally knocked the lever and opened the door.
• When placed in the box again, the cat went through another series of incorrect responses until it knocked the lever again.
• Eventually the cat became progressively quicker at opening the door and pushing the lever was no longer a random pattern – but instead a deliberate one.
Thorndike’s Experiment with Cats
Thorndike’s Experiments with Cats• Thorndike referred to this phenomena as trial &
error learning, and developed the law of effect stating that:
▫ ‘A behaviour which is followed by satisfying consequences is strengthened (more likely to occur) and a behaviour which is followed by annoying consequences is weakened (less likely to occur).’
• In the puzzle, the cat became instrumental in obtaining its release to get the food – hence the term instrumental learning.
• Learning Activity 10.11 – Review questions, pg. 478
Operant Conditioning• Operant conditioning was first coined by B.F. Skinner
• An operant is a response (or set of responses) that occurs and acts (operates) on the environment to produce some kind of effect.
• Essentially an operant is any behaviour that generates consequences.
• Skinner argues that any behaviour that is followed by a consequence will change in strength (more or less) and frequency depending on the nature of that consequence (reward or punishment).
• Operant Conditioning is the learning process in which the likelihood of a particular behaviour occurring is determined by the consequences of that behaviour.
• It is based on the assumption that an organism will tend to repeat behaviour that has a desirable consequence and tend not to repeat behaviour that has an undesirable consequence.
Three-phase Model of Operant Conditioning•The theory of operant conditioning has been
expressed as a three-phase model based on Thorndike’s law of effect
•The three-phase model of operant conditioning has three components▫The stimulus (S) – that precedes an operant
response▫The operant response (R) – to the stimulus▫The consequence (C) – to the operant response
Three-phase model of operant conditioning• This can be expressed as
▫ Stimulus (S) – Response (R) - Consequence (C)▫ You may also see▫ Antecedent (A) – Behaviour (B) – Consequence (C)
• The stimulus may refer to a single stimuli or a set of stimulus
• The response can refer to a single response or a set of responses
• In Thorndike’s experiment, S is the box, R is the sequence of movements needed to open the door, C is escape and food
Three-phase model of operant conditioning
Skinner’s Experiments with Rats•BF Skinner (1904-1990) was inspired by
Thorndike’s work and went on to develop the instrumental learning theory into what is now known as operant conditioning.
•He did this to emphasise that animals and people learn to operate on the environment to produce desired consequences.
Skinner’s Experiments with Rats• Skinner created an apparatus
called a Skinner Box (Fig 10.27, pg. 481).
• A Skinner Box is a small soundproof chamber in which an experimental animal learns to make a particular response for which the consequences can be controlled by the researcher.
• It is equipped with levers that delivers food when pressed or flashing lights and mild electric shocks.
Skinner’s Experiments with Rats• The lever is wired to a cumulative recorder,
which records when the desired or undesired response is made.
• In 1938, Skinner used the Skinner Box in a classic experiment to demonstrate operant conditioning.
• A hungry rat was placed in the box.
• The animal proceeded to scurry around the box and randomly touch parts of the floors and walls.
Skinner’s Experiments with Rats
Skinner’s Experiments with Rats• Eventually the rat accidentally pressed a lever
mounted on the wall.
• Immediately a pellet of food dropped into the dish and the rat ate it.
• The rat again accidentally pressed the lever and the same happened again.
• With more and more repetitions of this behaviour, the rats once random movements were replaced by more consistent lever pressing.
Skinner’s Experiments with Rats
Skinner’s Experiments with Rats•Fig 10.29 – Typical responses for a rat in
the Skinner Box, pg. 482
•Learning Activity 10.12 – Review questions, pg. 483
Elements of Operant Conditioning• Some elements of operant conditioning are
similar to classical conditioning.
• These include:▫Acquisition▫Extinction▫Spontaneous Recovery▫Stimulus Generalisation▫Stimulus Discrimination
• Two elements however that are different include:▫Reinforcement▫Shaping
Reinforcement• Reinforcement is any stimulus (event) which
subsequently strengthens or increases the likelihood of a particular response that it follows.
• When a dog sits or shakes hands on command, you reinforce it with a biscuit or a pat on the head.
• The stimulus that strengthens or increases the frequency of a particular response is called a reinforcer.
• The term reinforcer is often linked with the term reward or punishment.
Schedules of Reinforcement•A schedule of reinforcement is a
program for giving reinforcement, specifically the frequency and manner in which a desired response is reinforced
•The schedule used will influence the speed of learning (response acquisition rate) and the strength of the desired learned response
Schedules of Reinforcement•In the acquisition phase of the learning
process, learning is usually most rapid if the correct response is reinforced every time it occurs.
•The reinforcer is typically provided immediately after every correct response.
•This procedure of reinforcing every correct response after it occurs is called continuous reinforcement.
Schedules of Reinforcement•Continuous reinforcement is almost
always essential in the acquisition phase until some learning has occurred.
•However, once a correct response consistently occurs, a different reinforcement procedure can be used to maintain, increase or strengthen the response.
Schedules of Reinforcement• During Skinner’s experiments with rats, he ran
out of food pellets and was forced to deliver reinforcers less often.
• He found that responses acquired through a program of partial or intermittent reinforcement are stronger and less likely to weaken or cease than those acquired through continuous reinforcement. Why???
• Partial reinforcement therefore is the process of reinforcing some correct responses but not all of them.
Schedules of Reinforcement• The term schedule of reinforcement refers to the
frequency and manner in which a desired response is reinforced.
• Reinforcement can be given after:
▫ A certain number of correct responses are given (ratio).▫ A certain amount of time has elapsed after a correct
response (interval).▫ And both of these can be fixed (set) or variable
(irregular).
Schedules of ReinforcementFour Main Schedules of Partial Reinforcement
Schedule Ratio (number) Interval (time)
Fixed (set) Fixed-Ratio Fixed-Interval
Variable (irregular) Variable-Ratio Variable-Interval
Fixed-Ratio Schedule•A fixed-ratio schedule is when the
reinforcer is given after a set (fixed), unvarying number (ratio) of desired responses have been made.
•Examples include:▫Giving a reinforcer for every 2nd correct
response (ratio of 1:2).▫Giving $25 for every 1000 pamphlets delivered.▫Giving $3 for every bucket of cherries picked.
Variable-Ratio Schedule• A variable-ratio schedule is when the reinforcer
is given after an irregular (variable) number of correct responses (ratio).
• There is also a constant mean number of correct responses for giving reinforcement.
• For example – on average about 10 of 100 correct responses will be rewarded. These can occur on responses 1,3,5,17,28,29,33,48,67,98. But for the next 100 – the 10 rewards may occur at different times.
Variable-Ratio Schedule• This schedule is a very effective system of reinforcement in
terms of the speed with which a response is acquired.
• It seems that the uncertainty of when the next reinforcer is coming keeps organisms responding steadily in the desired way.
• Pokies are an example of this with payouts set at 90% of what goes into them, but the actual payout to an individual is completely random and unknown.
Fixed-Interval Schedule•The fixed-interval schedule involves
delivery of the reinforcer after a specific or fixed period of time has occurred since the previous reinforcer – provided the correct response has been made.
•For example, reinforcing the 1st correct response after 2, 7, 10 or 20 seconds of time has elapsed.
•This type of schedule generally produces a moderate response rate which is often erratic.
Fixed-Interval Schedule•An example of a fixed-interval schedule
outside of the laboratory is when you approach a pedestrian crossing at a set of traffic lights.
•Quite often you will press the button over and over, even though the WALK signal is governed by a timer.
Variable-Interval Schedule•A variable-interval schedule is when the
reinforcer is given after irregular or variable periods of time have passed, provided the correct response has been made.
•Like variable-ratio, there is a mean constant but the subject is unaware of when the reinforcer will arrive.
Variable-Interval Schedule•For example – reinforcement will be
awarded once every 10 seconds.
•Using a variable-interval schedule this may occur on the 7th, 11th, 22nd, 37th, 40th second mark (notice that this still averages out at 1 in every 10 seconds).
•Fishing involves a variable-interval schedule where you do not know when the fish will strike.
Schedules of Reinforcement
Schedules of Reinforcement•Learning Activity 10.13 – Review
questions, pg. 488
•Learning Activity 10.14 – Identifying schedules of reinforcement, pg. 488
•Learning Activity 10.15 – Using Reinforcement to Change Behaviour, pg. 488
Positive Reinforcement• A positive reinforcer is a stimulus that
strengthens or increases the frequency or likelihood of a desired response by providing a pleasant or satisfying consequence (reward).
• Positive reinforcers are used to encourage the particular behaviour and allow it to occur more frequently.
• Positive reinforcement occurs from giving or applying a positive reinforcer after the desired response has been made
Negative Reinforcement•A negative reinforcer is any stimulus that
(when reduced, removed, or prevented) strengthens or increases the frequency or likelihood of a desired response.
•Negative reinforcement is the removal or avoidance of any unpleasant stimulus and has the effect of increasing the likelihood of a response being repeated and therefore strengthens the response
Negative Reinforcement is NOT Punishment• Negative reinforcement is often confused with
punishment.
• This occurs because both involve an unpleasant stimuli.
• The difference between the two is that:
▫ A negative reinforcer will strengthen the desired behaviour.
▫ Punishment will weaken or deter the undesirable behaviour.
•We discuss punishment next
Positive & Negative Reinforcement• To get the idea as positive and negative reinforcement think of
positive as adding something and negative as taking away.
• Positive reinforcement (+)• Negative reinforcement (-)
• Positive reinforcement is given while negative reinforcement is removed – both lead to a desirable response
• Learning Activity 10.13 – Review questions, pg. 488
• Learning Activity 10. 14 – Identifying schedules of reinforcement, pg. 488
• Learning Activity 10.15 – Using reinforcement to change behaviour, pg. 488
• Learning Activity 10.16 – Data analysis, pg. 489
Punishment• A punisher is an unpleasant stimulus that, when closely
associated with a response, weakens the response or decreases the probability of that response occurring again.
• Punishment is the delivery of a punisher following a response.
• Punishment also occurs when a response is followed by the removal of a pleasant event.
• Positive punishment involves the presentation of a stimulus and thereby decreases or weakens the likelihood of a response occurring again
• Negative punishment involves the removal of a stimulus and thereby decreasing or weakening the likelihood of a response occurring again
Punishment•Negative punishment (the removal of a
stimulus) is often referred to as response cost
•Response cost may be described as involving any valued stimulus being removed, whether or not it causes the behaviour
Comparing Reinforcement and Punishment
Factors that influence the effectiveness of reinforcement and punishment• Simply achieving the desired response isn’t
enough for operant conditioning to occur.
• What happens after the response is just as important.
• Factors such as the ones stated above are important in determining the effectiveness of reinforcement and learning:
▫ The order of presentation▫ Timing▫ Appropriateness of the reinforcer
Order of Presentation•To use reinforcement effectively it is
important that it is presented after a desired response, never before.
•This helps to ensure that the organism learns the consequence of a particular response.
Timing•Reinforcers should be presented as close
in time to the desired response as possible.
•When Skinner’s rats pressed the lever, a food pellet was dispersed immediately.
•This causes the organism to learn and associate the desired response with the reinforcement (both positive and negative).
Appropriateness of the Reinforcer•For any stimulus to be a reinforcer, it
must provide a pleasing or satisfying consequence for its recipient.
•Technically, it will not be known if something will act as a reinforcer until after it has been used.
•In most situations it can be assumed that if a reinforcer works for one situation, it will work in another.
Reinforcement and Punishment
Reinforcement and Punishment• Learning Activity 10.17 – Review questions,
pg. 492
• Learning Activity 10.18 – Reinforcement and punishment, pg. 493
• Learning Activity 10.19 – Concept summary, pg. 493
• Learning Activity 10.20 – Applying operant conditioning, pg. 494
Key Processes in Operant Conditioning•The same processes (acquisition,
extinction, stimulus generalisation, stimulus discrimination and spontaneous recovery) are involved in both operant and classical conditioning
•The way in which these processes can occur however, are slightly different in operant conditioning
Acquisition•In operant conditioning, acquisition is
the establishment of a response through reinforcement
•The speed in which the response is established is dependent on the schedule of reinforcement used
Extinction• In operant conditioning extinction is the gradual
decrease in the strength or rate of a conditioned response, following consistent non-reinforcement of the response
• With operant conditioning, extinction occurs over time when reinforcement is no longer given
• Extinction can depend on whether continuous or partial reinforcement is used and when partial reinforcement is used extinction is less likely to occur
• This is why gambling is such a difficult behaviour to extinguish
Spontaneous Recovery•After the apparent extinction of a
response, spontaneous recovery can occur and the organism will once again show the response in the absence of any reinforcement
•The response however is likely to be weaker and will probably not last very long
Stimulus Generalisation•Stimulus generalisation occurs in
operant conditioning when the correct response is made to another stimulus which is similar to the original stimulus
•We frequently generalise our responses from one stimulus to another – a sound of a car backfiring may be perceived as a gunshot
Stimulus Discrimination•Stimulus discrimination occurs in
operant conditioning when an organism makes the correct response to a stimulus and is reinforced but does not respond to any other similar stimuli
•Skinner taught pigeons to discriminate between stimuli such as red and a green light
Applications of Operant Conditioning•The principles of operant conditioning
were originally applied to animals but have since being utilised and applied to people in numerous settings
•Shaping
•Token Economies
Shaping•In one experiment, Skinner decided to
train a pigeon to turn a full circle in an anticlockwise position.
•In order to get the pigeon to perform what Skinner called ‘target behaviour’, he used an operant conditioning procedure called shaping to gradually ‘mould’ responses to the target behaviour.
Shaping• Shaping is a strategy in which a reinforcer is
given for any response that successively approximates and ultimately leads to the final desired response or target behaviour.
• Shaping is also known a the method of successive approximations.
• Skinner was able to achieve this by firstly rewarding the bird with a food pellet every time it made a slight turn to the left.
Shaping•Once it had been conditioned to turn
slightly to the left, he ceased reinforcement and started giving it a food pellet once it had then completed a ¼ circle to the left.
•The pigeon eventually learned to perform the desired response because it was reinforced for each successive step leading to the target behaviour (but not for any of the former responses).
Shaping• Shaping is used when the desired response has a
low probability of occurring naturally.
• Shaping is an effective procedure that works on both people and animals.
• Many tricks performed in circuses and animal shows have been learned through shaping.
• Shaping is often used by teachers to teach a new skill or series of skills to students.
Token Economies• A token economy is a setting in which an
individual receives tokens (reinforcers) for desired behaviour and these tokens can then be collected and exchanged for other reinforcers such
• Examples of token economies are present in schools, prisons, workplaces etc.
• Learning Activity 10.22 – Review questions, pg. 501
• Learning Activity 10.23 – Data analysis, pg. 501
Comparison of Classical and Operant Conditioning•Using the information on pg. 503-504,
complete the provided table (this can be used to complete Learning Activity 10.24 – Comparing classical and operant conditioning)
•Learning Activity 10.25 – Classical versus operant conditioning, pg. 505
One-Trial Learning and Taste Aversion• Many of us have a dislike for certain foods.
• Sometimes the dislike is associated with the texture of the food (ie oysters) or the origin of the food (ie kidney, liver, brains). For others it is simply because the food appears disgusting (ie snails, grubs, frogs legs).
• Taste aversion is a learned response in which a person or animal establishes an association between a particular food and being or feeling ill after having consumed some or all of it at some time in the past.
One-Trial Learning and Taste Aversion• This association is usually the result of a single experience and the
particular food will be avoided in the future
• One-trial Learning is a form of learning involving a change in behaviour that occurs with only one experience
• One-trial learning is like classical conditioning but is not the same nor a subtype – a classically conditioned response takes several pairings to establish and this response can be extinguished relatively quickly; a conditioned response in one-trail learning occurs after one association and is considerably resistant to extinction
• This is because the UCS (feeling ill) due to the CS (nausea-producing substance) is very powerful
• Another important distinction is that in classical conditioning the pairing of the UCS and the CS occur close together where in one-trial learning the CR (illness) could occur as much as a day or so after the food (CS) was consumed
One-Trial Learning and Taste Aversion•Taste aversion is sometimes referred to as
the Garcia effect
•Garcia and Koelling (1966) performed the most famous research into taste aversion
•Garcia and Koelling (1966) pg. 507
•Box 10.8 – Cancer patients and taste aversion, pg. 508
Observational Learning•We also learn by
watching others and observing others
•Through observations we can acquire new response without having to personally experience them
Observational Learning•Observational Learning occurs when
someone uses observation of another person’s actions and their consequences to guide their future actions
•The person being observed is referred to as a model
•Consequently observational learning is often known as modelling or social learning
Observational Learning• Learning by observing someone is an extremely
useful process because it can be more efficient than trial and error learning or waiting for reinforcement or punishment.
• The process of observational has been researched extensively by Canadian Psychologist Albert Bandura.
• Take 5 minutes to complete Learning Activity 10.29 on pg. 510
Observational Learning•Bandura believes that modelling is not a
form of learning that is separate from conditioning.
•His experiments demonstrated that both classical and operant conditioning can occur vicariously, or indirectly, through observational learning.
•In other words, observational learning involves being conditioned indirectly by observing someone else’s conditioning.
Observational Learning• During vicarious conditioning, the
individual watches another person displaying behaviour that is either reinforced or punished and subsequently behaves in exactly the same way, or in a modified way, or refrains from the behaviour as a result of the observation.
• Vicarious reinforcement increases the likelihood of the observer behaving in a similar way to a model whose behavior is reinforced.
Observational Learning•Vicarious punishment occurs when the
likelihood of an observer performing a particular behaviour decreases after having seen a model’s behaviour being punished.
•Learning Activity 10.30 – Review questions, pg. 511
Bandura’s Bobo Doll Experiment• In the 1960’s, Bandura and his colleagues Ross
and Ross conducted a series of experiments on observational learning in young children.
• Children were required to sit and watch a model performing some actions on television, and then they were given an opportunity to imitate the model.
• One such experiment demonstrated the influence of observational learning on aggression in 4yo children.
Bandura’s Bobo Doll Experiment
Bandura’s Bobo Doll Experiment• Outline of Experiment & Results, pg. 511-517
• In their first experiment, Bandura, Ross and Ross (1961) exposed children to aggressive and non-aggressive adult models and then tested the children for the amount of imitative learning
• In the second experiment Bandura, Ross and Ross (1963a) aimed to find out the extent to which observation of aggressive models presented in films by real-life and cartoon characters influences aggressive behaviour by children
• In the third experiment Bandura, Ross and Ross (1963b) studied the influence of reward and punishment on observational learning
Bandura, Ross and Ross (1961) Bandura, Ross
and Ross (1961) concluded that aggressive behaviour can be learnt through exposure to aggressive models and that there are sex differences in aggressive behaviour
Bandura, Ross and Ross (1963a) Bandura, Ross and
Ross (1963a) concluded that exposure of children to aggressive models increased the probability that they will respond aggressively when given the opportunity to do so
They also concluded that children can learn aggressive behaviour through observation of aggression by both real-life models and film-portrayed models
Bandura, Ross and Ross (1963b)
Bandura, Ross and Ross (1963b) concluded that observational learning was influenced by the consequences (or lack of) for the adult model(s)
Bandura’s Bobo Doll Experiment• You will need to complete three evaluations of
research for each of Bandura, Ross and Ross experiments
• The following learning activities will assist you in doing these evaluations of research
• Learning Activity 10.31 – Summary of experiments by Bandura, Ross and Ross, pg. 518
• Learning Activity 10.31 – Evaluation of experiments by Bandura, Ross and Ross, pg. 518
Elements of Observational Learning•According to Bandura, there are 4
elements that account for observational learning. These include:
▫Attention▫Retention▫Reproduction▫Motivation-Reinforcement
Elements of Observational Learning - Attention•In order to learn through observation, we
must pay attention to the model’s behaviour.
•If we don’t pay attention, we won’t recognise the distinctive features of the model’s behaviour.
•The way we focus our attention is not random. According to Bandura we are more likely to observe and imitate models if the have the following characteristics:
Elements of Observational Learning - Attention• The model is perceived positively and has high
status.
• There a perceived similarities between the model and the observer.
• The model is known or familiar to the observer.
• The model’s behaviour is visible and stands out to the observer.
• The model demonstrates behaviour that can be imitated.
Elements of Observational Learning - Retention•Having observed the model we must now
be able to remember the models behaviour.
•Responses learned are often not needed until some time after they have been acquired.
•Memory plays an important role as there is the need to make a mental representation of what we have observed so it can be recalled at a later date.
Elements of Observational Learning - Reproduction•Having observed and remembered the
model’s desired behaviour, we then attempt to reproduce or imitate, what has been observed.
•We must, however, have the ability to put into practice what was observed.
Elements of Observational Learning - Motivation - Reinforcement•The learner has to be motivated to
perform the behaviour.
•Unless the behavioural response has an incentive or reward for the learner, it is unlikely that they will want to learn it in the first place.
•Bandura suggests that there are 3 aspects to motivation: External Reinforcement, Vicarious Reinforcement & Self Reinforcement.
Elements of Observational Learning - Motivation - Reinforcement
•External Reinforcement – is comparable to learning by consequences. If a full forward kicks a goal, he is likely to be praised by team-mates, coaches and fans, he will most likely retain that kicking style.
•Vicarious Reinforcement – is observing the modelled behaviour being reinforced for other people. For example watching Matthew Lloyd being successful with the particular kicking style.
Elements of Observational Learning - Motivation - Reinforcement
• Self Reinforcement – occurs when we are reinforced by meeting certain standards or performance that we set for ourselves.
• Learning Activity 10.34 – Review questions, pg. 523
Insight Learning• Many of us have experienced the relief of solving a problem after having antagonised over it for a period of time, by suddenly seeing the solution in a different light and thinking ‘Aha! I know the answer!’
• This is an example of insight learning, a type of learning involving a period of mental manipulation of information associated with a problem, prior to the realisation of a solution to the problem
Insight Learning•The original research into insight learning
was conducted by a researcher Kohler (1925)
•He used chimpanzees in his research to demonstrate what he called the cognitive processes involved in learning
•Read about Kohler’s (1925) research, pg. 523-524
Insight Learning•Kohler (1925) showed
that his research with Sultan was an example of insight learning
•Kohler and further researchers identified distinct stages involved in insight learning
Stages of Insight Learning - Preparation•The first stage of insight learning,
preparation is a ‘getting ready’ stage in which a person or animal gathers as much information as possible about what needs to be done
•Sultan picked up the sticks, looked at them and tried to reach the banana with the sticks
Stages of Insight Learning - Incubation•Incubation is a period of mental ‘time out’
where the information gained in the preparation stage appears to be put aside
•Sultan showed this when he squatted on a box and appeared uninterested in solving the problem
Stages of Insight Learning – Insightful Experience• The insightful experience is sometimes described
as the ah-ha! experience because it occurs so suddenly that people often explain ah-Ha!
• It can appear like a sudden period of illumination after feeling the solution has been in the dark
• Sultan suddenly behaved as though he had been given a proper understanding of the relationship between the sticks
Stages of Insight Learning - Verification•The verification stage is the final stage in
insight learning and represents the visual image flashed into the mind during the insightful experience is acted upon and tested
•Sultan quickly verified his insightful experience by fitting the two sticks together and retrieving the banana
Features of Insight Learning• Studies of insight learning since Kohler’s experiments
have led psychologists to describe insight learning as having the following characteristics▫ The learning appears sudden and complete▫ The first time the solution is performed, it is usually done
with no errors▫ The solution is less likely to be forgotten than if it is
learned by rote▫ The principle underlying the solution is easily applied to
other relevant problem-solving solutions
• Learning Activity 10.37 – Review questions, pg. 527
Latent Learning•In the 1930’s an American psychologist Edward
Tolman conducted maze learning studies with rats that highlighted two aspects of learning that challenged the traditional conditioning theories
•The studies indicated▫That learning can occur without reinforcement of
observable actions▫Learning can occur without revealing itself to
observable behaviour
Latent Learning•Read about Tolman and Honzik’s (1930)
experiment with rats, pg. 527-528
Latent Learning•Latent learning is learning that occurs
without any direct reinforcement but remains unexpressed, or hidden, until is needed
•That learning can take place without direct reinforcement challenged Skinner’s operant conditioning theory in which reinforcement has an important role in explaining learning of all types of behaviour
Learning Cognitive Maps•Tolman also concluded from his experiment
that the rats learned the location of places in his mazes
•Tolman devised the term ‘cognitive map’ to describe the process of forming a mental picture of the relationship between locations
•Learning Activity 10.38 – Review questions, pg. 530