the learned dog class 6: models & just what is learned anyway?

The Learned Dog

Class 6: Models & Just What is Learned Anyway?

Agenda

• Questions

• And no, it is a different Blumberg who is cited in the text :-)

• Models

• Noodling on “exactly what is learned anyway?”

• CB on practical application of desensitization

The big picture from SWR

• ‘We learn about things only when we process them actively’

• ‘We process things actively only when they are surprising; that is, when we do not yet understand them.’

• ‘As conditioning proceeds, the CS and the US become less surprising. As a result, they get processed less, and we therefore learn less about them’

• NOTE: ‘getting processed less’ does not mean that the animal responds to them less, rather it spends less energy learning about them.

Schwartz, B., E. A. Wasserman, et al. (2002). Psychology of Learning and Behavior. New York, NY, W. W. Norton & Company, Inc.

Models

What’s a model?

• A model is a simplified description of a complicated real world system that captures enough of how the real world system works that it can be used to...

• Explain some/all of the observed behavior of the real world system.

• Enable you to make predictions about how the real world system would behave if certain things were true (“what if?”)

• A real world system can be modeled at different levels of abstraction depending on what aspect of the system you want to understand...

How do you know if it is a good model?

• How much explanatory power does it have?

• How much predictive power does it have?

• How important are the situations in which it fails?

• Is there a simpler explanation that does just as good a job?

Models of Pavlovian Conditioning...

• Two broad classes of models...

• Associative models: It is all about the strength of the association between the CS and the US, and how that strength changes

• Rescorla - Wagner

• Pearce - Hall

• Representational models: It is all about higher level representations of time and rate and how the animal uses them.

• Gallistel

Associative models...

The big idea behind associative models...

• All stimuli are modeled as having a value, e.g., 0-100.

• US: these values are assumed to be innate and fixed

• CS: these values are learned. Typically, start out at 0.

• According to this view, during Pavlovian conditioning, the value of a CS becomes closer and closer to that of the US that it seems to predict.

• A strong association means that the value of the CS and the US are very close.

• If multiple CS seem to predict a given US, the sum of their values approaches or equals that of the US.

Learning is inherently retrospective...

WTH: ok, what did I miss?

time

significant event (food, shock, ball)

Heard tone 2

Saw light flash

Heard baby cry

smelled nervous

grad student

Heard coke machine

Saw mary scratch

memory?

Memory may be different for different classes of

stimuli...WTH: ok, what did I miss?

time

significant event (food, shock, ball)

Heard tone 2

Saw light flash

Heard baby cry

smelled nervous grad student

Heard coke machine

Saw mary scratch

WTH: pigeons associate visual stimuli with food

time

significant event (food)

Heard tone 2

Saw light flash

Heard baby cry


Heard coke machine

Saw mary scratch

What memories get used may depend on what

happened..

What memories get used may depend on what

happened..

WTH: pigeons associate auditory stimuli with pain

time

significant event (shock)

Heard tone 2

Saw light flash

Heard baby cry


Heard coke machine

Saw mary scratch

Rescorla-Wagner Model

• The value of the CS approaches that of the US over time...

• The amount it changes on each trial is proportional to ‘how far off it is’ from the value of the US

• CS value changes a lot at first because it’s value is so far off from that of US

• CS changes less and less as its value approaches that of the US.

Rescorla-Wagner Model

• The difference between the value of the US and CS represents value that is currently ‘unexplained’ by the CS

• As a result of learning, the ‘unexplained’ value goes down as the CS value approaches that of US. BUT...

• The key thing of the RW model is that CS compete to explain the ‘unexplained’ value...

• US = CS1 + CS2 + ...

RW & Blocking

• In blocking, the first CS does such a good job explaining the value of the US, there isn’t much ‘unexplained’ value to be explained by the second CS once it is added

RW & Inhibition

• Assume that the US is fully explained by CS1. Now on the next trial, CS1 and CS2 are presented together but no US occurs...

• In this case, both CS1 and CS2 share some of the blame but the effect is that CS2 actually takes on a negative value, i.e. it becomes inhibitory.

Just in case you wanted the equation...

Vpredicted = Vcs1 + Vcs2

Change in Vcs1 = K*(Vunexplained)

Change in Vcs2 = M*(Vunexplained)

Vunexplained = (Vus - Vpredicted)

K & M represent ‘relative salience of cs1 and cs2 respectively. And note, RW considers them fixed

Rescorla Wagner has been an influential model

• Its simple and it has explanatory and predictive power.

• The equation is a variant of the so-called ‘delta rule’ that is used in artificial neural nets (models of how groups of neurons work.)

• Once again, it is as if when something significant happens, the animal asks itself “can I explain it?”, and the difference between what actually happened and what it expected to happen is used to adjust its expectations so over time those expectations increasingly get more reliable.

• If the outcome is perfectly in line with its expectations, no adjustment occurs, i.e. no learning occurs.

The Pearce-Hall Model

Motivation

• The Rescorla-Wagner model does not handle latent inhibition

• Pre-exposure to a CS in the absence of a US slows down subsequent learning when the CS is paired with the US.

• This is a consequence of 2 assumptions...

• Learning about a CS only occurs when a US has occurred.

• The salience associated with a given class of CS is fixed.

• The Pearce-Hall model is a response...

Courville, A. C., N. D. Daw, et al. (2006). "Bayesian theories of conditioning in a changing world." Trends in Cognitive Science 10(7): 294-300.

The Pearce-Hall Model

• ‘In essence, the model says that the associability of a CS reflects the degree to which the US on the last presentation of the CS was surprising’

• Salience of CS change over the trials based on experience

• At the 30,000’ level, it is once again a model that views learning as an exercise in reducing surprise...Schwartz, B., E. A. Wasserman, et al. (2002).

Psychology of Learning and Behavior. New York, NY, W. W. Norton & Company, Inc.

The equations (oh no...)

Previous exposure to CS in absence of US drives surprise to 0 on first trial with US, but then goes up on subsequent trials

Surprise last trial = |Actual last trial - Expected last trial |

Change in Vcs this trial = Surprise last

trial*Intensity*Vus

Keep’em guessing if you want them to

learn...

Surprise last trial = |Actual last trial - Expected last trial |

The big point: the amount of learning is proportional to the level of surprise on previous trials.

Alternatively, active attention is proportional to surprise

Orienting Responses as a measure of active attention

• ‘Attention’ is measured by the degree to which the animal orients toward the CS (quality and duration). This is known as the Orienting Response.

• In Pavlovian conditioning experiments,

• initially the level of OR to the CS will be high, reflecting the level of active attention (possibly acquiring as much information as possible).

• over time the level of OR to the CS goes down, and the animal focuses on the US, reflecting a lower level of active attention.

The key insights of Pearce-Hall

• Surprise as a driving force behind learning...

• Animals potentially learn about features of their world even in the absence of a biologically significant event.

• Active attention is proportional to the level of surprise, as an animal learns about a given stimuli, less and less attention is devoted to learning more about it.

• It brings the discussion back to what the animal is biased to attend to in its world, either because of innate biases or learned biases.

• The importance of orienting responses.

Representational Models

Representational Models

• The Rescorla-Wagner and Pearce-Hall models do not incorporate a notion of time...

• The animal ‘magically’ knows when a trial starts and ends.

• Can’t model effects such as ratio of trial time and inter-trial time.

• The models are associative models so rely on this fuzzy notion of ‘value’

Gallistel’s model (in 1 slide...)

• Gallistel’s approach is to argue that...

• Animals seem very good at measuring ‘rate’:

• Rate of reinforcement = (Number of cookies)/(time light is on)

• Implies they can keep track of quantity and time.

• The results of most Pavlovian Conditioning experiments can be reinterpreted as the animal learning rates of US occurrence and apportioning them to the various CS.

• What the animal is trying to explain is the rate of occurrence in a given CS context

Gallistel’s model provides a range of elegant solutions...

• Blocking

• Animal learns the rate of occurrence of US when CS1 is active.

• When CS2 is presented at same time as CS1 and rate of US does not change, the rate of US is totally explained by CS1, so rate apportioned to CS2 is 0

• Latent Inhibition

• The previous experience of no US when CS1 is active biases the initial estimate of the rate of occurrence (time CS1 active will be high), when the US begins to be paired with CS1.

Gallistel’s model hasn’t gained a lot of traction...

• The neural machinery for associative models is a lot clearer than the neural machinery required to support his model

• Traditional behaviorists don’t like representational accounts because you can’t observe them.

• Assumes animal keeps track of CS even in absence of reason to do so...

• Some innate perhaps, some learned??

• I think it is a very intriguing idea...

• Gallistel, C. R. and J. Gibbon (2000). "Time, rate and conditioning." Psychological Review 107

Theories of extinction

• The key point is that ‘extinction seems to produce performance loss rather than association loss’

• 4 forms of recovery

• Spontaneous recovery (give it time)

• Disinhibition (surprising event associated with US causes recovery)

• Reinstatement (reminded of how bad/good the US can be)

• Renewal (placed back in a specific context)


Theories of extinction, cont.

• ‘One theory holds that organisms gradually stop paying attention to the CS over the course of extinction training. Responding eventually stops not because of a loss in the CS-US association, but rather because the animal know ignores the CS. Recovery of responding occurs whenever circumstances direct the animal’s attention back to the CS’

• This fits with what we learned about the amygdala and the neocortex. You are not replacing associations, rather you are adding associations that work against the existing associations. But they are always there, ready to be expressed.

• This may be why some problems such as reactiveness are so hard to remedy completely.


What is learned?

S-S or S-R?Like most things in Pavlovian Conditioning, it depends...


Different results if association is S-S vs. S-R

Does the meaning of CS2 change if the meaning of CS1 change?


S-S or S-R? Its ambiguous...

• Different results depending on when CS2 is associated with CS1 relative to when CS2 is associated with US.

• However, there are other examples in which second-order conditioning seems to produce S-S associations


Other clues that an association is formed between CS and the internal representation of the US.

• Pigeons experience tone-food pairing in key-peck set-up. Despite lots of experience they don’t peck the key.

• Subsequently, light paired with just the tone, and they start pecking the key...

• ‘what this result indicates is that during tone-food pairing pigeons do indeed form an association, so the tone evokes a representation of food. When the key light is later paired with the tone, it is paired with the representation of food evoked by the tone.’ [BB: maybe something else is going on though]

• Devaluation experiments

• US is devalued and response to CS is accordingly lessened.


Probably won’t go to far wrong if you assume ‘that the CS invokes representation of a particular US arriving at a particular point in time.’

Shettleworth, S. J. (1998). Cognition, Evolution and Behavior. New York, NY, Oxford University Press

How does inhibition work...

• The wicked important point is that inhibitory associations do not replace the excitatory association, but rather modulate it in some way.

• Something changes and you will see that excitatory association live and well

• One theory is that inhibitors work as occasion setters or qualifiers: ‘that is, they set conditions that modify effects of other associations’


A note on counter-conditioning...

• What is it...

• CS that is previously associated with one US, becomes associated with another as well.

• So when the animal experiences the CS, which US does it think will follow, USoriginal or USnew ?

• The one that follows more reliably?

• The one that has been experienced more recently?

• The one that has more innate strength?

What determines the form of the Conditioned Response?

• Pavlov believed that it looked a lot like the Unconditioned Response and was thus tied to the nature of the Unconditioned Stimulus.

• That is part of the story, but the form of the CS is also important. For example, in the case of rats, when food is signaled by light or tone...

• CR in case of light was standing & orienting toward feeder

• CR in case of tone was general increase in activity

• When the CS are rats on a trolley, the CR is orienting toward them and performing social behaviors.

Form of CR can be explained by time interval

& form

CR easily understood in context of natural behavior

Shettleworth, S. J. (1998). Cognition, Evolution and Behavior. New York, NY, Oxford University Press

Thoughts...

Pavlovian or classical conditioning is anything but simple...

• Its all about associations between stimuli, but what associations get made, how easily, and how the animal responds to those associations can only be understood within the context of the animal’s behavioral and perceptual repertoire!

• That is why, IMHO, so much of this comes down to “it depends”

• The use of words like ‘stimuli’ imply that there are unitary features to which the animal responds. This may mask the complexity of the animal’s perception of objects and events.

• Our intuition to what they attend is a poor guide, especially if you haven’t taken the time to understand the animal.

Some things do seem to be generally true...

• Animals seem biased to learning certain types of temporal patterns (this is an example of an associative bias)

• By predicting the future, animals can respond sooner and more appropriately

• ‘Surprise’ is the driver for learning. Animals learn so as to avoid surprises

• If some significant event is already well predicted by existing associations, the animal will typically not learn new associations unless something changes.

• For an association to be learned it must have incremental predictive value!

• The form of response an animal makes to a learned association is a function of the nature of the US and the CS and tends to be most easily understood in the context of the animal’s behavioral repertoire.

Some things seem to be generally true...

• Inhibition and/or excitation are more a matter of performance loss (stop performing) or creating additional associations that modulate existing associations than they are about replacing old associations.

• You gotta know your animal, but then pavlovian conditioning can be a powerful technique.

the learned dog class 6: models & just what is learned anyway?

Documents