figure 4 responses of dopamine neurons to unpredicted primary reward (top) and the transfer of this...
Post on 22-Dec-2015
215 views
TRANSCRIPT
FIGURE 4 Responses of dopamine neurons to unpredicted primary reward (top) and the transfer of this
response to progressively earlier reward-predicting conditioned stimuli with training (middle). The bottom record shows a control baseline task when the reward is predicted by an earlier stimulus and not the light. From Schultz et al. (1995) with permission.
Odor Selective Cells in the Amygdala fire preferentially with regard to outcome or reward value of an odor prior to demonstration that the animal has learned this outcome or value.
Odor Selective Cells in the Amygdala fire preferentially with regard to outcome or reward value of an odor simultaneous to demonstration that the animal has learned this outcome or value.
Cells in Orbitofrontal Cortex (OFC) show less selectivity to outcome, in rats without an amygdala. This
demonstrates a role for the amygdala in conveying motivational/reward information to the OFC.
Dopamine, reward processing and optimal prediction
ONLY AS A REFERENCE FOR THOSE WHO ARE INTERESTED IN BEGINNING TO CROSS THE NEUROBEHAVIORALCOMPUTATIONAL DIVIDE – Maybe after the Exam??
Human dopaminergic system
Cortical and striatal projections
Schultz, 1998
Koob & Le Moal, 2001
Schultz, Dayan & Montague 1997
Expected Reward
v = wu
v : expected reward w : weight (association) u : stimulus (binary)
Rescorla-Wagner Rule
Association update rule: w w + αδuw : weight (association)α : learning rateu : stimulus
Prediction error: δ = r - vr : actual reward
v : expected reward
Rescorla - Wagner provides account for:
Some Pavlovian conditioningExtinctionPartial reinforcement
and, with more than one stimulus:
BlockingInhibitory conditioningOvershadowing
… but not
Latent inhibition (CS preexposure effect)Secondary conditioning
A recent update: uncertainty (i²)
Kakade, Montague & Dayan, 2001
Kalman weight update rule:
wi wi + αi δ
With associability:
αi = i² ui
jj² uj +E
An example:
U1 U2 U3 U4 U5
U(t)
input
U(t)
input
r(t)
U(t)
input
r(t)
w(t)
U(t)
input
ŵ(t)
v(t)
U(t)
input
r(t)
ŵ(t)
v(t)
U(t)
input
r(t)
ŵ(t)
v(t)
δ(t)
(t) = r(t) - v(t)
Error Rule
U(t)
ŵ(t)
v(t)
inset
Ui -input
i wi
-uncertainty -weight
Uncertainty
Kalman learning & associability
weight update rule:
ŵi (t+1) = ŵi (t) + α i (t) δ (t)
associability:
αi(t) =i(t)² xi (t)jj(t)² xj (t)+E
Stimulus uncertainties
Reward prediction
Predicting future reward
single time steps:v = wu v : expected reward
w : weight (association)
u : stimulus
total predicted reward:
v(t) = w(τ) u(t - τ) t : time steps in a
trial τ : current time step
t τ=0
Sum of discounted future rewards:
With 0 ≤ γ ≤ 1
In recursive form:
Schultz, Dayan & Montague, 1997
Exponential discounting, γ = .95
0 10 20 30 40 50 60 70 80 90 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
TIME STEPS
RE
WA
RD
VA
LUE
Temporal difference rule
Total estimated future reward: v(t) = r(t)+ γv(t+1) r(t) = v(t)-γv(t+1)
Temporal difference rule: δ = r(t)+γv(t+1)-v(t)
(With single time steps: δ = r - vr : actual reward
v : expected reward )
Temporal difference rule
Total estimated future reward: v(t) = r(t)+v(t+1) r(t) = v(t)-v(t+1)
Temporal difference rule: δ = r(t) + v(t+1)-v(t)
(With single time steps: δ = r - vr : actual reward
v : expected reward )
Schultz, Dayan & Montague, 1997
Schultz, 1996
Anatomical interpretation
Schultz, Dayan & Montague, 1997
Temporal Difference Rule for Navigation
between successive steps u and u’
δ = ra (u) + γ v(u’)-v(u)
Behavior evaluation Hippocampal place field
Foster, Morris & Dayan 2000
Spatial learning
Foster, Morris & Dayan 2000
Conclusions
• Behavioral study of (nonhuman) neural systems is interesting
• Neural processes amenable to contemporary learning theory
• .. they may play distinct roles a normative framework of learning
e.g. vta, hippocampus, subiculum, also- Ach in NBM/SI, NE in LC, 5-HT, ventral striatum,
lateral connections ,core/shell distinctions of the NAAC, patch-matrix anatomy in basal ganglia, the superior colliculus,
psychoalphabetadiscobioaquadodoo