on neural correlates of reinforcement learning the role of ... · on neural correlates of...

37
Genela Morris Dept. of Neurobiology and Ethology Haifa University On neural correlates of On neural correlates of reinforcement learning reinforcement learning the role of dopamine in the role of dopamine in planning and action planning and action

Upload: others

Post on 19-Apr-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

Genela Morris

Dept. of Neurobiology and Ethology

Haifa University

On neural correlates of On neural correlates of reinforcement learningreinforcement learning

the role of dopamine in the role of dopamine in planning and action planning and action

Page 2: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University222

OverviewOverview

Learning through reward – reinforcement learningMidbrain dopamine neurons and temporal difference (TD) learningACh in the striatumDopamine neurons and their impact on decision behaviourImplications for computational models of action selection

Page 3: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University333

Reinforcement learningReinforcement learningthe basicsthe basics

Supervised learning –all knowing teacher, detailed feedbackReinforcement learning –scalar (correct/incorrect) feedbackUnsupervised learning –self organization

Page 4: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University444

Reinforcement learning: The law of Reinforcement learning: The law of effecteffect

“The Law of Effect is that: Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur”

Edward Lee Thorndike (1911)

Page 5: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

Early attempts at modelingEarly attempts at modeling

January 10January 10January 10BCCN GoettingenBCCN GoettingenBCCN Goettingen555

By associative rulesClassical conditioning

Page 6: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

Classical conditioningClassical conditioningThe Elements:US: Unconditioned stimulus UR: Unconditioned response NS: Neutral stimulus CS: Conditioned stimulus

CS1: Conditioned stimulus 1 CS2: Conditioned stimulus 2

CR: Conditioned response

Page 7: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

Properties of classical conditioningProperties of classical conditioning

(Pavlov 1927)Acquisition. Partial Reinforcement

Generalization. Interstimulus Interval (ISI) effects. Intertrial Interval (ITI) effects.

Page 8: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

So farSo far……A simple association (coincidence, Hebbian) model can explain the phenomenon.

– But…

Page 9: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

Properties of classical conditioningProperties of classical conditioning

(Cnt’d)CS must RELIABLY predict US

Conditioned Inhibition

Relative validity (Wagner 1968). Blocking (Kamin 1968)…

Page 10: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

Which simple association canWhich simple association can’’t t explainexplain

Learning occurs not because two events co-occur, but because that co-occurrence is UNPREDICTED

Page 11: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

RescorlaRescorla--Wagner rule (1972)Wagner rule (1972)Learning to predict reward R given stimulus

U=1Goal: Form a prediction of the reward V of

the form:V=ωUAnd learn to change ω :Δ ω =ε(R-V)UAfter learning of consistent pairing: ω=R

Where: U=CS availability (0,1);V=reward prediction:R=reward availability (0,1) :ω = weight of the connection between U and Vε = learning rateR-V = prediction error

Page 12: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

Blocking with Rescorla WagnerBlocking with Rescorla WagnerGiven U1, U2 and R, after U1 has been learnt:ω1=RV= ω1U1+ ω2U2

Prediction error: R-V=0And no learning occurs for ω2

RRR 000

Page 13: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University131313

Critical problems in Critical problems in reinforcement learning (and in reinforcement learning (and in

RescorlaRescorla--Wagner)Wagner)

1. Temporal credit assignment

state 1

state 2 state N R

Page 14: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University141414

Critical problems, for controlCritical problems, for control

2. Exploration/exploitation

Page 15: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University151515

TD learning TD learning -- solution for temporal solution for temporal credit assignmentcredit assignment

1. Estimate value of current state (Vt=rt+ γ’rt+1+…) :

(discounted) sum of expected rewards2. Measure ‘truer’ value of current state: reward at

present state + estimated value of next state (rt+ γVt+1)

3. TD error 4. Use TD error to improve 1 (Vt

k+1=Vtk+η δt)

where:Vt = value of the state reached at time t in iteration krt = reward given at time t; η = learning rate, δ = prediction error

tttt VVr −+= +1γδ

Page 16: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University161616

TD error: TD error: tttt VVr −+= +1γδ

time

Page 17: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University171717

TD error: TD error: tttt rVV +−= +1γδ

time

Page 18: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

Berlin 2004Berlin 2004Berlin 2004181818

Basal ganglia Basal ganglia -- anatomyanatomy

Page 19: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

Berlin 2004Berlin 2004Berlin 2004191919

Dopamine and acetylcholine meet in the Dopamine and acetylcholine meet in the Dopamine and acetylcholine meet in the striatumstriatumstriatum

MouseMouseMouseMonkeyMonkeyMonkey

Page 20: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

Berlin 2004Berlin 2004Berlin 2004202020

Facts to remember (1)Facts to remember (1)

Basal ganglia receive cortical inputBasal ganglia project to frontal cortexDopamine and acetylcholine localization

Page 21: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University212121

The midbrain dopamine systemThe midbrain dopamine system

Schultz et al, JNS 13:900-913 ,1993

Page 22: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University222222

…… and it can cause long term and it can cause long term plasticity of corticoplasticity of cortico--striatal synapsesstriatal synapses

Reynolds et al, A cellular mechanism of reward-related learning Nature 413, 67 - 70 (2001)

DA

STR

Ctx

D1/5

Page 23: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University232323

Probabilistic instrumental Probabilistic instrumental conditioning taskconditioning task

tttt rVV +−= +1γδ

Page 24: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

Berlin 2004Berlin 2004Berlin 2004242424

DA responseDA response

Page 25: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University252525

Dopamine population responseDopamine population response--cuecue

n=114

Page 26: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University262626

Dopamine population responseDopamine population response--rewardreward

Page 27: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University272727

Dopamine population response Dopamine population response ––reward omissionreward omission

Page 28: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University282828

Instrumental conditioning Instrumental conditioning -- resultsresultsResponses to visual cue are correlated with future reward probabilityResponses to reward are inversely correlated with reward probabilityResponses to reward omission are indifferent to reward probabilityDopamine neurons provide an accurate TD signal (but only in the positive domain)

Page 29: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

The basal ganglia are involved in action

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University292929

But how is this related to bBut how is this related to behaviourehaviour??

The law of effect is all about actionsAI agents act to maximize a goal…and so do people and animals

O’Doherty et al., Science 2004

Page 30: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University303030

State 1State 2State

Action 1

Agent EnvironmentAction

+Reinforcement

Control Control -- Adding actionAdding action

The agent has to:Learn to predict reinforcement state valueKnow the state-action-state transitions behavioural policy

Page 31: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University313131

Solution 1: actor/critic networksSolution 1: actor/critic networks

Environment

Action

State

Actor

Critic State Reward

TD

Page 32: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University323232

How can the dopamine signal How can the dopamine signal contribute to decision behaviour?contribute to decision behaviour?

Long term policy-shaping effect

through synaptic plasticity

Immediate effect on action

btmaction eP +−+

= )(11δ

DA

STR

Ctx

D1/5

CS

State

Action

State Envirt

Actor

CriticRewardTD

State

Action

State Envirt

Actor

CriticRewardTD

Page 33: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University333333

The two armed bandit taskThe two armed bandit task

Page 34: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University343434

Decision Decision behaviour, theory and practicebehaviour, theory and practice

maximizing

probability-matching

monkeys?

Rright/(Rright+Rleft)

Crig

ht/(C

right

+Cle

ft)

0 0.5 1

1

0.5

right

leftleftright

right

leftright

right

RR

RCC

C

θθ

⋅+=∞

+)(

R = reward

C =

cho

ice

Page 35: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University353535

MonkeysMonkeys’’ decisions:decisions: probability matchingprobability matching

Page 36: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University363636

Lost in translation?Lost in translation?

reward behaviour

dopamineresponse

plasticity in action circuits

Page 37: On neural correlates of reinforcement learning the role of ... · On neural correlates of reinforcement learning the role of dopamine in the role of dopamine in ... A simple association

January 10January 10January 10Haifa UniversityHaifa UniversityHaifa University373737

MonkeysMonkeys’’ decisions: shaping by dopaminedecisions: shaping by dopamine