bayesian theories of conditioning in a changing world · 2019. 5. 7. · learning of r(t) might be...
TRANSCRIPT
Bayesian theories of conditioning in a changingworld
Advanced Signal Processing 2, SS2012
Alexander Melzer
SPSC, TU Graz
May 6, 2012
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Outline
1 Introduction
2 Classical and Statistical Conditioning
3 Sigmoid Belief Networks
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Introduction
IssueThe finding that surprising events provoke animals to learn faster
Prediction of biologically significant events
Quantitative models of conditioning
Recent interest: Reframing in explicitly statistical terms
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Introduction
Surprise causes faster learning due to signal change → increased
uncertainty
Pearce’s theory of surprise in conditioning ↔ Bayesian inference
Change is a relatively unexplored aspect of the Bayesian model space
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Stimuli differentiation
Conditioned stimuli, CSsNeutral stimuli, unknown event to the animalExamples: bells, lights
Unconditioned stimuli, USsBiologically significant reinforces for animalsExamples: food, shock
Conditional Responses, CRs
Animals’ prediction with various patterns of CS/US pairingsExamples: light → food
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Stimuli differentiation - Pavlovian conditioning
Figure: Pavlovian conditioning
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Bayesian accounts of conditioning and change
Interpret animal’s responding about the likelihood of reinforcement,given their experience
Use conditioned responding to reflect subjects’ estimates
P(US(t)|CS(t),D) (1)
where D is the training history of CSs
Different Baysian accounts can differ in what sort of model theyassume→ World Models
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
World ModelsDiscriminative
P(US(t)|CS(t),D) (2)
→ Reinforcement given the current stimuli
Generative
P(US(t),CS(t)|D) (3)
→ Predict full pattern of both stimuli and reinforcement
ChangeHow to incorporate the possibility of change?
w(t − 1)→ w(t)
w(t) = [w0(t),w1(t), ...,wn(t)]T
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Historical models
Consider experiment: Light l(t) and sound s(t)
w(t) =
[wl(t)ws(t)
]Pavlovian conditioning: Positive association with reward r(t)
∆wl(t) = αl(t) (r(t)− wl(t)) l(t) (4)
l(t) ∈ {0, 1} ... Presence of light (CS)r(t) ................ Reward (US)wl(t) .............. Strength of expectation of rewardαl(t) .............. Learning rate
Similarly,∆ws(t) = αs(t) (r(t)− ws(t)) s(t) (5)
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Historical models
US-processing theory: Delta rule
wl(t) = wl(t − 1) + [αl(t) (r(t)− wl(t)) l(t)]︸ ︷︷ ︸∆wl (t)
(6)
and
ws(t) = ws(t − 1) + [αs(t) (r(t)− ws(t)) s(t)]︸ ︷︷ ︸∆ws (t)
(7)
Associative strength
V (t) = wl(t)l(t) + ws(t)s(t) (8)
Prediction error: δ(t) = r(t)− V (t)
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Paradigms in conditioning
Figure: Paradigms in conditioning (Dickinson, 1980; Mackintosh, 1983)
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Paradigms in conditioningUnblocking with qualitative change in reinforcement
Figure: Unblocking with qualitative change in reinforcement (Courville, Dawand Touretzky, 2006)
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Paradigms in conditioningOvershadowing counteracting latent inhibition
Figure: Overshadowing counteracting latent inhibition (Courville, Daw andTouretzky, 2006)
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Paradigms in conditioning
Competition between different stimuli → competition between
learning rates
Blocking: Nothing unexpected happens in second set of stimuli
(shadowed)
Unblocking: Quality improvement for such models (due to delay
inbetween rewards)
Extension to multivariate problem ⇒ Statistical formulation
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Statistical formulation
Parametrized probability distribution
P[r(t)|s(t), l(t)] (9)
Maximum likelihood inference → maximize probability P over allsamples
⇒ Three natural models of P[r(t)|s(t), l(t)]
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Three natural models of P[r(t)|s(t), l(t)]
1) Rescorla Wagner (Rescorla and Wagner, 1972)
PG [r(t)|s(t), l(t)] = N [wl l(t) + wss(t), σ2] (10)
Only σ2 added compared to (8)
Learning of r(t) might be corrupted if substantial noise is present
Downwards unblocking suggests that animals are not using PG asbasis for their predictions
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Three natural models of P[r(t)|s(t), l(t)] (cont)2) Compatitive mixture of experts (Nowlan, 1991; Jacobs et al,1991)
Mixture of Gaussian model, EM (Expectation-Maximization)Algorithm
M phase:∆wl(t) ∝ (r(t)− wl(t))ql(t) (11)
whereql(t) ∝ πl(t)e−(r(t)−wl l(t))2/2σ2
(12)
and πl(t) (together with πs(t)) are the mixing proportions.
PM [r(t)|s(t), l(t)] =
πl(t)N [wl , σ2] + πs(t)N [ws , σ
2] + π̄(t)N [w̄ , τ 2] (13)
Model captures downwards unblocking
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Three natural models of P[r(t)|s(t), l(t)] (cont)
3) Cooperative mixture of experts (Jacobs et al, 1991)
PJ [r(t)|s(t), l(t)] = N [wlπl(t)l(t) + wsπs(t)s(t), σ2] (14)
Idea:P[wl(t)|r ] = N [r , ρ−1
l (t)] (15)
P[ws(t)|r ] = N [r , ρ−1s (t)] (16)
where ρl(t) and ρs(t) are the inverse variances.Thus,
σ2 = (ρl(t) + ρs(t))−1 (17)
πl(t) = ρl(t)σ2 πs(t) = ρs(t)σ2 (18)
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Three natural models of P[r(t)|s(t), l(t)] (cont)
3) Cooperative mixture of experts (Jacobs et al, 1991)Normative learning rules
∆wl = αwπl(t)
ρl(t)δ(t) (19)
where δ(t) = r(t)− πl(t)wl(t)− πs(t)ws(t) is the prediction error
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Three natural models of P[r(t)|s(t), l(t)] (summary)
1) Rescorla Wagner
PG [r(t)|s(t), l(t)] = N [wl l(t) + wss(t), σ2]
2) Compatitive mixture of experts
PM [r(t)|s(t), l(t)] = πl(t)N [wl , σ2] + πs(t)N [ws , σ
2] + π̄(t)N [w̄ , τ 2]
3) Cooperative mixture of experts
PJ [r(t)|s(t), l(t)] = N [wlπl(t)l(t) + wsπs(t)s(t), σ2]
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Second-order conditioning
Paradigm Phase 1 training Phase 2 training Test1st-order cond. S1-US - S1?2nd-order cond. S1-US S2-S1 S2?Sensory precond. S2-S1 S1-US S2?
Table: Phases of 1st and 2nd order conditioning
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Second-order conditioning (cont)
Figure: Transience of second order conditioning (Gerwitz and Davis, 2000)
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Second-order conditioning (cont)
Figure: Schematic representation of hypothetical associations (Gerwitz andDavis, 2000)
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Second-order conditioning
Figure: Second-order fear conditioning (Gerwitz and Davis, 2000)
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Sigmoid Belief Networks
Conditional probabilities defined as functions of weighted sums of parentnodes
P(yj = 1|x1, ..., xc ,wm,m) =1
1 + exp(−∑
i wijxi − wyj )(20)
and
P(yj = 0|x1, ..., xc ,wm,m) = 1− P(yj = 1|x1, ..., xc ,wm,m)
wij .... weight: influence of the parent node xi on the child node yi
wyj ... bias termwm ... model parameters for model structure m
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Model representation
Directed graph model
Figure: Sigmoid Belief Network (Courville et al, 2003)
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Sigmoid Belief Likelihood
Stimuli are mutually independent (latent causes) → conditional jointprobability of the observed stimuli:
s∏j=1
P(yj |x1, ..., xc ,wm,m)
Similarly, we assume trials drawn from a stationary process. Resultinglikelihood function of the training data:
P(D|wm,m) =T∏
t=1
∑x
s∏j=1
P(yj(t)|x,wm,m)P(x|wm,m) (21)
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Prediction under Parameter Uncertainty
Consider particular network structure m with parameters wm
Uncertainty associated with the parameters → posterior distributionover wm
p(wm|m,D) ∝ P(D|wm,m)︸ ︷︷ ︸(21)
p(wm|m)︸ ︷︷ ︸prior distribution
Assume model parameters are a priori independent:
p(wm|m) =∏ij
p(wij)∏
i
p(wxi )∏
j
p(wyj )
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Prediction under Parameter Uncertainty (cont)
Measure uncertainty by testing the CR (Conditioned Response):
P(US |CS ,m,D) =
∫P(US |CS ,wm,m,D)p(wm|m,D)dwm (22)
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Prediction under Model Uncertainty
Which model m is the correct one to choose?
Standard Bayesian approach: marginalize out influence of modelchoice
P(US |CS ,D) =∑m
P(US |CS ,m,D)P(m|D) (23)
Posterior over models:
P(m,D) =P(D|m)P(m)∑
m′ P(D|m′)P(m′),
P(D,m)︸ ︷︷ ︸marginal likelihood
=
∫P(D|wm,m)p(wm|m)dwm
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Prediction under Model Uncertainty (cont)
Tradeoff between model fidelity and complexity
Figure: Marginal likelihood (Courville et al, 2003)
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Experiments and results
Figure: Experiments summary (Yin et al, 1994)
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Experiments and results (cont)
Figure: Simulation results (Courville et al, 2003)
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
Experiments and results (cont)
Figure: Corresponding Sigmoid Belief Networks (Courville et al, 2003)
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world
Introduction Classical and Statistical Conditioning Sigmoid Belief Networks References
References
A. C. Courville, N. D. Daw, G. J. Gordon, and D. S. Touretzky.Model uncertainty in classical conditioning.pages 977–984. MIT Press, 2003.
A. C. Courville, N. D. Daw, G. J. Gordon, and D. S. Touretzky.Bayesian theories of conditioning in a changing world.pages 294–300. Elsevier, 2006.
Peter Dayan, Cognitive Sciences, and Theresa Long.Statistical models of conditioning.In NIPS, pages 117–123. MIT Press, 1999.
Davis M. Gewirtz JC.Using pavlovian higher-order conditioning paradigms to investigatethe neural substrates of emotional learning and memory.University of Minnesota, Minneapolis, 2000.
Alexander Melzer SPSC, TU Graz Bayesian theories of conditioning in a changing world