rss discussion of girolami and calderhead, october 13, 2010

13
About discretising Hamiltonians Christian P. Robert Universit´ e Paris-Dauphine and CREST http://xianblog.wordpress.com Royal Statistical Society, October 13, 2010 Christian P. Robert About discretising Hamiltonians

Upload: christian-robert

Post on 05-Dec-2014

978 views

Category:

Documents


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: RSS discussion of Girolami and Calderhead, October 13, 2010

About discretising Hamiltonians

Christian P. Robert

Universite Paris-Dauphine and CREST

http://xianblog.wordpress.com

Royal Statistical Society, October 13, 2010

Christian P. Robert About discretising Hamiltonians

Page 2: RSS discussion of Girolami and Calderhead, October 13, 2010

Hamiltonian dynamics

Dynamic on the level sets of

H (θ,p) = −L(θ) +1

2log{(2π)D|G(θ)|} +

1

2pTG(θ)−1p ,

where p is an auxiliary vector of dimension D, is associated withHamilton’s pde’s

p =∂H

∂p(θ,p) , θ =

∂H

∂θ(θ,p)

which preserve the potential H (θ,p) and hence the targetdistribution at all times t

Christian P. Robert About discretising Hamiltonians

Page 3: RSS discussion of Girolami and Calderhead, October 13, 2010

Discretised Hamiltonian

Girolami and Calderhead reproduce Hamiltonian equations withinthe simulation domain by discretisation via the generalised leapfrog(!) generator,

[Subliminal French bashing?!]

Christian P. Robert About discretising Hamiltonians

Page 4: RSS discussion of Girolami and Calderhead, October 13, 2010

Discretised Hamiltonian

Girolami and Calderhead reproduce Hamiltonian equations withinthe simulation domain by discretisation via the generalised leapfrog(!) generator,but...

Christian P. Robert About discretising Hamiltonians

Page 5: RSS discussion of Girolami and Calderhead, October 13, 2010

Discretised Hamiltonian

Girolami and Calderhead reproduce Hamiltonian equations withinthe simulation domain by discretisation via the generalised leapfrog(!) generator,but...invariance and stability properties of the [background] continuoustime process the method do not carry to the discretised version ofthe process [e.g., Langevin]

Christian P. Robert About discretising Hamiltonians

Page 6: RSS discussion of Girolami and Calderhead, October 13, 2010

Discretised Hamiltonian (2)

Is it useful to so painstakingly reproduce the continuousbehaviour?

Approximations (see R&R’s Langevin) can be corrected by aMetropolis-Hastings step, so why bother with a second levelof approximation?

Discretisation induces a calibration problem: how long is longenough?

Convergence issues (for the MCMC algorithm) should not beimpacted by inexact renderings of the continuous time processin discrete time: loss of efficiency?

Christian P. Robert About discretising Hamiltonians

Page 7: RSS discussion of Girolami and Calderhead, October 13, 2010

An illustration

Comparison of the fits of discretised Langevin diffusion sequencesto the target f(x) ∝ exp(−x4) when using a discretisation stepσ2 = .1 and σ2 = .0001, after the same number T = 107 of steps.

Den

sity

−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5

0.0

0.1

0.2

0.3

0.4

0.5

0.6

Christian P. Robert About discretising Hamiltonians

Page 8: RSS discussion of Girolami and Calderhead, October 13, 2010

An illustration

Comparison of the fits of discretised Langevin diffusion sequencesto the target f(x) ∝ exp(−x4) when using a discretisation stepσ2 = .1 and σ2 = .0001, after the same number T = 107 of steps.

Den

sity

−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5

0.0

0.2

0.4

0.6

0.8

Christian P. Robert About discretising Hamiltonians

Page 9: RSS discussion of Girolami and Calderhead, October 13, 2010

An illustration

Comparison of the fits of discretised Langevin diffusion sequencesto the target f(x) ∝ exp(−x4) when using a discretisation stepσ2 = .1 and σ2 = .0001, after the same number T = 107 of steps.

−2 −1 0 1 2

0e+

002e

+04

4e+

046e

+04

8e+

041e

+05

time

Christian P. Robert About discretising Hamiltonians

Page 10: RSS discussion of Girolami and Calderhead, October 13, 2010

Back on Langevin

For the Langevin diffusion, the corresponding Langevin(discretised) algorithm could as well use another scale η for thegradient, rather than the one τ used for the noise

Christian P. Robert About discretising Hamiltonians

Page 11: RSS discussion of Girolami and Calderhead, October 13, 2010

Back on Langevin

For the Langevin diffusion, the corresponding Langevin(discretised) algorithm could as well use another scale η for thegradient, rather than the one τ used for the noise

y = xt + η∇π(x) + τǫt

rather than a strict Euler discretisation

y = xt + τ2∇π(x)/2 + τǫt

Christian P. Robert About discretising Hamiltonians

Page 12: RSS discussion of Girolami and Calderhead, October 13, 2010

Back on Langevin

For the Langevin diffusion, the corresponding Langevin(discretised) algorithm could as well use another scale η for thegradient, rather than the one τ used for the noise

y = xt + η∇π(x) + τǫt

rather than a strict Euler discretisation

y = xt + τ2∇π(x)/2 + τǫt

A few experiments run in Robert and Casella (1999, Chap. 6, §6.5)hinted that using a scale η 6= τ2/2 could actually lead toimprovements

Christian P. Robert About discretising Hamiltonians

Page 13: RSS discussion of Girolami and Calderhead, October 13, 2010

Back on Langevin

For the Langevin diffusion, the corresponding Langevin(discretised) algorithm could as well use another scale η for thegradient, rather than the one τ used for the noise

y = xt + η∇π(x) + τǫt

rather than a strict Euler discretisation

y = xt + τ2∇π(x)/2 + τǫt

A few experiments run in Robert and Casella (1999, Chap. 6, §6.5)hinted that using a scale η 6= τ2/2 could actually lead toimprovementsWhich [independent] framework should we adopt forassessing discretised diffusions?

Christian P. Robert About discretising Hamiltonians