Download - Dr09 Slide
![Page 1: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/1.jpg)
A vanilla Rao–Blackwellisation ofMetropolis–Hastings algorithms
Randal DOUC and Christian ROBERTTelecom SudParis, France
April 2009
1 / 24
![Page 2: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/2.jpg)
Main themes
1 Rao–Blackwellisation on MCMC.2 Can be performed in any Hastings Metropolis algorithm.3 Asymptotically more efficient to usual MCMC with a
controlled amount of calculations.
2 / 24
![Page 3: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/3.jpg)
Main themes
1 Rao–Blackwellisation on MCMC.2 Can be performed in any Hastings Metropolis algorithm.3 Asymptotically more efficient to usual MCMC with a
controlled amount of calculations.
2 / 24
![Page 4: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/4.jpg)
Main themes
1 Rao–Blackwellisation on MCMC.2 Can be performed in any Hastings Metropolis algorithm.3 Asymptotically more efficient to usual MCMC with a
controlled amount of calculations.
2 / 24
![Page 5: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/5.jpg)
Main themes
1 Rao–Blackwellisation on MCMC.2 Can be performed in any Hastings Metropolis algorithm.3 Asymptotically more efficient to usual MCMC with a
controlled amount of calculations.
2 / 24
![Page 6: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/6.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Outline
1 Introduction
2 Some properties of the HM algorithm
3 Rao–BlackwellisationVariance reductionAsymptotic results
4 Illustrations
5 Conclusion
3 / 24
![Page 7: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/7.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Outline
1 Introduction
2 Some properties of the HM algorithm
3 Rao–BlackwellisationVariance reductionAsymptotic results
4 Illustrations
5 Conclusion
3 / 24
![Page 8: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/8.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Outline
1 Introduction
2 Some properties of the HM algorithm
3 Rao–BlackwellisationVariance reductionAsymptotic results
4 Illustrations
5 Conclusion
3 / 24
![Page 9: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/9.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Outline
1 Introduction
2 Some properties of the HM algorithm
3 Rao–BlackwellisationVariance reductionAsymptotic results
4 Illustrations
5 Conclusion
3 / 24
![Page 10: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/10.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Outline
1 Introduction
2 Some properties of the HM algorithm
3 Rao–BlackwellisationVariance reductionAsymptotic results
4 Illustrations
5 Conclusion
3 / 24
![Page 11: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/11.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Outline
1 Introduction
2 Some properties of the HM algorithm
3 Rao–BlackwellisationVariance reductionAsymptotic results
4 Illustrations
5 Conclusion
4 / 24
![Page 12: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/12.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Metropolis Hastings algorithm
1 We wish to approximate
I =
∫h(x)π(x)dx∫π(x)dx
=
∫
h(x)π̄(x)dx
2 x 7→ π(x) is known but not∫π(x)dx .
3 Approximate I with δ = 1n
∑nt=1 h(x (t)) where (x (t)) is a Markov
chain with limiting distribution π̄.
4 Convergence obtained from Law of Large Numbers or CLT forMarkov chains.
5 / 24
![Page 13: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/13.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Metropolis Hastings algorithm
1 We wish to approximate
I =
∫h(x)π(x)dx∫π(x)dx
=
∫
h(x)π̄(x)dx
2 x 7→ π(x) is known but not∫π(x)dx .
3 Approximate I with δ = 1n
∑nt=1 h(x (t)) where (x (t)) is a Markov
chain with limiting distribution π̄.
4 Convergence obtained from Law of Large Numbers or CLT forMarkov chains.
5 / 24
![Page 14: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/14.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Metropolis Hastings algorithm
1 We wish to approximate
I =
∫h(x)π(x)dx∫π(x)dx
=
∫
h(x)π̄(x)dx
2 x 7→ π(x) is known but not∫π(x)dx .
3 Approximate I with δ = 1n
∑nt=1 h(x (t)) where (x (t)) is a Markov
chain with limiting distribution π̄.
4 Convergence obtained from Law of Large Numbers or CLT forMarkov chains.
5 / 24
![Page 15: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/15.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Metropolis Hastings algorithm
1 We wish to approximate
I =
∫h(x)π(x)dx∫π(x)dx
=
∫
h(x)π̄(x)dx
2 x 7→ π(x) is known but not∫π(x)dx .
3 Approximate I with δ = 1n
∑nt=1 h(x (t)) where (x (t)) is a Markov
chain with limiting distribution π̄.
4 Convergence obtained from Law of Large Numbers or CLT forMarkov chains.
5 / 24
![Page 16: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/16.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Metropolis Hasting Algorithm
Suppose that x (t) is drawn.
1 Simulate yt ∼ q(·|x (t)).
2 Set x (t+1) = yt with probability
α(x (t), yt) = min{
1,π(yt )
π(x (t))
q(x (t)|yt)
q(yt |x (t))
}
Otherwise, set x (t+1) = x (t) .
3 α is such that the detailed balance equation is satisfied: ⊲ π̄ isthe stationary distribution of (x (t)).
◮ The accepted candidates are simulated with the rejectionalgorithm.
6 / 24
![Page 17: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/17.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Metropolis Hasting Algorithm
Suppose that x (t) is drawn.
1 Simulate yt ∼ q(·|x (t)).
2 Set x (t+1) = yt with probability
α(x (t), yt) = min{
1,π(yt )
π(x (t))
q(x (t)|yt)
q(yt |x (t))
}
Otherwise, set x (t+1) = x (t) .
3 α is such that the detailed balance equation is satisfied: ⊲ π̄ isthe stationary distribution of (x (t)).
◮ The accepted candidates are simulated with the rejectionalgorithm.
6 / 24
![Page 18: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/18.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Metropolis Hasting Algorithm
Suppose that x (t) is drawn.
1 Simulate yt ∼ q(·|x (t)).
2 Set x (t+1) = yt with probability
α(x (t), yt) = min{
1,π(yt )
π(x (t))
q(x (t)|yt)
q(yt |x (t))
}
Otherwise, set x (t+1) = x (t) .
3 α is such that the detailed balance equation is satisfied:
π(x)q(y |x)α(x , y) = π(y)q(x |y)α(y , x).
⊲ π̄ is the stationary distribution of (x (t)).
◮ The accepted candidates are simulated with the rejectionalgorithm.
6 / 24
![Page 19: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/19.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Metropolis Hasting Algorithm
Suppose that x (t) is drawn.
1 Simulate yt ∼ q(·|x (t)).
2 Set x (t+1) = yt with probability
α(x (t), yt) = min{
1,π(yt )
π(x (t))
q(x (t)|yt)
q(yt |x (t))
}
Otherwise, set x (t+1) = x (t) .
3 α is such that the detailed balance equation is satisfied:
π(x)q(y |x)α(x , y) = π(y)q(x |y)α(y , x).
⊲ π̄ is the stationary distribution of (x (t)).
◮ The accepted candidates are simulated with the rejectionalgorithm.
6 / 24
![Page 20: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/20.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Outline
1 Introduction
2 Some properties of the HM algorithm
3 Rao–BlackwellisationVariance reductionAsymptotic results
4 Illustrations
5 Conclusion
7 / 24
![Page 21: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/21.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
1 Alternative representation of the estimator δ is
δ =1n
n∑
t=1
h(x (t)) =1N
MN∑
i=1
nih(zi ) ,
where
zi ’s are the accepted yj ’s,MN is the number of accepted yj ’s till time N,ni is the number of times zi appears in the sequence (x (t))t .
8 / 24
![Page 22: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/22.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
q̃(·|zi ) =α(zi , ·) q(·|zi )
p(zi )≤ q(·|zi )
p(zi ),
where p(zi ) =∫α(zi , y) q(y |zi )dy . To simulate according to q̃(·|zi ):
1 Propose a candidate y ∼ q(·|zi )
2 Accept with probability
q̃(y |zi )/
(q(y |zi )
p(zi )
)
= α(zi , y)
Otherwise, reject it and starts again.
3 ◮ this is the transition of the HM algorithm.
The transition kernel q̃ admits π̃ as a stationary distribution:
π̃(x)q̃(y |x) =
9 / 24
![Page 23: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/23.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
q̃(·|zi ) =α(zi , ·) q(·|zi )
p(zi )≤ q(·|zi )
p(zi ),
where p(zi ) =∫α(zi , y) q(y |zi )dy . To simulate according to q̃(·|zi ):
1 Propose a candidate y ∼ q(·|zi )
2 Accept with probability
q̃(y |zi )/
(q(y |zi )
p(zi )
)
= α(zi , y)
Otherwise, reject it and starts again.
3 ◮ this is the transition of the HM algorithm.
The transition kernel q̃ admits π̃ as a stationary distribution:
π̃(x)q̃(y |x) =
9 / 24
![Page 24: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/24.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
q̃(·|zi ) =α(zi , ·) q(·|zi )
p(zi )≤ q(·|zi )
p(zi ),
where p(zi ) =∫α(zi , y) q(y |zi )dy . To simulate according to q̃(·|zi ):
1 Propose a candidate y ∼ q(·|zi )
2 Accept with probability
q̃(y |zi )/
(q(y |zi )
p(zi )
)
= α(zi , y)
Otherwise, reject it and starts again.
3 ◮ this is the transition of the HM algorithm.
The transition kernel q̃ admits π̃ as a stationary distribution:
π̃(x)q̃(y |x) =
9 / 24
![Page 25: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/25.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
q̃(·|zi ) =α(zi , ·) q(·|zi )
p(zi )≤ q(·|zi )
p(zi ),
where p(zi ) =∫α(zi , y) q(y |zi )dy . To simulate according to q̃(·|zi ):
1 Propose a candidate y ∼ q(·|zi )
2 Accept with probability
q̃(y |zi )/
(q(y |zi )
p(zi )
)
= α(zi , y)
Otherwise, reject it and starts again.
3 ◮ this is the transition of the HM algorithm.
The transition kernel q̃ admits π̃ as a stationary distribution:
π̃(x)q̃(y |x) =π(x)p(x)
∫π(u)p(u)du
︸ ︷︷ ︸
π̃(x)
α(x , y)q(y |x)
p(x)︸ ︷︷ ︸
q̃(y |x)
9 / 24
![Page 26: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/26.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
q̃(·|zi ) =α(zi , ·) q(·|zi )
p(zi )≤ q(·|zi )
p(zi ),
where p(zi ) =∫α(zi , y) q(y |zi )dy . To simulate according to q̃(·|zi ):
1 Propose a candidate y ∼ q(·|zi )
2 Accept with probability
q̃(y |zi )/
(q(y |zi )
p(zi )
)
= α(zi , y)
Otherwise, reject it and starts again.
3 ◮ this is the transition of the HM algorithm.
The transition kernel q̃ admits π̃ as a stationary distribution:
π̃(x)q̃(y |x) =π(x)α(x , y)q(y |x)∫π(u)p(u)du
9 / 24
![Page 27: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/27.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
q̃(·|zi ) =α(zi , ·) q(·|zi )
p(zi )≤ q(·|zi )
p(zi ),
where p(zi ) =∫α(zi , y) q(y |zi )dy . To simulate according to q̃(·|zi ):
1 Propose a candidate y ∼ q(·|zi )
2 Accept with probability
q̃(y |zi )/
(q(y |zi )
p(zi )
)
= α(zi , y)
Otherwise, reject it and starts again.
3 ◮ this is the transition of the HM algorithm.
The transition kernel q̃ admits π̃ as a stationary distribution:
π̃(x)q̃(y |x) =π(y)α(y , x)q(x |y)∫π(u)p(u)du
9 / 24
![Page 28: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/28.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
q̃(·|zi ) =α(zi , ·) q(·|zi )
p(zi )≤ q(·|zi )
p(zi ),
where p(zi ) =∫α(zi , y) q(y |zi )dy . To simulate according to q̃(·|zi ):
1 Propose a candidate y ∼ q(·|zi )
2 Accept with probability
q̃(y |zi )/
(q(y |zi )
p(zi )
)
= α(zi , y)
Otherwise, reject it and starts again.
3 ◮ this is the transition of the HM algorithm.
The transition kernel q̃ admits π̃ as a stationary distribution:
π̃(x)q̃(y |x) = π̃(y)q̃(x |y) ,
9 / 24
![Page 29: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/29.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Lemme
The sequence (zi , ni) satisfies
1 (zi , ni)i is a Markov chain;
2 zi+1 and ni are independent given zi ;
3 ni is distributed as a geometric random variable with probabilityparameter
p(zi ) :=
∫
α(zi , y) q(y |zi ) dy ; (1)
4 (zi )i is a Markov chain with transition kernel
Q̃(z, dy) = q̃(y |z)dy and stationary distribution π̃ such that
q̃(·|z) ∝ α(z, ·) q(·|z) and π̃(·) ∝ π(·)p(·) .
10 / 24
![Page 30: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/30.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Lemme
The sequence (zi , ni) satisfies
1 (zi , ni)i is a Markov chain;
2 zi+1 and ni are independent given zi ;
3 ni is distributed as a geometric random variable with probabilityparameter
p(zi ) :=
∫
α(zi , y) q(y |zi ) dy ; (1)
4 (zi )i is a Markov chain with transition kernel
Q̃(z, dy) = q̃(y |z)dy and stationary distribution π̃ such that
q̃(·|z) ∝ α(z, ·) q(·|z) and π̃(·) ∝ π(·)p(·) .
10 / 24
![Page 31: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/31.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Lemme
The sequence (zi , ni) satisfies
1 (zi , ni)i is a Markov chain;
2 zi+1 and ni are independent given zi ;
3 ni is distributed as a geometric random variable with probabilityparameter
p(zi ) :=
∫
α(zi , y) q(y |zi ) dy ; (1)
4 (zi )i is a Markov chain with transition kernel
Q̃(z, dy) = q̃(y |z)dy and stationary distribution π̃ such that
q̃(·|z) ∝ α(z, ·) q(·|z) and π̃(·) ∝ π(·)p(·) .
10 / 24
![Page 32: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/32.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Lemme
The sequence (zi , ni) satisfies
1 (zi , ni)i is a Markov chain;
2 zi+1 and ni are independent given zi ;
3 ni is distributed as a geometric random variable with probabilityparameter
p(zi ) :=
∫
α(zi , y) q(y |zi ) dy ; (1)
4 (zi )i is a Markov chain with transition kernel
Q̃(z, dy) = q̃(y |z)dy and stationary distribution π̃ such that
q̃(·|z) ∝ α(z, ·) q(·|z) and π̃(·) ∝ π(·)p(·) .
10 / 24
![Page 33: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/33.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
zi−1
11 / 24
![Page 34: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/34.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
zi−1 zi
ni−1
indep
indep
11 / 24
![Page 35: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/35.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
zi−1 zi zi+1
ni−1 ni
indep
indep
indep
indep
11 / 24
![Page 36: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/36.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
zi−1 zi zi+1
ni−1 ni
indep
indep
indep
indep
δ =1n
n∑
t=1
h(x (t)) =1N
MN∑
i=1
nih(zi ) .
11 / 24
![Page 37: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/37.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
zi−1 zi zi+1
ni−1 ni
indep
indep
indep
indep
δ =1n
n∑
t=1
h(x (t)) =1N
MN∑
i=1
nih(zi ) .
11 / 24
![Page 38: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/38.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Outline
1 Introduction
2 Some properties of the HM algorithm
3 Rao–BlackwellisationVariance reductionAsymptotic results
4 Illustrations
5 Conclusion
12 / 24
![Page 39: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/39.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
1 A natural idea:
δ∗ =1N
MN∑
i=1
h(zi )
p(zi ),
13 / 24
![Page 40: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/40.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
1 A natural idea:
δ∗ ≃
∑MNi=1
h(zi )
p(zi )∑MN
i=11
p(zi )
=
∑MNi=1
π(zi )
π̃(zi )h(zi )
∑MNi=1
π(zi )
π̃(zi )
.
13 / 24
![Page 41: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/41.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
1 A natural idea:
δ∗ ≃
∑MNi=1
h(zi )
p(zi )∑MN
i=11
p(zi )
=
∑MNi=1
π(zi )
π̃(zi )h(zi )
∑MNi=1
π(zi )
π̃(zi )
.
2 But p not available in closed form.
13 / 24
![Page 42: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/42.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
1 A natural idea:
δ∗ ≃
∑MNi=1
h(zi )
p(zi )∑MN
i=11
p(zi )
=
∑MNi=1
π(zi )
π̃(zi )h(zi )
∑MNi=1
π(zi )
π̃(zi )
.
2 But p not available in closed form.
3 The geometric ni is the obvious solution that is used in theoriginal Metropolis–Hastings estimate.
13 / 24
![Page 43: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/43.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
1 A natural idea:
δ∗ ≃
∑MNi=1
h(zi )
p(zi )∑MN
i=11
p(zi )
=
∑MNi=1
π(zi )
π̃(zi )h(zi )
∑MNi=1
π(zi )
π̃(zi )
.
2 But p not available in closed form.
3 The geometric ni is the obvious solution that is used in theoriginal Metropolis–Hastings estimate.
ni = 1 +
∞∑
j=1
∏
ℓ≤j
I {uℓ ≥ α(zi , yℓ)} ,
13 / 24
![Page 44: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/44.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
ni = 1 +
∞∑
j=1
∏
ℓ≤j
I {uℓ ≥ α(zi , yℓ)} ,
Lemma
If (yj)j is an iid sequence with distribution q(y |zi ), the quantity
ξ̂i = 1 +
∞∑
j=1
∏
ℓ≤j
{1 − α(zi , yℓ)}
is an unbiased estimator of 1/p(zi ) which variance, conditional on zi ,
is lower than the conditional variance of ni , {1 − p(zi )}/p2(zi ).
13 / 24
![Page 45: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/45.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
ξ̂i = 1 +∞∑
j=1
∏
ℓ≤j
{1 − α(zi , yℓ)}
1 Infinite sum but sometimes finite:
α(x (t), yt) = min{
1,π(yt )
π(x (t))
q(x (t)|yt)
q(yt |x (t))
}
For example: take a symetric random walk as a proposal.
2 What if we wish to be sure that the sum is finite?
14 / 24
![Page 46: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/46.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Variance reduction
Proposition
If (yj)j is an iid sequence with distribution q(y |zi ) and (uj)j is an iiduniform sequence, for any k ≥ 0, the quantity
ξ̂ki = 1 +
∞∑
j=1
∏
1≤ℓ≤k∧j
{1 − α(zi , yj)}∏
k+1≤ℓ≤j
I {uℓ ≥ α(zi , yℓ)} (2)
is an unbiased estimator of 1/p(zi ) with an almost sure finite numberof terms.
15 / 24
![Page 47: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/47.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Variance reduction
Proposition
If (yj)j is an iid sequence with distribution q(y |zi ) and (uj)j is an iiduniform sequence, for any k ≥ 0, the quantity
ξ̂ki = 1 +
∞∑
j=1
∏
1≤ℓ≤k∧j
{1 − α(zi , yj)}∏
k+1≤ℓ≤j
I {uℓ ≥ α(zi , yℓ)} (2)
is an unbiased estimator of 1/p(zi ) with an almost sure finite numberof terms. Moreover, for k ≥ 1,
V
[
ξ̂ki
∣∣∣ zi
]
=1 − p(zi )
p2(zi)−1 − (1 − 2p(zi ) + r(zi))
k
2p(zi ) − r(zi )
(2 − p(zi )
p2(zi )
)
(p(zi )−r(zi )) ,
where p(zi ) :=∫α(zi , y) q(y |zi ) dy . and r(zi) :=
∫α2(zi , y) q(y |zi ) dy .
15 / 24
![Page 48: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/48.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Variance reduction
Proposition
If (yj)j is an iid sequence with distribution q(y |zi ) and (uj)j is an iiduniform sequence, for any k ≥ 0, the quantity
ξ̂ki = 1 +
∞∑
j=1
∏
1≤ℓ≤k∧j
{1 − α(zi , yj)}∏
k+1≤ℓ≤j
I {uℓ ≥ α(zi , yℓ)} (2)
is an unbiased estimator of 1/p(zi ) with an almost sure finite numberof terms. Therefore, we have
V
[
ξ̂i
∣∣∣ zi
]
≤ V
[
ξ̂ki
∣∣∣ zi
]
≤ V
[
ξ̂0i
∣∣∣ zi
]
= V [ni | zi ] .
15 / 24
![Page 49: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/49.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Variance reduction
zi−1
ξ̂ki = 1 +
∞∑
j=1
∏
1≤ℓ≤k∧j
{1 − α(zi , yj)}∏
k+1≤ℓ≤j
I {uℓ ≥ α(zi , yℓ)} (3)
16 / 24
![Page 50: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/50.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Variance reduction
zi−1 zi
ξ̂ki−1
not indep
not indep
ξ̂ki = 1 +
∞∑
j=1
∏
1≤ℓ≤k∧j
{1 − α(zi , yj)}∏
k+1≤ℓ≤j
I {uℓ ≥ α(zi , yℓ)} (3)
16 / 24
![Page 51: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/51.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Variance reduction
zi−1 zi zi+1
ξ̂ki−1 ξ̂k
i
not indep
not indep
not indep
not indep
ξ̂ki = 1 +
∞∑
j=1
∏
1≤ℓ≤k∧j
{1 − α(zi , yj)}∏
k+1≤ℓ≤j
I {uℓ ≥ α(zi , yℓ)} (3)
16 / 24
![Page 52: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/52.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Variance reduction
zi−1 zi zi+1
ξ̂ki−1 ξ̂k
i
not indep
not indep
not indep
not indep
δkM =
∑Mi=1 ξ̂
ki h(zi )
∑Mi=1 ξ̂
ki
.
16 / 24
![Page 53: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/53.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Variance reduction
zi−1 zi zi+1
ξ̂ki−1 ξ̂k
i
not indep
not indep
not indep
not indep
δkM =
∑Mi=1 ξ̂
ki h(zi )
∑Mi=1 ξ̂
ki
.
16 / 24
![Page 54: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/54.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Asymptotic results
Let
δkM =
∑Mi=1 ξ̂
ki h(zi )
∑Mi=1 ξ̂
ki
.
For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ <∞}.
17 / 24
![Page 55: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/55.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Asymptotic results
Let
δkM =
∑Mi=1 ξ̂
ki h(zi )
∑Mi=1 ξ̂
ki
.
For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ <∞}.Assume that there exist a positive function ϕ ≥ 1 such that
∀h ∈ Cϕ,∑M
i=1 h(zi )/p(zi )∑M
i=1 1/p(zi )
P−→ π(h) (3)
Theorem
Under the assumption that π(p) > 0, the following convergenceproperty holds:
i) If h is in Cϕ, then
δkM
P−→M→∞ π(h) (◮CONSISTENCY)
17 / 24
![Page 56: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/56.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Asymptotic results
Let
δkM =
∑Mi=1 ξ̂
ki h(zi )
∑Mi=1 ξ̂
ki
.
For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ <∞}.Assume that there exist a positive function ψ such that
∀h ∈ Cψ,√
M
(∑Mi=1 h(zi )/p(zi )∑M
i=1 1/p(zi )− π(h)
)
L−→ N (0, Γ(h))
Theorem
Under the assumption that π(p) > 0, the following convergenceproperty holds:
ii) If, in addition, h2/p ∈ Cϕ and h ∈ Cψ, then
√M(δk
M − π(h))L−→M→∞ N (0,Vk [h − π(h)]) , (◮CLT)
where Vk (h) := π(p)∫π(dz)V
[
ξ̂ki
∣∣∣ z
]
h2(z)p(z) + Γ(h) .17 / 24
![Page 57: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/57.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Asymptotic results
We will need some additional assumptions. Assume a maximalinequality for the Markov chain (zi )i : there exists a measurablefunction ζ such that for any starting point x ,
∀h ∈ Cζ , Px
∣∣∣∣∣∣
sup0≤i≤N
i∑
j=0
[h(zi ) − π̃(h)]
∣∣∣∣∣∣
> ǫ
≤ NCh(x)
ǫ2
Theorem
Assume that h is such that h/p ∈ Cζ and {Ch/p, h2/p2} ⊂ Cφ. Assumemoreover that
√M(δ0
M − π(h)) L−→ N (0,V0[h − π(h)]) .
Then, for any starting point x,
√
MN
(∑Nt=1 h(x (t))
N− π(h)
)
L−→N→∞ N (0,V0[h − π(h)]) ,
18 / 24
![Page 58: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/58.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Asymptotic results
We will need some additional assumptions. Assume a maximalinequality for the Markov chain (zi )i : there exists a measurablefunction ζ such that for any starting point x ,
∀h ∈ Cζ , Px
∣∣∣∣∣∣
sup0≤i≤N
i∑
j=0
[h(zi ) − π̃(h)]
∣∣∣∣∣∣
> ǫ
≤ NCh(x)
ǫ2
Moreover, assume that ∃φ ≥ 1 such that for any starting point x ,
∀h ∈ Cφ, Q̃n(x , h)P−→ π̃(h) = π(ph)/π(p) ,
Theorem
Assume that h is such that h/p ∈ Cζ and {Ch/p, h2/p2} ⊂ Cφ. Assumemoreover that
√M(δ0
M − π(h)) L−→ N (0,V0[h − π(h)]) .
Then, for any starting point x,
(∑ )18 / 24
![Page 59: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/59.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Asymptotic results
We will need some additional assumptions. Assume a maximalinequality for the Markov chain (zi )i : there exists a measurablefunction ζ such that for any starting point x ,
∀h ∈ Cζ , Px
∣∣∣∣∣∣
sup0≤i≤N
i∑
j=0
[h(zi ) − π̃(h)]
∣∣∣∣∣∣
> ǫ
≤ NCh(x)
ǫ2
Moreover, assume that ∃φ ≥ 1 such that for any starting point x ,
∀h ∈ Cφ, Q̃n(x , h)P−→ π̃(h) = π(ph)/π(p) ,
Theorem
Assume that h is such that h/p ∈ Cζ and {Ch/p, h2/p2} ⊂ Cφ. Assumemoreover that
√M(δ0
M − π(h)) L−→ N (0,V0[h − π(h)]) .
Then, for any starting point x,
(∑ )18 / 24
![Page 60: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/60.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Asymptotic results
∀h ∈ Cζ , Px
∣∣∣∣∣∣
sup0≤i≤N
i∑
j=0
[h(zi ) − π̃(h)]
∣∣∣∣∣∣
> ǫ
≤ NCh(x)
ǫ2
∀h ∈ Cφ, Q̃n(x , h)P−→ π̃(h) = π(ph)/π(p) ,
Theorem
Assume that h is such that h/p ∈ Cζ and {Ch/p, h2/p2} ⊂ Cφ. Assumemoreover that
√M(δ0
M − π(h)) L−→ N (0,V0[h − π(h)]) .
Then, for any starting point x,
√
MN
(∑Nt=1 h(x (t))
N− π(h)
)
L−→N→∞ N (0,V0[h − π(h)]) ,
where MN is defined by18 / 24
![Page 61: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/61.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Asymptotic results
Theorem
Assume that h is such that h/p ∈ Cζ and {Ch/p, h2/p2} ⊂ Cφ. Assumemoreover that
√M(δ0
M − π(h)) L−→ N (0,V0[h − π(h)]) .
Then, for any starting point x,
√
MN
(∑Nt=1 h(x (t))
N− π(h)
)
L−→N→∞ N (0,V0[h − π(h)]) ,
where MN is defined by
MN∑
i=1
ξ̂0i ≤ N <
MN+1∑
i=1
ξ̂0i . (3)
18 / 24
![Page 62: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/62.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Asymptotic results
Theorem
Assume that h is such that h/p ∈ Cζ and {Ch/p, h2/p2} ⊂ Cφ. Assumemoreover that
√M(δ0
M − π(h)) L−→ N (0,V0[h − π(h)]) .
Then, for any starting point x,
√
MN
(∑Nt=1 h(x (t))
N− π(h)
)
L−→N→∞ N (0,V0[h − π(h)]) ,
where MN is defined by
MN∑
i=1
ξ̂0i ≤ N <
MN+1∑
i=1
ξ̂0i . (3)
18 / 24
![Page 63: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/63.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Outline
1 Introduction
2 Some properties of the HM algorithm
3 Rao–BlackwellisationVariance reductionAsymptotic results
4 Illustrations
5 Conclusion
19 / 24
![Page 64: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/64.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Figure: Overlay of the variations of 250 iid realisations of theestimates δ (gold) and δ∞ (grey) of E[X ] = 0 for 1000 iterations, alongwith the 90% interquantile range for the estimates δ (brown) and δ∞
(pink), in the setting of a random walk Gaussian proposal with scaleτ = 10.
20 / 24
![Page 65: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/65.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Figure: Overlay of the variations of 500 iid realisations of theestimates δ (deep grey), δ∞ (medium grey) and of the importancesampling version (light grey) of E[X ] = 10 when X ∼ Exp(.1) for 100iterations, along with the 90% interquantile ranges (same colourcode), in the setting of an independent exponential proposal withscale µ = 0.02.
21 / 24
![Page 66: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/66.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
π(x) = β(1 − β)x and 2q(y |x) =
{
I|x−y |=1 if x > 0 ,
I|y |≤1 if x = 0 .
For this problem,
p(x) = 1 − β/2 and r(x) = 1 − β + β2/2 .
We can therefore compute the gain in variance
p(x) − r(x)
2p(x) − r(x)
2 − p(x)
p2(x)= 2
β(1 − β)(2 + β)
(2 − β2)(2 − β)2
which is optimal for β = 0.174, leading to a gain of 0.578 while therelative gain in variance is
p(x) − r(x)
2p(x) − r(x)
2 − p(x)
1 − p(x)=
(1 − β)(2 + β)
(2 − β2)
which is decreasing in β.
22 / 24
![Page 67: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/67.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Outline
1 Introduction
2 Some properties of the HM algorithm
3 Rao–BlackwellisationVariance reductionAsymptotic results
4 Illustrations
5 Conclusion
23 / 24
![Page 68: Dr09 Slide](https://reader033.vdocuments.site/reader033/viewer/2022051322/545550e6af795994188b482d/html5/thumbnails/68.jpg)
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
a) Rao Blackwellisation of any HM algorithm with a controledamount of additional calculation.
b) Link with the importance sampling of Markov chains.
c) Analysis with asymptotic results on triangular arrays.
24 / 24