![Page 1: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/1.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Sampling MethodsBishop PRML Ch. 11
Alireza Ghane
Sampling Methods A. Ghane / T. Möller / G. Mori 1
![Page 2: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/2.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Recall – Inference For General Graphs
• Junction tree algorithm is an exact inference method for arbitrarygraphs
• A particular tree structure defined over cliques of variables• Inference ends up being exponential in maximum clique size• Therefore slow in many cases
• Sampling methods: represent desired distribution with a set ofsamples, as more samples are used, obtain more accuraterepresentation
Sampling Methods A. Ghane / T. Möller / G. Mori 2
![Page 3: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/3.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Outline
Sampling
Rejection Sampling
Importance Sampling
Markov Chain Monte Carlo
Sampling Methods A. Ghane / T. Möller / G. Mori 3
![Page 4: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/4.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Outline
Sampling
Rejection Sampling
Importance Sampling
Markov Chain Monte Carlo
Sampling Methods A. Ghane / T. Möller / G. Mori 4
![Page 5: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/5.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Sampling
• The fundamental problem we address in this lecture is how toobtain samples from a probability distribution p(z)
• This could be a conditional distribution p(z|e)• We often wish to evaluate expectations such as
E[f ] =∫f(z)p(z)dz
• e.g. mean when f(z) = z
• For complicated p(z), this is difficult to do exactly, approximateas
f̂ =1
L
L∑l=1
f(z(l))
where {z(l)|l = 1, . . . , L} are independent samples from p(z)
Sampling Methods A. Ghane / T. Möller / G. Mori 5
![Page 6: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/6.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Sampling
p(z) f(z)
z
• Approximate
f̂ =1
L
L∑l=1
f(z(l))
where {z(l)|l = 1, . . . , L} are independent samples from p(z)
Sampling Methods A. Ghane / T. Möller / G. Mori 6
![Page 7: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/7.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Bayesian Networks - Generating Fair Samples
Cloudy
RainSprinkler
WetGrass
C
TF
.80
.20
P(R|C)C
TF
.10
.50
P(S|C)
S R
T TT FF TF F
.90
.90
.99
P(W|S,R)
P(C).50
.01[from Russell and Norvig, AIMA]
• How can we generate a fair set of samples from this BN?Sampling Methods A. Ghane / T. Möller / G. Mori 7
![Page 8: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/8.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Sampling from Bayesian Networks
• Sampling from discrete Bayesian networks with no observationsis straight-forward, using ancestral sampling
• Bayesian network specifies factorization of joint distribution
P (z1, . . . , zn) =
n∏i=1
P (zi|pa(zi))
• Sample in-order, sample parents before children• Possible because graph is a DAG
• Choose value for zi from p(zi|pa(zi))
Sampling Methods A. Ghane / T. Möller / G. Mori 8
![Page 9: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/9.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Sampling From Empty Network – Example
Cloudy
RainSprinkler
WetGrass
C
TF
.80
.20
P(R|C)C
TF
.10
.50
P(S|C)
S R
T TT FF TF F
.90
.90
.99
P(W|S,R)
P(C).50
.01[from Russell and Norvig, AIMA]
Sampling Methods A. Ghane / T. Möller / G. Mori 9
![Page 10: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/10.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Sampling From Empty Network – Example
Cloudy
RainSprinkler
WetGrass
C
TF
.80
.20
P(R|C)C
TF
.10
.50
P(S|C)
S R
T TT FF TF F
.90
.90
.99
P(W|S,R)
P(C).50
.01[from Russell and Norvig, AIMA]
Sampling Methods A. Ghane / T. Möller / G. Mori 10
![Page 11: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/11.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Sampling From Empty Network – Example
Cloudy
RainSprinkler
WetGrass
C
TF
.80
.20
P(R|C)C
TF
.10
.50
P(S|C)
S R
T TT FF TF F
.90
.90
.99
P(W|S,R)
P(C).50
.01[from Russell and Norvig, AIMA]
Sampling Methods A. Ghane / T. Möller / G. Mori 11
![Page 12: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/12.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Sampling From Empty Network – Example
Cloudy
RainSprinkler
WetGrass
C
TF
.80
.20
P(R|C)C
TF
.10
.50
P(S|C)
S R
T TT FF TF F
.90
.90
.99
P(W|S,R)
P(C).50
.01[from Russell and Norvig, AIMA]
Sampling Methods A. Ghane / T. Möller / G. Mori 12
![Page 13: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/13.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Sampling From Empty Network – Example
Cloudy
RainSprinkler
WetGrass
C
TF
.80
.20
P(R|C)C
TF
.10
.50
P(S|C)
S R
T TT FF TF F
.90
.90
.99
P(W|S,R)
P(C).50
.01[from Russell and Norvig, AIMA]
Sampling Methods A. Ghane / T. Möller / G. Mori 13
![Page 14: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/14.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Sampling From Empty Network – Example
Cloudy
RainSprinkler
WetGrass
C
TF
.80
.20
P(R|C)C
TF
.10
.50
P(S|C)
S R
T TT FF TF F
.90
.90
.99
P(W|S,R)
P(C).50
.01[from Russell and Norvig, AIMA]
Sampling Methods A. Ghane / T. Möller / G. Mori 14
![Page 15: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/15.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Sampling From Empty Network – Example
Cloudy
RainSprinkler
WetGrass
C
TF
.80
.20
P(R|C)C
TF
.10
.50
P(S|C)
S R
T TT FF TF F
.90
.90
.99
P(W|S,R)
P(C).50
.01[from Russell and Norvig, AIMA]
Sampling Methods A. Ghane / T. Möller / G. Mori 15
![Page 16: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/16.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Ancestral Sampling
• This sampling procedure is fair, the fraction of samples with aparticular value tends towards the joint probability of that value
• Define SPS(z1, . . . , zn) to be the probability of generating theevent (z1, . . . , zn)
• This is equal to p(z1, . . . , zn) due to the semantics of theBayesian network
• Define NPS(z1, . . . , zn) to be the number of times we generatethe event (z1, . . . , zn)
limN→∞
NPS(z1, . . . , zn)
N= SPS(z1, . . . , zn) = p(z1, . . . , zn)
Sampling Methods A. Ghane / T. Möller / G. Mori 16
![Page 17: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/17.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Sampling Marginals
Cloudy
RainSprinkler
WetGrass
C
TF
.80
.20
P(R|C)C
TF
.10
.50
P(S|C)
S R
T TT FF TF F
.90
.90
.99
P(W|S,R)
P(C).50
.01
• Note that this procedure can be applied togenerate samples for marginals as well
• Simply discard portions of sample whichare not needed
• e.g. For marginal p(rain), sample(cloudy = t, sprinkler = f, rain =t, wg = t) just becomes (rain = t)
• Still a fair sampling procedure
Sampling Methods A. Ghane / T. Möller / G. Mori 17
![Page 18: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/18.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Sampling with Evidence
• What if we observe some values and want samples from p(z|e)?• Naive method, logic sampling:
• Generate N samples from p(z) using ancestral sampling• Discard those samples that do not have correct evidence values
• e.g. For p(rain|cloudy = t, spr = t, wg = t), sample(cloudy = t, spr = f, rain = t, wg = t) discarded
• Generates fair samples, but wastes time• Many samples will be discarded for low p(e)
Sampling Methods A. Ghane / T. Möller / G. Mori 18
![Page 19: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/19.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Other Problems
• Continuous variables?• Gaussian okay, Box-Muller and other methods• More complex distributions?
• Undirected graphs (MRFs)?
Sampling Methods A. Ghane / T. Möller / G. Mori 19
![Page 20: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/20.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Outline
Sampling
Rejection Sampling
Importance Sampling
Markov Chain Monte Carlo
Sampling Methods A. Ghane / T. Möller / G. Mori 20
![Page 21: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/21.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Rejection Sampling
p(z) f(z)
z
• Consider the case of an arbitrary, continuous p(z)• How can we draw samples from it?• Assume we can evaluate p(z), up to some normalization constant
p(z) =1
Zpp̃(z)
where p̃(z) can be efficiently evaluated (e.g. MRF)
Sampling Methods A. Ghane / T. Möller / G. Mori 21
![Page 22: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/22.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Proposal Distribution
z0 z
u0
kq(z0) kq(z)
p̃(z)
• Let’s also assume that we have some simpler distribution q(z)called a proposal distribution from which we can easily drawsamples
• e.g. q(z) is a Gaussian
• We can then draw samples from q(z) and use these• But these wouldn’t be fair samples from p(z)?!
Sampling Methods A. Ghane / T. Möller / G. Mori 22
![Page 23: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/23.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Comparison Function and Rejection
z0 z
u0
kq(z0) kq(z)
p̃(z)
• Introduce constant k such that kq(z) ≥ p̃(z) for all z• Rejection sampling procedure:
• Generate z0 from q(z)• Generate u0 from [0, kq(z0)] uniformly• If u0 > p̃(z) reject sample z0, otherwise keep it
• Original samples are uniform in grey region• Kept samples uniform in white region – hence samples from p(z)
Sampling Methods A. Ghane / T. Möller / G. Mori 23
![Page 24: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/24.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Rejection Sampling Analysis
• How likely are we to keep samples?• Probability a sample is accepted is:
p(accept) =
∫{p̃(z)/kq(z)}q(z)dz
=1
k
∫p̃(z)dz
• Smaller k is better subject to kq(z) ≥ p̃(z) for all z• If q(z) is similar to p̃(z), this is easier
• In high-dim spaces, acceptance ratio falls off exponentially• Finding a suitable k challenging
Sampling Methods A. Ghane / T. Möller / G. Mori 24
![Page 25: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/25.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Outline
Sampling
Rejection Sampling
Importance Sampling
Markov Chain Monte Carlo
Sampling Methods A. Ghane / T. Möller / G. Mori 25
![Page 26: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/26.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Discretization
• Importance sampling is a sampling technique for computingexpectations:
E[f ] =∫f(z)p(z)dz
• Could approximate using discretization over a uniform grid:
E[f ] ≈L∑l=1
f(z(l))p(z(l))
• c.f. Riemannian sum
• Much wasted computation, exponential scaling in dimension• Instead, again use a proposal distribution instead of a uniform
grid
Sampling Methods A. Ghane / T. Möller / G. Mori 26
![Page 27: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/27.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Importance samplingp(z) f(z)
z
q(z)
• Approximate expectation
E[f ] =
∫f(z)p(z)dz =
∫f(z)
p(z)
q(z)q(z)dz
≈ 1
L
L∑l=1
f(z(l))p(z(l))
q(z(l))
• Quantities p(z(l))/q(z(l)) are known as importance weights• Correct for use of wrong distribution q(z) in sampling
Sampling Methods A. Ghane / T. Möller / G. Mori 27
![Page 28: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/28.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Likelihood Weighted Sampling• Consider the case where we have evidence e and again desire an
expectation over p(x|e)• If we have a Bayesian network, we can use a particular type of
importance sampling called likelihood weighted sampling:• Perform ancestral sampling• If a variable zi is in the evidence set, set its value rather than
sampling
• Importance weights are: ??
p(z(l))
q(z(l))= ?
p(z(l))
q(z(l))=
p(x, e)
p(e)
1∏zi 6∈e p(zi|pai)
∝∏zi∈e
p(zi|pai)
Sampling Methods A. Ghane / T. Möller / G. Mori 28
![Page 29: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/29.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Likelihood Weighted Sampling• Consider the case where we have evidence e and again desire an
expectation over p(x|e)• If we have a Bayesian network, we can use a particular type of
importance sampling called likelihood weighted sampling:• Perform ancestral sampling• If a variable zi is in the evidence set, set its value rather than
sampling
• Importance weights are:
??
p(z(l))
q(z(l))= ?
p(z(l))
q(z(l))=
p(x, e)
p(e)
1∏zi 6∈e p(zi|pai)
∝∏zi∈e
p(zi|pai)
Sampling Methods A. Ghane / T. Möller / G. Mori 29
![Page 30: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/30.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Likelihood Weighted Sampling• Consider the case where we have evidence e and again desire an
expectation over p(x|e)• If we have a Bayesian network, we can use a particular type of
importance sampling called likelihood weighted sampling:• Perform ancestral sampling• If a variable zi is in the evidence set, set its value rather than
sampling
• Importance weights are:
??
p(z(l))
q(z(l))= ?
p(z(l))
q(z(l))=
p(x, e)
p(e)
1∏zi 6∈e p(zi|pai)
∝∏zi∈e
p(zi|pai)
Sampling Methods A. Ghane / T. Möller / G. Mori 30
![Page 31: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/31.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Likelihood Weighted Sampling Example
Cloudy
RainSprinkler
WetGrass
C
TF
.80
.20
P(R|C)C
TF
.10
.50
P(S|C)
S R
T TT FF TF F
.90
.90
.99
P(W|S,R)
P(C).50
.01[from Russell and Norvig, AIMA]
w = 1.0
Sampling Methods A. Ghane / T. Möller / G. Mori 31
![Page 32: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/32.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Likelihood Weighted Sampling Example
Cloudy
RainSprinkler
WetGrass
C
TF
.80
.20
P(R|C)C
TF
.10
.50
P(S|C)
S R
T TT FF TF F
.90
.90
.99
P(W|S,R)
P(C).50
.01[from Russell and Norvig, AIMA]
w = 1.0
Sampling Methods A. Ghane / T. Möller / G. Mori 32
![Page 33: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/33.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Likelihood Weighted Sampling Example
Cloudy
RainSprinkler
WetGrass
C
TF
.80
.20
P(R|C)C
TF
.10
.50
P(S|C)
S R
T TT FF TF F
.90
.90
.99
P(W|S,R)
P(C).50
.01[from Russell and Norvig, AIMA]
w = 1.0× 0.1
Sampling Methods A. Ghane / T. Möller / G. Mori 33
![Page 34: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/34.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Likelihood Weighted Sampling Example
Cloudy
RainSprinkler
WetGrass
C
TF
.80
.20
P(R|C)C
TF
.10
.50
P(S|C)
S R
T TT FF TF F
.90
.90
.99
P(W|S,R)
P(C).50
.01[from Russell and Norvig, AIMA]
w = 1.0× 0.1
Sampling Methods A. Ghane / T. Möller / G. Mori 34
![Page 35: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/35.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Likelihood Weighted Sampling Example
Cloudy
RainSprinkler
WetGrass
C
TF
.80
.20
P(R|C)C
TF
.10
.50
P(S|C)
S R
T TT FF TF F
.90
.90
.99
P(W|S,R)
P(C).50
.01[from Russell and Norvig, AIMA]
w = 1.0× 0.1× 0.99 = 0.099
Sampling Methods A. Ghane / T. Möller / G. Mori 35
![Page 36: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/36.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Sampling Importance Resampling
• Note that importance sampling, e.g. likelihood weightedsampling, gives approximation to expectation, not samples
• But samples can be obtained using these ideas• Sampling-importance-resampling uses a proposal distributionq(z) to generate samples
• Unlike rejection sampling, no parameter k is needed
Sampling Methods A. Ghane / T. Möller / G. Mori 36
![Page 37: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/37.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
SIR - Algorithm
• Sampling-importance-resampling algorithm has two stages• Sampling:
• Draw samples z(1), . . . ,z(L) from proposal distribution q(z)• Importance resampling:
• Put weights on samples
wl =p̃(z(l))/q(z(l))∑m p̃(z
(m))/q(z(m))
• Draw samples from the discrete set z(1), . . . ,z(L) according toweights wl (uniform distribution)
• This two stage process is correct in the limit as L→∞
Sampling Methods A. Ghane / T. Möller / G. Mori 37
![Page 38: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/38.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Outline
Sampling
Rejection Sampling
Importance Sampling
Markov Chain Monte Carlo
Sampling Methods A. Ghane / T. Möller / G. Mori 38
![Page 39: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/39.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Markov Chain Monte Carlo
• Markov chain Monte Carlo (MCMC) methods also use aproposal distribution to generate samples from anotherdistribution
• Unlike the previous methods, we keep track of the samplesgenerated z(1), . . . ,z(τ)
• The proposal distribution depends on the current state: q(z|z(τ))• Intuitively, walking around in state space, each step depends only
on the current state
Sampling Methods A. Ghane / T. Möller / G. Mori 39
![Page 40: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/40.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Metropolis Algorithm
• Simple algorithm for walking around in state space:• Draw sample z∗ ∼ q(z|z(τ))• Accept sample with probability
A(z∗, z(τ)) = min
(1,
p̃(z∗)
p̃(z(τ))
)• If accepted z(τ+1) = z∗, else z(τ+1) = z(τ)
• Note that if z∗ is better than z(τ), it is always accepted• Every iteration produces a sample
• Though sometimes it’s the same as previous• Contrast with rejection sampling
• The basic Metropolis algorithm assumes the proposaldistribution is symmetric q(zA|zB) = q(zB|zA)
Sampling Methods A. Ghane / T. Möller / G. Mori 40
![Page 41: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/41.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Metropolis Algorithm
• Simple algorithm for walking around in state space:• Draw sample z∗ ∼ q(z|z(τ))• Accept sample with probability
A(z∗, z(τ)) = min
(1,
p̃(z∗)
p̃(z(τ))
)• If accepted z(τ+1) = z∗, else z(τ+1) = z(τ)
• Note that if z∗ is better than z(τ), it is always accepted• Every iteration produces a sample
• Though sometimes it’s the same as previous• Contrast with rejection sampling
• The basic Metropolis algorithm assumes the proposaldistribution is symmetric q(zA|zB) = q(zB|zA)
Sampling Methods A. Ghane / T. Möller / G. Mori 41
![Page 42: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/42.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Metropolis Algorithm
• Simple algorithm for walking around in state space:• Draw sample z∗ ∼ q(z|z(τ))• Accept sample with probability
A(z∗, z(τ)) = min
(1,
p̃(z∗)
p̃(z(τ))
)• If accepted z(τ+1) = z∗, else z(τ+1) = z(τ)
• Note that if z∗ is better than z(τ), it is always accepted• Every iteration produces a sample
• Though sometimes it’s the same as previous• Contrast with rejection sampling
• The basic Metropolis algorithm assumes the proposaldistribution is symmetric q(zA|zB) = q(zB|zA)
Sampling Methods A. Ghane / T. Möller / G. Mori 42
![Page 43: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/43.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Metropolis Algorithm
• Simple algorithm for walking around in state space:• Draw sample z∗ ∼ q(z|z(τ))• Accept sample with probability
A(z∗, z(τ)) = min
(1,
p̃(z∗)
p̃(z(τ))
)• If accepted z(τ+1) = z∗, else z(τ+1) = z(τ)
• Note that if z∗ is better than z(τ), it is always accepted• Every iteration produces a sample
• Though sometimes it’s the same as previous• Contrast with rejection sampling
• The basic Metropolis algorithm assumes the proposaldistribution is symmetric q(zA|zB) = q(zB|zA)
Sampling Methods A. Ghane / T. Möller / G. Mori 43
![Page 44: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/44.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Metropolis Example
0 0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
• p(z) is anisotropic Gaussian, proposal distribution q(z) isisotropic Gaussian
• Red lines show rejected moves, green lines show accepted moves• As τ →∞, distribution of z(τ) tends to p(z)
• True if q(zA|zB) > 0• In practice, burn-in the chain, collect samples after some
iterations• Only keep every M th sample
Sampling Methods A. Ghane / T. Möller / G. Mori 44
![Page 45: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/45.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Metropolis Example - Graphical Model
Cloudy
RainSprinkler
WetGrass
C
TF
.80
.20
P(R|C)C
TF
.10
.50
P(S|C)
S R
T TT FF TF F
.90
.90
.99
P(W|S,R)
P(C).50
.01
• Consider running Metropolis algorithm to draw samples fromp(cloudy, rain|spr = t, wg = t)
• Define q(z|zτ ) to be uniformly pick from cloudy, rain,uniformly reset its value
Sampling Methods A. Ghane / T. Möller / G. Mori 45
![Page 46: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/46.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Metropolis Example
Cloudy
RainSprinkler
WetGrass
Cloudy
RainSprinkler
WetGrass
Cloudy
RainSprinkler
WetGrass
Cloudy
RainSprinkler
WetGrass
• Walk around in this state space, keep track of how many timeseach state occurs
Sampling Methods A. Ghane / T. Möller / G. Mori 46
![Page 47: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/47.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Metropolis-Hastings Algorithm
• A generalization of the previous algorithm for asymmetricproposal distributions known as the Metropolis-Hastingsalgorithm
• Accept a step with probability
A(z∗, z(τ)) = min
(1,
p̃(z∗)q(z(τ)|z∗)p̃(z(τ))q(z∗|z(τ))
)
• A sufficient condition for this algorithm to produce the correctdistribution is detailed balance
Sampling Methods A. Ghane / T. Möller / G. Mori 47
![Page 48: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/48.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Gibbs Sampling
• A simple coordinate-wise MCMC method• Given distribution p(z) = p(z1, . . . , zM ), sample each variable
(either in pre-defined or random order)• Sample z(τ+1)
1 ∼ p(z1|z(τ)2 , z(τ)3 , . . . , z
(τ)M )
• Sample z(τ+1)2 ∼ p(z2|z(τ+1)
1 , z(τ)3 , . . . , z
(τ)M )
• . . .• Sample z(τ+1)
M ∼ p(zM |z(τ+1)1 , z
(τ+1)2 , . . . , z
(τ+1)M−1 )
• These are easy if Markov blanket is small, e.g. in MRF withsmall cliques, and forms amenable to sampling
Sampling Methods A. Ghane / T. Möller / G. Mori 48
![Page 49: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/49.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Gibbs Sampling - Example
z1
z2
L
l
Sampling Methods A. Ghane / T. Möller / G. Mori 49
![Page 50: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/50.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Gibbs Sampling Example - Graphical Model
Cloudy
RainSprinkler
WetGrass
C
TF
.80
.20
P(R|C)C
TF
.10
.50
P(S|C)
S R
T TT FF TF F
.90
.90
.99
P(W|S,R)
P(C).50
.01
• Consider running Gibbs sampling onp(cloudy, rain|spr = t, wg = t)
• q(z|zτ ): pick from cloudy, rain, reset its value according top(cloudy|rain, spr, wg) (or p(rain|cloudy, spr, wg))
• This is often easy – only need to look at Markov blanketSampling Methods A. Ghane / T. Möller / G. Mori 50
![Page 51: Sampling Methods - Bishop PRML Ch. 11vda.univie.ac.at/Teaching/ML/16s/LectureNotes/11... · 2016-06-03 · Sampling Methods Bishop PRML Ch. 11 Alireza Ghane ... The fundamental problem](https://reader034.vdocuments.site/reader034/viewer/2022042308/5ed45995116674658f37d755/html5/thumbnails/51.jpg)
Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo
Conclusion
• Readings: Ch. 11.1-11.3 (we skipped much of it)• Sampling methods use proposal distributions to obtain samples
from complicated distributions• Different methods, different methods of correcting for proposal
distribution not matching desired distribution• In practice, effectiveness relies on having good proposal
distribution, which matches desired distribution well
Sampling Methods A. Ghane / T. Möller / G. Mori 51