uai mcmc tutorial
TRANSCRIPT
-
8/7/2019 Uai Mcmc Tutorial
1/61
Inference on Relational Models Using
Markov Chain Monte Carlo
Brian Milch
Massachusetts Institute of Technology
UAI Tutorial
July 19, 2007
-
8/7/2019 Uai Mcmc Tutorial
2/61
2
S. Russel and P. Norvig (1995). Artificial Intelligence: A Modern Approach. Upper
Saddle River, NJ: Prentice Hall.
Example 1: Bibliographies
Russell, Stuart and Norvig, Peter. Articial Intelligence. Prentice-Hall, 1995.
Stuart Russell Peter Norvig
Artificial Intelligence: A Modern Approach
-
8/7/2019 Uai Mcmc Tutorial
3/61
3
(1.9, 6.1, 2.2)
(0.6, 5.9, 3.2)
Example 2: Aircraft Tracking
t=1 t=2 t=3
(1.9, 9.0, 2.1)
(0.7, 5.1, 3.2)
(1.8, 7.4, 2.3)
(0.9, 5.8, 3.1)
-
8/7/2019 Uai Mcmc Tutorial
4/61
4
Inference on Relational Structures
Russell Roberts
AI: A Mod...
Rus... AI... AI: A... Rus... AI... AI: A...
Rus... AI... AI: A...
Rob... Adv... Rob...
Shak... Haml... Wm...Seu... The... Seu...
Russell Norvig
AI: A Mod...Advance...
Seuss
The... If you...
Shak...
Hamlet
Tempest
1.2 x 10-12 2.3 x 10-12 4.5 x 10-14
6.7 x 10-16 8.9 x 10-16 5.0 x 10-20
-
8/7/2019 Uai Mcmc Tutorial
5/61
5
Markov Chain Monte Carlo (MCMC)
Markov chain s1, s2, ... over
worlds where evidence E is
true
Approximate P(Q|E) as
fraction of s1, s2, ... that satisfy
query Q
E
Q
-
8/7/2019 Uai Mcmc Tutorial
6/61
6
Outline
Probabilistic models for relational structures Modeling the number of objects
Three mistakes that are easy to make
Markov chain Monte Carlo (MCMC) Gibbs sampling
Metropolis-Hastings
MCMC over events
Case studies Citation matching
Multi-target tracking
-
8/7/2019 Uai Mcmc Tutorial
7/61
7
Simple Example: Clustering
Wingspan (cm)
Q = 22 Q = 49 Q = 80
10 20 30 40 50 60 70 80 90 100
-
8/7/2019 Uai Mcmc Tutorial
8/61
8
Simple Bayesian Mixture Model
Number of latent objects is known to be k
For each latent object i, have parameter:
For each data point j, have object selector
and observable value
]100,0[Uniform~iQ
},...,1Uniform({~ kCj
25,N r aljcj
X Q
-
8/7/2019 Uai Mcmc Tutorial
9/61
9
BN for Mixture Model
X1 X2 X3 Xn
C1 C2 C3 Cn
Q1 Q2 Qk
-
8/7/2019 Uai Mcmc Tutorial
10/61
10
Context-Specific Dependencies
X1 X2 X3 Xn
C1 C2 C3 Cn
Q1 Q2 Qk
= 2 = 1 = 2
-
8/7/2019 Uai Mcmc Tutorial
11/61
11
Extensions to Mixture Model
Random number of latent objects k, with distributionp(k) such as:
Uniform({1, , 100})
Geometric(0.1)
Poisson(10)
Random distribution T for selecting objects
p(T | k) ~ Dirichlet(E1,..., Ek)(Dirichlet: distribution over probability vectors)
Still symmetric: each Ei= E/k
unbounded!
-
8/7/2019 Uai Mcmc Tutorial
12/61
12
Existence versus Observation
A latent object can existeven ifno observations correspond
to it
Bird species may not be observed yet
Aircraft may fly over without yielding any blips
Two questions:
How many objects correspond to observations?
How many objects are there in total?
Observed3 species, each 100 times: probably no more
Observed 200 species, each 1 or 2 times: probably more exist
-
8/7/2019 Uai Mcmc Tutorial
13/61
13
Expecting Additional Objects
P(ever observe new species | seen r so far) bounded byP(ku r)
So as # species observedpg, probability of ever seeingmorep 0
What if we dont want this?
r observed species
observe more later?
-
8/7/2019 Uai Mcmc Tutorial
14/61
14
Dirichlet Process Mixtures
Set k = g, letT be infinite-dimensionalprobabilityvector with stick-breaking prior
Another view: Define prior directly on partitions ofdata points, allowing unbounded number of blocks
Drawback: Cant ask about number ofunobservedlatent objects (always infinite)
T1 T2 T3 T4 T5
[Ferguson 1983; Sethuraman 1994]
[tutorials: Jordan 2005; Sudderth 2006]
-
8/7/2019 Uai Mcmc Tutorial
15/61
15
Outline
Probabilistic models for relational structures Modeling the number of objects
Three mistakes that are easy to make
Markov chain Monte Carlo (MCMC) Gibbs sampling
Metropolis-Hastings
MCMC over events
Case studies Citation matching
Multi-target tracking
-
8/7/2019 Uai Mcmc Tutorial
16/61
16
Mistake 1: Ignoring Interchangeability
Which birds are in species S1?
Latent objectindices are
interchangeable
Posterior on selector variable CB1 is uniform
Posterior on QS1 has a peak for each cluster of birds
Really care aboutpartition of observations
Partition with r blocks corresponds to k! / (k-r)! instantiations
of the Cjvariables
B1 B3B2 B5 B4
{{1, 3}, {2}, {4, 5}}
(1, 2, 1, 3, 3), (1, 2, 1, 4, 4), (1, 4, 1, 3, 3), (2, 1, 2, 3, 3),
-
8/7/2019 Uai Mcmc Tutorial
17/61
17
Ignoring Interchangeability, Contd
Say k = 4. Whats prior probability that B1, B3 arein one species, B2 in another?
Multiply probabilities for CB1
, CB2
, CB3:
(1/4) x (1/4) x (1/4)
Not enough! Partition {{B1, B3}, {B2}} correspondsto 12 instantiations of Cs
Partition with r blocks corresponds to kPrinstantiations
(S1, S2, S1), (S1, S3, S1), (S1, S4, S1), (S2, S1, S2), (S2, S3, S2), (S2, S4, S2)
(S3, S1, S3), (S3, S2, S3), (S3, S4, S3), (S4, S1, S4), (S4, S2, S4), (S4, S3, S4)
-
8/7/2019 Uai Mcmc Tutorial
18/61
18
Mistake 2: Underestimating the Bayesian
Ockhams Razor Effect Say k = 4. Are B1 and B2 in same species?
Maximum-likelihood estimation would yield one species
with Q = 50 and another with Q = 52
But Bayesian modeltrades offlikelihood againstprior
probabilityof getting those Q values
Wingspan (cm)
10 20 30 40 50 60 70 80 90 100
XB1=50 X B2=52
-
8/7/2019 Uai Mcmc Tutorial
19/61
19
Bayesian Ockhams Razor
10 20 30 40 50 60 70 80 90 100
XB1=50 X B2=52
H1: Partition is {{B1, B2}}
11211100
01
2
141 )|()|()(41)d t,( QQQQ dxpxppPHp !
} 1.3 x 10-4
H2: Partition is {{B1}, {B2}}
222100
02111
100
01
2
242 )|()()|()(41)d t,( QQQQQQ dxppdxppPHp !
} 7.5 x 10-5
= 0.01
Dont use more latent objects than necessary to explain your data
[MacKay 1992]
-
8/7/2019 Uai Mcmc Tutorial
20/61
20
Mistake 3: Comparing Densities Across
Dimensions
Wingspan (cm)
10 20 30 40 50 60 70 80 90 100
XB1=50 X B2=52
H1: Partition is {{B1, B2}}, Q = 51
H2: Partition is {{B1}, {B2}}, QB1 = 50, QB2= 52
)5,51;52()5,51;50(01.04
1)d t,( 222
141 NNPHp !
)5,52;52(.)5,50;50(01.041)ata,( 22
2
142 NNPHp
} 1.5 x 10-5
} 4.8 x 10-7
H1 wins by greater margin
-
8/7/2019 Uai Mcmc Tutorial
21/61
21
What If We Change the Units?
Wingspan (m)
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
XB1=0.50 X B2=0.52
H1: Partition is {{B1, B2}}, Q = 0.51
H2: Partition is {{B1}, {B2}}, QB1 = 0.50, QB2= 0.52
)05.0,51.0;52.0()05.0,51.0;50.0(14
1)d t,( 222
141 NNPHp !
)05.0,52.0;52.0(1)05.0,50.0;50.0(141)ata,( 22
2
142 NNPHp
} 15
} 48
density of Uniform(0, 1) is 1!
Now H2wins by a landslide
-
8/7/2019 Uai Mcmc Tutorial
22/61
22
Lesson: Comparing Densities Across
Dimensions
Densities dont behave like probabilities (e.g., theycan be greater than 1)
Heights of density peaks in spaces ofdifferentdimension are not comparable
Work-arounds:
Find most likely partition first, then most likely
parameters given that partition Findregion in parameter space where most of the
posterior probability mass lies
-
8/7/2019 Uai Mcmc Tutorial
23/61
23
Outline
Probabilistic models for relational structures Modeling the number of objects
Three mistakes that are easy to make
Markov chain Monte Carlo (MCMC) Gibbs sampling
Metropolis-Hastings
MCMC over events
Case studies Citation matching
Multi-target tracking
-
8/7/2019 Uai Mcmc Tutorial
24/61
24
Why Not Exact Inference?
Number of possible partitions is superexponential in
n
Variable elimination? Summing outQi
couples all the Cjs
Summing out Cjcouples all theQis
X1 X2 X3Xn
C1 C2 C3 Cn
Q1 Q2 Qk
-
8/7/2019 Uai Mcmc Tutorial
25/61
25
Markov Chain Monte Carlo (MCMC)
Start in arbitrary state (possibleworld) s1 satisfying evidence E
Sample s2, s3, ... according to
transition kernelT(si, si+1),yielding Markov chain
Approximate p(Q | E) byfraction of s1, s2, , sL that are
in Q
E
Q
-
8/7/2019 Uai Mcmc Tutorial
26/61
26
Why a Markov Chain?
Why use Markov chain rather than sampling
independently?
Stochastic local search for high-probability s
Once we find such s, explore around it
-
8/7/2019 Uai Mcmc Tutorial
27/61
27
Convergence
Stationary distribution T is such that
If chain is ergodic(can get to anywhere fromanywhere*), then:
It has unique stationary distribution T
Fraction of s1, s2, ..., sL in Q converges to T(Q) as Lpg
Well design T so T(s) = p(s | E)
!s
sssTs )'()',()( TT
* and its aperiodic
-
8/7/2019 Uai Mcmc Tutorial
28/61
28
Gibbs Sampling
Order non-evidence variables V1,V2,...,Vm
Given state s, sample from T as follows:
Let sd = s For i = 1 to m
Sample vid from p(Vi | sd-i)
Let sd = (sd-i, Vi= vid)
Return sd
Theorem: stationary distribution is p(s | E)
[Geman & Geman 1984]
Conditional for Vigiven
other vars in sd
-
8/7/2019 Uai Mcmc Tutorial
29/61
29
Conditional for V depends only on factors that
contain v
So condition on Vs Markov
blanketmb(V): parents,
children, and co-parents
Gibbs on Bayesian Network
w)ch(
)])([P,|][()])[P (|()|(VY
VVYsvYspVsvpsvp
V
-
8/7/2019 Uai Mcmc Tutorial
30/61
30
Gibbs on Bayesian Mixture Model
Given current state s:
Resample each Qigiven prior and
{Xj: Cj= i in s}
Resample each Cjgiven XjandQ1:k
X1 X2 X3 Xn
C1 C2 C3 Cn
Q1 Q2 Qk
context-specificMarkov blanket
[Neal 2000]
-
8/7/2019 Uai Mcmc Tutorial
31/61
31
Sampling Given Markov Blanket
If V is discrete, just iterate over values, normalize,
sample from discrete distrib. If V is continuous:
Simple if child distributions are conjugate to Vs prior:posterior has same form as priorwith different
parameters
In general, even sampling from p(v | s-V) can be hard
w
)c (
)])([ a,|][()])[ a(|()|(VY
VVYsvYspVsvpsvp
[See BUGS software: http://www.mrc-bsu.cam.ac.uk/bugs]
-
8/7/2019 Uai Mcmc Tutorial
32/61
32
Convergence Can Be Slow
Cjs wont change untilQ2 is in right area Q2does unguidedrandom walkas long as no observations
are associated with it Especially bad in high dimensions
should be two clusters
Q1 = 20 Q2= 90
species 2 is far away
Wingspan (cm)
10 20 30 40 50 60 70 80 90 100
-
8/7/2019 Uai Mcmc Tutorial
33/61
33
Outline
Probabilistic models for relational structures Modeling the number of objects
Three mistakes that are easy to make
Markov chain Monte Carlo (MCMC) Gibbs sampling
Metropolis-Hastings
MCMC over events
Case studies Citation matching
Multi-target tracking
-
8/7/2019 Uai Mcmc Tutorial
34/61
34
Metropolis-Hastings
Define T(si, si+1) as follows:
Sample sd from proposal distribution q(sd | s)
Compute acceptance probability
With probabilityE, let si+1 = sd;else let si+1 = si
d
dd!
ii
i
ssqEsp
ssqEsp
||
||,1miE
relative posterior
probabilities
backward / forward
proposal probabilities
Can show that p(s | E) is stationary distribution for T
[Metropolis et al. 1953; Hastings 1970]
-
8/7/2019 Uai Mcmc Tutorial
35/61
35
Metropolis-Hastings
Benefits
Proposal distribution can propose big steps involvingseveral variables
Only need to compute ratio p(sd | E) / p(s | E), ignoringnormalization factors
Dont need to sample from conditional distribs
Limitations
Proposals must be reversible, else q(s | sd) = 0
Need to be able to compute q(s | sd) / q(sd | s)
-
8/7/2019 Uai Mcmc Tutorial
36/61
36
Split-Merge Proposals
Choose two observations i, j
If Ci= Cj= c, then splitcluster c
Get unused latent object cd
For each observation m such that Cm = c, change Cm tocd with probability 0.5
Propose new values forQc, Qcd
Else merge clusters ciand cj For each m such that Cm = cj, set Cm = ci Propose new value forQc
[Jain & Neal 2004]
-
8/7/2019 Uai Mcmc Tutorial
37/61
37
Split-Merge Example
Q1 = 20 Q2= 90
Wingspan (cm)
10 20 30 40 50 60 70 80 90 100
Q2= 27
Split two birds from species 1
Resample Q2 to match these two birds
Move is likely to be accepted
-
8/7/2019 Uai Mcmc Tutorial
38/61
38
Mixtures of Kernels
If T1,,Tm all have stationary distribution T, then sodoes mixture
Example: Mixture of split-merge and Gibbs moves
Point: Faster convergence
!wm
i
ii ssTwssT1
)',()',(
-
8/7/2019 Uai Mcmc Tutorial
39/61
39
Outline
Probabilistic models for relational structures Modeling the number of objects
Three mistakes that are easy to make
Markov chain Monte Carlo (MCMC) Gibbs sampling
Metropolis-Hastings
MCMC over events
Case studies Citation matching
Multi-target tracking
-
8/7/2019 Uai Mcmc Tutorial
40/61
40
MCMC States in Split-Merge
Not complete instantiations!
No parameters for unobserved species
States are partialinstantiations of random variables
Each state corresponds to an event: set of outcomes
satisfying description
k = 12, CB1 = S2, CB2= S8, QS2= 31, QS8= 84
-
8/7/2019 Uai Mcmc Tutorial
41/61
41
MCMC over Events
Markov chain over
events W, with stationary distrib.
proportional to p(W)
Theorem: Fraction of visitedevents in Q converges to p(Q|E)
if:
Each W is either subset of Q or
disjoint from Q
Events form partition of E
E
Q
[Milch & Russell 2006]
-
8/7/2019 Uai Mcmc Tutorial
42/61
42
Computing Probabilities of Events
Engine needs to compute p(Wd) / p(Wn) efficiently
(without summations)
Use instantiations thatinclude all active parents
of the variables they
instantiate
Then probability is product of CPDs:
!)(vars
))(Pa(|)()(W
WWWW
X
XXXpp
-
8/7/2019 Uai Mcmc Tutorial
43/61
43
States That Are Even More Abstract
Typical partial instantiation:
Specifies particular species numbers, even though species areinterchangeable
Let states be abstractpartial instantiations:
See [Milch & Russell 2006] for conditions under which wecan compute probabilities of such events
x y{ x[k = 12, CB1 = x, CB2= y, Qx= 31, Qy= 84]
k = 12, CB1 = S2, CB2= S8, QS2= 31, QS8= 84
-
8/7/2019 Uai Mcmc Tutorial
44/61
44
Outline
Probabilistic models for relational structures Modeling the number of objects
Three mistakes that are easy to make
Markov chain Monte Carlo (MCMC) Gibbs sampling
Metropolis-Hastings
MCMC over events
Case studies Citation matching
Multi-target tracking
-
8/7/2019 Uai Mcmc Tutorial
45/61
45
Representative Applications
Tracking cars with cameras [Pasula et al. 1999]
Segmentation in computer vision [Tu & Zhu 2002]
Citation matching [Pasula et al. 2003] Multi-target tracking with radar[Oh et al. 2004]
-
8/7/2019 Uai Mcmc Tutorial
46/61
46
Citation Matching Model
#Researcher ~ NumResearchersPrior();
Name(r) ~ NamePrior();
#Paper ~ NumPapersPrior();
FirstAuthor(p) ~ Uniform({Researcher r});Title(p) ~ TitlePrior();
PubCited(c) ~ Uniform({Paper p});
Text(c) ~ NoisyCitationGrammar
(Name(FirstAuthor(PubCited(c))), Title(PubCited(c)));
[Pasula et al. 2003; Milch & Russell 2006]
-
8/7/2019 Uai Mcmc Tutorial
47/61
47
Citation Matching
Elaboration of generative model shown earlier
Parameter estimation
Priors for names, titles, citation formats learned offline from
labeled data
String corruption parameters learned with Monte Carlo EM
Inference
MCMC with split-merge proposals
Guided by canopies of similar citations Accuracy stabilizes after ~20 minutes
[Pasula et al., NIPS 2002]
-
8/7/2019 Uai Mcmc Tutorial
48/61
48
Citation Matching Results
Four data sets of ~300-500 citations, referring to ~150-
300 papers
0
0.05
0.
0. 5
0.
0. 5
Rei f e Face Reason Const aint
Error
(Fraction
ofClustersNotRecoveredCorrectly)
Phrase Matching[Lawrence et al. 1999]
Generative Model + MCMC
[Pasula et al. 2002]
Conditional Random Field
[Wellneret al. 2004]
-
8/7/2019 Uai Mcmc Tutorial
49/61
49
Cross-Citation Disambiguation
Wauchope, K. Eucalyptus: Integrating Natural Language
Input with a Graphical User Interface. NRL Report
NRL/FR/5510-94-9711 (1994).
Is "Eucalyptus" part of the title, or is the authornamed K. Eucalyptus Wauchope?
Kenneth Wauchope (1994). Eucalyptus: Integrating
natural language input with a graphical user
interface. NRL Report NRL/FR/5510-94-9711, Naval
Research Laboratory, Washington, DC, 39pp.
Second citation makes it clear how to parse the first one
-
8/7/2019 Uai Mcmc Tutorial
50/61
50
Preliminary Experiments:
Information Extraction
P(citation text | title, author names) modeled with
simple HMM
For each paper: recover title, author surnames andgiven names
Fraction whose attributes are recovered perfectly in
last MCMC state:
among papers with one citation: 36.1%
among papers with multiple citations: 62.6%
Can use inferred knowledge for disambiguation
-
8/7/2019 Uai Mcmc Tutorial
51/61
51
Multi-Object Tracking
False
Detection
Unobserved
Object
-
8/7/2019 Uai Mcmc Tutorial
52/61
52
State Estimation for Aircraft
#Aircraft ~ NumAircraftPrior();
State(a, t)
if t = 0 then ~ InitState()else ~ StateTransition(State(a, Pred(t)));
#Blip(Source = a, Time = t)~ NumDetectionsCPD(State(a, t));
#Blip(Time = t)~ NumFalseAlarmsPrior();
ApparentPos(r)if (Source(r) = null) then ~ FalseAlarmDistrib()else ~ ObsCPD(State(Source(r), Time(r)));
-
8/7/2019 Uai Mcmc Tutorial
53/61
53
Aircraft Entering and Exiting
#Aircraft(EntryTime = t) ~ NumAircraftPrior();
Exits(a, t)if InFlight(a, t) then ~ Bernoulli(0.1);
InFlight(a, t)
if t < EntryTime(a) then = falseelseif t = EntryTime(a) then = trueelse = (InFlight(a, Pred(t)) & !Exits(a, Pred(t)));
State(a, t)if t = EntryTime(a) then ~ InitState()elseif InFlight(a, t) then
~ StateTransition(State(a, Pred(t)));#Blip(Source = a, Time = t)
if InFlight(a, t) then~ NumDetectionsCPD(State(a, t));
plus last two statements from previous slide
-
8/7/2019 Uai Mcmc Tutorial
54/61
54
MCMC for Aircraft Tracking
Uses generative model from previous slide (although not
with BLOG syntax)
Examples of Metropolis-Hastings proposals:
[Oh et al., CDC 2004]
[Figures by Songhwai Oh]
-
8/7/2019 Uai Mcmc Tutorial
55/61
55
Aircraft Tracking Results
[Oh et al., CDC 2004][Figures by Songhwai Oh]
MCMC has smallest error,
hardly degrades at all as
tracks get dense
MCMC is nearly as fast as
greedy algorithm;
much faster than MHT
Estim tionError Running Time
-
8/7/2019 Uai Mcmc Tutorial
56/61
56
Toward General-Purpose Inference
Currently, each new application requires new code
for:
Proposing moves
Representing MCMC states
Computing acceptance probabilities
Goal:
User specifies model and proposal distribution
General-purpose code does the rest
-
8/7/2019 Uai Mcmc Tutorial
57/61
57
General MCMC Engine
Propose MCMC state
sd given sn
Compute ratio
q(sn | sd) / q(sd | sn)
Compute acceptance
probability based on model
Set sn+1
Define p(s)Custom proposal distribution
(Java class)
General-purpose engine
(Java code)
Model
(in declarative language)MCMC states: partial worlds
[Milch & Russell 2006]
Handle arbitrary proposals efficiently
using context-specific structure
-
8/7/2019 Uai Mcmc Tutorial
58/61
58
Summary
Models for relational structures go beyond standard
probabilistic inference settings
MCMC provides a feasible path for inference
Open problems
More general inference
Adaptive MCMC
Integrating discriminative methods
-
8/7/2019 Uai Mcmc Tutorial
59/61
59
References
Blei, D. M. and Jordan, M. I. (2005) Variational inference for Dirichlet process mixtures. J. Bayesian
Analysis 1(1):121-144.
Casella, G. and Robert, C. P. (1996) Rao-Blackwellisation of sampling schemes . Biometrika 83(1):81-
94.
Ferguson T. S. (1983) Bayesian density estimation by mixtures of normal distributions. In Rizvi, M. H.
et al., eds. Recent Advances in Statistics: Papers in Honor of Herman Chernoff on His Sixtieth Birthday.
Academic Press, New York, pages 287-302.
Geman, S. and Geman, D. (1984) Stochastic relaxation, Gibbs distributions and the Bayesian
restoration of images. IEEE Trans. on Pattern Analysis and Machine Intelligence 6:721-741.
Gilks, W. R., Thomas, A. and Spiegelhalter, D. J. (1994) A language and program for complex Bayesian
modelling. The Statistician 43(1):169-177.
Gilks, W. R., Richardson, S., and Spiegelhalter, D. J., eds. (1996) Markov Chain Monte Carlo in Practice.
Chapman and Hall.
Green, P. J. (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model
determination. Biometrika 82(4):711-732.
-
8/7/2019 Uai Mcmc Tutorial
60/61
60
References
Hastings, W. K. (1970) Monte Carlo sampling methods using Markov chains and their applications.Biometrika 57:97-109.
Jain, S. and Neal, R. M. (2004) A split-merge Markov chain Monte Carlo procedure for the Dirichletprocess mixture model. J. Computational and Graphical Statistics 13(1):158-182.
Jordan M. I. (2005) Dirichlet processes, Chinese restaurant processes, and all that. Tutorial at theNIPS Conference, available at http://www.cs.berkeley.edu/~jordan/nips-tutorial05.ps
MacKay D. J. C. (1992) Bayesian Interpolation Neural Computation 4(3):414-447.
MacEachern, S. N. (1994) Estimating normal means with a conjugate style Dirichlet process priorCommunications in Statistics: Simulation and Computation 23:727-741.
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. and Teller, E. (1953) Equations ofstate calculations by fast computing machines. J. Chemical Physics 21:1087-1092.
Milch, B., Marthi, B., Russell, S., Sontag, D., Ong, D. L., and Kolobov, A. (2005) BLOG: ProbabilisticModels with Unknown Objects. In Proc. 19th Intl Joint Conf. on AI, pages 1352-1359.
Milch, B. and Russell, S. (2006) General-purpose MCMC inference over relational structures. In Proc.
22nd
Conf. on Uncertainty in AI, pages 349-358.
-
8/7/2019 Uai Mcmc Tutorial
61/61
61
References
Neal, R. M. (2000) Markov chain sampling methods for Dirichlet process mixture models . J.
Computational and Graphical Statistics 9:249-265.
Oh, S., Russell, S. and Sastry, S. (2004) Markov chain Monte Carlo data association for general multi-target tracking problems. In Proc. 43rdIEEE Conf. on Decision and Control, pages 734-742.
Pasula, H., Russell, S. J., Ostland, M., and Ritov, Y. (1999) Tracking many objects with many sensors.In Proc. 16th Intl Joint Conf. on AI, pages 1160-1171.
Pasula, H., Marthi, B., Milch, B., Russell, S., and Shpitser, I. (2003) Identity uncertainty and citationmatching. In Advances in Neural Information Processing Systems 15, MIT Press, pages 1401-1408.
Richardson,, S. and Green, P. J. (1997) On Bayesian analysis of mixtures with an unknown number ofcomponents. J. Royal Statistical Society B 59:731-792.
Sethuraman, J. (1994) A constructive definition of Dirichlet priors . Statistica Sinica 4:639-650.
Sudderth, E. (2006) Graphical models for visual object recognition and tracking. Ph.D. thesis, Dept. of
EECS, Massachusetts Institute of Technology, Cambridge, MA.
Tu, Z. and Zhu, S.-C. (2002) Image segmentation by data-driven Markov chain Monte Carlo. IEEE
Trans. Pattern Analysis and Machine Intelligence 24(5):657-673.