cristopher m. bishop's tutorial on graphical models
DESCRIPTION
TRANSCRIPT
![Page 1: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/1.jpg)
Part 1: Graphical Models
Machine Learning Techniques
for Computer Vision
Microsoft Research Cambridge
ECCV 2004, Prague
Christopher M. Bishop
![Page 2: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/2.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
About this Tutorial
• Learning is the new frontier in computer vision • Focus on concepts
– not lists of algorithms– not technical details
• Graduate level• Please ask questions!
![Page 3: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/3.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Overview
• Part 1: Graphical models– directed and undirected graphs– inference and learning
• Part 2: Unsupervised learning– mixture models, EM– variational inference, model complexity– continuous latent variables
• Part 3: Supervised learning– decision theory– linear models, neural networks, – boosting, sparse kernel machines
![Page 4: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/4.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Probability Theory
• Sum rule
• Product rule
• From these we have Bayes’ theorem
– with normalization
![Page 5: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/5.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Role of the Graphs
• New insights into existing models• Motivation for new models• Graph based algorithms for calculation and computation
– c.f. Feynman diagrams in physics
![Page 6: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/6.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Decomposition
• Consider an arbitrary joint distribution
• By successive application of the product rule
![Page 7: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/7.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Directed Acyclic Graphs
• Joint distribution
where denotes the parents of i
No directed cycles
![Page 8: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/8.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Undirected Graphs
• Provided then joint distribution is product of non-negative functions over the cliques of the graph
where are the clique potentials, and Z is a normalization constant
![Page 9: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/9.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Conditioning on Evidence
• Variables may be hidden (latent) or visible (observed)
• Latent variables may have a specific interpretation, or may be introduced to permit a richer class of distribution
![Page 10: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/10.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Conditional Independences
• x independent of y given z if, for all values of z,
• For undirected graphs this is given by graph separation!
![Page 11: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/11.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
“Explaining Away”
• C.I. for directed graphs similar, but with one subtlety• Illustration: pixel colour in an image
image colour
surfacecolour
lightingcolour
![Page 12: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/12.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Directed versus Undirected
![Page 13: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/13.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Example: State Space Models
• Hidden Markov model• Kalman filter
![Page 14: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/14.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Example: Bayesian SSM
![Page 15: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/15.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Example: Factorial SSM
• Multiple hidden sequences• Avoid exponentially large hidden space
![Page 16: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/16.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Example: Markov Random Field
• Typical application: image region labelling
![Page 17: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/17.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Example: Conditional Random Field
![Page 18: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/18.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Inference
• Simple example: Bayes’ theorem
![Page 19: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/19.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Message Passing
• Example
• Find marginal for a particular node
– for M-state nodes, cost is – exponential in length of chain– but, we can exploit the graphical structure
(conditional independences)
![Page 20: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/20.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Message Passing
• Joint distribution
• Exchange sums and products
![Page 21: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/21.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Message Passing
• Express as product of messages
• Recursive evaluation of messages
• Find Z by normalizing
![Page 22: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/22.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Belief Propagation
• Extension to general tree-structured graphs• At each node:
– form product of incoming messages and local evidence– marginalize to give outgoing message– one message in each direction across every link
• Fails if there are loops
![Page 23: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/23.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Junction Tree Algorithm
• An efficient exact algorithm for a general graph– applies to both directed and undirected graphs– compile original graph into a tree of cliques– then perform message passing on this tree
• Problem: – cost is exponential in size of largest clique– many vision models have intractably large cliques
![Page 24: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/24.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Loopy Belief Propagation
• Apply belief propagation directly to general graph– need to keep iterating– might not converge
• State-of-the-art performance in error-correcting codes
![Page 25: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/25.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Max-product Algorithm
• Goal: find
– define
– then
• Message passing algorithm with “sum” replaced by “max”• Example:
– Viterbi algorithm for HMMs
![Page 26: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/26.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Inference and Learning
• Data set
• Likelihood function (independent observations)
• Maximize (log) likelihood
• Predictive distribution
![Page 27: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/27.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Regularized Maximum Likelihood
• Prior , posterior
• MAP (maximum posterior)
• Predictive distribution
• Not really Bayesian
![Page 28: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/28.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Bayesian Learning
• Key idea is to marginalize over unknown parameters, rather than make point estimates
– avoids severe over-fitting of ML and MAP– allows direct model comparison
• Parameters are now latent variables• Bayesian learning is an inference problem!
![Page 29: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/29.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Bayesian Learning
![Page 30: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/30.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Bayesian Learning
![Page 31: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/31.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
And Finally … the Exponential Family
• Many distributions can be written in the form
• Includes: – Gaussian– Dirichlet– Gamma– Multi-nomial– Wishart– Bernoulli– …
• Building blocks in graphs to give rich probabilistic models
![Page 32: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/32.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Illustration: the Gaussian
• Use precision (inverse variance)
• In standard form
![Page 33: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/33.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Maximum Likelihood
• Likelihood function (independent observations)
• Depends on data via sufficient statistics of fixed dimension
![Page 34: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/34.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Conjugate Priors
• Prior has same functional form as likelihood
• Hence posterior is of the form
• Can interpret prior as effective observations of value• Examples:
– Gaussian for the mean of a Gaussian– Gaussian-Wishart for mean and precision of Gaussian– Dirichlet for the parameters of a discrete distribution
![Page 35: Cristopher M. Bishop's tutorial on graphical models](https://reader035.vdocuments.site/reader035/viewer/2022062615/54839ebcb4af9fee278b4581/html5/thumbnails/35.jpg)
Machine Learning Techniques for Computer Vision (ECCV 2004)
Christopher M. Bishop
Summary of Part 1
• Directed graphs
• Undirected graphs
• Inference by message passing: belief propagation