computer vision: models, learning and inference chapter 8 regression
TRANSCRIPT
![Page 1: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/1.jpg)
Computer vision: models, learning and inference
Chapter 8 Regression
![Page 2: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/2.jpg)
2
Structure
2Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
• Linear regression• Bayesian solution• Non-linear regression• Kernelization and Gaussian processes• Sparse linear regression• Dual linear regression • Relevance vector regression• Applications
![Page 3: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/3.jpg)
Models for machine vision
3Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 4: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/4.jpg)
Body Pose Regression
Encode silhouette as 100x1 vector, encode body pose as 55 x1 vector. Learn relationship
4Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 5: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/5.jpg)
Type 1: Model Pr(w|x) - Discriminative
How to model Pr(w|x)?– Choose an appropriate form for Pr(w)– Make parameters a function of x– Function takes parameters q that define its shape
Learning algorithm: learn parameters q from training data x,wInference algorithm: just evaluate Pr(w|x)
5Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 6: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/6.jpg)
Linear Regression• For simplicity we will assume that each dimension of
world is predicted separately. • Concentrate on predicting a univariate world state w.
Choose normal distribution over world w
Make • Mean a linear function of data x• Variance constant
6Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 7: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/7.jpg)
Linear Regression
7Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 8: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/8.jpg)
Neater Notation
To make notation easier to handle, we• Attach a 1 to the start of every data vector
• Attach the offset to the start of the gradient vector f
New model:
8Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 9: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/9.jpg)
Combining EquationsWe have one equation for each x,w pair:
The likelihood of the whole dataset is the product of these individual distributions and can be written as
where
9Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 10: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/10.jpg)
LearningMaximum likelihood
Substituting in
Take derivative, set result to zero and re-arrange:
10Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 11: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/11.jpg)
11Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 12: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/12.jpg)
Regression Models
12Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 13: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/13.jpg)
13
Structure
13Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
• Linear regression• Bayesian solution• Non-linear regression• Kernelization and Gaussian processes• Sparse linear regression• Dual linear regression • Relevance vector regression• Applications
![Page 14: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/14.jpg)
Bayesian Regression
Likelihood
Prior
(We concentrate on f – come back to s2 later!)
Bayes rule’
14Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 15: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/15.jpg)
Posterior Dist. over Parameters
where
15Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 16: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/16.jpg)
Inference
16Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 17: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/17.jpg)
Practical IssueProblem: In high dimensions, the matrix A may be too big to invert
Solution: Re-express using Matrix Inversion Lemma
17Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Final expression: inverses are (I x I) , not (D x D)
![Page 18: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/18.jpg)
Fitting Variance
• We’ll fit the variance with maximum likelihood• Optimize the marginal likelihood (likelihood
after gradients have been integrated out)
18Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 19: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/19.jpg)
19
Structure
19Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
• Linear regression• Bayesian solution• Non-linear regression• Kernelization and Gaussian processes• Sparse linear regression• Dual linear regression • Relevance vector regression• Applications
![Page 20: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/20.jpg)
Regression Models
20Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 21: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/21.jpg)
Non-Linear Regression
GOAL:
Keep the math of linear regression, but extend to more general functions
KEY IDEA:
You can make a non-linear function from a linear weighted sum of non-linear basis functions
21Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 22: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/22.jpg)
Non-linear regression
Linear regression:
Non-Linear regression:
where
In other words, create z by evaluating x against basis functions, then linearly regress against z.
22Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 23: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/23.jpg)
Example: polynomial regression
23Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
A special case of
Where
![Page 24: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/24.jpg)
Radial basis functions
24Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 25: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/25.jpg)
Arc Tan Functions
25Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 26: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/26.jpg)
Non-linear regression
Linear regression:
Non-Linear regression:
where
In other words, create z by evaluating x against basis functions, then linearly regress against z.
26Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 27: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/27.jpg)
Maximum Likelihood
27Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Same as linear regression, but substitute in Z for X:
![Page 28: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/28.jpg)
28
Structure
28Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
• Linear regression• Bayesian solution• Non-linear regression• Kernelization and Gaussian processes• Sparse linear regression• Dual linear regression • Relevance vector regression• Applications
![Page 29: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/29.jpg)
Regression Models
29Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 30: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/30.jpg)
Bayesian Approach
Learn s2 from marginal likelihood as before
Final predictive distribution:
30Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 31: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/31.jpg)
31Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 32: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/32.jpg)
The Kernel Trick
Notice that the final equation doesn’t need the data itself, but just dot products between data items of the form zi
Tzj
So, we take data xi and xj pass through non-linear function to create zi and zj and then take dot products of different zi
Tzj
32Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 33: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/33.jpg)
The Kernel Trick
So, we take data xi and xj pass through non-linear function to create zi and zj and then take dot products of different zi
Tzj
Key idea:
Define a “kernel” function that does all of this together. • Takes data xi and xj • Returns a value for dot product zi
Tzj
If we choose this function carefully, then it will correspond to some underlying z=f[x].
Never compute z explicitly - can be very high or infinite dimension 33Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 34: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/34.jpg)
Gaussian Process RegressionBefore
34Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
After
![Page 35: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/35.jpg)
Example Kernels
(Equivalent to having an infinite number of radial basis functions at every position in space. Wow!)
35Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 36: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/36.jpg)
RBF Kernel Fits
36Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 37: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/37.jpg)
Fitting Variance
• We’ll fit the variance with maximum likelihood• Optimize the marginal likelihood (likelihood after
gradients have been integrated out)
• Have to use non-linear optimization
37Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 38: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/38.jpg)
38
Structure
38Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
• Linear regression• Bayesian solution• Non-linear regression• Kernelization and Gaussian processes• Sparse linear regression• Dual linear regression • Relevance vector regression• Applications
![Page 39: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/39.jpg)
Regression Models
39Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 40: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/40.jpg)
Sparse Linear RegressionPerhaps not every dimension of the data x is informative
A sparse solution forces some of the coefficients in f to be zero
Method:
– apply a different prior on f that encourages sparsity
– product of t-distributions
40Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 41: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/41.jpg)
Sparse Linear Regression
Apply product of t-distributions to parameter vector
41Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
As before, we use
Now the prior is not conjugate to the normal likelihood. Cannot compute posterior in closed from
![Page 42: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/42.jpg)
Sparse Linear Regression
To make progress, write as marginal of joint distribution
42Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Diagonal matrix with hidden variables {hd} on diagonal
![Page 43: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/43.jpg)
Sparse Linear Regression
43Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Substituting in the prior
Still cannot compute, but can approximate
![Page 44: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/44.jpg)
Sparse Linear Regression
44Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
To fit the model, update variance s2 and hidden variables {hd}.• To choose hidden variables
• To choose variance
where
![Page 45: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/45.jpg)
Sparse Linear Regression
After fitting, some of hidden variables become very big, implies prior tightly fitted around zero, can be eliminated from model
45Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 46: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/46.jpg)
Sparse Linear Regression
Doesn’t work for non-linear case as we need one hidden variable per dimension – becomes intractable with high dimensional transformation. To solve this problem, we move to the dual model.
46Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 47: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/47.jpg)
47
Structure
47Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
• Linear regression• Bayesian solution• Non-linear regression• Kernelization and Gaussian processes• Sparse linear regression• Dual linear regression • Relevance vector regression• Applications
![Page 48: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/48.jpg)
Dual Linear RegressionKEY IDEA:
Gradient F is just a vector in the data space
Can represent as a weighted sum of the data points
Now solve for Y. One parameter per training example.
48Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 49: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/49.jpg)
Dual Linear Regression
49Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Original linear regression:
Dual variables:
Dual linear regression:
![Page 50: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/50.jpg)
Maximum likelihood
50Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Maximum likelihood solution:
Dual variables:
Same result as before:
![Page 51: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/51.jpg)
Bayesian case
51Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Compute distribution over parameters:
Gives result:
where
![Page 52: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/52.jpg)
Bayesian case
52Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Predictive distribution:
where:
Notice that in both the maximum likelihood and Bayesian case depend on dot products XTX. Can be kernelized!
![Page 53: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/53.jpg)
53
Structure
53Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
• Linear regression• Bayesian solution• Non-linear regression• Kernelization and Gaussian processes• Sparse linear regression• Dual linear regression • Relevance vector regression• Applications
![Page 54: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/54.jpg)
Regression Models
54Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 55: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/55.jpg)
Relevance Vector Machine
Combines ideas of
• Dual regression (1 parameter per training example)
• Sparsity (most of the parameters are zero)
i.e., model that only depends sparsely on training data.
55Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 56: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/56.jpg)
Relevance Vector Machine
56Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Using same approximations as for sparse model we get the problem:
To solve, update variance s2 and hidden variables {hd} alternately.
Notice that this only depends on dot-products and so can be kernelized
![Page 57: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/57.jpg)
57
Structure
57Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
• Linear regression• Bayesian solution• Non-linear regression• Kernelization and Gaussian processes• Sparse linear regression• Dual linear regression • Relevance vector regression• Applications
![Page 58: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/58.jpg)
Body Pose Regression (Agarwal and Triggs 2006)
Encode silhouette as 100x1 vector, encode body pose as 55 x1 vector. Learn relationship
58Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 59: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/59.jpg)
Shape Context
Returns 60 x 1 vector for each of 400 points around the silhouette59Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 60: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/60.jpg)
Dimensionality Reduction
Cluster 60D space (based on all training data) into 100 vectorsAssign each 60x1 vector to closest cluster (Voronoi partition)Final data vector is 100x1 histogram over distribution of assignments 60Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 61: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/61.jpg)
Results
• 2636 training examples, solution depends on only 6% of these• 6 degree average error 61Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 62: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/62.jpg)
Displacement experts
62Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 63: Computer vision: models, learning and inference Chapter 8 Regression](https://reader031.vdocuments.site/reader031/viewer/2022032104/56649caf5503460f94972453/html5/thumbnails/63.jpg)
Regression
• Not actually used much in vision• But main ideas all apply to classification:– Non-linear transformations– Kernelization– Dual parameters– Sparse priors
63Computer vision: models, learning and inference. ©2011 Simon J.D. Prince