09 cv mil_classification
TRANSCRIPT
![Page 1: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/1.jpg)
Computer vision: models, learning and inference
Chapter 9 Classification models
Please send errata to [email protected]
![Page 2: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/2.jpg)
2
Structure
2Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
• Logistic regression• Bayesian logistic regression• Non-linear logistic regression• Kernelization and Gaussian process classification• Incremental fitting, boosting and trees• Multi-class classification• Random classification trees• Non-probabilistic classification• Applications
![Page 3: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/3.jpg)
Models for machine vision
3Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 4: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/4.jpg)
Example application: Gender Classification
4Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 5: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/5.jpg)
Type 1: Model Pr(w|x) - Discriminative
How to model Pr(w|x)?– Choose an appropriate form for Pr(w)– Make parameters a function of x– Function takes parameters q that define its shape
Learning algorithm: learn parameters q from training data x,wInference algorithm: just evaluate Pr(w|x)
5Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 6: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/6.jpg)
Logistic RegressionConsider two class problem. • Choose Bernoulli distribution over world. • Make parameter l a function of x
Model activation with a linear function
creates number between . Maps to with
6Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 7: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/7.jpg)
Two parameters
Learning by standard methods (ML,MAP, Bayesian)Inference: Just evaluate Pr(w|x)
7Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 8: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/8.jpg)
Neater Notation
To make notation easier to handle, we• Attach a 1 to the start of every data vector
• Attach the offset to the start of the gradient vector f
New model:
8Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 9: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/9.jpg)
Logistic regression
9Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 10: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/10.jpg)
Maximum Likelihood
Take Logarithm
Take derivative:
10Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 11: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/11.jpg)
Derivatives
Unfortunately, there is no closed form solution– we cannot get and expression for f in terms of x and w
Have to use a general purpose technique:
“iterative non-linear optimization”
11Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 12: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/12.jpg)
Optimization
Goal:
How can we find the minimum?
Basic idea:• Start with estimate • Take a series of small steps to• Make sure that each step decreases cost• When can’t improve then must be in minimum
Cost function orObjective function
12Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 13: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/13.jpg)
Local Minima
13Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 14: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/14.jpg)
Convexity
If a function is convex, then it has only a single minimumCan tell if a function is convex by looking at 2nd derivatives
14Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 15: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/15.jpg)
15Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 16: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/16.jpg)
Gradient Based Optimization
• Choose a search direction s based on the local properties of the function
• Perform an intensive search along the chosen direction. This is called line search
• Then set
16Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 17: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/17.jpg)
Gradient Descent
Consider standing on a hillside
Look at gradient where you are standing
Find the steepest direction downhill
Walk in that direction for some distance (line search)
17Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 18: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/18.jpg)
Finite differences
What if we can’t compute the gradient?
Compute finite difference approximation:
where ej is the unit vector in the jth direction 18Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 19: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/19.jpg)
Steepest Descent ProblemsClose up
19Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 20: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/20.jpg)
Second Derivatives
In higher dimensions, 2nd derivatives change how much we should move differently in the different directions: changes best direction to move in.
20Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 21: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/21.jpg)
Newton’s MethodApproximate function with Taylor expansion
Take derivative
Re-arrange
21Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Adding line search
![Page 22: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/22.jpg)
Newton’s Method
Matrix of second derivatives is called the Hessian.
Expensive to compute via finite differences.
If positive definite then convex22Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 23: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/23.jpg)
Newton vs. Steepest Descent
23Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 24: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/24.jpg)
Line Search
Gradually narrow down range
24Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 25: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/25.jpg)
Optimization for Logistic Regression
Derivatives of log likelihood:
Positive definite!25Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 26: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/26.jpg)
26Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 27: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/27.jpg)
Maximum likelihood fits
27Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 28: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/28.jpg)
28
Structure
28Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
• Logistic regression• Bayesian logistic regression• Non-linear logistic regression• Kernelization and Gaussian process classification• Incremental fitting, boosting and trees• Multi-class classification• Random classification trees• Non-probabilistic classification• Applications
![Page 29: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/29.jpg)
29Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 30: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/30.jpg)
Bayesian Logistic RegressionLikelihood:
Prior (no conjugate):
Apply Bayes’ rule:
(no closed form solution for posterior)
30Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 31: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/31.jpg)
Laplace Approximation
Approximate posterior distribution with normal• Set mean to MAP estimate• Set covariance to matches that at MAP estimate
31Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 32: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/32.jpg)
Laplace ApproximationFind MAP solution by optimizing log likelihood
Approximate with normal
where
32Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 33: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/33.jpg)
Laplace Approximation
Actual posterior Approximated
33Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Prior
![Page 34: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/34.jpg)
Inference
Using transformation properties of normal distributions
Can re-express in terms of activation
34Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 35: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/35.jpg)
Approximation of Integral
35Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 36: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/36.jpg)
Bayesian Solution
36Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 37: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/37.jpg)
37
Structure
37Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
• Logistic regression• Bayesian logistic regression• Non-linear logistic regression• Kernelization and Gaussian process classification• Incremental fitting, boosting and trees• Multi-class classification• Random classification trees• Non-probabilistic classification• Applications
![Page 38: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/38.jpg)
38Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 39: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/39.jpg)
Non-linear regression
Same idea as for regression.
• Apply non-linear transformation
• Build model as usual
39Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 40: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/40.jpg)
Non-linear regression
40Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Example transformations:
Fit using optimization (also transformation parameters):
![Page 41: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/41.jpg)
Non-linear regression in 1D
41Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 42: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/42.jpg)
Non-linear regression in 2D
42Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 43: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/43.jpg)
43
Structure
43Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
• Logistic regression• Bayesian logistic regression• Non-linear logistic regression• Kernelization and Gaussian process classification• Incremental fitting, boosting and trees• Multi-class classification• Random classification trees• Non-probabilistic classification• Applications
![Page 44: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/44.jpg)
44Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 45: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/45.jpg)
Dual Logistic Regression
KEY IDEA:
Gradient F is just a vector in the data space
Can represent as a weighted sum of the data points
Now solve for Y. One parameter per training example.
45Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 46: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/46.jpg)
Maximum LikelihoodLikelihood
Derivatives
Depend only depend on inner products!
46Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 47: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/47.jpg)
Kernel Logistic Regression
47Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 48: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/48.jpg)
ML vs. Bayesian
48Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Bayesian case is known as Gaussian process classification
![Page 49: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/49.jpg)
Relevance vector classificationApply sparse prior to dual variables:
49Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
As before, write as marginalization of dual variables:
![Page 50: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/50.jpg)
Relevance vector classificationApply sparse prior to dual variables:
50Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Gives likelihood:
![Page 51: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/51.jpg)
Relevance vector classification
51Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Use Laplace approximation result:
giving:
![Page 52: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/52.jpg)
Relevance vector classificationPrevious result:
52Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Second approximation:
To solve, alternately update hidden variables in H and mean and variance of Laplace approximation.
![Page 53: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/53.jpg)
Relevance vector classification
Results:
Most hidden variables increase to larger values
This means prior over dual variable is very tight around zero
The final solution only depends on a very small number of examples – efficient
53Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 54: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/54.jpg)
54
Structure
54Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
• Logistic regression• Bayesian logistic regression• Non-linear logistic regression• Kernelization and Gaussian process classification• Incremental fitting & boosting• Multi-class classification• Random classification trees• Non-probabilistic classification• Applications
![Page 55: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/55.jpg)
55Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 56: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/56.jpg)
Incremental FittingPreviously wrote:
Now write:
56Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 57: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/57.jpg)
Incremental FittingKEY IDEA: Greedily add terms one at a time.
STAGE 1: Fit f0, f1, x1
STAGE K: Fit f0, fk, xk
STAGE 2: Fit f0, f2, x2
57Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 58: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/58.jpg)
Incremental Fitting
58Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 59: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/59.jpg)
DerivativeIt is worth considering the form of the derivative in the context of the incremental fitting procedure
Actual label Predicted Label
Points contribute to derivative more if they are still misclassified: the later classifiers become increasingly specialized to the difficult examples.
59Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 60: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/60.jpg)
Boosting
Incremental fitting with step functions
Each step function is called a ``weak classifier``
Can`t take derivative w.r.t a so have to just use exhaustive search
60Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 61: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/61.jpg)
Boosting
61Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 62: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/62.jpg)
Branching Logistic Regression
New activation
The term is a gating function.
• Returns a number between 0 and • If 0 then we get one logistic regression model• If 1 then get a different logistic regression model
A different way to make non-linear classifiers
62Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 63: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/63.jpg)
Branching Logistic Regression
63Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 64: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/64.jpg)
Logistic Classification Trees
64Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 65: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/65.jpg)
65
Structure
65Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
• Logistic regression• Bayesian logistic regression• Non-linear logistic regression• Kernelization and Gaussian process classification• Incremental fitting, boosting and trees• Multi-class classification• Random classification trees• Non-probabilistic classification• Applications
![Page 66: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/66.jpg)
Multiclass Logistic Regression
66Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
For multiclass recognition, choose distribution over w and then make the parameters of this a function of x.
Softmax function maps real activations {ak} to numbers between zero and one that sum to one
Parameters are vectors {fk}
![Page 67: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/67.jpg)
Multiclass Logistic Regression
67Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Softmax function maps activations which can take any value to parameters of categorical distribution between 0 and 1
![Page 68: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/68.jpg)
Multiclass Logistic Regression
68Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
To learn model, maximize log likelihood
No closed from solution, learn with non-linear optimization
where
![Page 69: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/69.jpg)
69
Structure
69Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
• Logistic regression• Bayesian logistic regression• Non-linear logistic regression• Kernelization and Gaussian process classification• Incremental fitting, boosting and trees• Multi-class classification• Random classification trees• Non-probabilistic classification• Applications
![Page 70: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/70.jpg)
Random classification tree
70Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Key idea:• Binary tree• Randomly chosen function at each split• Choose threshold t to maximize log probability
For given threshold, can compute parameters in closed form
![Page 71: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/71.jpg)
Random classification tree
71Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Related models:
Fern: • A tree where all of the functions at a level are the same• Threshold may be same or different• Very efficient to implement
Forest• Collection of trees• Average results to get more robust answer• Similar to `Bayesian’ approach – average of models with
different parameters
![Page 72: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/72.jpg)
72
Structure
72Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
• Logistic regression• Bayesian logistic regression• Non-linear logistic regression• Kernelization and Gaussian process classification• Incremental fitting, boosting and trees• Multi-class classification• Random classification trees• Non-probabilistic classification• Applications
![Page 73: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/73.jpg)
Non-probabilistic classifiers
73Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Most people use non-probabilistic classification methods such as neural networks, adaboost, support vector machines. This is largely for historical reasons
Probabilistic approaches: • No serious disadvantages• Naturally produce estimates of uncertainty• Easily extensible to multi-class case• Easily related to each other
![Page 74: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/74.jpg)
Non-probabilistic classifiers
74Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Multi-layer perceptron (neural network)• Non-linear logistic regression with sigmoid functions• Learning known as back propagation• Transformed variable z is hidden layer
Adaboost• Very closely related to logitboost• Performance very similar
Support vector machines• Similar to relevance vector classification but objective fn is convex• No certainty• Not easily extended to multi-class• Produces solutions that are less sparse• More restrictions on kernel function
![Page 75: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/75.jpg)
75
Structure
75Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
• Logistic regression• Bayesian logistic regression• Non-linear logistic regression• Kernelization and Gaussian process classification• Incremental fitting, boosting and trees• Multi-class classification• Random classification trees• Non-probabilistic classification• Applications
![Page 76: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/76.jpg)
Gender Classification
76Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
Incremental logistic regression
300 arc tan basis functions:
Results: 87.5% (humans=95%)
![Page 77: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/77.jpg)
Fast Face Detection (Viola and Jones 2001)
77Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 78: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/78.jpg)
Computing Haar Features
78Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 79: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/79.jpg)
Pedestrian Detection
79Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 80: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/80.jpg)
Semantic segmentation
80Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 81: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/81.jpg)
Recovering surface layout
81Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
![Page 82: 09 cv mil_classification](https://reader035.vdocuments.site/reader035/viewer/2022062405/554f436eb4c905423f8b46eb/html5/thumbnails/82.jpg)
Recovering body pose
82Computer vision: models, learning and inference. ©2011 Simon J.D. Prince