supervised object recognition, unsupervised object...

118
Supervised object recognition, unsupervised object recognition then Perceptual organization Bill Freeman, MIT 6.869 April 12, 2005

Upload: others

Post on 17-Jul-2020

46 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Supervised object recognition,unsupervised object recognition

then Perceptual organization

Bill Freeman, MIT

6.869 April 12, 2005

Page 2: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Readings• Brief overview of classifiers in context of gender

recognition: – http://www.merl.com/reports/docs/TR2000-01.pdf, Gender

Classification with Support Vector Machines Citation: Moghaddam, B.; Yang, M-H., "Gender Classification with Support Vector Machines", IEEE International Conference on Automatic Face and Gesture Recognition (FG), pps 306-311, March 2000

• Overview of support vector machines—Statistical Learning and Kernel MethodsBernhard Schölkopf, ftp://ftp.research.microsoft.com/pub/tr/tr-2000-23.pdf

• M. Weber, M. Welling and P. Perona Proc. 6th Europ. Conf. Comp. Vis., ECCV, Dublin, Ireland, June 2000 ftp://vision.caltech.edu/pub/tech-reports/ECCV00-recog.pdf

Page 3: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Gender Classification with Support Vector Machines

Baback Moghaddam

Moghaddam, B.; Yang, M-H, "Learning Gender with Support Faces", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), May 2002

Page 4: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Support vector machines (SVM’s)

• The 3 good ideas of SVM’s

Page 5: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Good idea #1: Classify rather than model probability distributions.

• Advantages:– Focuses the computational resources on the task at

hand.• Disadvantages:

– Don’t know how probable the classification is– Lose the probabilistic model for each object class;

can’t draw samples from each object class.

Page 6: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Good idea #2: Wide margin classification

• For better generalization, you want to use the weakest function you can.– Remember polynomial fitting.

• There are fewer ways a wide-margin hyperplane classifier can split the data than an ordinary hyperplane classifier.

Page 7: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Too weak

Bishop, neural networks for pattern recognition, 1995

Page 8: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Just right

Bishop, neural networks for pattern recognition, 1995

Page 9: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Too strong

Bishop, neural networks for pattern recognition, 1995

Page 10: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Finding the wide-margin separating hyperplane: a quadratic programming problem, involving inner products of data vectors

Learning with Kernels, Scholkopf and Smola, 2002

Page 11: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Good idea #3: The kernel trick

Page 12: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Non-separable by a hyperplane in 2-d

x1

x2

Page 13: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Separable by a hyperplane in 3-d

x2

x22

x1

Page 14: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Embedding

Learning with Kernels, Scholkopf and Smola, 2002

Page 15: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

The kernel idea• There are many embeddings where the dot product in the

high dimensional space is just the kernel function applied to the dot product in the low-dimensional space.

• For example:– K(x,x’) = (<x,x’> + 1)d

• Then you “forget” about the high dimensional embedding, and just play with different kernel functions.

Page 16: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Example kerneldxxxxK )1,(),( +>′<=′

Here, the high-dimensional vector is

),2,,2,1(),( 222

21121 xxxxxx >−

You can see for this case how the dot product of the high-dimensional vectors is just the kernel function applied to the low-dimensional vectors. Since all we need to find the desired hyperplanes separating the high-dimensional vectors is their dot product, we can do it all with kernels applied to the low-dimensional vectors.

>′′′′=<

′+′++′+′=

+′+′=′′

),2,,2,1(),,2,,2,1(

221)()(

)1()),(),,((

222

211

222

211

22112

222

11

222112121

xxxxxxxx

xxxxxxxx

xxxxxxxxK

dot product of the high-dimensional vectors

kernel function applied to the low-dimensional vectors

Page 17: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

• See also nice tutorial slides http://www.bioconductor.org/workshops/NGFN03/svm.pdf

Page 18: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Example kernel functions

• Polynomials• Gaussians• Sigmoids• Radial basis functions• Etc…

Page 19: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

The hyperplane decision function

1( ) sgn( ( ) )

m

i i ii

f x y x x bα=

= ⋅ +∑1

( ) sgn( ( ) )m

i i ii

f x y x x bα=

= ⋅ +∑

Eq. 32 of “statistical learning and kernel methods, MSR-TR-2000-23

Page 20: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Learning with Kernels, Scholkopf and Smola, 2002

Page 21: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Discriminative approaches:e.g., Support Vector Machines

Page 22: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Gender Classification with Support Vector Machines

Baback Moghaddam

Moghaddam, B.; Yang, M-H, "Learning Gender with Support Faces", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), May 2002

Page 23: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Gender Prototypes

Images courtesy of University of St. Andrews Perception Laboratory

Moghaddam, B.; Yang, M-H, "Learning Gender with Support Faces", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), May 2002

Page 24: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Gender Prototypes

Images courtesy of University of St. Andrews Perception Laboratory

Moghaddam, B.; Yang, M-H, "Learning Gender with Support Faces", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), May 2002

Page 25: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Classifier Evaluation

• Compare “standard” classifiers

• 1755 FERET faces – 80-by-40 full-resolution– 21-by-12 “thumbnails”

• 5-fold Cross-Validation testing

• Compare with human subjects

Moghaddam, B.; Yang, M-H, "Learning Gender with Support Faces", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), May 2002

Page 26: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Face Processor

[Moghaddam & Pentland, PAMI-19:7]

Moghaddam, B.; Yang, M-H, "Learning Gender with Support Faces", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), May 2002

Page 27: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Gender (Binary) Classifier

Moghaddam, B.; Yang, M-H, "Learning Gender with Support Faces", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), May 2002

Page 28: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Binary ClassifiersNN Linear Fisher

Quadratic RBF SVM

Moghaddam, B.; Yang, M-H, "Learning Gender with Support Faces", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), May 2002

Page 29: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Linear SVM Classifier

• Data: {xi , yi} i =1,2,3 … N yi = {-1,+1}

• Discriminant: f(x) = (w . x + b) > 0 • minimize || w ||• subject to yi (w . xi + b) > 1 for all i

• Solution: QP gives {αi}• wopt = Σ αi yi xi

• f(x) = Σ αi yi (xi . x) + b

Note we just need the vector dot products, so this is easy to “kernelize”.

Moghaddam, B.; Yang, M-H, "Learning Gender with Support Faces", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), May 2002

Page 30: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

“Support Faces”

Moghaddam, B.; Yang, M-H, "Learning Gender with Support Faces", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), May 2002

Page 31: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Classifier Performance

Moghaddam, B.; Yang, M-H, "Learning Gender with Support Faces", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), May 2002

Page 32: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Classifier Error Rates

0 10 20 30 40 50 60

SVM - Gaussian

SVM - Cubic

Large ERBF

RBF

Quadratic

Fisher

1-NN

Linear

Moghaddam, B.; Yang, M-H, "Learning Gender with Support Faces", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), May 2002

Page 33: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Gender Perception Study

• Mixture: 22 males, 8 females

• Age: mid-20s to mid-40s

• Stimuli: 254 faces (randomized)

– low-resolution 21-by-12– high-resolution 84-by-48

• Task: classify gender (M or F)

– forced-choice– no time constraints

Moghaddam, B.; Yang, M-H, "Learning Gender with Support Faces", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), May 2002

Page 34: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

How would you classify these 5 faces?

True classification: F, M, M, F, M

Page 35: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Human Performance84 x 48 21 x 12

Stimuli

But note how the pixellated enlargement hinders recognition. Shown below with

pixellation removed

N = 4032 N = 252

High-Res Low-Res

6.54% 30.7%Results σ = 3.7%

Moghaddam, B.; Yang, M-H, "Learning Gender with Support Faces", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), May 2002

Page 36: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Machine vs. Humans

0

5

10

15

20

25

30

35

SVM Humans

Low-ResHigh-Res

% Error

Moghaddam, B.; Yang, M-H, "Learning Gender with Support Faces", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), May 2002

Page 37: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

End of SVM section

Page 38: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

6.869

Previously: Object recognition via labeled training sets.Now: Unsupervised Category LearningFollowed by:Perceptual organization:

– Gestalt Principles– Segmentation by Clustering

• K-Means• Graph cuts

– Segmentation by Fitting• Hough transform• Fitting

Readings: F&P Ch. 14, 15.1-15.2

Page 39: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Unsupervised Learning• Object recognition methods in last two lectures

presume:– Segmentation– Labeling– Alignment

• What can we do with unsupervised (weakly supervised) data?

• See work by Perona and collaborators– (the third of the 3 bits needed to characterize all

computer vision conference submissions, after SIFT and Viola/Jones style boosting).

Page 40: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

References•• Unsupervised Learning of Models for Recognition• M. Weber, M. Welling and P. Perona

(15 pages postscript) (15 pages PDF)Proc. 6th Europ. Conf. Comp. Vis., ECCV, Dublin, Ireland, June 2000

•• Towards Automatic Discovery of Object Categories• M. Weber, M. Welling and P. Perona

(8 pages postscript) (8 pages PDF)Proc. IEEE Comp. Soc. Conf. Comp. Vis. and Pat. Rec., CVPR, June 2000

Page 41: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Yes, contains object No, does not contain object

Page 42: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

What are the features that let us recognize that this is a face?

Page 43: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then
Page 44: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then
Page 45: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

A B

C D

Page 46: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

A

B

C

Page 47: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Feature detectors• Keypoint detectors [Foerstner87]

• Jets / texture classifiers [Malik-Perona88, Malsburg91,…]

• Matched filtering / correlation [Burt85, …]

• PCA + Gaussian classifiers [Kirby90, Turk-Pentland92….]

• Support vector machines [Girosi-Poggio97, Pontil-Verri98]

• Neural networks [Sung-Poggio95, Rowley-Baluja-Kanade96]

• ……whatever works best (see handwriting experiments)

Page 48: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Representation

From

: Rob

Fer

gus

http

://w

ww

.robo

ts.o

x.ac

.uk/

%7E

ferg

us/

Use a scale invariant, scale sensing feature keypoint detector (like the first steps of Lowe’s SIFT).

[Slide from Bradsky & Thrun, Stanford]

Page 49: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Data

Slid

e fro

m L

i Fei

-Fei

http

://w

ww

.vis

ion.

calte

ch.e

du/fe

ifeili/

Res

ume.

htm

[Slide from Bradsky & Thrun, Stanford]

Page 50: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Features for Category Learning

From

: Rob

Fer

gus

http

://w

ww

.robo

ts.o

x.ac

.uk/

%7E

ferg

us/

A direct appearance model is taken around each located key. This is then normalized by it’s detected scale to an 11x11 window. PCA further reduces these features.

[Slide from Bradsky & Thrun, Stanford]

Page 51: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Unsupervised detector training - 2

“Pattern Space” (100+ dimensions)

Page 52: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then
Page 53: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

A

BC D

E

E EEEEE Ryx σθ=

D DDDDD Ryx σθ=

Hypothesis: H=(A,B,C,D,E)Probability density: P(A,B,C,D,E)

Page 54: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Learning• Fit with E-M (this example is a 3 part model)• We start with the dual problem of what to fit and where to fit it.

From

: Rob

Fer

gus

http

://w

ww

.robo

ts.o

x.ac

.uk/

%7E

ferg

us/

Assume that an object instance is the onlyconsistent thing somewhere in a scene.

We don’t know where to start, so we usethe initial random parameters.

1. (M) We find the best (consistent across images) assignment given the params.

2. (E) We refit the feature detector params. and repeat until converged.• Note that there isn’t much

consistency

3. This repeats until it converges at the most consistent assignment with maximized parameters across images.

[Slide from Bradsky & Thrun, Stanford]

Page 55: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

ML using EM1. Current estimate 2. Assign probabilities to constellations

Large P

...

Image 2 Image iSmall P

pdf

Image 1

3. Use probabilities as weights to reestimate parameters. Example: µ

Large P x + Small P + … =x

new estimate of µ

Page 56: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

LearnedModel

From

: Rob

Fer

gus

http

://w

ww

.robo

ts.o

x.ac

.uk/

%7E

ferg

us/

The shape model. The mean location is indicated by the cross, with the ellipse showing the uncertainty in location. The number by each part is the probability of that part being present.

Page 57: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

From

: Rob

Fer

gus

http

://w

ww

.robo

ts.o

x.ac

.uk/

%7E

ferg

us/

[Slide from Bradsky & Thrun, Stanford]

Page 58: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Block diagram

Page 59: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then
Page 60: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then
Page 61: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Recognition

From

: Rob

Fer

gus

http

://w

ww

.robo

ts.o

x.ac

.uk/

%7E

ferg

us/

Page 62: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Result: Unsupervised Learning

Slid

e fro

m L

i Fei

-Fei

http

://w

ww

.vis

ion.

calte

ch.e

du/fe

ifeili/

Res

ume.

htm

[Slide from Bradsky & Thrun, Stanford]

Page 63: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

From

: Rob

Fer

gus

http

://w

ww

.robo

ts.o

x.ac

.uk/

%7E

ferg

us/

Page 64: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

6.869

Previously: Object recognition via labeled training sets.Previously: Unsupervised Category LearningNow:Perceptual organization:

– Gestalt Principles– Segmentation by Clustering

• K-Means• Graph cuts

– Segmentation by Fitting• Hough transform• Fitting

Readings: F&P Ch. 14, 15.1-15.2

Page 65: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Segmentation and Line Fitting

• Gestalt grouping• K-Means• Graph cuts• Hough transform• Iterative fitting

Page 66: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Segmentation and Grouping

• Motivation: vision is often simple inference, but for segmentation

• Obtain a compact representation from an image/motion sequence/set of tokens

• Should support application• Broad theory is absent at

present

• Grouping (or clustering)– collect together tokens that

“belong together”

• Fitting– associate a model with

tokens– issues

• which model?• which token goes to which

element?• how many elements in the

model?

Page 67: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

General ideas

• Tokens– whatever we need to

group (pixels, points, surface elements, etc., etc.)

• Top down segmentation– tokens belong together

because they lie on the same object

• Bottom up segmentation– tokens belong together

because they are locally coherent

• These two are not mutually exclusive

Page 68: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Why do these tokens belong together?

Page 69: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

What is the figure?

Page 70: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then
Page 71: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Basic ideas of grouping in humans

• Figure-ground discrimination– grouping can be seen

in terms of allocating some elements to a figure, some to ground

– impoverished theory

• Gestalt properties– A series of factors

affect whether elements should be grouped together

Page 72: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then
Page 73: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then
Page 74: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then
Page 75: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then
Page 76: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then
Page 77: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Occlusion is an important cue in grouping.

Page 78: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Consequence:Groupings by Invisible Completions

* Images from Steve Lehar’s Gestalt papers: http://cns-alumni.bu.edu/pub/slehar/Lehar.html

Page 79: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

And the famous…

Page 80: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

And the famous invisible dog eating under a tree:

Page 81: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

• We want to let machines have these perceptual organization abilities, to support object recognition and both supervised and unsupervised learning about the visual world.

Page 82: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Segmentation as clustering

• Cluster together (pixels, tokens, etc.) that belong together…

• Agglomerative clustering– attach closest to cluster it is closest to– repeat

• Divisive clustering– split cluster along best boundary– repeat

• Dendrograms– yield a picture of output as clustering process continues

Page 83: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Clustering Algorithms

Page 84: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then
Page 85: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

K-Means

• Choose a fixed number of clusters

• Choose cluster centers and point-cluster allocations to minimize error

• can’t do this by search, because there are too many possible allocations.

• Algorithm– fix cluster centers; allocate

points to closest cluster– fix allocation; compute best

cluster centers

• x could be any set of features for which we can compute a distance (careful about scaling)

x j − µ i

2

j∈elements of i'th cluster∑

⎧ ⎨ ⎩

⎫ ⎬ ⎭ i∈clusters

Page 86: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

K-Means

Page 87: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Image Clusters on intensity (K=5) Clusters on color (K=5)

K-means clustering using intensity alone and color alone

Page 88: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Image Clusters on color

K-means using color alone, 11 segments

Page 89: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

K-means usingcolor alone,11 segments.

Color aloneoften will not yeild salient segments!

Page 90: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

K-means using colour andposition, 20 segments

Still misses goal of perceptuallypleasing segmentation!

Hard to pick K…

Page 91: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Graph-Theoretic Image Segmentation

Build a weighted graph G=(V,E) from image

V: image pixels

E: connections between pairs of nearby pixels

region same the tobelong

j& iy that probabilit :ijW

Page 92: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Graphs Representations

⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢

0110110000100000000110010a

d

b

ce

Adjacency Matrix

* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

Page 93: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Weighted Graphs and Their Representations

⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢

∞∞∞

∞∞∞

0172106760432401

310a

e

d

c

b

6

Weight Matrix

* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

Page 94: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Boundaries of image regions defined by a number of attributes

– Brightness/color– Texture– Motion– Stereoscopic depth– Familiar configuration

[Malik]

Page 95: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Measuring AffinityIntensity

aff x, y( )= exp − 12σ i

2⎛ ⎝

⎞ ⎠ I x( )− I y( ) 2( )⎧

⎨ ⎩

⎫ ⎬ ⎭

Distance

aff x, y( )= exp − 12σ d

2⎛ ⎝

⎞ ⎠ x − y 2( )⎧

⎨ ⎩

⎫ ⎬ ⎭

Color

aff x, y( )= exp − 12σ t

2⎛ ⎝

⎞ ⎠ c x( )− c y( ) 2( )⎧

⎨ ⎩

⎫ ⎬ ⎭

Page 96: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Eigenvectors and affinity clusters• Simplest idea: we want a

vector a giving the association between each element and a cluster

• We want elements within this cluster to, on the whole, have strong affinity with one another

• We could maximize

• But need the constraint

• This is an eigenvalueproblem - choose the eigenvector of A with largest eigenvalue

aT Aa

aTa = 1

Page 97: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Example eigenvector

points

eigenvector

matrix

Page 98: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Example eigenvector

points

eigenvector

matrix

Page 99: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Scale affects affinity

σ=.2

σ=.1 σ=.2 σ=1

Page 100: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Scale affects affinity

σ=.1 σ=.2 σ=1

Page 101: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Some Terminology for Graph Partitioning

• How do we bipartition a graph:

∅=∩

∈∈∑=

BAwith

BA,

),,W(B)A,(vu

vucut

disjointy necessarilnot A' andA

A'A,

),(W)A'A,( ∑∈∈

=vu

vuassoc

[Malik]

Page 102: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Minimum CutA cut of a graph G is the set of edges S such that removal of S from G disconnects G.

Minimum cut is the cut of minimum weight, where weight of cut <A,B> is given as

( ) ( )∑ ∈∈=

ByAxyxwBAw

,,,

* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

Page 103: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Minimum Cut and Clustering

* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

Page 104: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Drawbacks of Minimum Cut

• Weight of cut is directly proportional to the number of edges in the cut.

Cuts with lesser weightthan the ideal cut

Ideal Cut

* Slide from Khurram Hassan-Shafique CAP5415 Computer Vision 2003

Page 105: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Normalized cuts

• First eigenvector of affinity matrix captures within cluster similarity, but not across cluster difference

• Min-cut can find degenerate clusters

• Instead, we’d like to maximize the within cluster similarity compared to the across cluster difference

• Write graph as V, one cluster as A and the other as B

• Maximize

where cut(A,B) is sum of weights that straddle A,B; assoc(A,V) is sum of all edges with one end in A.

I.e. construct A, B such that their within cluster similarity is high compared to their association with the rest of the graph

cut(A,B)assoc(A,V)

cut(A,B)assoc(B,V)

+

Page 106: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Solving the Normalized Cut problem

• Exact discrete solution to Ncut is NP-complete even on regular grid,– [Papadimitriou’97]

• Drawing on spectral graph theory, good approximation can be obtained by solving a generalized eigenvalue problem.

[Malik]

Page 107: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Normalized Cut As Generalized Eigenvalue problem

...

),(

),( ;

11)1()1)(()1(

11)1)(()1(

)VB,()BA,(

)VA,(B)A,(B)A,(

0

=

=−

−−−+

+−+=

+=

∑∑ >

i

xT

T

T

T

iiD

iiDk

DkxWDx

DkxWDx

assoccut

assoccutNcut

i

• after simplification, we get

.01},,1{ with ,)(),( =−∈−

= DybyDyy

yWDyBANcut TiT

T

[Malik]

Page 108: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Normalized cuts• Instead, solve the generalized eigenvalue problem

• which gives

• Now look for a quantization threshold that maximizes the criterion ---i.e all components of y above that threshold go to one, all below go to -b

maxy yT D − W( )y( ) subject to yT Dy = 1( )

D − W( )y = λDy

Page 109: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Brightness Image Segmentation

Page 110: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Brightness Image Segmentation

Page 111: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then
Page 112: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Results on color segmentation

Page 113: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Motion Segmentation with Normalized Cuts

• Networks of spatial-temporal connections:

• Motion “proto-volume” in space-time

Page 114: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then
Page 115: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Comparison of MethodsAuthors Matrix used Procedure/Eigenvectors used

Perona/ Freeman Affinity A 1st x: Recursive procedure

Shi/Malik D-A with D adegree matrix

2nd smallest generalizedeigenvectorAlso recursive

Scott/Longuet-Higgins

Affinity A,User inputs k

Finds k eigenvectors of A, forms V. Normalizes rows of V. Forms Q = VV’. Segments by Q. Q(i,j)=1 -> same cluster

Ng, Jordan, Weiss Affinity A,User inputs k

Normalizes A. Finds k eigenvectors, forms X. Normalizes X, clusters rows

Ax xλ=

( , ) ( , )j

D i i A i j= ∑( )D A x Dxλ− =

Nugent, Stanberry UW STAT 593E

Page 116: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Advantages/Disadvantages

• Perona/Freeman– For block diagonal affinity matrices, the first

eigenvector finds points in the “dominant”cluster; not very consistent

• Shi/Malik– 2nd generalized eigenvector minimizes affinity

between groups by affinity within each group; no guarantee, constraints

Nugent, Stanberry UW STAT 593E

Page 117: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Advantages/Disadvantages

• Scott/Longuet-Higgins– Depends largely on choice of k– Good results

• Ng, Jordan, Weiss– Again depends on choice of k– Claim: effectively handles clusters whose

overlap or connectedness varies across clusters

Nugent, Stanberry UW STAT 593E

Page 118: Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Affinity Matrix Perona/Freeman Shi/Malik Scott/Lon.Higg

1st eigenv. 2nd gen. eigenv. Q matrix

Affinity Matrix Perona/Freeman Shi/Malik Scott/Lon.Higg

1st eigenv. 2nd gen. eigenv. Q matrix

Affinity Matrix Perona/Freeman Shi/Malik Scott/Lon.Higg

1st eigenv. 2nd gen. eigenv. Q matrixNugent, Stanberry UW STAT 593E