cvpr2007 object category recognition p3 - discriminative models

82
Part 3: classifier based methods Antonio Torralba

Upload: zukun

Post on 10-May-2015

952 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Cvpr2007 object category recognition   p3 - discriminative models

Part 3: classifier based methodsAntonio Torralba

Page 2: Cvpr2007 object category recognition   p3 - discriminative models

Overview of section• A short story of discriminative methods

• Object detection with classifiers

• Boosting– Gentle boosting– Weak detectors– Object model– Object detection

• Multiclass object detection

• Context based object recognition

Page 3: Cvpr2007 object category recognition   p3 - discriminative models

Classifier based methodsObject detection and recognition is formulated as a classification problem.

Bag of image patches

… and a decision is taken at each window about if it contains a target object or not.

Decision boundary

Computer screen

Background

In some feature space

Where are the screens?

The image is partitioned into a set of overlapping windows

Page 4: Cvpr2007 object category recognition   p3 - discriminative models

(The lousy painter)

Discriminative vs. generative

0 10 20 30 40 50 60 700

0.05

0.1

x = data

• Generative model

0 10 20 30 40 50 60 700

0.5

1

x = data

• Discriminative model

0 10 20 30 40 50 60 70 80

-1

1

x = data

• Classification function

(The artist)

Page 5: Cvpr2007 object category recognition   p3 - discriminative models

• The representation and matching of pictorial structures Fischler, Elschlager (1973). • Face recognition using eigenfaces M. Turk and A. Pentland (1991). • Human Face Detection in Visual Scenes - Rowley, Baluja, Kanade (1995) • Graded Learning for Object Detection - Fleuret, Geman (1999) • Robust Real-time Object Detection - Viola, Jones (2001)• Feature Reduction and Hierarchy of Classifiers for Fast Object Detection in Video Images - Heisele, Serre, Mukherjee, Poggio (2001)•….

Page 6: Cvpr2007 object category recognition   p3 - discriminative models

• The representation and matching of pictorial structures Fischler, Elschlager (1973). • Face recognition using eigenfaces M. Turk and A. Pentland (1991). • Human Face Detection in Visual Scenes - Rowley, Baluja, Kanade (1995) • Graded Learning for Object Detection - Fleuret, Geman (1999) • Robust Real-time Object Detection - Viola, Jones (2001)• Feature Reduction and Hierarchy of Classifiers for Fast Object Detection in Video Images - Heisele, Serre, Mukherjee, Poggio (2001)•….

Page 7: Cvpr2007 object category recognition   p3 - discriminative models

Face detection

• The representation and matching of pictorial structures Fischler, Elschlager (1973). • Face recognition using eigenfaces M. Turk and A. Pentland (1991). • Human Face Detection in Visual Scenes - Rowley, Baluja, Kanade (1995) • Graded Learning for Object Detection - Fleuret, Geman (1999) • Robust Real-time Object Detection - Viola, Jones (2001)• Feature Reduction and Hierarchy of Classifiers for Fast Object Detection in Video Images - Heisele, Serre, Mukherjee, Poggio (2001)•….

Page 8: Cvpr2007 object category recognition   p3 - discriminative models

Face detection

Page 9: Cvpr2007 object category recognition   p3 - discriminative models

• Formulation: binary classification

Formulation

+1-1

x1 x2 x3 xN

… xN+1 xN+2 xN+M

-1 -1 ? ? ?

Training data: each image patch is labeledas containing the object or background

Test data

Features x =

Labels y =

Where belongs to some family of functions

• Classification function

• Minimize misclassification error(Not that simple: we need some guarantees that there will be generalization)

Page 10: Cvpr2007 object category recognition   p3 - discriminative models

Discriminative methods

106 examples

Nearest neighbor

Shakhnarovich, Viola, Darrell 2003Berg, Berg, Malik 2005…

Neural networks

LeCun, Bottou, Bengio, Haffner 1998Rowley, Baluja, Kanade 1998…

Support Vector Machines and Kernels Conditional Random Fields

McCallum, Freitag, Pereira 2000Kumar, Hebert 2003…

Guyon, VapnikHeisele, Serre, Poggio, 2001…

Page 11: Cvpr2007 object category recognition   p3 - discriminative models

A simple object detector with Boosting

Download

• Toolbox for manipulating dataset

• Code and dataset

Matlab code

• Gentle boosting

• Object detector using a part based model

Dataset with cars and computer monitors

http://people.csail.mit.edu/torralba/iccv2005/

Page 12: Cvpr2007 object category recognition   p3 - discriminative models

• A simple algorithm for learning robust classifiers– Freund & Shapire, 1995– Friedman, Hastie, Tibshhirani, 1998

• Provides efficient algorithm for sparse visual feature selection– Tieu & Viola, 2000– Viola & Jones, 2003

• Easy to implement, not requires external optimization tools.

Why boosting?

Page 13: Cvpr2007 object category recognition   p3 - discriminative models

Boosting

Boosting fits the additive model

by minimizing the exponential loss

Training samples

The exponential loss is a differentiable upper bound to the misclassification error.

Page 14: Cvpr2007 object category recognition   p3 - discriminative models

Boosting

Sequential procedure. At each step we add

For more details: Friedman, Hastie, Tibshirani. “Additive Logistic Regression: a Statistical View of Boosting” (1998)

to minimize the residual loss

inputDesired outputParametersweak classifier

Page 15: Cvpr2007 object category recognition   p3 - discriminative models

Weak classifiers • The input is a set of weighted training

samples (x,y,w)

• Regression stumps: simple but commonly used in object detection.

Four parameters:

b=Ew(y [x> ])

a=Ew(y [x< ])x

fm(x)

Page 16: Cvpr2007 object category recognition   p3 - discriminative models

Flavors of boosting

• AdaBoost (Freund and Shapire, 1995)

• Real AdaBoost (Friedman et al, 1998)

• LogitBoost (Friedman et al, 1998)

• Gentle AdaBoost (Friedman et al, 1998)

• BrownBoosting (Freund, 2000)

• FloatBoost (Li et al, 2002)

• …

Page 17: Cvpr2007 object category recognition   p3 - discriminative models

From images to features:A myriad of weak detectors

We will now define a family of visual features that can be used as weak classifiers (“weak detectors”)

Takes image as input and the output is binary response.The output is a weak detector.

Page 18: Cvpr2007 object category recognition   p3 - discriminative models

A myriad of weak detectors

• Yuille, Snow, Nitzbert, 1998• Amit, Geman 1998• Papageorgiou, Poggio, 2000• Heisele, Serre, Poggio, 2001• Agarwal, Awan, Roth, 2004• Schneiderman, Kanade 2004 • Carmichael, Hebert 2004• …

Page 19: Cvpr2007 object category recognition   p3 - discriminative models

Weak detectors

Textures of textures Tieu and Viola, CVPR 2000

Every combination of three filters generates a different feature

This gives thousands of features. Boosting selects a sparse subset, so computations on test time are very efficient. Boosting also avoids overfitting to some extend.

Page 20: Cvpr2007 object category recognition   p3 - discriminative models

Haar wavelets

Haar filters and integral imageViola and Jones, ICCV 2001

The average intensity in the block is computed with four sums independently of the block size.

Page 21: Cvpr2007 object category recognition   p3 - discriminative models

Haar waveletsPapageorgiou & Poggio (2000)

Polynomial SVM

Page 22: Cvpr2007 object category recognition   p3 - discriminative models

Edges and chamfer distance

Gavrila, Philomin, ICCV 1999

Page 23: Cvpr2007 object category recognition   p3 - discriminative models

Edge fragments

Weak detector = k edge fragments and threshold. Chamfer distance uses 8 orientation planes

Opelt, Pinz, Zisserman, ECCV 2006

Page 24: Cvpr2007 object category recognition   p3 - discriminative models

Histograms of oriented gradients

• Dalal & Trigs, 2006

• Shape context

Belongie, Malik, Puzicha, NIPS 2000• SIFT, D. Lowe, ICCV 1999

Page 25: Cvpr2007 object category recognition   p3 - discriminative models

Weak detectors

Part based: similar to part-based generative models. We create weak detectors by using parts and voting for the object center location

Car model Screen model

These features are used for the detector on the course web site.

Page 26: Cvpr2007 object category recognition   p3 - discriminative models

Weak detectors

First we collect a set of part templates from a set of training objects.

Vidal-Naquet, Ullman, Nature Neuroscience 2003

Page 27: Cvpr2007 object category recognition   p3 - discriminative models

Weak detectors

We now define a family of “weak detectors” as:

= =

Better than chance

*

Page 28: Cvpr2007 object category recognition   p3 - discriminative models

Weak detectors

We can do a better job using filtered images

Still a weak detectorbut better than before

* * ===

Page 29: Cvpr2007 object category recognition   p3 - discriminative models

TrainingFirst we evaluate all the N features on all the training images.

Then, we sample the feature outputs on the object center and at random locations in the background:

Page 30: Cvpr2007 object category recognition   p3 - discriminative models

Representation and object model

4 10

Selected features for the screen detector

1 2 3

100

Lousy painter

Page 31: Cvpr2007 object category recognition   p3 - discriminative models

Representation and object modelSelected features for the car detector

1 2 3 4 10 100

… …

Page 32: Cvpr2007 object category recognition   p3 - discriminative models

Detection• Invariance: search strategy

• Part based

fi, Pigi

Here, invariance in translation and scale is achieved by the search strategy: the classifier is evaluated at all locations (by translating the image) and at all scales (by scaling the image in small steps).

The search cost can be reduced using a cascade.

Page 33: Cvpr2007 object category recognition   p3 - discriminative models

Example: screen detectionFeature output

Page 34: Cvpr2007 object category recognition   p3 - discriminative models

Example: screen detectionFeature output

Thresholded output

Weak ‘detector’Produces many false alarms.

Page 35: Cvpr2007 object category recognition   p3 - discriminative models

Example: screen detectionFeature output

Thresholded output

Strong classifier at iteration 1

Page 36: Cvpr2007 object category recognition   p3 - discriminative models

Example: screen detectionFeature output

Thresholded output

Strongclassifier

Second weak ‘detector’Produces a different set of false alarms.

Page 37: Cvpr2007 object category recognition   p3 - discriminative models

Example: screen detection

+

Feature output

Thresholded output

Strongclassifier

Strong classifier at iteration 2

Page 38: Cvpr2007 object category recognition   p3 - discriminative models

Example: screen detection

+

Feature output

Thresholded output

Strongclassifier

Strong classifier at iteration 10

Page 39: Cvpr2007 object category recognition   p3 - discriminative models

Example: screen detection

+

Feature output

Thresholded output

Strongclassifier

Adding features

Finalclassification

Strong classifier at iteration 200

Page 40: Cvpr2007 object category recognition   p3 - discriminative models

We want the complexity of the 3 features classifier with the performance of the 100 features classifier:

Cascade of classifiersFleuret and Geman 2001, Viola and Jones 2001

Recall

Precision

0% 100%

100%

3 features

30 features

100 features

Select a threshold with high recall for each stage.

We increase precision using the cascade

Page 41: Cvpr2007 object category recognition   p3 - discriminative models

Some goals for object recognition

• Able to detect and recognize many object classes

• Computationally efficient

• Able to deal with data starving situations:– Some training samples might be harder to

collect than others– We want on-line learning to be fast

Page 42: Cvpr2007 object category recognition   p3 - discriminative models

Multiclass object detection

Page 43: Cvpr2007 object category recognition   p3 - discriminative models

Multiclass object detection

Number of classes

Amount of labeled data

Number of classes

Complexity

Page 44: Cvpr2007 object category recognition   p3 - discriminative models

Shared features• Is learning the object class 1000 easier

than learning the first?

• Can we transfer knowledge from one object to another?

• Are the shared properties interesting by themselves?

Page 45: Cvpr2007 object category recognition   p3 - discriminative models

Multitask learning

•horizontal location of doorknob •single or double door•horizontal location of doorway center •width of doorway•horizontal location of left door jamb

•horizontal location of right door jamb•width of left door jamb •width of right door jamb•horizontal location of left edge of door •horizontal location of right edge of door

Primary task: detect door knobs

Tasks used:

R. Caruana. Multitask Learning. ML 1997

Page 46: Cvpr2007 object category recognition   p3 - discriminative models

Sharing invariancesS. Thrun. Is Learning the n-th Thing Any Easier Than Learning The First? NIPS 1996

“Knowledge is transferred between tasks via a learned model of the invariances of the domain: object recognition is invariant to rotation, translation, scaling, lighting, … These invariances are common to all object recognition tasks”.

Toy world

Without sharing

With sharing

Page 47: Cvpr2007 object category recognition   p3 - discriminative models

Sharing transformationsMiller, E., Matsakis, N., and Viola, P. (2000). Learning from one example

through shared densities on transforms. In IEEE Computer Vision and Pattern Recognition.

Transformations are sharedand can be learnt from other tasks.

Page 48: Cvpr2007 object category recognition   p3 - discriminative models

Models of object recognitionI. Biederman, “Recognition-by-components: A theory of human image understanding,” Psychological Review, 1987.

M. Riesenhuber and T. Poggio, “Hierarchical models of object recognition in cortex,” Nature Neuroscience 1999.

T. Serre, L. Wolf and T. Poggio. “Object recognition with features inspired by visual cortex”. CVPR 2005

Page 49: Cvpr2007 object category recognition   p3 - discriminative models

Sharing in constellation models

Pictorial StructuresFischler & Elschlager, IEEE Trans. Comp. 1973

Constellation ModelBurl, Liung,Perona, 1996; Weber, Welling, Perona, 2000

Fergus, Perona, & Zisserman, CVPR 2003

SVM DetectorsHeisele, Poggio, et. al., NIPS 2001

Model-Guided SegmentationMori, Ren, Efros, & Malik, CVPR 2004

Page 50: Cvpr2007 object category recognition   p3 - discriminative models

E-Step

Random initialization

Variational EMVariational EM

prior knowledge of p()

new estimate of p(|train)

M-Step

new ’s

(Attias, Hinton, Beal, etc.)Slide from Fei Fei Li

Fei-Fei, Fergus, & Perona, ICCV 2003

Page 51: Cvpr2007 object category recognition   p3 - discriminative models

Reusable Parts

Goal: Look for a vocabulary of edges that reduces the number of features.

Krempp, Geman, & Amit “Sequential Learning of Reusable Parts for Object Detection”. TR 2002

Num

ber

of f

eatu

res

Number of classes

Examples of reused parts

Page 52: Cvpr2007 object category recognition   p3 - discriminative models

Sharing patches• Bart and Ullman, 2004

For a new class, use only features similar to features that where good for other classes:

Proposed Dog features

Page 53: Cvpr2007 object category recognition   p3 - discriminative models

Multiclass boosting

• Adaboost.MH (Shapire & Singer, 2000)

• Error correcting output codes (Dietterich & Bakiri, 1995; …)

• Lk-TreeBoost (Friedman, 2001)

• ...

Page 54: Cvpr2007 object category recognition   p3 - discriminative models

Shared features

Screen detector

Car detector

Face detector

• Independent binary classifiers:

Torralba, Murphy, Freeman. CVPR 2004. PAMI 2007

Screen detector

Car detector

Face detector

• Binary classifiers that share features:

Page 55: Cvpr2007 object category recognition   p3 - discriminative models

50 training samples/class29 object classes2000 entries in the dictionary

Results averaged on 20 runsError bars = 80% interval

Krempp, Geman, & Amit, 2002

Torralba, Murphy, Freeman. CVPR 2004

Shared features

Class-specific features

Page 56: Cvpr2007 object category recognition   p3 - discriminative models

Generalization as a function of object similarities

12 viewpoints12 unrelated object classes

Number of training samples per class Number of training samples per class

Are

a un

der

RO

C

Are

a un

der

RO

C K = 2.1 K = 4.8

Torralba, Murphy, Freeman. CVPR 2004. PAMI 2007

Page 57: Cvpr2007 object category recognition   p3 - discriminative models

Opelt, Pinz, Zisserman, CVPR 2006

Efficiency Generalization

Page 58: Cvpr2007 object category recognition   p3 - discriminative models

Some references on multiclass

• Caruana 1997• Schapire, Singer, 2000• Thrun, Pratt 1997• Krempp, Geman, Amit, 2002• E.L.Miller, Matsakis, Viola, 2000• Mahamud, Hebert, Lafferty, 2001• Fink 2004• LeCun, Huang, Bottou, 2004• Holub, Welling, Perona, 2005• …

Page 59: Cvpr2007 object category recognition   p3 - discriminative models

Context based methodsAntonio Torralba

Page 60: Cvpr2007 object category recognition   p3 - discriminative models

Why is this hard?

Page 61: Cvpr2007 object category recognition   p3 - discriminative models

2

1

What are the hidden objects?

Page 62: Cvpr2007 object category recognition   p3 - discriminative models

What are the hidden objects?

Chance ~ 1/30000

Page 63: Cvpr2007 object category recognition   p3 - discriminative models

Context-based object recognition

• Cognitive psychology– Palmer 1975 – Biederman 1981– …

• Computer vision– Noton and Stark (1971)– Hanson and Riseman (1978)– Barrow & Tenenbaum (1978) – Ohta, kanade, Skai (1978)– Haralick (1983)– Strat and Fischler (1991)– Bobick and Pinhanez (1995)– Campbell et al (1997)

Page 64: Cvpr2007 object category recognition   p3 - discriminative models
Page 65: Cvpr2007 object category recognition   p3 - discriminative models
Page 66: Cvpr2007 object category recognition   p3 - discriminative models
Page 67: Cvpr2007 object category recognition   p3 - discriminative models

Global and local representations

building

car

sidewalk

Urban street scene

Page 68: Cvpr2007 object category recognition   p3 - discriminative models

Global and local representations

Image index: Summary statistics, configuration of textures

Urban street scene

features

histogram

building

car

sidewalk

Urban street scene

Page 69: Cvpr2007 object category recognition   p3 - discriminative models

Global scene representations

Spatial structure is important in order to provide context for object localization

Sivic, Russell, Freeman, Zisserman, ICCV 2005Fei-Fei and Perona, CVPR 2005Bosch, Zisserman, Munoz, ECCV 2006

Bag of words Spatially organized textures

Non localized textons

S. Lazebnik, et al, CVPR 2006Walker, Malik. Vision Research 2004

M. Gorkani, R. Picard, ICPR 1994A. Oliva, A. Torralba, IJCV 2001

Page 70: Cvpr2007 object category recognition   p3 - discriminative models

Contextual object relationshipsCarbonetto, de Freitas & Barnard (2004) Kumar, Hebert (2005)

Torralba Murphy Freeman (2004)

Fink & Perona (2003)E. Sudderth et al (2005)

Page 71: Cvpr2007 object category recognition   p3 - discriminative models

Context

• Murphy, Torralba & Freeman (NIPS 03)Use global context to predict presence and location of objects

Op1,c1

vp1,c1

OpN,c1

vpN,c1. . .

Op1,c2

vp1,c2

OpN,c2

vpN,c2. . .

Class 1 Class 2

E1 E2

S

c2maxVc1

maxV

X1X2vg

Keyboards

Page 72: Cvpr2007 object category recognition   p3 - discriminative models

3d Scene Context

Image World

[Hoiem, Efros, Hebert ICCV 2005]

Page 73: Cvpr2007 object category recognition   p3 - discriminative models

3d Scene Context

Image Support Vertical Sky

V-Left V-Center V-Right V-Porous V-Solid

[Hoiem, Efros, Hebert ICCV 2005]

ObjectSurface?

Support?

Page 74: Cvpr2007 object category recognition   p3 - discriminative models

Object-Object Relationships

Enforce spatial consistency between labels using MRF

Carbonetto, de Freitas & Barnard (04)

Page 75: Cvpr2007 object category recognition   p3 - discriminative models

Object-Object Relationships

Use latent variables to induce long distance correlations between labels in a Conditional Random Field (CRF)

He, Zemel & Carreira-Perpinan (04)

Page 76: Cvpr2007 object category recognition   p3 - discriminative models

• Fink & Perona (NIPS 03)Use output of boosting from other objects at previous

iterations as input into boosting for this iteration

Object-Object Relationships

Page 77: Cvpr2007 object category recognition   p3 - discriminative models

CRFsObject-Object Relationships

[Torralba Murphy Freeman 2004] [Kumar Hebert 2005]

Page 78: Cvpr2007 object category recognition   p3 - discriminative models

Hierarchical Sharing and Context

• Scenes share objects

• Objects share parts

• Parts share features

E. Sudderth, A. Torralba, W. T. Freeman, and A. Wilsky.

Page 79: Cvpr2007 object category recognition   p3 - discriminative models

Some references on context

• Strat & Fischler (PAMI 91) • Torralba & Sinha (ICCV 01), • Torralba (IJCV 03) • Fink & Perona (NIPS 03)• Murphy, Torralba & Freeman (NIPS 03)• Kumar and M. Hebert (NIPS 04)• Carbonetto, Freitas & Barnard (ECCV 04)• He, Zemel & Carreira-Perpinan (CVPR 04)• Sudderth, Torralba, Freeman, Wilsky (ICCV 05)• Hoiem, Efros, Hebert (ICCV 05)• …

With a mixture of generative and discriminative approaches

Page 80: Cvpr2007 object category recognition   p3 - discriminative models

A car out of context …

Page 81: Cvpr2007 object category recognition   p3 - discriminative models

Banksy

Integrated models for scene and object recognition

Page 82: Cvpr2007 object category recognition   p3 - discriminative models

Summary

Many techniques are used for training discriminative models that I have not mention here:

• Conditional random fields

• Kernels for object recognition

• Learning object similarities

• …