beyond bags of features: adding spatial information

24
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Upload: zared

Post on 23-Mar-2016

35 views

Category:

Documents


0 download

DESCRIPTION

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba. Adding spatial information. Forming vocabularies from pairs of nearby features – “doublets” or “bigrams” Computing bags of features on sub-windows of the whole image - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Beyond bags of features: Adding spatial information

Beyond bags of features:Adding spatial information

Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Page 2: Beyond bags of features: Adding spatial information

Adding spatial information• Forming vocabularies from pairs of nearby

features – “doublets” or “bigrams”• Computing bags of features on sub-windows

of the whole image• Using codebooks to vote for object position• Generative part-based models

Page 3: Beyond bags of features: Adding spatial information

From single features to “doublets”1. Run pLSA on a regular visual vocabulary2. Identify a small number of top visual words

for each topic3. Form a “doublet” vocabulary from these top

visual words4. Run pLSA again on the augmented

vocabulary

J. Sivic, B. Russell, A. Efros, A. Zisserman, B. Freeman, Discovering Objects and their Location in Images, ICCV 2005

Page 4: Beyond bags of features: Adding spatial information

From single features to “doublets”

J. Sivic, B. Russell, A. Efros, A. Zisserman, B. Freeman, Discovering Objects and their Location in Images, ICCV 2005

Ground truth All features “Face” features initiallyfound by pLSA

One doublet Another doublet “Face” doublets

Page 5: Beyond bags of features: Adding spatial information

Spatial pyramid representationSpatial pyramid representation• Extension of a bag of features• Locally orderless representation at several levels of resolution

level 0

Lazebnik, Schmid & Ponce (CVPR 2006)

Page 6: Beyond bags of features: Adding spatial information

Spatial pyramid representationSpatial pyramid representation• Extension of a bag of features• Locally orderless representation at several levels of resolution

level 0 level 1

Lazebnik, Schmid & Ponce (CVPR 2006)

Page 7: Beyond bags of features: Adding spatial information

Spatial pyramid representationSpatial pyramid representation

level 0 level 1 level 2

• Extension of a bag of features• Locally orderless representation at several levels of resolution

Lazebnik, Schmid & Ponce (CVPR 2006)

Page 8: Beyond bags of features: Adding spatial information

Scene category datasetScene category dataset

Multi-class classification results(100 training images per class)

Page 9: Beyond bags of features: Adding spatial information

Caltech101 datasetCaltech101 datasethttp://www.vision.caltech.edu/Image_Datasets/Caltech101/Caltech101.html

Multi-class classification results (30 training images per class)

Page 10: Beyond bags of features: Adding spatial information

Implicit shape models• Visual codebook is used to index votes for

object position

B. Leibe, A. Leonardis, and B. Schiele, Combined Object Categorization and Segmentation with an Implicit Shape Model, ECCV Workshop on Statistical Learning in Computer Vision 2004

training image

visual codeword withdisplacement vectors

Page 11: Beyond bags of features: Adding spatial information

Implicit shape models• Visual codebook is used to index votes for

object position

B. Leibe, A. Leonardis, and B. Schiele, Combined Object Categorization and Segmentation with an Implicit Shape Model, ECCV Workshop on Statistical Learning in Computer Vision 2004

test image

Page 12: Beyond bags of features: Adding spatial information

Implicit shape models: Training1. Build codebook of patches around extracted

interest points using clustering

Page 13: Beyond bags of features: Adding spatial information

Implicit shape models: Training1. Build codebook of patches around extracted

interest points using clustering2. Map the patch around each interest point to

closest codebook entry

Page 14: Beyond bags of features: Adding spatial information

Implicit shape models: Training1. Build codebook of patches around extracted

interest points using clustering2. Map the patch around each interest point to

closest codebook entry3. For each codebook entry, store all positions

it was found, relative to object center

Page 15: Beyond bags of features: Adding spatial information

Implicit shape models: Testing1. Given test image, extract patches, match to

codebook entry 2. Cast votes for possible positions of object center3. Search for maxima in voting space4. Extract weighted segmentation mask based on

stored masks for the codebook occurrences

Page 16: Beyond bags of features: Adding spatial information

Generative part-based models

R. Fergus, P. Perona and A. Zisserman, Object Class Recognition by Unsupervised Scale-Invariant Learning, CVPR 2003

Page 17: Beyond bags of features: Adding spatial information

Probabilistic model

h: assignment of features to parts

)|(),|(),|(max)|,()|(

objecthpobjecthshapepobjecthappearancePobjectshapeappearancePobjectimageP

h

Partdescriptors

Partlocations

Candidate parts

Page 18: Beyond bags of features: Adding spatial information

Probabilistic model

h: assignment of features to parts

Part 2

Part 3

Part 1

)|(),|(),|(max)|,()|(

objecthpobjecthshapepobjecthappearancePobjectshapeappearancePobjectimageP

h

Page 19: Beyond bags of features: Adding spatial information

Probabilistic model

h: assignment of features to parts

Part 2

Part 3

Part 1

)|(),|(),|(max)|,()|(

objecthpobjecthshapepobjecthappearancePobjectshapeappearancePobjectimageP

h

Page 20: Beyond bags of features: Adding spatial information

Probabilistic model

)|(),|(),|(max)|,()|(

objecthpobjecthshapepobjecthappearancePobjectshapeappearancePobjectimageP

h

High-dimensional appearance space

Distribution over patchdescriptors

Page 21: Beyond bags of features: Adding spatial information

Probabilistic model

)|(),|(),|(max)|,()|(

objecthpobjecthshapepobjecthappearancePobjectshapeappearancePobjectimageP

h

2D image space

Distribution over jointpart positions

Page 22: Beyond bags of features: Adding spatial information

Results: Faces

Faceshapemodel

Patchappearancemodel

Recognitionresults

Page 23: Beyond bags of features: Adding spatial information

Results: Motorbikes and airplanes

Page 24: Beyond bags of features: Adding spatial information

Summary: Adding spatial information• Doublet vocabularies

• Pro: takes co-occurrences into account, some geometric invariance is preserved

• Con: too many doublet probabilities to estimate

• Spatial pyramids• Pro: simple extension of a bag of features, works very well• Con: no geometric invariance

• Implicit shape models• Pro: can localize object, maintain translation and possibly

scale invariance• Con: need supervised training data (known object positions

and possibly segmentation masks)

• Generative part-based models• Pro: very nice conceptually• Con: combinatorial hypothesis search problem