hoggles: visualizing object detection features (to be appeared in iccv 2013) carl vondrick aditya...

48
HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba ,MIT Presented By: Yonatan Dishon Nov 2013

Upload: georgiana-palmer

Post on 17-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013)Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba ,MIT

Presented By: Yonatan Dishon Nov 2013

Page 2: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Today talk Motivation Related Work HOGgles under the hood.

3 Baselines + main algorithm Limitations Quantative + Qualitative evaluation

Human+HOG detector Paper Conclusion Future Development

Page 3: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

So why HOGgles?

Image from: C. Vondrick, A. Khosla, T. Malisiewicz, A. Torralba. "HOGgles: Visualizing Object Detection Features" 2013

Page 4: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

So why HOGgles?

!?Maybe I should put my HOGgles!!HOGgles Visualization

Page 5: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

So why HOGgles? This is a visualization of the descriptor space – this is

what a classification/detection algorithm sees! So which of the following would you classify as a car?

Remember Humans are the “perfect” classifier!

Object categories (PASCAL VOC): Airplane, Bicycle, Bird, Boat , Bottle, Bus, Car, Cat, Chair, Cow, Table, Dog, Horse, Motorbike, Person, Potted Plant, Sheep, Sofa, Train, TV/Monitor

1 2 3 4 5 6 7

Page 6: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Motivation So why did my detector failed?

Training set Maybe not a good one? Learning Algorithm Maybe should have been

different? Features Maybe there are better features for this

kind of problem?

Visualizing the Feature space can bring us to an intuitive understanding of our detection system limitations and failures

Page 7: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

HOGgles Contributions A tool to explain some of the failure of object

detection systems. Algorithm to present the feature space of

object detectors (general features – not only HOG!).

4 different algorithms to do so are presented. Public feature visualization toolbox -

http://web.mit.edu/vondrick/ihog/#code

Page 8: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Today talk Motivation Related Work HOGgles under the hood.

3 Baselines + main algorithm Limitations Quantative + Qualitative evaluation

Human+HOG detector Paper Conclusion Future Development

Page 9: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Related Work (1) Reconstruct an image given keypoint of SIFT

based on a huge database

P. Weinzaepfel, H. J´egou, and P. P´erez. Reconstructing an image from its local descriptors. In CVPR, 2011

Image from: P. Weinzaepfel, H. J´egou, and P. P´erez. Reconstructing an image from its local descriptors. In CVPR, 2011.

Page 10: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Related Work (1)

Page 11: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Related Work (1)

original imagecopied patches blending interpolation

Reconstructing an image from its local descriptors, Philippe Weinzaepfel, Hervé Jégou and Patrick Pérez, Proc. IEEE CVPR’11.

calculate SIFT elliptic region of interest

affine normalization to square patch

Slide credit: Ezgi Mercan

Page 12: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Related Work (1)

Image from: P. Weinzaepfel, H. J´egou, and P. P´erez. Reconstructing an image from its local descriptors. In CVPR, 2011.

Website for more examples : http://www.irisa.fr/texmex/people/jegou/projects/reconstructing/index.html

Page 13: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Related Work (2) Reconstruct an image given only LBP features

E. d’Angelo, A. Alahi, and P. Vandergheynst. Beyond bits: Reconstructing images from local binary descriptors. ICPR, 2012. 2

Using LBD descriptors (Local Binary Descriptors) – BRIEF and FREAK

No external information! Reconstruction as a regularized inverse problem

A. Alahi, R. Ortiz, and P. Vandergheynst. FREAK: Fast Retina Keypoint. In IEEE Conference on Computer Vision and Pattern Recognition (To Appear), 2012.

M. Calonder, V. Lepetit, C. Strecha, and P. Fua. BRIEF: Binary Robust Independent Elementary Features. Computer Vision–ECCV 2010, pages 778–792, 2010.

Page 14: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Related Work (2)

Image reconstruction results of FREAK (top row) and BRIEF (bottom row) descriptors.

E. d’Angelo, A. Alahi, and P. Vandergheynst. Beyond bits: Reconstructing images from local binary descriptors. ICPR, 2012. 2

Page 15: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Today talk Motivation Related Work HOGgles under the hood.

3 Baselines + main algorithm Limitations Quantative + Qualitative evaluation

Human+HOG detector Paper Conclusion Future Development

Page 16: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

HOGgles – under the hood The problem of feature visualization as a

feature inversion problem.Given a feature vector - what was the

image/patch that created it? Let be an image and be the

corresponding HOG feature descriptor. is a many to one function.

The inversion problem cannot be solved analytically!

Dx R y x

Page 17: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

HOGgles under the hood The problem is formalized as an optimization

problem – given a descriptor y we seek an image x that minimize:

This function isn’t convex Trying to find a minima with ordinary

optimization algorithms didn’t work (Steepest decent and Newton’s method).

2* 1

2argmin

Dx R

x y x y

Page 18: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Today talk Motivation Related Work HOGgles under the hood.

3 Baselines + main algorithm Limitations Quantative + Qualitative evaluation

Human+HOG detector Paper Conclusion Future Development

Page 19: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

HOGgles under the hood 4 algorithms are showed – 3 as a baselines

and one is offered as the main algorithm. Baseline 1: Exampler LDA

Baseline 2: Ridge Regression

Baseline 3: Direct Optimization

Main Algorithm : Paired Dictionary

Page 20: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

HOGgles under the hood (Baseline 1 -ELDA)

Baseline 1: Exampler LDA (B.Hariharan, , J.Malik & D.Ramanan ECCV 2012)

HOG inverse is the average of the top K detections of the ELDA detector in RGB space .

1 y

Page 21: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

HOGgles under the hood (Baseline 1 -ELDA)

Slide credit: Ezgi Mercan

Page 22: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

HOGgles under the hood (Baseline 1 -ELDA) PROs:

Simple. Surprisingly accurate results! Even when the

database doesn’t contain the category of HOG template!

CONs: Computationally expensive! – running an object

detector over a large database Yields blurred results.

Page 23: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

HOGgles under the hood (Baseline 2 –Ridge Regression) Baseline 2: Ridge Regression

Statistically most likely image given a HOG descriptor.

Calculating the most probable grayscale image given its HOG feature.

Modeling as The HOG inverse (the visualization) is given by

are estimated on a large database Single matrix multiplication!

,N X | YP

1 1B XY YY Y Xy y

Xand

X,YP

Page 24: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

HOGgles under the hood (Baseline 2 –Ridge Regression) PROs:

Simple – this is a matrix multiplication! Very fast! – under a second for inversion.

CONs: Inversion yields blur images.

Page 25: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

HOGgles under the hood (Baseline 3 -Direct) Baseline 3: Direct Optimization

Describing a natural image basis Any image can be encoded by coefficients in this basis: And we wish to find

D KU R Dx R KR

x U

2*

2argminKR

U y

Page 26: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

HOGgles under the hood (Baseline 3 -Direct) PROs:

Recover high frequencies CONs:

adding noise to the image

Page 27: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

HOGgles under the hood (the main algorithm –Pair Dictionary) Let be an image be its HOG

descriptor. Suppose we write x and y in terms of bases and respectively where are shared coefficients

y can be projected to V basis and then to image basis U

Dx R dand y RD KU R

d KV R

Page 28: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

HOGgles under the hood (Main Algorithm –PairDict)

d KV R

D KU R

Page 29: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

HOGgles under the hood (Baseline 4 –PairDict) How the bases U and V are found?

Solving Pair dictionaries learning problem.

The objective is simplified to a standard sparse coding and dictionary learning problem, optimized with SPAMS

J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online dictionary learning for sparse coding. In ICML, 2009. 4SPAMS – SParse Modeling Software, Code available at http://spams-devel.gforge.inria.fr/

*

*

Page 30: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

HOGgles under the hood (Baseline 4 –PairDict) Dictionaries optimization time is a few hours (offline). U and V are estimated with dictionary size Training samples from a large database

examples for U & V pairs – notice the correlation of the dictionaries

310K 610N

Page 31: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

HOGgles visual results

Page 32: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Today talk Motivation Related Work HOGgles under the hood.

3 Baselines + main algorithm Limitations Quantative + Qualitative evaluation

Human+HOG detector Paper Conclusion Future Development

Page 33: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

HOGgles – Limitations There exist a better visualization than paired

dictionaries although it may not be traceable to construct it.

On Recursive iterative solution some high frequencies are lost.

Page 34: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

HOGgles – Limitations – cont. Inversion is sensitive to the HOG

dimensionality

Page 35: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Today talk Motivation Related Work HOGgles under the hood.

3 Baselines + main algorithm Limitations Quantative + Qualitative evaluation

Human+HOG detector Paper Conclusion Future Development

Page 36: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Evaluation of Inversions PASCAL VOC 2011 dataset. Inverting patches correspond to objects. Quantitative evaluation:

How well each pixel in x is reconstructed from y by each Algorithm?

Qualitative evaluation: How well the high level content is saved?

(Human research using MTurk platform).

Page 37: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Evaluation of InversionsQuantitative Mean normalized cross correlation of inverse

image to ground truth. (Higher is better , max. is 1)

bicycle bottle car cat chair table motorbike person Mean0.40

0.45

0.50

0.55

0.60

0.65

0.70

0.75

0.80

0.56

0.67

0.64

0.71

0.62 0.610.59

0.65 0.64

Evaluation of Performance

ELDA Ridge Direct PairDict

Page 38: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Evaluation of InversionsQualitative MTurk workers where asked to classified

inversions to 1 of 20 categories. MIT PhD. In CV refers as experts. Trying the

same task with HOG Glyphs.

(*) Numbers are percentage classified correctly, chance is 0.05

Page 39: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Evaluation of InversionsQualitative

Graphs credit : Ezgi Mercan

bicycle bottle car cat chair table motorbike person Mean

-0.10

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.31

0.45

0.59

0.20

0.39

0.24 0.22

0.68

0.38

Evaluation of Vizualization PerformanceELDA Ridge Direct PairDict Glyph Expert

Page 40: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Evaluation of InversionsQualitative Gliphs vs. HOGgles

Page 41: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Today talk Motivation Related Work HOGgles under the hood.

3 Baselines + main algorithm Limitations Quantative + Qualitative evaluation

Human+HOG detector Paper Conclusion Future Development

Page 42: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Human+HOG detector Get Insight on performance of HOG with the

perfect learning algorithm (people).

Large Human Expirement consist of: MTurk workers Dataset – Top detections from DPM on PASCAL

2007 VOC. 5000 windows per category. 20% are true positive. 25 votes per window.

Page 43: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Human+HOG detectorResults

Page 44: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Paper Conclusions DPM is operating very close to the

performance limit of HOG. HOG may be too lossy of a descriptor for high

performance object detection. The features we are using are the one to

blame in current novel object detection algorithms.

To advance to the next level recognition – finer details and higher level information capturing features are needed to be built.

Page 45: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Written Level of the Paper Pros:

Well written Well referenced Novel solution Large and detailed Human experiment Website.

Cons: Details/examples on baseline algorithms are

lacking The chosen algorithm settings are lacking Some conclusions on the Human+HOG are

questionable

Page 46: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Future development Applying visualization on other descriptors. Color reconstruction of HOG features. Developing new feature that can be more

discriminate. Developing Algorithms that better model the

interaction of the simple atoms of the image.

Page 47: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

THANK YOU!QUESTIONS?

Page 48: HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Links PASCAL 2 challenge Deformable Part Model HOGgles