authors : antonio torralba, kevin p. murphy, william t. freeman, and mark a. rubin

13
An opposition to: Context-Based Vision System for Place and Object Recognition Contextual Models for Object Detection Using BRFs Authors: Antonio Torralba, Kevin P. Murphy, William T. Freeman, and Mark A. Rubin Opponent: Carlos Vallespi

Upload: aure

Post on 23-Jan-2016

38 views

Category:

Documents


0 download

DESCRIPTION

An opposition to: Context-Based Vision System for Place and Object Recognition Contextual Models for Object Detection Using BRFs. Authors : Antonio Torralba, Kevin P. Murphy, William T. Freeman, and Mark A. Rubin Opponent : Carlos Vallespi. Paper claims. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Authors : Antonio Torralba, Kevin P. Murphy, William T. Freeman, and Mark A. Rubin

An opposition to:

Context-Based Vision System for Place and Object Recognition Contextual Models for Object Detection Using BRFs

Authors: Antonio Torralba, Kevin P. Murphy, William T. Freeman, and Mark A. Rubin

Opponent: Carlos Vallespi

Page 2: Authors : Antonio Torralba, Kevin P. Murphy, William T. Freeman, and Mark A. Rubin

Paper claims Claims to recognize 63 different locations. Claims to categorize new environments Claims to help object recognition by

suggesting presence and location.

Page 3: Authors : Antonio Torralba, Kevin P. Murphy, William T. Freeman, and Mark A. Rubin

Place recognition

Temporal information is available. HMM will help a lot to the classifier.

Only 2-3 choices are possible at a time, knowing the current state.

Is the classifier really doing anything?

Is the classifier really doing anything?

Page 4: Authors : Antonio Torralba, Kevin P. Murphy, William T. Freeman, and Mark A. Rubin

Simple place recognition with SIFT

Database

Page 5: Authors : Antonio Torralba, Kevin P. Murphy, William T. Freeman, and Mark A. Rubin

Simple place recognition with SIFT

Test DB

Page 6: Authors : Antonio Torralba, Kevin P. Murphy, William T. Freeman, and Mark A. Rubin

Comparing with SIFT

74 matches

Page 7: Authors : Antonio Torralba, Kevin P. Murphy, William T. Freeman, and Mark A. Rubin

Comparing with SIFT

Some correct matches

Page 8: Authors : Antonio Torralba, Kevin P. Murphy, William T. Freeman, and Mark A. Rubin

Comparing with SIFT

Correct no matches

Page 9: Authors : Antonio Torralba, Kevin P. Murphy, William T. Freeman, and Mark A. Rubin

Comparing with SIFT No incorrect mismatches Just one weak match (22 matches):

Provided 9 locations and 100% accuracy in the test set.

Page 10: Authors : Antonio Torralba, Kevin P. Murphy, William T. Freeman, and Mark A. Rubin

Scene categorization This paper claims that they are able to categorize 17 unseen scenarios. We have seen other methods in the past for scene categorization that also

worked well (with up to 13 classes): Bag-of-words approaches (using textons, for instance). Histogram-based approaches. Torralba’s paper (using image frequencies).

They use an average of local features over the image with a sliding window.

In fact, this is just a sort of histogram approach (nothing new). DB does not seem very generic. They do not compare with other

methods. It performs poorly, except for the exception of the HMM:

Page 11: Authors : Antonio Torralba, Kevin P. Murphy, William T. Freeman, and Mark A. Rubin

Object presence and location Their own images speak for themselves ;)

???

A filecabinet is expected to be seen in almost the entire image.

Most of the objects that are highly expected to be found, do not show up.

Page 12: Authors : Antonio Torralba, Kevin P. Murphy, William T. Freeman, and Mark A. Rubin

Object presence and location Their own images speak for themselves ;)

Except for the case of the building (which I am sure I could get something similar by averaging all the bounding boxes of buildings), all others are wrong… even the sky.

Page 13: Authors : Antonio Torralba, Kevin P. Murphy, William T. Freeman, and Mark A. Rubin

Conclusions Place recognition:

It seems to be an easy problem, that can be solved by simpler methods without temporal information.

An HMM alone could have done similar work. Scene categorization:

Suspicious DB Only works because of the temporal information.

Object presence and location: Just does not work.