sketch tokens: a learned mid-level representation for contour and object detection

1
Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection Joseph J. Lim (MIT), C. Lawrence Zitnick (MSR), Piotr Dollár (MSR) Overview Method ODS OIS AP Speed Human 0.80 0.80 - Canny 0.60 0.64 0.58 1/15s gPb 0.73 0.76 0.73 240s SCG 0.74 0.76 0.77 280s Sketch tokens 0.73 0.75 0.78 1s 200x fast er! Sketch Tokens Goal: learn and detect local contour-based representation for mid-level features Sketch Tokens: • Local edge structures (e.g. straight lines, t-junctions, y- junctions) • Discovered from human- generated image sketches We demonstrate our approach on both top-down and bottom-up tasks. • State-of-the-art result on contour detection, while 200x faster • Large improvements on object and pedestrian detection. Method Contour Detection (BSDS 500) Object Detection on PASCAL2007 Conclusion MATLAB code is available on the website We are given a set of images, I, and its corresponding set of binary contour images, S. Defining Sketch Tokens Sketch Tokens are clusters of extracted patches from the binary contour images S. - Each patch has a fixed size of 35x35, and its center pixel must be on a labeled contour - 150 clusters are extracted using K- means on Daisy descriptors computed on binary patches. Detecting Sketch Tokens Given a set of sketch token classes, our goal is to detect them in color images. Each color patch’s ground truth class is assigned to one of Sketch Token or background class. We used random forest classifier with various features (e.g. CIE-LUV intensity, orientation, and self-similarity). Method # channel s miss rate LUV+M+O 10 17.2% ST 151 19.5% ST+LUV+M+O 161 14.7% INRIA Pedestrian Detection Method plan e bike bird boat bott le bus car cat chai r cow HOG 19. 7 43. 9 2.2 4.8 13. 4 36. 6 40. 2 5.4 10. 9 15. 7 ST 17. 8 41. 1 4.8 5.7 11. 1 31. 9 33. 8 5.1 10. 8 16. 1 HOG+ST 9 5 6.3 6.4 6 5 3 6.1 7 2 Method tabl e dog hors e moto pers on plan t shee p sofa trai n tv HOG 7.5 2.1 41. 9 30. 9 23. 9 3.4 9.3 14. 8 26. 9 32. 4 ST 7.4 3.1 32. 9 27. 0 20. 9 4.6 8.6 10. 4 18. 9 26. 3 14. 46. 34. 30. 15. 18. 30. 36. We used Sketch Token responses (150 st + 1 bg dimension) on images as additional features to the deformable parts model detector. On average, we improved 3.8 AP. In addition to standard features used in Dollár et. al.’s implementation, we added Sketch Token responses. t1 t2 t3 t4 t5 t6 t7 t8 t9 t14 t15

Upload: dyan

Post on 17-Jan-2016

79 views

Category:

Documents


5 download

DESCRIPTION

Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection. Joseph J. Lim (MIT), C. Lawrence Zitnick (MSR), Piotr Dollár (MSR). Contour Detection (BSDS 500). Object Detection on PASCAL2007. Overview. Method. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection

Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection

Joseph J. Lim (MIT), C. Lawrence Zitnick (MSR), Piotr Dollár (MSR)

Overview

Method ODS OIS AP Speed

Human 0.80 0.80 -

Canny 0.60 0.64 0.58 1/15s

gPb 0.73 0.76 0.73 240s

SCG 0.74 0.76 0.77 280s

Sketch tokens 0.73 0.75 0.78 1s

200x faster

!

Sketch Tokens

Goal: learn and detect local contour-based representation for mid-level features

Sketch Tokens:• Local edge structures (e.g. straight lines, t-junctions, y-junctions)• Discovered from human-generated image sketches

We demonstrate our approach on both top-down and bottom-up tasks.

• State-of-the-art result on contour detection, while 200x faster• Large improvements on object and pedestrian detection.

Method

Contour Detection (BSDS 500) Object Detection on PASCAL2007

Conclusion

MATLAB code is available on the website

We are given a set of images, I, and its corresponding set of binary contour images, S.

Defining Sketch Tokens

Sketch Tokens are clusters of extracted patches from the binary contour images S.

- Each patch has a fixed size of 35x35, and its center pixel must be on a labeled contour- 150 clusters are extracted using K-means on Daisy descriptors computed on binary patches.

Detecting Sketch TokensGiven a set of sketch token classes, our goal is to detect them in color images.

Each color patch’s ground truth class is assigned to one of Sketch Token or background class.

We used random forest classifier with various features (e.g. CIE-LUV intensity, orientation, and self-similarity).

Method # channels miss rate

LUV+M+O 10 17.2%

ST 151 19.5%

ST+LUV+M+O 161 14.7%

INRIA Pedestrian Detection

Method plane bike bird boat bottle bus car cat chair cow

HOG 19.7 43.9 2.2 4.8 13.4 36.6 40.2 5.4 10.9 15.7

ST 17.8 41.1 4.8 5.7 11.1 31.9 33.8 5.1 10.8 16.1

HOG+ST 21.9 48.5 6.3 6.4 14.6 41.5 43.3 6.1 15.7 19.2

Method table dog horse moto person plant sheep sofa train tv

HOG 7.5 2.1 41.9 30.9 23.9 3.4 9.3 14.8 26.9 32.4

ST 7.4 3.1 32.9 27.0 20.9 4.6 8.6 10.4 18.9 26.3

HOG+ST 14.2 3.8 46.1 34.5 30.9 8.1 15.3 18.9 30.3 36.6

We used Sketch Token responses (150 st + 1 bg dimension) on images as additional features to the deformable parts model detector.

On average, we improved 3.8 AP.

In addition to standard features used in Dollár et. al.’s implementation, we added Sketch Token responses.

t1

t2 t3

t4 t5 t6 t7

t8 t9 t14 t15