face detection slides adapted grauman & liebe’s tutorial bleibe/teaching/tutorial-aaai08/ also...

Face detection

Slides adapted Grauman & Liebe’s tutorial • http://www.vision.ee.ethz.ch/~bleibe/teaching/tutorial-aaai08/

Also see Paul Viola’s talk (video)• http://www.cs.washington.edu/education/courses/577/04sp/contents.html#DM

http://www.vision.ee.ethz.ch/~bleibe/teaching/tutorial-aaai08/

http://www.cs.washington.edu/education/courses/577/04sp/contents.html

2

Limitations of Eigenfaces

Eigenfaces are cool.

But they’re not great for face detection.

Chief Limitations• not very accurate• not very fast

To make it work on the camera, we need ~30fps, and near-perfect accuracy.

2

3

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Rectangle filters

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1

P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. CVPR 2001.

http://research.microsoft.com/en-us/um/people/viola/pubs/detect/violajones_cvpr2001.pdf

4

Why rectangles?

Answer: very very fast to compute• Trick: integral images (aka summed-area-tables)

(x,y)

Value at (x,y) is sum of pixels above and to the left of (x,y)

integral image

5

Integral images

What’s the sum of pixels in the blue rectangle?

input image integral image

A B

C D

6

Integral images

integral image

A B

C D

7

Integral images

integral image

A

8

Integral images

integral image

B

9

Integral images

integral image

C

10

Integral images

integral image

D

Integral imagesWhat’s the sum of pixels in the rectangle?

DA CB

Computing sum within a rectangle

• Let A,B,C,D be the values of the integral image at the corners of a rectangle

• Then the sum of original image values within the rectangle can be computed as: sum = A – B – C + D

• Only 3 additions are required for any size of rectangle!

D B

C A

Lana Lazebnik

Filter as a classifierHow to convert the filter into a classifier?

Outputs of a rectangle feature on faces and non-

faces.

Resulting weak classifier:

14

Finding the best filters...

Considering all possible filter parameters: position, scale, and type:

180,000+ possible filters associated with each 24 x 24 window

Which of these filters(s) should we use to determine if a window has a face?

Boosting

Weak Classifier 1

Slide credit: Paul Viola

Boosting

WeightsIncreased

Boosting

Weak Classifier 2

Boosting

WeightsIncreased

Boosting

Weak Classifier 3

Boosting

Final classifier is a combination of weak classifiers

Boosting: training

• Initially, weight each training example equally

• In each boosting round:– find the weak classifier with lowest weighted training error

– raise weights of training examples misclassified by current weak classifier

• Final classifier is linear combination of all weak classifiers

– weight of each learner is directly proportional to its accuracy)

• Exact formulas for re-weighting and combining weak classifiers depend on the particular boosting schemeSlide credit: Lana Lazebnik

Perc

eptu

al and S

enso

ry A

ugm

ente

d C

om

puti

ng

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al

AdaBoost AlgorithmStart with uniform weights on training examples

Evaluate weighted error for each feature, pick best.Re-weight the examples:Incorrectly classified -> more weightCorrectly classified -> less weight

Final classifier is combination of the weak ones, weighted according to error they had.

Freund & Schapire 1995

{x1,…xn}For T rounds

Perc

eptu

al and S

enso

ry A

ugm

ente

d C

om

puti

ng

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al

First two features selected

Viola-Jones Face Detector: Results

• Even if the filters are fast to compute, each new image has a lot of possible windows to search.

• How to make the detection more efficient?

Cascading classifiers for detection

• Form a cascade with low false negative rates early on

• Apply less accurate but faster classifiers first to immediately discard windows that clearly appear to be negative Kristen Grauman

Viola-Jones detector: summary

•Train with 5K positives, 350M negatives•Real-time detector using 38 layer cascade•6061 features in all layers•[Implementation available in OpenCV: http://www.intel.com/technology/computing/opencv/]

Faces

Non-faces

Train cascade of classifiers with

AdaBoost

Selected features, thresholds, and weights

New image

App

ly to

each

subw

indo

w

Kristen Grauman

Perc

eptu

al and S

enso

ry A

ugm

ente

d C

om

puti

ng

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al


Perc

eptu

al and S

enso

ry A

ugm

ente

d C

om

puti

ng

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al

Detecting profile faces?

Can we use the same detector?

Perc

eptu

al and S

enso

ry A

ugm

ente

d C

om

puti

ng

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al

Paul Viola, ICCV tutorial


Everingham, M., Sivic, J. and Zisserman, A."Hello! My name is... Buffy" - Automatic naming of characters in TV video,BMVC 2006. http://www.robots.ox.ac.uk/~vgg/research/nface/index.html

Example using Viola-Jones detector

Frontal faces detected and then tracked, character names inferred with alignment of script and subtitles.

Application: streetview

3434


3535


Consumer application: iPhoto 2009

http://www.apple.com/ilife/iphoto/Slide credit: Lana Lazebnik

http://www.apple.com/ilife/iphoto/

Consumer application: iPhoto 2009

Things iPhoto thinks are faces

Slide credit: Lana Lazebnik

http://www.flickr.com/groups/977532@N24/pool/

What other categories are amenable to window-based representation?

Perc

eptu

al and S

enso

ry A

ugm

ente

d C

om

puti

ng

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al

Pedestrian detection• Detecting upright, walking humans also possible using

sliding window’s appearance/texture; e.g.,

SVM with HoG [Dalal & Triggs, CVPR 2005]

Kristen Grauman

Perc

eptu

al and S

enso

ry A

ugm

ente

d C

om

puti

ng

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al

Window-based detection: strengths

• Sliding window detection and global appearance descriptors:

Simple detection protocol to implement Good feature choices critical Past successes for certain classes

Kristen Grauman

Perc

eptu

al and S

enso

ry A

ugm

ente

d C

om

puti

ng

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al

Window-based detection: Limitations

• High computational complexity For example: 250,000 locations x 30 orientations x 4 scales =

30,000,000 evaluations! If training binary detectors independently, means cost

increases linearly with number of classes

• With so many windows, false positive rate better be low

Kristen Grauman

Perc

eptu

al and S

enso

ry A

ugm

ente

d C

om

puti

ng

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al

Limitations (continued)

• Not all objects are “box” shaped

Kristen Grauman

Perc

eptu

al and S

enso

ry A

ugm

ente

d C

om

puti

ng

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al


• Non-rigid, deformable objects not captured well with representations assuming a fixed 2d structure; or must assume fixed viewpoint

• Objects with less-regular textures not captured well with holistic appearance-based descriptions

Kristen Grauman

Perc

eptu

al and S

enso

ry A

ugm

ente

d C

om

puti

ng

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al


• If considering windows in isolation, context is lost

Figure credit: Derek Hoiem

Sliding window Detector’s view

Kristen Grauman

Perc

eptu

al and S

enso

ry A

ugm

ente

d C

om

puti

ng

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al

Vis

ual O

bje

ct

Recog

nit

ion

Tu

tori

al


• In practice, often entails large, cropped training set (expensive)

• Requiring good match to a global appearance description can lead to sensitivity to partial occlusions

Image credit: Adam, Rivlin, & Shimshoni Kristen Grauman

face detection slides adapted grauman & liebe’s tutorial bleibe/teaching/tutorial-aaai08/ also...

Documents