the$analysisof$faces$ in$brainsand$machinescs.wellesley.edu/~vision/slides/faces.pdf ·...

The Analysis of Faces in Brains and Machines

9.523 Aspects of a Computational Theory of Intelligence

Rafael Reif

stay tuned...

Why is face analysis important for intelligence?

Remember/recognize people we’ve seen before Categorization – e.g. gender, race, age, kinship Social communication – emotions/mood, intentions, trustworthiness,

competence or intelligence, attractiveness Scene understanding, e.g. direction of gaze suggests focus of attention

Why is face recognition hard?

changing pose changing illumination

changing expression clutter

occlusion

aging

Jenkins, White, Van Montfort & Burton, Cognition, 2011

How good are we at face recognition?

Face recognition performance in humans

chance performance

testmybrain.org

Wilmer et al., 2012 Duchaine & Nakayama, 2006

Bruce et al., 1999

Face recognition performance in humans Which of the 10 photos on the bottom depicts the target face? Viewers are ~ 70% correct Performance degrades with changes in pose, expression Only slight improvement with short video clip of target

Importance of familiar vs. unfamiliar face recognition!

How good are the best machines? Public databases of face images serve as benchmarks:

Labeled Faces in the Wild (LFW, http://vis-‐www.cs.umass.edu/lfw) > 13,000 images of celebrities, 5,749 different identities

YouTube Faces Database (YTF, http://www.cs.tau.ac.il/~wolf/ytfaces) 3,425 videos, 1,595 different identities

Private face image datasets:

(Facebook) Social Face Classieication dataset 4.4 million face photos, 4,030 different identities

(Google) 100-‐200 million face images, ~ 8 million different identities

LFW YTF Facebook DeepFace 97.4% 91.4% Google FaceNet 99.6% 95.1% Human performance 97.5% 89.7%

Machine vision applications of face recognition

surveillance

access control

security, forensics

More applications of face recognition

content-‐based image retrieval social media

graphics, HCI humanoid robots

Aspects of face processing

Face detection – eind image regions that contain faces Face identieication – who is the person? Categorization – gender, age, race Facial expression – mood, emotion Non-‐verbal social perception and communication

It all began with Takeo Kanade (1973)… PhD thesis, Picture Processing System by Computer Complex and

Recognition of Human Faces

•  Special purpose algorithms to locate eyes, nose, mouth, boundaries of face

•  ~ 40 geometric features, e.g. ratios of distances and angles between features

Eigenfaces for recognition (Turk & Pentland) Principal Components Analysis (PCA)

Goal: reduce the dimensionality of the data while retaining as much information as possible in the original dataset

PCA allows us to compute a linear transformation that maps data from a high dimensional space to a lower dimensional subspace

Typical sample training set…

One or more images per person

Aligned & cropped to common pose, size

Simple background

Sample images from the Yale face database, results from C. deCoro http://www.cs.princeton.edu/~cdecoro/eigenfaces/

Eigenfaces for recognition (Turk & Pentland)

1-14

Perform PCA on a large set of training images, to create a set of eigenfaces, Ei(x,y), that span the data set

First components capture most of the variation across the data set, later components capture subtle variations

Each face image F(x,y) can be expressed as a weighted combination of the eigenfaces Ei(x,y):

Ψ(x,y): average face (across all faces)

Ψ(x,y)

http://vismod.media.mit.edu/vismod/demos/facerec/basic.html

F(x,y) = Ψ(x,y) + Σi wi*Ei(x,y)

Representing individual faces Each face image F(x,y) can be expressed as a weighted combination of the eigenfaces Ei(x,y):

Recognition process: (1)  Compute weights wi

for novel face image

(2)  Find image m in face database with most similar weights, e.g.

min (wi −wim

i=1

k

∑ )2

F(x,y) = Ψ(x,y) + Σi wi*Ei(x,y)

Changing expressions & lighting

1-16

Eigenfaces approach handles changes in facial expression ok…

… but not changes in lighting

(results from C. deCoro)

1-17

Face detection: Viola & Jones

Multiple view-‐based classi4iers based on simple features that best discriminate faces vs. non-‐faces

Most discriminating features learned from thousands of samples of face and non-‐face image windows

Attentional mechanism: cascade of increasingly discriminating classieiers improves performance

1-18

Viola & Jones use simple features Use simple rectangle features:

Σ I(x,y) in gray area – Σ I(x,y) in white area within 24 x 24 image sub-‐windows

• Initially consider 160,000 potential features per sub-‐window!

• features computed very efeiciently

Which features best distinguish face vs. non-‐face?

Learn most discriminating features from thousands of samples of face and non-‐face image windows

1-19

Learning the best features x = image window f = feature p = +1 or -‐1 θ = threshold

weak classiBier using one feature:

(x1,w1,1) (xn,wn,0)

…

normalize weights

find next best weak classifier

use classieication errors to update weights

n training samples, equal weights, known classes

τ

einal classieier

~ 200 features yields good results for “monolithic” classieier

AdaBoost

1-20

“Attentional cascade” of increasingly discriminating classieiers

Early classieiers use a few highly discriminating features, low threshold

•  1st classieier uses two features, removes 50% non-‐face windows

•  later classieiers distinguish harder examples

• Increases efeiciency

• Allows use of many more features

à Cascade of 38 classieiers, using ~6000 features

Training with normalized faces

5000 faces many more non-‐face patches faces are normalized for scale, rotation small variation in pose

1-22

Viola & Jones results

With additional diagonal features, classieiers were created to handle image rotations and proeile views

Feature based vs. holistic processing

•  inversion disrupts recognition of faces more than other objects

•  prosopagnosics do not show inversion effect

Composite Face Effect

•  identical top halves seen as different when aligned with different bottom halves

•  when misaligned, top halves perceived as identical

Face Inversion Effect

Feature based vs. holistic processing Which features are more diagnostic?

Whole-‐Part Effect

Identieication of the “studied” face is signieicantly better in the whole vs. part condition

Test conditions Eyebrows are important!

View generalization mediated by motion? Hypothesis: Temporal association is used to link multiple views of a person’s face

12 female faces scanned for 3D shape and visual texture

image sequences were created that morph between two different faces

observers viewed morph sequences, back and forth

same or different person? (shown separated in time)

performance within morph groups was compromised by temporal association

✔

Wallis & Bulthoff, PNAS, 2001

The power of averages (Burton & colleagues)

Improves accuracy in the recognition of famous faces

-‐  PCA -‐  commercial system -‐  human experiments average “texture”

average “shape”

Faces everywhere...

the$analysisof$faces$ in$brainsand$machinescs.wellesley.edu/~vision/slides/faces.pdf ·...

Documents