soma biswas department of electrical engineering … · face detection: first step in face...
TRANSCRIPT
S O M A B I S W A S
D E P A R T M E N T O F E L E C T R I C A L E N G I N E E R I N G
I I S C , B A N G A L O R E
Face Detection and Recognition 1
Biometrics 2
Biometrics in the early 1900s
Bertillon’s system of bodily measurements, called anthropometry, as used in the United States in the early 1900s
3
Alphonse Bertillon
Wikipedia
Biometrics Now
Automatic methods of recognizing / validating the identity of a person based on physiological or behavioral characteristic(s)
Natural, non-intrusive, easy-to-acquire and use
Ability to acquire signature from non-cooperating subjects
Why Face?
4
Applications
5
Usefulness
6
What is Face Recognition?
Mr. ? Mr. A
Slide Credit: Prof. Rama Chellappa, UMD
8
Different FR tasks
Source: S. Kevin Zhou, Ph. D. Dissertation. Univ. of Maryland.
9
Face Detection: First Step In Face Recognition
First step in any automatic face recognition system
Most FR algorithms assume the face to be detected and accurately cropped
Accurate localization is required for good recognition
Input Image Detected Face
Extract features from detected
face
Identify who the person is
10
Problem Definition
Given an arbitrary image, the goal is to determine if there are any faces in the image and return the image location and extent of each face.
Related Problems:
1) Facial feature detection: detect the presence and location of features, such as eyes, nose, etc.
2) Face tracking: in an image sequence
3) Face localization: assume one face in image
11
Challenges
Non-rigid object: Pose, expression variations
Illumination variations
Variation in scale, location, orientations
Presence/absence of glasses, beard, moustaches
Occlusion: in group images
12
Viola Jones Face Detector - Basic Idea
Two class problem – face / non-face
Feature Extraction Feature Selection
Cascade of Classifiers
Training data (face & non-face)
1) Simple 2) Easy to compute 3) Large number
Select the useful ones
1) Efficiency, speed 2) Reduce number of non-face subwindows
13
P. A. Viola, M. J. Jones. “Robust Real-Time Face Detection”, in International Journal of Computer Vision 57 (2), 137-154, 2004 - 2196 citations (google scholar)
Feature Extraction
Rectangular features - Haar like features -> simple, easy to compute -> efficient detector
Feature: White areas subtracted from black ones
Two rectangular feature
Three rectangular feature
Four rectangular feature
14
Integral Image
Features computed very rapidly using integral image
Integral image at (x,y)
(0,0)
(x,y)
x
y
Cumulative sum
15
Efficient Computation of Rectangular Value
Integral image can be computed in one pass
Value of any rectangular sum can be computed efficiently
16
Rectangular Features
Very simple, efficient
Coarse - sensitive to vertical, horizontal or diagonal orientations
Large number of features: each feature type is scaled and shifted -> rich representation
For a 24X24 detection window,
No. of possible features ~160,000
17
Feature Selection
No. of possible features ~ 160,000 (24 X 24 region)
Computation of entire feature set impractical
Question:
1) Can a good classifier be created using a small subset of features?
2) How to select this subset?
Training set
Positive and Negative Images
Features Machine Learning algorithms
18
Adaboost
Adaboost used for selecting features and training classifier
Boosting: classification scheme that works by combining weak learners into a more accurate classifier
Step 1: Given a set of weak classifiers, pick the best one
Weak Classifier 1
19
Adaboost - Step2
Reweight the data so that the inputs where the first classifier has made errors get larger weights
20
Adaboost – Step 3
Now choose the second weak classifier based on the weighted data
21
Adaboost – Step 4
Reweight the data according to the errors and choose the next classifier and so on……
22
Adaboost – Final Step
Final classifier is a weighted combination of all the weak classifiers
23
Boosting for Face Detection
Greedy feature selection process
Weak learners - based on rectangular features
In each round of boosting
Evaluate each feature on the examples
Select best threshold so that minimum number of examples are misclassified
Choose classifier: best feature and threshold combination
Reweight examples
More features improves classification accuracy
24
Features Selected by Boosting 25
Example Classifier
A classifier with 200 features learned by Adaboost
Detection rate = 95 % on test data with 1 in 14084 false positive
Time: 0.7 sec for a 384 x 288 image
Adding more features – increase computation time
26
Attentional Cascade: Cascade of Classifiers
Improve detection, reduce computation time
In an image, majority does not contain face
Simple classifiers - reject majority of subwindows,
before complex classifiers are used
Most non-face regions are
rejected at early stage
Early stage classifier deals with easy instances while the deeper classifier faces more difficult cases.
27
Cascade of Classifiers: Details
Final detector: 38 cascade layers, total 6060 features
Classifier 1: 2 features, rejects ~50% non-faces while detecting ~100% faces
Classifier 2: 10 features, rejects ~80% non-faces
In each layer, number of features increased till false positive rate was reduced
More levels added till false
positive rate was ~zero, but
high correct detection rate
28
Training Data
4916 hand labeled faces
Scaled and aligned to
resolution of 24 x 24
Faces downloaded from
World Wide Web
Faces roughly aligned
29
Detection Results
Faces are detected at multiple scales
30
More Detection Results 31
Failure Modes
Faces tilted more than ±15 degrees in plane and about ±45 degrees out of plane (towards profile) Detector trained on roughly aligned, frontal, upright faces
Harsh backlighting: Face dark, background light
Normalization improves the detection rate
Computational cost increases greatly
Significant occlusion – specially eyes
Face with covered mouth is usually detected
32
Evaluation Metrics
Detection rate: ratio of number of faces correctly detected to number faces determined by a human.
An image region identified as a face by a classifier is considered to be correctly detected if it covers more than a certain percentage of a face in the image
Two types of errors:
False negatives: faces are missed causing low detection rates
False positives: image region declared to be face, but it is not.
Both factors important since parameters can be tuned to increase the detection rates while also increasing the number of false detections.
33
Evaluation of Viola Jones Detector
MIT+CMU frontal face test data
130 images, 507 labeled frontal faces
15 times faster than [3] and 600 times faster than [4]
34
Areas where this idea has been used
Car Detection
35
Pedestrian detection
Pedestrian detection using boosted features over many frames, M.J.Jones, D. Snow, ICPR 2008. A 3D Teacher for Car Detection in Aerial Images , S. Kluckner, G. Pacher, H. Grabner, H. Bischof, J. Bauer, ICCV 2007 Sharing Visual Features for Multiclass and Multiview Object Detection, A. Torralba, K.P. Murphy, W.T. Freeman, PAMI 2007
Knowledge-Based Method – Basic Idea
Rule-based methods encode human knowledge of what constitutes a typical face.
Face has two eyes symmetric to each other, a nose, mouth.
Usually, the rules capture the relationships between facial features – relative distances and positions
Facial features extracted and face candidates identified based on the rules
36
Knowledge-Based Method: Summary
Advantages:
Easy to come up with simple rules to describe the features of a face and their relationships
Work well in for frontal faces in uncluttered scenes
Drawbacks:
Translate human knowledge into well-defined rules
Too strict: fail to detect faces that do not pass all the rules.
Too general: many false positives.
Difficult for non-frontal poses since it is challenging to enumerate all possible cases.
37
Template Matching
Store a template
Predefined: based on edges or regions
Deformable: based on facial contours (e.g. Snakes)
Templates are usually hand-coded (not learned)
Use correlation to locate faces
38
- Relative brightness of different face parts do not change -Use relative pair-wise ratios of the brightness of facial regions - Eyes are usually darker than the surrounding face
Template Matching: Summary
Advantage:
Simple to implement.
Drawback:
Difficult to enumerate templates for different poses
Cannot effectively deal with variation in scale, pose, and shape.
39
Face Detection in Video
Advantages: An easier problem than detection in still images
Use all available cues: motion (frame differencing, background modeling)
40
Detecting Faces in Unconstrained Setting
FDDB: A benchmark for Face Detection in Unconstrained Settings. Technical Report UM-CS-2010-009, Dept. of Computer Science, University of Massachusetts, Amherst. 2010.
Annotations for 5171 faces from 2845 images (http://vis-www.cs.umass.edu/fddb/)
41
Face Recognition 42
The ‘Thatcher Illusion’ 43
Source: Thompson, P. (1980). Margaret Thatcher: A new illusion. Perception, 9(4), 483-484
Face Recognition: Challenges 44
Different Modalities 45
Image Formation Model 46
+
Light source
Intensity image
Albedo/Texture
Surface normals
Illumination-insensitive signature
47
47
Shape From Shading (SfS)
Input image Albedo estimate
Normalized image
Shape estimate Albedo-mapped shape
48
Shape vs Texture 49
Overview of Morphable Model Approach 50
Ref: V. Blanz and T. Vetter, A Morphable Model for the Synthesis of 3D Faces, SIGGRAPH 1999
Applications of FR across Aging
Homeland security
Missing individuals
Multimedia
51
http://www.digitalworldtokyo.com/index.php/digital_tokyo/articles/ face_recognition_machines_to_stop_under_age_smoking_not/
Fujitaka’s ‘Child Check System’
FR across Aging 52
Facial Aging (Shape vs. Texture)
Facial aging effects are manifested in different forms during different ages
Facial aging can be described as a problem of characterizing facial shape and facial texture as functions of time
53
http://www.psychology.ecu.edu.au/photoaging/pages/look.html
Photoaging i.e., skin aging due to solar radiation
Drug use, smoking & stress
Advantages of Video FR
Compared to single image FR, there are multiple images
Can integrate information temporally across the video sequence
Dynamic information can help
54
Typical Video Based Face Recognition 55
A typical two-stage system
Tracking module
Recognition module
Head pose may vary significantly
Robust to misalignment errors
Robust to partial occlusion
Large amounts of data – data management/storage
Different Ways of Fusion 56
Decision level fusion
Feature level fusion Features
Score Score Score Score level
Fusion
Decision Decision Decision
Use Temporal Information 57
Video Sequence
Pose Manifolds
Transition between the pose manifolds – utilizes temporal information
Compute distance of test data from gallery appearance manifold
Training
Testing
K. C. Lee, J. Ho, M. H. Yang and D. Kriegman, Video-based face recognition using probabilistic appearance manifolds, CVPR 2003
Matching Faces across Plastic Surgery 58
G. Aggarwal, S. Biswas, P.F. Flynn, K.W. Bowyer: A sparse representation approach to face matching across plastic surgery, WACV 2012.
Matching Faces of Identical Twins 59
http://www.twinsdays.org/