matthias wimmer, ursula zucker and bernd radig chair for image understanding computer science...
TRANSCRIPT
Matthias Wimmer, Ursula Zucker and Bernd Radig
Chair for Image UnderstandingComputer Science
Technische Universität München
{ wimmerm, zucker, radig }@in.tum.de
Human Capabilities on Video-based Facial Expression Recognition
2007-09-10 2/10Technische Universität MünchenUrsula Zucker
Motivation Facial Expression Recognition
goal: human-like man-machine communication six universal facial expressions [Ekman]:
anger, disgust, fear, happiness, sadness, surprise minimal muscle activity
-> reliable recognition is difficult recognition rate of state-of-the-art approaches: ~ 70%
Question How reliable do humans specify facial expressions?
-> survey to determine human capabilities
2007-09-10 3/10Technische Universität MünchenUrsula Zucker
The Facial Expression Database
Cohn-Kanade AU-Coded Facial Expression Database 488 image sequences (containing 4 up to 66 images) each showing one of the six universal facial expressions no natural facial expressions (simulated ground truth) no context information
2007-09-10 4/10Technische Universität MünchenUrsula Zucker
Description of Our Survey Execution of the Survey
participants are shown randomly selected sequences 250 participants 5413 annotations -> approx. 11 per sequence
2007-09-10 5/10Technische Universität MünchenUrsula Zucker
Evaluation Evaluation of the Survey
no ground truth -> comparison of the annotations to one another
annotation rate for each sequence and each facial expression
relative agreement for an expression confusion between facial expressions
Comparison to algorithms recognition rate
2007-09-10 6/10Technische Universität MünchenUrsula Zucker
Annotation Rate for Each Sequence
Explanation: 488 rows 1 row = 1 sequence darker regions denote a
higher annotation rate sorted by similar annotation
Result: happiness:
best annotation rates surprise and fear:
get confused often fear: difficult to tell apart
2007-09-10 7/10Technische Universität MünchenUrsula Zucker
Relative Agreement
Explanation: example: annotating the sequences as happiness
~ 350 sequences annotated as happiness by nobody, ~ 50 sequences annotated as happiness by everybody
well-recognized facial expressions have peaks at “0” and at “1”
2007-09-10 8/10Technische Universität MünchenUrsula Zucker
Confusion Between Facial Expressions
)(
)(),(
21
2121
H
H
fear and surprise: high confusion happiness and disgust: low confusion
confusion rate
anger disgust fear happiness sadness surprise
anger 100% 42% 24% 7% 43% 29%disgust 42% 100% 33% 6% 19% 25%fear 24% 33% 100% 11% 16% 44%happiness 7% 6% 11% 100% 7% 14%sadness 43% 19% 16% 7% 100% 29%surprise 29% 25% 44% 14% 29% 100%
),( 21
2007-09-10 9/10Technische Universität MünchenUrsula Zucker
Comparison: humans vs. algorithms
ground truth: provided by Michel et. al.
Results: Michel et. al.: worse at recognizing anger Schweiger et. al.: worse at recognizing disgust, fear,
happiness and on the average
facial human specification result of the algorithm result of the algorithmexpression during our survey of Michel et. al. of Schweiger et. al.
anger 72% 67% 76%disgust 64% 64% 30%fear 28% 67% 0%happiness 91% 92% 79%sadness 53% 63% 61%surprise 77% 83% 90%
average 64% 72% 56%
2007-09-10 10/10Technische Universität MünchenUrsula Zucker
Conclusion Survey applies similar assumptions as algorithms:
consideration of visual information only no context information no natural facial expressions
Summary of our results: poor recognition rate of humans – worse than expected some facial expressions get confused easily
Conclusion & Outlook: integration of more sources of information is highly
recommended, e. g. audio/language, context, ...