where, who and what? @ait intelligent affective interaction icann, sept. 14, athens, greece
DESCRIPTION
Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece. Aristodemos Pnevmatikakis, John Soldatos and Fotios Talantzis Athens Information Technology, Autonomic & Grid Computing. Overview. CHIL AIT SmartLab Signal Processing for perceptual components - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/1.jpg)
Where, Who and What?
@AIT
Intelligent Affective InteractionICANN, Sept. 14, Athens, Greece
Aristodemos Pnevmatikakis, John Soldatos and Fotios TalantzisAthens Information Technology, Autonomic & Grid Computing
![Page 2: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/2.jpg)
Overview
• CHIL– AIT SmartLab
• Signal Processing for perceptual components– Video Processing– Audio Processing
• Services
• Middleware– Easing application assembly
![Page 3: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/3.jpg)
Computers in the Human Interaction Loop• EU FP6 Integrated Project (IP
506909) • Coordinators: Universität
Karlsruhe (TH) Fraunhofer Institute IITB
• Duration: 36 months• Total Project costs: Over 24M€• Goal: Create environments in
which computers serve humans who focus on interacting with other humans as opposed to having to attend to and being preoccupied with the machines themselves
• Key Research Areas:– Perceptual Technologies– Software Infrastructure– Human-Centric Pervasive
Services
![Page 4: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/4.jpg)
AIT SmartLab Equipment• Five fixed cameras (one with fish-eye lens)• PTZ camera• NIST 64-channel array• 4 clusters of 4 inverted T-shaped SHURE
microphone clusters• 4 tabletop microphones• 6 dual Xeon 3 GHz, 2 Gb PCs• Firewire cables & repeaters
![Page 5: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/5.jpg)
AIT SmartLab
![Page 6: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/6.jpg)
Perceptual Components
![Page 7: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/7.jpg)
Detection and Identification System
Recognizer
Detector
Eye detector
Head detector
TrackerFace
normalizerFace
recognizer
Frontal verifier
Confidence estimator
Weighted voting
Classifier confidence
ID
Frontality confidence
![Page 8: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/8.jpg)
Unconstrained Video Difficulties
![Page 9: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/9.jpg)
Where and Who are the World Cup Finalists?
• and European Champions?
![Page 10: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/10.jpg)
Tracking
Adaptive background
Parameters’ adaptation
Adaptive Background Module
Frames
Target association
Evidence Generation Module
Track initialization
Targetsplit?
Kalman Module
State
Prediction
Measurement update
Edge detection
Evidence extraction
SplitExisting
New
Predicted tracks
PPM
State information
No split
New state
Edges
Track consistency
Track memory
Track Consistency ModuleTargets
![Page 11: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/11.jpg)
Tracking – Smart Spaces
![Page 12: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/12.jpg)
Tracking – 3D from Synchronized Cameras
![Page 13: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/13.jpg)
Tracking – Outdoors Surveillance
• AIT system 2nd in the VACE / NIST surveillance evaluations
![Page 14: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/14.jpg)
Head DetectionEye
detectorHead
detectorTracker
Face normalizer
Face recognizer
Frontal verifier
Confidence estimator
Weighted voting
• Detection of head by processing the outline of the foreground belonging to the body
![Page 15: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/15.jpg)
Eye DetectionEye
detectorHead
detectorTracker
Face normalizer
Face recognizer
Frontal verifier
Confidence estimator
Weighted voting
• Vector quantization of colors in head region
• Detect candidate eye regions– Based on resemblance to skin, brightness, shape and size
• Selection amongst candidates based on face geometry
![Page 16: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/16.jpg)
Face Recognition from Video
![Page 17: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/17.jpg)
Effect of Eye Misalignment: LDA
2 3 4 5 6 7 8 9 100
5
10
15
20
25
30
35
Numbero of training images per person
PM
C (
%)
Ideal eyes
Ideal for training, detected for testingDetected for training, testing
![Page 18: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/18.jpg)
Effect of Eye Misalignment
0 1 2 3 4 5 6 70
10
20
30
40
50
60
RMS eye perturbation (%, relative to eye distance)
PM
C (%
)PCA
PCAw/o3
LDA
EBGM
Laplacianfaces
MACE
2D-HMM
![Page 19: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/19.jpg)
Edginess No preprocessing Feature vector Post-decision
5
10
15
20
25
30
PM
C (
%)
Classifier FusionIllumination variations Pose variations
• Classifier fusion addresses the fact that different classifiers are optimum for different recognition impairments
Edginess No preprocessing Feature vector Post-decision0
10
20
30
40
50
60
70
80
PM
C (
%)
![Page 20: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/20.jpg)
Fusion Across Time, Classifiers and Modalities
Speech of an individual collected
over 5 seconds
Faces of an individual collected
over 5 secondsH
isto
gram
eq
ualiz
atio
n
PCA LDA
Fus
ion
acro
ss ti
me
Fus
ion
acro
ss ti
me
N imagesN images N imagesN IDs and
confidences, PMC of 60%
N IDs and confidences, PMC of 58%
Fusion across classifiers
Single ID and confidence, PMC of 31%
Single ID and confidence, PMC of 36%
Visual ID and confidence, PMC of 29%
Fusion across modalities
Audio ID and confidence,
PMC of 9.7%
Audio-Visual ID, PMC of 6.8%
![Page 21: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/21.jpg)
Face Recognition @ CLEAR2006
15 sec training 30 sec training
Testing duration
(sec)1 5 10 20 1 5 10 20
AIT 50.57 29.68 23.18 20.22 47.31 31.14 26.64 24.72
UKA 46.82 33.58 28.03 23.03 40.13 23.11 20.42 16.29
UPC 79.77 78.59 77.51 76.40 80.42 77.13 74.39 73.03
New AIT
45.35 27.01 17.65 15.73 43.72 17.76 13.49 7.86
![Page 22: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/22.jpg)
Speaker ID @ CLEAR2006
15 sec training 30 sec training
Testing duration
(sec)1 5 10 20 1 5 10 20
AIT 26.92 9.73 7.96 4.49 15.17 2.68 1.73 0.56
CMU 23.65 7.79 7.27 3.93 14.36 2.19 1.38 0.00
LIMSI 51.71 10.95 6.57 3.37 38.83 5.84 2.08 0.00
UPC 24.96 10.71 10.73 11.80 15.99 2.92 3.81 2.81
AIT IS2006
25.69 5.60 4.50 2.25 15.01 2.19 2.42 0.0
![Page 23: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/23.jpg)
Audiovisual ID @ CLEAR200615 sec training 30 sec training
Testing duration
(sec)1 5 10 20 1 5 10 20
AIT 23.65 6.81 6.57 2.81 13.70 2.19 1.73 0.56
UIUC primary
17.61 2.68 1.73 0.56 13.21 2.43 1.38 0.56
UIUC contrast
20.55 5.60 3.81 2.25 15.99 3.41 2.42 1.12
UKA / CMU
43.07 29.20 23.88 20.22 35.73 19.71 16.61 12.36
UPC 23.16 8.03 5.88 3.93 13.38 2.92 2.08 1.12
![Page 24: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/24.jpg)
Audiovisual Tracker• Information-theoretic
speaker localization from mic. array– Accurate azimuth,
approximate depth, no elevation
• Moderate targeting of speaker’s face using a PTZ camera
• Refine targeting by visual face detection
![Page 25: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/25.jpg)
Services
![Page 26: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/26.jpg)
Memory Jog• Memory Jog:
– Context-Aware Human-Centric Assistant for meetings, lectures, presentations
– Proactive, Reactive Assistance and Information Retrieval
• Features-Functionalities– Sophisticated Situation Modeling / Tracking– Essentially Non-obtrusive Operation– Intelligent Meeting Recording Functionality– GUI runs also on PDA– Full Compliance to CHIL Architecture– Integration actuating devices (Targeted Audio,
Projectors)
![Page 27: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/27.jpg)
Context as Network of SituationsTransition Elements & Components
NIL S1 Table Watcher (people in table area), SAD
S1 S2White-Board Watcher (presenter in speaker area),
Face ID, Speaker ID
S2 S3Speaker ID (speaker ID ≠ presenter ID), Speaker
Tracking
S3 S2Face Detection (presenter in speaker area),
Face ID, Speaker ID
S2 S4White-Board Watcher (no face in speaker area for N seconds), Table Watcher (all participants in meeting
table)
S4 S5 Table Watcher (nobody in table area)
![Page 28: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/28.jpg)
What Happened While I was Away?
![Page 29: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/29.jpg)
Middleware
![Page 30: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/30.jpg)
Virtualized Sensor Access
![Page 31: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/31.jpg)
CHIL Compliant Perceptual Components
• Several sites develop site, room, configuration specific Perceptual Components for CHIL
• Provide common abstractions in the input and output of the PC (black box)
• Facilitate Component Exchange Across Sites & Vendors
• Standardization commenced for Body Trackers– Continues to Face ID Components
![Page 32: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/32.jpg)
Architecture for Body Tracker Exchange
Information retrieval
Transparent connection to sensor output
Common control API (CHILiX)
Services complying to current API
Non-CHIL Compliant Body Tracker
Sensor abstraction
![Page 33: Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece](https://reader036.vdocuments.site/reader036/viewer/2022062309/56813b8f550346895da4bf6b/html5/thumbnails/33.jpg)
Thank you!Questions?