![Page 1: Using Grammars for Action Recognition · Aniket Bera. Video analysis with CFGs The “Inverse Hollywood problem”: From video to scripts and storyboards via causal analysis. Brand](https://reader034.vdocuments.site/reader034/viewer/2022050501/5f93abb7bcba77083c6d0945/html5/thumbnails/1.jpg)
Using Grammars for Action Recognition
Aniket Bera
![Page 2: Using Grammars for Action Recognition · Aniket Bera. Video analysis with CFGs The “Inverse Hollywood problem”: From video to scripts and storyboards via causal analysis. Brand](https://reader034.vdocuments.site/reader034/viewer/2022050501/5f93abb7bcba77083c6d0945/html5/thumbnails/2.jpg)
Video analysis with CFGs
The “Inverse Hollywood problem”:
From video to scripts and storyboards via causal analysis.
Brand 1997
Action Recognition using Probabilistic Parsing.Bobick and Ivanov 1998
Recognizing Multitasked Activities from Video using
Stochastic Context-Free Grammar.
Moore and Essa 2001
13
![Page 3: Using Grammars for Action Recognition · Aniket Bera. Video analysis with CFGs The “Inverse Hollywood problem”: From video to scripts and storyboards via causal analysis. Brand](https://reader034.vdocuments.site/reader034/viewer/2022050501/5f93abb7bcba77083c6d0945/html5/thumbnails/3.jpg)
CFG for human activities
enter detach leave enter detach attach touch touch detach attach leave
M. Brand. The "Inverse Hollywood Problem":From video to scripts and storyboards
via causal analysis. AAAI 1997.
14
![Page 4: Using Grammars for Action Recognition · Aniket Bera. Video analysis with CFGs The “Inverse Hollywood problem”: From video to scripts and storyboards via causal analysis. Brand](https://reader034.vdocuments.site/reader034/viewer/2022050501/5f93abb7bcba77083c6d0945/html5/thumbnails/4.jpg)
Parse treeSCENE (Open up a PC)
IN ACTION (Open PC)
OUT IN
ADD ADD
enter detach leave enter
ACTION (unscrew) OUT
MOVE REMOVE
MOTION MOTION
detach attach touch touch detach attach leave
• Deterministic low-level primitive detection• Deterministic parsing
M. Brand. The "Inverse Hollywood Problem": From video to scripts and storyboards via causal analysis. AAAI 1997.
15
![Page 5: Using Grammars for Action Recognition · Aniket Bera. Video analysis with CFGs The “Inverse Hollywood problem”: From video to scripts and storyboards via causal analysis. Brand](https://reader034.vdocuments.site/reader034/viewer/2022050501/5f93abb7bcba77083c6d0945/html5/thumbnails/5.jpg)
Stochastic CFGs
Action Recognition using Probabilistic Parsing.Bobick and Ivanov 1998
16
![Page 6: Using Grammars for Action Recognition · Aniket Bera. Video analysis with CFGs The “Inverse Hollywood problem”: From video to scripts and storyboards via causal analysis. Brand](https://reader034.vdocuments.site/reader034/viewer/2022050501/5f93abb7bcba77083c6d0945/html5/thumbnails/6.jpg)
Gesture analysis with CFGs
Primitive recognition with HMMs
Action Recognition using Probabilistic Parsing. Bobick and Ivanov 1998 17
![Page 7: Using Grammars for Action Recognition · Aniket Bera. Video analysis with CFGs The “Inverse Hollywood problem”: From video to scripts and storyboards via causal analysis. Brand](https://reader034.vdocuments.site/reader034/viewer/2022050501/5f93abb7bcba77083c6d0945/html5/thumbnails/7.jpg)
left-right
Action Recognition using Probabilistic Parsing. Bobick and Ivanov 1998 18
![Page 8: Using Grammars for Action Recognition · Aniket Bera. Video analysis with CFGs The “Inverse Hollywood problem”: From video to scripts and storyboards via causal analysis. Brand](https://reader034.vdocuments.site/reader034/viewer/2022050501/5f93abb7bcba77083c6d0945/html5/thumbnails/8.jpg)
up-down
Action Recognition using Probabilistic Parsing. Bobick and Ivanov 1998 19
![Page 9: Using Grammars for Action Recognition · Aniket Bera. Video analysis with CFGs The “Inverse Hollywood problem”: From video to scripts and storyboards via causal analysis. Brand](https://reader034.vdocuments.site/reader034/viewer/2022050501/5f93abb7bcba77083c6d0945/html5/thumbnails/9.jpg)
right-left
Action Recognition using Probabilistic Parsing. Bobick and Ivanov 1998 20
![Page 10: Using Grammars for Action Recognition · Aniket Bera. Video analysis with CFGs The “Inverse Hollywood problem”: From video to scripts and storyboards via causal analysis. Brand](https://reader034.vdocuments.site/reader034/viewer/2022050501/5f93abb7bcba77083c6d0945/html5/thumbnails/10.jpg)
down-up
Action Recognition using Probabilistic Parsing. Bobick and Ivanov 1998 21
![Page 11: Using Grammars for Action Recognition · Aniket Bera. Video analysis with CFGs The “Inverse Hollywood problem”: From video to scripts and storyboards via causal analysis. Brand](https://reader034.vdocuments.site/reader034/viewer/2022050501/5f93abb7bcba77083c6d0945/html5/thumbnails/11.jpg)
Parse Tree
S
RH
TOP UD BOT DU
LR RL
left-right up-down right-left down-up
22
![Page 12: Using Grammars for Action Recognition · Aniket Bera. Video analysis with CFGs The “Inverse Hollywood problem”: From video to scripts and storyboards via causal analysis. Brand](https://reader034.vdocuments.site/reader034/viewer/2022050501/5f93abb7bcba77083c6d0945/html5/thumbnails/12.jpg)
Errors
Likelihood value over time (not discrete symbols)
HMM a
HMM b
Errors are inevitable…
but the grammar acts as a top-down constraint
Action Recognition using Probabilistic Parsing. Bobick and Ivanov 1998 23
![Page 13: Using Grammars for Action Recognition · Aniket Bera. Video analysis with CFGs The “Inverse Hollywood problem”: From video to scripts and storyboards via causal analysis. Brand](https://reader034.vdocuments.site/reader034/viewer/2022050501/5f93abb7bcba77083c6d0945/html5/thumbnails/13.jpg)
Dealing with uncertainty & errors
Stolcke-Early (probabilistic) parser
SKIP rules to deal with insertion errors
HMM a
HMM b
HMM c
Action Recognition using Probabilistic Parsing. Bobick and Ivanov 1998 24
![Page 14: Using Grammars for Action Recognition · Aniket Bera. Video analysis with CFGs The “Inverse Hollywood problem”: From video to scripts and storyboards via causal analysis. Brand](https://reader034.vdocuments.site/reader034/viewer/2022050501/5f93abb7bcba77083c6d0945/html5/thumbnails/14.jpg)
SCFG for Blackjack
Recognizing Multitasked Activities from Video usingStochastic Context-Free Grammar.
Moore and Essa 2001
• Deals with more complex activities• Deals with more error types
25
![Page 15: Using Grammars for Action Recognition · Aniket Bera. Video analysis with CFGs The “Inverse Hollywood problem”: From video to scripts and storyboards via causal analysis. Brand](https://reader034.vdocuments.site/reader034/viewer/2022050501/5f93abb7bcba77083c6d0945/html5/thumbnails/15.jpg)
Stochastic Grammars: Overview
• Representation: Stochastic grammar• Terminals: object interactions• Context-sensitive due to internal scene models
• Domain: Towers of Hanoi• Requires activities with
strong temporal constraints
• Contributions• Showed recognition &
decomposition with veryweak appearance models
• Demonstrated usefulnessof feedback from high tolow-level reasoning components
![Page 16: Using Grammars for Action Recognition · Aniket Bera. Video analysis with CFGs The “Inverse Hollywood problem”: From video to scripts and storyboards via causal analysis. Brand](https://reader034.vdocuments.site/reader034/viewer/2022050501/5f93abb7bcba77083c6d0945/html5/thumbnails/16.jpg)
Expectation Grammars(CVPR 2003)
• Analyze video of a person physically solving the Towers of Hanoi task
• Recognize valid activity
• Identify each move
• Segment objects
• Detect distracters / noise
![Page 17: Using Grammars for Action Recognition · Aniket Bera. Video analysis with CFGs The “Inverse Hollywood problem”: From video to scripts and storyboards via causal analysis. Brand](https://reader034.vdocuments.site/reader034/viewer/2022050501/5f93abb7bcba77083c6d0945/html5/thumbnails/17.jpg)
System Overview
![Page 18: Using Grammars for Action Recognition · Aniket Bera. Video analysis with CFGs The “Inverse Hollywood problem”: From video to scripts and storyboards via causal analysis. Brand](https://reader034.vdocuments.site/reader034/viewer/2022050501/5f93abb7bcba77083c6d0945/html5/thumbnails/18.jpg)
ToH: Low-Level Vision
Raw VideoBackground
Model
ForegroundComponents
Foreground andshadow detection
![Page 19: Using Grammars for Action Recognition · Aniket Bera. Video analysis with CFGs The “Inverse Hollywood problem”: From video to scripts and storyboards via causal analysis. Brand](https://reader034.vdocuments.site/reader034/viewer/2022050501/5f93abb7bcba77083c6d0945/html5/thumbnails/19.jpg)
Low-Level Features• Explanation-based symbols
• Blob interaction events
• merge, split, enter, exit, tracked, noise
• Future Work: hidden, revealed, blob-part, coalesce
• All possible explanations generated• Inconsistent explanations heuristically pruned
Enter
Merge
![Page 20: Using Grammars for Action Recognition · Aniket Bera. Video analysis with CFGs The “Inverse Hollywood problem”: From video to scripts and storyboards via causal analysis. Brand](https://reader034.vdocuments.site/reader034/viewer/2022050501/5f93abb7bcba77083c6d0945/html5/thumbnails/20.jpg)
Contributions
• Showed activity recognition and decomposition without appearance models
• Demonstrated usefulness of feedback from high-level, long-term interpretations to low-level, short-term decisions