3-d scene analysis via sequenced predictions over points and regions
DESCRIPTION
3-D Scene Analysis via Sequenced Predictions over Points and Regions. Daniel Munoz. Drew Bagnell. Martial Hebert. Xuehan Xiong. Problem: 3D Scene Understanding . Solution: Contextual C lassification. B uilding. W ire. P ole. V eg. T runk. C ar. G round. - PowerPoint PPT PresentationTRANSCRIPT
1
3-D Scene Analysis via Sequenced Predictions over Points and Regions
Xuehan Xiong
Daniel Munoz
Drew Bagnell
Martial Hebert
2
Problem: 3D Scene Understanding
3
Car
Pole
Ground
Trunk
Wire
Building
Veg
Solution: Contextual Classification
4
• Intractable inference
• Difficult to train
• Limited success
Graphical models
Fig. from Anguelov, et al. CVPR 2005
Classical Approach: Graphical Models
Anguelov, et al. CVPR 2005Triebel, et. al. IJCAI 2007 Munoz, et al. CVPR 2009
Kulesza NIPS 2007Wainwright JMLR 2006Finley & Joachims ICML 2008
Belief propagationMean fieldMCMC
5
• Intractable inference
• Difficult to train
• Limited success
Graphical models
Fig. from Anguelov, et al. CVPR 2005
Classical Approach: Graphical Models
Anguelov, et al. CVPR 2005Triebel, et. al. IJCAI 2007 Munoz, et al. CVPR 2009
Kulesza NIPS 2007Wainwright JMLR 2006Finley & Joachims ICML 2008
Belief propagationMean fieldMCMC
6
• Intractable inference
• Difficult to train
• Limited success
Graphical models
Fig. from Anguelov, et al. CVPR 2005
Classical Approach: Graphical Models
Anguelov, et al. CVPR 2005Triebel, et. al. IJCAI 2007 Munoz, et al. CVPR 2009
KuleszaWainwrightFinley & Joachims ICML 2008
Belief propagationMean fieldMCMC
7
8
9
10
11
Our Approach: Inference Machines• Train an inference procedure, not a model.– To encode spatial layout and long range relations– Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010
12
• Train an inference procedure, not a model.– To encode spatial layout and long range relations– Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010
• Inference via sequential prediction
C0 C1 C2
Reject
T T
F F F
E.g. Viola-Jones 2001
Our Approach: Inference Machines
13
C0 C1 C2
context context
Ours
• Train an inference procedure, not a model.– To encode spatial layout and long range relations– Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010
• Inference via sequential prediction
Our Approach: Inference Machines
14
point features
)(LogRegˆ )0()0()0( XY
Example features
15
point features
16
point features
top mid bottom
Contextual features
17
point features
)(LogRegˆ )1()1()1( XY
top mid bottom
18
point features
)(LogRegˆ )1()1()1( XY
top mid bottom
19
point features
)(LogRegˆ )1()1()1( XY
top mid bottom
20
point features
)(LogRegˆ )2()2()2( XY
top mid bottom
21
Local features only
Car Pole
Building
VegGround
Wire
22
Round 1
23
Round 2
24
Round 3
Car
Veg
25
Create regions
Level 2
Level 1
26
27
region features
Region level Pt level
28
Level 2
Level 1
29
point features
Point level Region level
30
With Regions
31
Learned Relationships
Neighbor contextual feature Learned weights
point features
top mid bottom
32
Learned Relationships
Neighbor contextual feature Learned weights
point features
top mid bottom
33
Experiments
• 3 large-scale datasets– CMU (26M), Moscow State (10M), Univ. Wash (10M)
• Multiple classes (4 to 8) – car, building, veg, wire, fence, people, trunk, pole,
ground, street sign• Different sensors– SICK (ground), ALTM 2050 (aerial), Velodyne (ground)
• Comparisons– Graphical models, exemplar based
34
Quantitative Results
CMU [1] Moscow A [2] Moscow B [2] UWash [3]0.3
0.4
0.5
0.6
0.7
0.8
0.9
Ours Related Work LogReg
Aver
age
F1 S
core
[1] Munoz CVPR 2009 [2] Shapovalov PCV 2010 [3] Lai RSS 2010 *
* Use additional semi-supervised data not leveraged by other methods.
35
CMU DatasetOurs Max Margin CRF [1]
[1] Munoz, et. al. CVPR 2009
36
Ours Max Margin CRF [1]
CMU Dataset[1] Munoz, et. al. CVPR 2009
37
Ours Max Margin CRF [1]
CMU Dataset[1] Munoz, et. al. CVPR 2009
38
Moscow State DatasetOurs Logistic regression
39
Conclusion
• Simple and fast approach for scene labeling– No graphical model– Labeling via 5x logistic regression predictions
• Support flexible contextual features– Learning rich relationships
C0 C1 C2
context context
40
Thank you! And Questions?
• Acknowledgements– US Army Research Laboratory, Collaborative
Technology Alliance– QinetiQ North America Robotics Fellowship