3-d scene analysis via sequenced predictions over points and regions

1

3-D Scene Analysis via Sequenced Predictions over Points and Regions

Xuehan Xiong

Daniel Munoz

Drew Bagnell

Martial Hebert

2

Problem: 3D Scene Understanding

3

Car

Pole

Ground

Trunk

Wire

Building

Veg

Solution: Contextual Classification

4

• Intractable inference

• Difficult to train

• Limited success

Graphical models

Fig. from Anguelov, et al. CVPR 2005

Classical Approach: Graphical Models

Anguelov, et al. CVPR 2005Triebel, et. al. IJCAI 2007 Munoz, et al. CVPR 2009

Kulesza NIPS 2007Wainwright JMLR 2006Finley & Joachims ICML 2008

Belief propagationMean fieldMCMC

5



• Limited success

Graphical models




Kulesza NIPS 2007Wainwright JMLR 2006Finley & Joachims ICML 2008


6



• Limited success

Graphical models




KuleszaWainwrightFinley & Joachims ICML 2008


11

Our Approach: Inference Machines• Train an inference procedure, not a model.– To encode spatial layout and long range relations– Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010

12

• Train an inference procedure, not a model.– To encode spatial layout and long range relations– Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010

• Inference via sequential prediction

C0 C1 C2

Reject

T T

F F F

E.g. Viola-Jones 2001

Our Approach: Inference Machines

13

C0 C1 C2

context context

Ours

• Train an inference procedure, not a model.– To encode spatial layout and long range relations– Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010

• Inference via sequential prediction

Our Approach: Inference Machines

14

point features

)(LogRegˆ )0()0()0( XY

Example features

15

point features

16

point features

top mid bottom

Contextual features

17

point features


top mid bottom

18

point features


top mid bottom

19

point features


top mid bottom

20

point features


top mid bottom

21

Local features only

Car Pole

Building

VegGround

Wire

22

Round 1

23

Round 2

24

Round 3

Car

Veg

25

Create regions

Level 2

Level 1

27

region features

Region level Pt level

28

Level 2

Level 1

29

point features

Point level Region level

30

With Regions

31

Learned Relationships

Neighbor contextual feature Learned weights

point features

top mid bottom

32

Learned Relationships

Neighbor contextual feature Learned weights

point features

top mid bottom

33

Experiments

• 3 large-scale datasets– CMU (26M), Moscow State (10M), Univ. Wash (10M)

• Multiple classes (4 to 8) – car, building, veg, wire, fence, people, trunk, pole,

ground, street sign• Different sensors– SICK (ground), ALTM 2050 (aerial), Velodyne (ground)

• Comparisons– Graphical models, exemplar based

34

Quantitative Results

CMU [1] Moscow A [2] Moscow B [2] UWash [3]0.3

0.4

0.5

0.6

0.7

0.8

0.9

Ours Related Work LogReg

Aver

age

F1 S

core

[1] Munoz CVPR 2009 [2] Shapovalov PCV 2010 [3] Lai RSS 2010 *

* Use additional semi-supervised data not leveraged by other methods.

35

CMU DatasetOurs Max Margin CRF [1]

[1] Munoz, et. al. CVPR 2009

36

Ours Max Margin CRF [1]

CMU Dataset[1] Munoz, et. al. CVPR 2009

37

Ours Max Margin CRF [1]

CMU Dataset[1] Munoz, et. al. CVPR 2009

38

Moscow State DatasetOurs Logistic regression

39

Conclusion

• Simple and fast approach for scene labeling– No graphical model– Labeling via 5x logistic regression predictions

• Support flexible contextual features– Learning rich relationships

C0 C1 C2

context context

40

Thank you! And Questions?

• Acknowledgements– US Army Research Laboratory, Collaborative

Technology Alliance– QinetiQ North America Robotics Fellowship

3-d scene analysis via sequenced predictions over points and regions

Documents

inference procedure

local features

inference machines11train

classical approach

graphical modelsanguelov

graphical modelsfig

point levelregion level29icra

regions level