3d object recognition pipeline kurt konolige, radu rusu, victor eruhmov, suat gedikli willow garage...

24
3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley, Stephen Gould Stanford Marius Muja UBC

Upload: amanda-barrows

Post on 15-Jan-2016

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

3D Object Recognition Pipeline

Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli

Willow Garage

Stefan Holzer, Stefan Hinterstoisser

TUM

Morgan Quigley, Stephen GouldStanford

Marius MujaUBC

Page 2: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

2

3D and Object Recognition

•Provides more info than just visual texture

•Good for scale and segmentation

•Verification

Need a good device for 3D info

Page 3: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

3

3D CamerasTechnology Examples Pro/Con

Stereo Newcombe, Davison CVPR 2010

Not dense, smearing; real-time, good resolutionRegistration + regularization

Stereo + texture WG device Dense, real-time, good resolutionShort range

Laser line scan STAIR Borg scanner Dense, most accurateShort range, not real time

Structured light PrimeSense Dense, real-time, good resolutionShort range, ambient light/scene texture

Phase shift SR4, PMD Dense, real-time, medium rangeLow resolution, low accuracy, gross errors

Gated reflectance Canesta Dense, real-timeLow resolution, low accuracy

Tabletop manipulation:• Short range• High resolution• High range accuracy• Real-time

Technology Examples Pro/Con

Stereo Newcombe, Davison CVPR 2010

Not dense, smearing; real-time, good resolutionRegistration + regularization

Stereo + texture WG device Dense, real-time, good resolutionShort range

Laser line scan STAIR Borg scanner Dense, most accurateShort range, not real time

Structured light PrimeSense Dense, real-time, good resolutionShort range, ambient light/scene texture

Phase shift SR4, PMD, Canesta Dense, real-time, medium rangeLow resolution, low accuracy, gross errors

Gated reflectance 3DV Dense, real-timeLow resolution, low accuracy

Page 4: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

WG Projected Texture Stereo Device

• Paint the scene with texture from a projector• vs. single camera with structured light

• Advantages:• Simple projector• Standard algorithms• Full frame rates (640x480)• Dynamic scenes

Page 5: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

WG project texture device

Projector• Red LED• Eye safe• Synchronized to cameras

3D Fly-thru

Page 6: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

6

Object Recognition Pipeline

•Textured objects via keypoints [Victor Eruhimov, Suat Gedikli]

•Untextured objects via DOT [Stefan Holzer, Stefan Hinterstoisser]

•Simple 3D model matching [Marius Muja]

•STAIR 2D/3D features [Stephen Gould]

Pre-filter Detect Verify

Page 7: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

7

MOPED – Textured object recognition with pose

•Model: Stereo view of an object at a known pose

•Extract keypoints and features

•For a new scene, match keypoints to each model

•Run SfM geometric check to verify and recover pose

Torres, Romea, Srinivasa ICRA 2010

Page 8: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

8

- Need texture- Need high res camera

Page 9: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

Dominant Orientation Templates (DOT) Stefan Hinterstoisser, Stefan Holzer (TUM; CVPR 2010, ECCV

2010)● DOT is a template matching based approach

template current scene

- Template is slid over the image to compute the response for each image position- If response is above a threshold it is considered as detection of the template

Page 10: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

DOT – Basic Principle● DOT uses gradients instead of color or gray values

template current scene

- Gradients are less sensitive to illumination changes- Gradients have orientation and magnitude

Page 11: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

Offline Learning● Good learning is necessary to reduce false-positive rate● We try to use all available information to segment the object:

● Point cloud from narrow stereo is used to detect the table and segment the point cloud of the object

● Object point cloud is used to create an initial mask● Mask is refined using GrabCut (see OpenCV)

Page 12: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

False-Positive Rejection

● Two more precise templates for validation:● more precise and not discretized gradient template● disparity template to compare expected with real disparities

Page 13: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

False-Positive Rejection

● Compute error between reference point cloud and point cloud at detected position

Optimize initial 3D point cloud pose given from the detection

Directly gives object pose if model is associated with learned point clouds

Page 14: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

14

Page 15: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

15

Page 16: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

16

Page 17: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

17

Page 18: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

18

Page 19: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

STAIR Vision Library (SVL)Stanford STAIR project [Andrew Ng, Stephen

Gould]• Initially developed to

support the Stanford AI Robot (STAIR) project

• Builds on top of OpenCV computer vision library and Eigen matrix library

• Provides a range of software infrastructure for

• computer vision

• machine learning

• probabilistic graphical models

• Hosted on SourceForge

Page 20: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

Object Detection in SVL• Sliding-window object detector

• Features are extracted from a local window

• Learned boosted decision-tree classifier scores each window

• Image is scanned at multiple resolutions to detect objects at different scales

Page 21: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

Image Channels• Image decomposed into multiple channels

• Depth at each pixel, obtained from a laser scanner, can be thought of as an additional channel

intensity image edge map depth map

[Quigley et al., ICRA 2009]

Page 22: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

Object Detection Features

• Learn a “patch” dictionary over intensity, edge and depth channels

• Patches encode localized templates for matching

• Depth patches capture shape; intensity and edge patches capture appearance

• Patch responses (over entire dictionary) are combined to form the feature vector

[Quigley et al., ICRA 2009]

Page 23: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

Results• 150 images of cluttered indoor scenes

• 5-fold cross-validation

• Depth information provides significant improvement in area under precision-recall curve

[Quigley et al., ICRA 2009]

8% improvement 3% improvement 38% improvement

Page 24: 3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

24

Conclusions

•Realtime, accurate 3D devices are becoming available

•3D can help in object detection for untextured objects

- Combo of visual and 3D features best

•3D is useful for verification

•Check out the PR2 Grasping Demo!