stereo and projective structure from motion computer vision cs 543 / ece 549 university of illinois...

49
Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana Lazebnik, Silvio Saverese, Steve Sei

Upload: daisy-bruce

Post on 16-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Stereo and Projective Structure from Motion

Computer VisionCS 543 / ECE 549

University of Illinois

Derek Hoiem

04/13/10

Many slides adapted from Lana Lazebnik, Silvio Saverese, Steve Seitz

Page 2: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

This class

• Recap of epipolar geometry

• Recovering structure– Generally, how can we estimate 3D positions for

matched points in two images? (triangulation)– If we have a moving camera, how can we recover

3D points? (projective structure from motion)– If we have a calibrated stereo pair, how can we

get dense depth estimates? (stereo fusion)

Page 3: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Basic Questions

• Why can’t we get depth if the camera doesn’t translate?

• Why can’t we get a nice panorama if the camera does translate?

Page 4: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Recap: Epipoles

C

• Point x in left image corresponds to epipolar line l’ in right image

• Epipolar line passes through the epipole (the intersection of the cameras’ baseline with the image plane

C

Page 5: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Recap: Fundamental Matrix• Fundamental matrix maps from a point in one

image to a line in the other

• If x and x’ correspond to the same 3d point X:

Page 6: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Recap: Automatically Relating Projections

Homography (No Translation) Fundamental Matrix (Translation)

Assume we have matched points x x’ with outliers

Page 7: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Recap: Automatically Relating Projections

Homography (No Translation)

• Correspondence Relation

1. Normalize image coordinates

2. RANSAC with 4 points

3. De-normalize:

Assume we have matched points x x’ with outliers

0HxxHxx ''

Txx ~ xTx ~

THTH~1

Fundamental Matrix (Translation)

Page 8: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Recap: Automatically Relating Projections

Homography (No Translation) Fundamental Matrix (Translation)

• Correspondence Relation

1. Normalize image coordinates

2. RANSAC with 8 points3. Enforce by

SVD4. De-normalize:

• Correspondence Relation

1. Normalize image coordinates

2. RANSAC with 4 points

3. De-normalize:

Assume we have matched points x x’ with outliers

0HxxHxx ''

Txx ~ xTx ~

THTH~1

Txx ~ xTx ~

TFTF~1

0~

det F

0 Fxx T

Page 9: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Recap• We can get projection matrices P and P’ up to a

projective ambiguity

• Code:function P = vgg_P_from_F(F)[U,S,V] = svd(F);e = U(:,3);P = [-vgg_contreps(e)*F e];

0IP | e|FeP 0 Fe T

See HZ p. 255-256

Page 10: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Recap• Fundamental matrix song

Page 11: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Triangulation: Linear Solution

• Generally, rays Cx and C’x’ will not exactly intersect

• Can solve via SVD, finding a least squares solution to a system of equations

X

x x'

0 XPx 0PXx

0AX

TT

TT

TT

TT

v

u

v

u

23

13

23

13

pp

pp

pp

pp

A

Further reading: HZ p. 312-313

Page 12: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Triangulation: Linear SolutionGiven P, P’, x, x’1. Precondition points and projection

matrices2. Create matrix A3. [U, S, V] = svd(A)4. X = V(:, end)

Pros and Cons• Works for any number of

corresponding images• Not projectively invariant

1

v

u

x

1

v

u

x

T

T

T

3

2

1

p

p

p

P

TT

TT

TT

TT

v

u

v

u

23

13

23

13

pp

pp

pp

pp

A

T

T

T

3

2

1

p

p

p

P

Code: http://www.robots.ox.ac.uk/~vgg/hzbook/code/vgg_multiview/vgg_X_from_xP_lin.m

Page 13: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Triangulation: Non-linear Solution• Minimize projected error while satisfying

xTFx=0

• Solution is a 6-degree polynomial of t, minimizing

Further reading: HZ p. 318

Page 14: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Projective structure from motion• Given: m images of n fixed 3D points

• xij = Pi Xj , i = 1,… , m, j = 1, … , n • Problem: estimate m projection matrices Pi and n 3D points Xj from

the mn corresponding points xij

x1j

x2j

x3j

Xj

P1

P2

P3

Slides from Lana Lazebnik

Page 15: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Projective structure from motion• Given: m images of n fixed 3D points

• xij = Pi Xj , i = 1,… , m, j = 1, … , n • Problem: estimate m projection matrices Pi

and n 3D points Xj from the mn corresponding points xij

• With no calibration info, cameras and points can only be recovered up to a 4x4 projective transformation Q:

• X → QX, P → PQ-1

• We can solve for structure and motion when • 2mn >= 11m +3n – 15

• For two cameras, at least 7 points are needed

Page 16: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Sequential structure from motion•Initialize motion from two images using fundamental matrix

•Initialize structure by triangulation

•For each additional view:– Determine projection matrix of

new camera using all the known 3D points that are visible in its image – calibration ca

mer

as

points

Page 17: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Sequential structure from motion•Initialize motion from two images using fundamental matrix

•Initialize structure by triangulation

•For each additional view:– Determine projection matrix of new

camera using all the known 3D points that are visible in its image – calibration

– Refine and extend structure: compute new 3D points, re-optimize existing points that are also seen by this camera – triangulation

cam

eras

points

Page 18: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Sequential structure from motion•Initialize motion from two images using fundamental matrix

•Initialize structure by triangulation

•For each additional view:– Determine projection matrix of new

camera using all the known 3D points that are visible in its image – calibration

– Refine and extend structure: compute new 3D points, re-optimize existing points that are also seen by this camera – triangulation

•Refine structure and motion: bundle adjustment

cam

eras

points

Page 19: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Bundle adjustment• Non-linear method for refining structure and motion• Minimizing reprojection error

2

1 1

,),(

m

i

n

jjiijDE XPxXP

x1j

x2j

x3j

Xj

P1

P2

P3

P1Xj

P2XjP3Xj

Page 20: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Self-calibration• Self-calibration (auto-calibration) is the process

of determining intrinsic camera parameters directly from uncalibrated images

• For example, when the images are acquired by a single moving camera, we can use the constraint that the intrinsic parameter matrix remains fixed for all the images– Compute initial projective reconstruction and find 3D

projective transformation matrix Q such that all camera matrices are in the form Pi = K [Ri | ti]

• Can use constraints on the form of the calibration matrix: zero skew

Page 21: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Summary so far• From two images, we can:

– Recover fundamental matrix F– Recover canonical cameras P and P’ from F– Estimate 3d position values X for corresponding

points x and x’

• For a moving camera, we can:– Initialize by computing F, P, X for two images– Sequentially add new images, computing new P,

refining X, and adding points– Auto-calibrate assuming fixed calibration matrix to

upgrade to similarity transform

Page 22: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Photo synth

Noah Snavely, Steven M. Seitz, Richard Szeliski, "Photo tourism: Exploring photo collections in 3D," SIGGRAPH 2006

http://photosynth.net/

Page 23: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

3D from multiple images

Building Rome in a Day: Agarwal et al. 2009

Page 24: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Plug: Steve Seitz Talk

• Steve Seitz will talk about “Reconstructing the World from Photos on the Internet”– Monday, April 26th, 4pm in Siebel Center

Page 25: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Special case: Dense binocular stereo• Fuse a calibrated binocular stereo pair to

produce a depth imageimage 1 image 2

Dense depth map

Many of these slides adapted from Steve Seitz and Lana Lazebnik

Page 26: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Basic stereo matching algorithm

• For each pixel in the first image– Find corresponding epipolar line in the right image– Examine all pixels on the epipolar line and pick the best match– Triangulate the matches to get depth information

• Simplest case: epipolar lines are scanlines– When does this happen?

Page 27: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Simplest Case: Parallel images• Image planes of cameras are

parallel to each other and to the baseline

• Camera centers are at same height

• Focal lengths are the same

Page 28: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Simplest Case: Parallel images• Image planes of cameras are

parallel to each other and to the baseline

• Camera centers are at same height

• Focal lengths are the same• Then, epipolar lines fall along

the horizontal scan lines of the images

Page 29: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Special case of fundamental matrix

RtExExT ][,0

00

00

000

][

T

TRtE

Epipolar constraint:

vTTv

vT

Tvuv

u

T

Tvu

0

0

10

100

00

000

1

R = I t = (T, 0, 0)

The y-coordinates of corresponding points are the same!

t

x

x’

Page 30: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Depth from disparity

f

x x’

BaselineB

z

O O’

X

f

z

fBxxdisparity

Disparity is inversely proportional to depth!

Page 31: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Stereo image rectification

Page 32: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Stereo image rectification

• Reproject image planes onto a common plane parallel to the line between optical centers

• Pixel motion is horizontal after this transformation

• Two homographies (3x3 transform), one for each input image reprojection

C. Loop and Z. Zhang. Computing Rectifying Homographies for Stereo Vision. IEEE Conf. Computer Vision and Pattern Recognition, 1999.

Page 33: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Rectification example

Page 34: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Basic stereo matching algorithm

• If necessary, rectify the two stereo images to transform epipolar lines into scanlines

• For each pixel x in the first image– Find corresponding epipolar scanline in the right image– Examine all pixels on the scanline and pick the best match x’– Compute disparity x-x’ and set depth(x) = 1/(x-x’)

Page 35: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Matching cost

disparity

Left Right

scanline

Correspondence search

• Slide a window along the right scanline and compare contents of that window with the reference window in the left image

• Matching cost: SSD or normalized correlation

Page 36: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Left Right

scanline

Correspondence search

SSD

Page 37: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Left Right

scanline

Correspondence search

Norm. corr

Page 38: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Effect of window size

– Smaller window+ More detail• More noise

– Larger window+ Smoother disparity maps• Less detail

W = 3 W = 20

Page 39: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Failures of correspondence search

Textureless surfaces Occlusions, repetition

Non-Lambertian surfaces, specularities

Page 40: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Results with window search

Window-based matching Ground truth

Data

Page 41: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

How can we improve window-based matching?

• So far, matches are independent for each point

• What constraints or priors can we add?

Page 42: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Stereo constraints/priors• Uniqueness

– For any point in one image, there should be at most one matching point in the other image

Page 43: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Stereo constraints/priors• Uniqueness

– For any point in one image, there should be at most one matching point in the other image

• Ordering– Corresponding points should be in the same order in

both views

Page 44: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Stereo constraints/priors• Uniqueness

– For any point in one image, there should be at most one matching point in the other image

• Ordering– Corresponding points should be in the same order in

both views

Ordering constraint doesn’t hold

Page 45: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Non-local constraints• Uniqueness

– For any point in one image, there should be at most one matching point in the other image

• Ordering– Corresponding points should be in the same order in both

views• Smoothness

– We expect disparity values to change slowly (for the most part)

Page 46: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Stereo matching as energy minimization

I1 I2 D

• Energy functions of this form can be minimized using graph cuts

Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization via Graph Cuts, PAMI 2001

W1(i ) W2(i+D(i )) D(i )

)(),,( smooth21data DEDIIEE

ji

jDiDE,neighbors

smooth )()( 2

21data ))(()( i

iDiWiWE

Page 47: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Many of these constraints can be encoded in an energy function and solved using graph cuts

Graph cuts Ground truth

For the latest and greatest: http://www.middlebury.edu/stereo/

Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization via Graph Cuts, PAMI 2001

Page 48: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Summary

• Recap of epipolar geometry– Epipoles are intersection of baseline with image planes– Matching point in second image is on a line passing through its

epipole– Fundamental matrix maps from a point in one image to an epipole in

the other– Can recover canonical camera matrices from F (with projective

ambiguity)

• Recovering structure– Triangulation to recover 3D position of two matched points in images

with known projection matrices– Sequential algorithm to recover structure from a moving camera,

followed by auto-calibration by assuming fixed K– Get depth from stereo pair by aligning via homography and searching

across scanlines to match; Depth is inverse to disparity.

Page 49: Stereo and Projective Structure from Motion Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/13/10 Many slides adapted from Lana

Next class

• KLT tracking

• Elegant SFM method using tracked points, assuming orthographic projection

• Optical flow