psi5787 - virtual realityalemart/psi5787/presentation.pdf · psi5787 - virtual reality alexandre...

PSI5787 - Virtual Reality

Alexandre Martins

[email protected]

December 10, 2012

Overview

I The problemI A few solutions

I First approachI Homography estimationI Linear algebra reviewI Stereographic projection

I Camera matrix formulationI Piecewise-Linear approximation

I Conclusion

The problem

The problem

We need a projection system for Spheree. We have:

I A spherical display surface D;

I N projectors: P1, ...,PN ;

I A camera C (pinhole camera model).

How can we display an image I on the surface D?

The problem

We’ll somehow “align” the projectors, warping the projectedimages. This is known as geometric correction.

The problem

In order to display I on D, we need to determine how theindividual projectors’ images map to I .

For projector Pi , let Gi be this mapping.

How can we find all Gi ’s?

The problem

D, which is a spherical display, is a parametric display surface.

The camera C can observe projected features from each Pi , thusestablishing a mapping Hi from the i th projector coordinates to thecamera coordinates.

The camera C , which observes D, has its own coordinate system.Let FCD be the transform from the camera coordinates to thedisplay coordinates.

We display image I on D. Let FDI be the mapping from thedisplay coordinates to the image coordinates.

Finally, we can combine FCD and FDI to form F , a transform fromthe camera coordinates to the image coordinates.

The problem

...thus establishing a mapping Hi from the i th projector coordinatesto the camera coordinates.

...F , a transform from the camera coordinates to the imagecoordinates.

Hence, Gi , which is a mapping from the i th projector coordinatesto the image coordinates, is merely:

Their composition

Gi = F · Hi

The problem

For the VR class, we just want to find a transform from camerato image.

Our problem is this:

Find F .

No projectors are involved!!!

First approach

First approach

Our first attempt for finding some F ...

Find:

I FCD : camera to display surface

I FDI : display surface to image

Then, set F = FDI · FCD

First approachFCD : camera to display surface

Let:

I S be the unit sphere in R3 centered at the origin O;I π be the plane z = z0, where −1 < z0 < 0;I C = (u, v ,w) be the position of the optical center of the

camera, where we enforce w < z0.


We’ll say that the base B of the sphere S is the circle in π withradius (1− z20 )1/2 centered at (0, 0, z0).

Also, points in S having z-coordinate less than z0 are said to be“invalid”. Points that are not invalid are called “valid”.


Given: q = (qx , qy ), the position of a pixel in the camera.Want: the corresponding point P = (x , y , z) in S .

Assume that there is a bijective transform H from the cameracoordinates to the plane π, taking q to some P ′ such that P ′ is theintersection between the line CP and the plane π.


We have: S , π : z = z0, C = (u, v ,w), q = (qx , qy ).We want: P = (x , y , z) ∈ S corresponding to pixel q.

Outline of the algorithm:

1. Let P ′ = (x0, y0, z0) ∈ π be H applied to q;

2. If P ′ /∈ B, then return null ;I say that pixel q is garbage.

3. Walk from C on the direction P ′ − C until you find, in S , avalid point, which will be P.



2. If P ′ /∈ B, then return null ;I say that pixel q is garbage.


That is, P ′ = H q.



2. If P ′ /∈ B, then return null;I say that pixel q is garbage.


In other words, return null if x20 + y20 > 1− z20 .


1. Let P ′ = (x0, y0, z0) ∈ π be H applied to q;2. If P ′ /∈ B, then return null ;

I say that pixel q is garbage.

3. Walk from C on the direction P ′ − C until you find, in S ,a valid point, which will be P.

Notice that C , P ′ and P are collinear. Hence,

(P ′ − C )× (P − P ′) = 0


0 =

(P ′ − C )× (P − P ′) =

∣∣∣∣∣∣i j k

x0 − u y0 − v z0 − wx − x0 y − y0 z − z0

∣∣∣∣∣∣ =

∣∣∣∣ y0 − v z0 − wy − y0 z − z0

∣∣∣∣ i − ∣∣∣∣ x0 − u z0 − wx − x0 z − z0

∣∣∣∣ j +

∣∣∣∣ x0 − u y0 − vx − x0 y − y0

∣∣∣∣ k


Therefore, (P ′ − C )× (P − P ′) = 0 implies:

x = x0 + ( z−z0z0−w )(x0 − u)

y = y0 + ( z−z0z0−w )(y0 − v)

Similarly, (P ′ − C )× (P − C ) = 0 implies:

x0 = u + ( z0−wz−w )(x − u)

y0 = v + ( z0−wz−w )(y − v)


Now, plug-in the sphere equation and solve for (x , y , z):

x = x0 + ( z−z0z0−w )(x0 − u)

y = y0 + ( z−z0z0−w )(y0 − v)

x2 + y2 + z2 = 1


Let:

a = 1 + ( x0−uz0−w )2 + ( y0−v

z0−w )2

b = 2{[x0 − ( x0−uz0−w )z0]( x0−u

z0−w ) + [y0 − ( y0−vz0−w )z0]( y0−v

z0−w )}

c = −1 + [x0 − ( x0−uz0−w )z0]2 + [y0 − ( y0−v

z0−w )z0]2

Solve az2 + bz + c = 0 for z .


Solve az2 + bz + c = 0 for z .

At most two real solutions. Since P ′ ∈ B, only one results in avalid point (i.e., not below the base) in S : the greater one.

Then, find the values of x and y . Now you know P.


But:

...there is a bijective transform H from the camera to π...

Can we compute such a H?

Homography estimation

First approachHomography estimation

Recall that RP2, the 2-dimensional projective space, is the set oflines in R3 passing through the origin (except the origin itself).

A homography is a bijective transformation from a projective spaceto itself that maps straight lines to straight lines.

Consider two coordinate systems represented by planes:

I the image plane I (x , y) and

I the world plane W (x ′, y ′).


Figure: homography. Extracted from mmlab.disi.unitn.it/wiki.


Matrix notation:

s xs ys

=

H−1︷︸︸︷ a1 a2 a3a4 a5 a6a7 a8 a9

x ′

y ′

1

H−1 maps world to image. s 6= 0.

Given: a set of m correspondences (xi , yi )←→ (x ′i , y′i ).

Want: estimate that matrix.


Now,

x =s x

s=

a1x′ + a2y

′ + a3a7x ′ + a8y ′ + a9

y =s y

s=

a4x′ + a5y

′ + a6a7x ′ + a8y ′ + a9


That is,

a1x′ + a2y + a3 − a7x

′x − a8y′x − a9x = 0

a4x′ + a5y

′ + a6 − a7x′y − a8y

′y − a9y = 0

Which means,

[x ′ y ′ 1 0 0 0 −x ′x −y ′x −x0 0 0 x ′ y ′ 1 −x ′y −y ′y −y

] a1...a9

= 0



Want: estimate a.

x ′1 y ′1 1 0 0 0 −x ′1x1 −y ′1x1 −x10 0 0 x ′1 y ′1 1 −x ′1y1 −y ′1y1 −y1x ′2 y ′2 1 0 0 0 −x ′2x2 −y ′2x2 −x20 0 0 x ′2 y ′2 1 −x ′2y2 −y ′2y2 −y2

...x ′m y ′m 1 0 0 0 −x ′mxm −y ′mxm −xm0 0 0 x ′m y ′m 1 −x ′mym −y ′mym −ym

a1a2a3a4...a8a9

= 0

Solution???



Want: estimate a.

x ′1 y ′1 1 0 0 0 −x ′1x1 −y ′1x1 −x10 0 0 x ′1 y ′1 1 −x ′1y1 −y ′1y1 −y1x ′2 y ′2 1 0 0 0 −x ′2x2 −y ′2x2 −x20 0 0 x ′2 y ′2 1 −x ′2y2 −y ′2y2 −y2

...x ′m y ′m 1 0 0 0 −x ′mxm −y ′mxm −xm0 0 0 x ′m y ′m 1 −x ′mym −y ′mym −ym

a1a2a3a4...a8a9

= 0

Solution??? Trivial solution.


Given: a set of m ≥ 4 correspondences (xi , yi )←→ (x ′i , y′i ).

Want: estimate a, where we enforce a9 = 1.

x ′1 y ′1 1 0 0 0 −x ′1x1 −y ′1x10 0 0 x ′1 y ′1 1 −x ′1y1 −y ′1y1x ′2 y ′2 1 0 0 0 −x ′2x2 −y ′2x20 0 0 x ′2 y ′2 1 −x ′2y2 −y ′2y2

...x ′m y ′m 1 0 0 0 −x ′mxm −y ′mxm0 0 0 x ′m y ′m 1 −x ′mym −y ′mym

a1a2a3a4...a7a8

=

x1y1x2y2...xmym

Solution???

Linear algebra review

First approachLinear algebra review

Let:

I A ∈ Rm×n, m ≥ n;

I b ∈ Rm.

The system of equations

A x = b

may not have a solution!

Then, let’s find x that minimizes ‖b − Ax‖2.


b may not belong to Im(A).


y is the orthogonal projection of b onto Im(A): y = Pb.


Theorem: A vector x minimizes the 2-norm of the errorr = (b − Ax) if, and only if, Pb = Ax , where P projects borthogonally onto Im(A).


Proof: we’ll show that y = Pb is the only guy that minimizes‖z − b‖ among all z ’s in Im(A).

This has to be the case, because, by taking any z 6= y in Im(A),we have: (Pythagorean theorem)

‖z − b‖2 =

> 0︷︸︸︷‖y − z‖2 +‖y − b‖2 > ‖y − b‖2

Therefore, y is better than z .


Theorem: x is such that Pb = Ax if, and only if, r ⊥ Im(A).

Proof:

1. (⇐) Suppose that r ⊥ Im(A). We know that b = y + r forsome y ∈ Im(A). That said, we have y = Pb. Now, sincePb ∈ Im(A), we can just set x such that Pb = Ax .

2. (⇒) Suppose that Pb = Ax . Observe that (b − Pb)⊥Im(A).Now, Atr = At(b − Ax) = At(b − Pb) = 0.


Fact: r ⊥ Im(A) ⇐⇒ AtAx = Atb.

Proof:Atr = 0

At(b − Ax) = 0

Atb − AtAx = 0

Atb = AtAx

Therefore, a vector x minimizes the 2-norm of the errorr = b − Ax if, and only if, that last equation is true.


Finally, if At A is non-singular, then the last fact suggests picking:

x = (At A)−1At b

in order to minimize the error ‖b − Ax‖.

Note: (At A)−1At is known as pseudo-inverse matrix.

Alternatives: SVD, QR.


Use that to compute homography H.

We have computed FCD : the transformation from the cameracoordinates to the coordinates of the display surface.

Now we want to compute FDI : the mapping from the surface tothe image displayed on it.

Stereographic projection

First approachFDI : display surface to image

Stereographic projection is a bijective, smooth mapping thatprojects a sphere onto a plane.

Let:

I S be the unit sphere in R3 centered at the origin;

I Q = (0, 0,−1) be the “south pole”;

I π be the plane z = 1.

Our projection is defined on S − {Q}.


Let:

I P = (x , y , z) be a point in S − {Q};I r be the line connecting P and Q.

We define the stereographic projection P ′ = (x ′, y ′) of P as r ∩ π.


Given: P = (x , y , z) ∈ S − {Q}.

Want: P ′ = (x ′, y ′) ∈ π: P projected from the “south pole”.

We find:

x ′ =2 x

1 + z

y ′ =2 y

1 + z

We also find the inverse transform:

x = 4 x ′

4+(x ′)2+(y ′)2y = 4 y ′

4+(x ′)2+(y ′)2z = 4−(x ′)2−(y ′)2

4+(x ′)2+(y ′)2


How to find FDI ?

1. Compute the stereographic projection;

2. Rescale the projection to a suitable region, say,[−1, 1]× [−1, 1].

We have FCD and FDI . Therefore, we have F .

Note: one might use a different parametrization for the image(e.g., geodesic polar coordinates).

First approachF : camera to image

First approachConclusion

In practice, this formulation requires:

I At least 4 correspondences between camera pixels and pointsin the sphere

I Diameter of the sphere and of the base of the sphere

I Position of the camera

Working JavaScript demo:

1. www.ime.usp.br/~alemart/psi5787/spheree/

2. www.ime.usp.br/~alemart/psi5787/spheree/index2.

html

www.ime.usp.br/~alemart/psi5787/spheree/

www.ime.usp.br/~alemart/psi5787/spheree/index2.html

www.ime.usp.br/~alemart/psi5787/spheree/index2.html

Camera matrix formulation

Camera matrix formulationIntroduction

A camera matrix C is a 3× 4 matrix used to describe thetransform of a pinhole camera from points in 3D space to 2Dpoints in the image.

s x ′

s y ′

s

=

a11 a12 a13 a14a21 a22 a23 a24a31 a32 a33 a34

xyz1

Facts:

I C has 11 DoF;

I Encodes intrinsics and extrinsics;

I Use ≥ 6 correspondences (x ′, y ′)↔ (x , y , z) to estimate it.

Camera matrix formulationIntroduction

In order to find F , we need:

I FCD : camera to display surface - we’ll see;

I FDI : display surface to image - as before.

Camera matrix formulationFCD : camera to display surface

Solve this system of equations:

s x ′

s y ′

s

=

a11 a12 a13 a14a21 a22 a23 a24a31 a32 a33 a34

xyz1

x2 + y2 + z2 = 1

Using C , write x and y as a function of z (x ′, y ′ are fixed):

x = α1 z + α0

y = β1 z + β0


Let ∆ = (α0α1 + β0β1)2 − (1 + α21 + β21)(α2

0 + β20 − 1).

Outline of the algorithm:

I if ∆ < 0, discard the pixel;

I else, let zk be:

zk =−(α0α1 + β0β1) + (−1)k

√∆

1 + α21 + β21

I and sk be:

sk =

a11(α1zk+α0)+a12(β1zk+β0)+a13zk+a14a31(α1zk+α0)+a32(β1zk+β0)+a33zk+a34

a21(α1zk+α0)+a22(β1zk+β0)+a23zk+a24a31(α1zk+α0)+a32(β1zk+β0)+a33zk+a34


I z0: upper half of the sphere. z1: lower half.Which one should you pick?

I set k̂ to argmin k

∥∥∥∥ [ x ′

y ′

]− sk

∥∥∥∥;

I return point P = (x , y , z), where:

I x = α1zk̂ + α0;I y = β1zk̂ + β0;I z = zk̂ .

So, now we know FCD .

Camera matrix formulationConclusion

In practice, this formulation requires:

I At least 6 corresp. camera ←→ sphere;

I No need to specify the position of the camera!!

Not implemented :(

Piecewise-Linear approximation

Piecewise-Linear approximationIntroduction

Cover any parametric surface with lots of patches.

Piecewise-Linear approximationAlgorithm

Given:

I a point p′ = (x ′, y ′) in the camera space.I two dense sets of quads (i.e., convex quadrilaterals):

1. a set of quads in cam. space: {q1, . . . , qm};2. a set of quads in display space: {Q1, . . . ,Qm};3. we assume that quad qi corresponds to Qi , for all i .

Want:

I transform p′: camera → display surface.


Sktch of the algorithm:

1. for all j :1.1 let:

I cj be the centroid of quad qj ;I Cj be the centroid of quad Qj .

1.2 estimate homography Hj using quads (qj ,Qj);I coord system: origin at the centroids

2. return a convex combination of {Cj + Rj Hj (p′ − cj)}I where Rj converts to the plane (in 3D) formed by Qj

Piecewise-Linear approximationConvex combination

Given a finite set of points {u1, u2, . . . , uk} in 3D space, theirconvex combination is the point:

α1u1 + α2u2 + . . .+ αkuk

where:

I scalar αi ≥ 0 for all i ;

I∑k

j=1 αj = 1.


One might use indicator variables:

αj =

{1 if p′ ∈ qj0 otherwise

The display surface will be approximated by linear patches.


Alternatively,

αj =

1m∑i=1

exp (− 12

(diσ

)2)

exp (−12

(djσ

)2)

where:

I σ is a fixed constant (say, 1).

I di = ‖p′ − ci‖ for all i .


Plot for exp (− x2

2 ):


Alternative: use triangulations instead of homography matrices.


Given:

I a point p′ = (x ′, y ′) in the camera space.

I a dense set of correspondences camera ↔ display.

Want: move p′ to display space.

Sketch of the algorithm:

1. compute a triangulation in the camera space;

2. move that triangulation to display space;

3. suppose that p′ is in triangle T ′:I p′ can be written as a convex combination of the vertices:

p′ = α1 v1 + α2 v2 + (1− α1 − α2) v3

4. move p′ to the corresponding triangle T in the display space.

Piecewise-Linear approximationConclusion

This method:

I requires a dense set of correspondences;

I models any parametric surface.

Conclusion

Conclusion

I Pinhole camera model

I First approachI ≥ 4 correspondences cam ↔ sphereI diameter of the sphere & baseI position of the camera

I Camera matrix formulationI ≥ 6 correspondences cam ↔ sphere

I Piecewise-Linear approximationI dense set of correspondencesI models any parametric surface

the end

psi5787 - virtual realityalemart/psi5787/presentation.pdf · psi5787 - virtual reality alexandre...

Documents