algebraic functions of views for 3d object recognition cs773c advanced machine intelligence...

53
Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence CS773C Advanced Machine Intelligence Applications Applications Spring 2008: Object Recognition Spring 2008: Object Recognition

Post on 15-Jan-2016

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Algebraic Functions of Views for 3D Object Recognition

CS773C Advanced Machine Intelligence ApplicationsCS773C Advanced Machine Intelligence Applications

Spring 2008: Object RecognitionSpring 2008: Object Recognition

Page 2: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Object Appearance

• The appearance of an object can have a large range of variation due to:– Photometric effects

– Scene clutter

– Changes in shape (e.g., non-rigid objects)

– Viewpoint changes

Page 3: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Algebraic Functions of Views (AFoVs)

• A powerful mathematical foundation for investigating variations in the geometrical appearance of an object due to viewpoint changes.

““the variety of of 2D views depicting the geometrical the variety of of 2D views depicting the geometrical appearance of a 3D object can be expressed as a appearance of a 3D object can be expressed as a combination of a small number of 2D views of the combination of a small number of 2D views of the object”object”

S. Ullman and R. Basri, "Recognition by Linear Combinations of Models", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, no. 10, pp. 992-1006, 1991.

Page 4: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Orthographic Projection

• Case of

3D rigid

transformations

(3 ref. views)

' '' '''1 2 3 4

' '' '''1 2 3 4

x a x a x a x a

y b y b y b y b

Page 5: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Orthographic Projection

• Case of 3D linear transformations

' ' ''1 2 3 4

' ' ''1 2 3 4

x a x a y a x a

y b x b y b x b

(2 ref views)

Page 6: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

More Results …

• Perspective projection (2 ref. views, obtained under orthographic projection)

• Objects with smooth surfaces and non-rigid objects– More reference views are required.

'' ' ' '' ' '5 6 7 8 5 6 7 8

'' ' ' '' ' '1 2 3 4 1 2 3 4

a x a x a y a b x b x b y bx y

a x a x a y a b x b x b y b

A. Shashua, “Algebraic functions for recognition”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 17, no. 8, pp. 779-789, 1995.

Page 7: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

A Word of Caution!

• Only common features in the reference views can be predicted in a novel view.

novel view

reference viewreference view

Page 8: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Recognition Framework Using AFoVs

““novel 2D views of a 3D object can be recognized novel 2D views of a 3D object can be recognized by matching them to combinations of a small by matching them to combinations of a small number of known 2D views of the object”number of known 2D views of the object”

Page 9: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Representation and Matching using AFoVs

• Representation– Objects are represented by a small number of views.

– Each view is represented by some geometric features (e.g., points)

• Matching– Predict the geometric appearance of an object in a novel view by

combining a small number of reference views of the object.

Page 10: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Advantages of the Method

• No 3D models or camera calibration are required.

• Only a small number of 2D views are required.

• Novel views can be different from the stored ones.

• Simpler verification scheme.

• More general framework (“family” of methods).

• Evidence that the human visual system works similarly.

Page 11: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Main Challenges

• Which model views to combine to predict a novel view?

• How to establish the correspondences between novel and reference views?

• How to find the coefficients of the combination?.

• How to handle occlusions?

• How to choose the reference views?

Integrate AFoVs with Indexing!

Page 12: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Overview of the Method (cont’d)

Page 13: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Which Model Groups to Choose?

• Cluster geometric features into higher level descriptions.

• Consider properties that are unlikely to occur at random.

• Property used in our work: convexity convexity

Page 14: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Which Model Groups to Choose? (cont’d)

Page 15: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

How to generate the appearances of a group?

• Estimate each parameter’s range of values

• Sample the space of parameter values

• Generate a new appearance for each sample of values' ' ''

1 2 3 4

' ' ''1 2 3 4

x a x a y a x a

y b x b y b x b

1 2 3 4( , , , )a a a a 1 2 3 4( , , , )b b b b

min max: [ , ]i i ia a a min max: [ , ]i i ib b b

Page 16: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Estimate the Range of Values of the Parameters

' ' ''1 11 11 1 1

' ' ''2 22 22 2 2

3 3

' ' ''4 4

1

1

... ...... ... ... ...

1 N NN N N

x ya bx y x

x ya bx y x

a b

x ya bx y y

1 xPc p 2 yPc por

1 ( )P

Pi xiP

i ii

u pc v

w

Using SVD:

2 ( )Pi y P

iPi ii

u pc v

wand

and

TP P PP U W V

Page 17: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Estimate the Range of Values of the Parameters (cont’d)

• Assume normalized coordinates:

• Use Interval Arithmetic (Moore, 1966)

(note that the solutions will be identical)

[0,1]xp [0,1]yp

1I I

xPc p 2I I

yPc p

1 2,I Ic c

1 1 2 2

1 1 1 2 2 1 2 2 1 1 1 2 2 1 2 2

[ , ]

* [min( , , , ),max( , , , )]

t r t r t r

t r t r t r t r t r t r t r t r t r

1 2 1 2[ , ] [ , ]t t t r r r

Page 18: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Example

Page 19: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Preconditioning the Reference Views :

• Transform the original views to new views

nPC Psuch that has the best possible condition.nP

effect of the conditionnumber of P on the intervals

Page 20: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Preconditioning the Reference Views (cont’d)

• Choosing:

• This implies:

• Thus:

( )( ) ( )n n nT T T

P P P C C C P P PU W V U W V U W V

C PU V

( ) ( )n n nT T

P P C C P P PU W W V U W V

nP C PW W W

1C PW W

Page 21: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Example(preconditioned views)

Page 22: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Decouple Image Coordinates

• Same transformation generates the x- and y-coordinates:

• Represent only the x-coordinates in the index table.

• For each group, store the following entry:

1 2(mod , , , )el v v group

' ' ''1 2 3 4

' ' ''1 2 3 4

x a x a y a x a

y b x b y b x b

' ' ''1 2 3 4x a x a y a x a

Page 23: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Hypothesis Generation and Verification

1.take intersection of hypotheses

model

2. apply constraintsto reject invalidhypotheses

Page 24: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

How to Choose the Scene Groups?

• Using convex groupingconvex grouping to extract salient scene groups.

Page 25: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Implementation Issues

• Space requirements– select salient groups

– reject groups giving rise to bad conditioned matrices

– coarse sampling of parameters

• Index computation and table size

Page 26: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Important Implementation Issues (cont’d)

• Sampling step (i.e., parameters of AFoVs)

• Noise tolerance

' ' ''1 2 3 4i i i ix a x a y a x a

' ' ''1 2 3 4ˆ ˆ ˆ ˆ ˆi i i ix a x a y a x a

actual:

predicted:

* * *| ( ) | | ( ) | | |i j i j i jx e q x q e x q e

make additional entries in a neighborhood around the indexed location

Page 27: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Experiments and Results

model objectsand referenceviews used inour experiments

Page 28: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Experiments and Results (cont’d)

reference views reference views

novel view novel view

Page 29: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Experiments and Results (cont’d)

reference views

novel view novel view

Page 30: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Experiments and Results (cont’d)

reference views reference views

novel viewnovel view

Page 31: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Criticism of the Method

• Relies heavily on feature extraction

• It has high memory requirements.

• The index table might represent unrealistic model appearances.

• Indexing based on hashing is not very efficient.

• No explicit ranking of hypotheses.

Page 32: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Improving AFoVs Recognition Framework

• Reject unrealistic appearances

• Reduce storage requirements and improve speed

• Develop a probabilistic hypothesis generation scheme– Learn shape appearance

– Rank hypotheses

• Represent object appearance more efficiently using improved indexing schemes and probabilistic models.

W. Li, G. Bebis, and N. Bourbakis, "Integrating Algebraic Functions of Views with Indexing and Learning for 3D Object Recognition", IEEE Workshop on Learning in Computer Vision and Patter Recognition (in conjunction with CVPR04), Washington DC, June 28, 2004.

Page 33: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Combine Indexing with Learning

• Sample the space of appearances sparsely and represent the samples in a K-d tree

• Sample the space of views densely and represent the samples using probabilistic models.

• Given a novel view:(1) Use K-d tree to retrieve a small number of candidate models

(2) For each candidate model, compute the probability that it might have produced the novel view

(3) Verify most likely hypotheses first

Page 34: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Combine Indexing with Learning (cont’d)

• The first stage provides hypothetical matches fast.

• The second stage evaluates the feasibility of hypothetical matches fast, without having to apply verification explicitly.

• Only “highly likely” hypotheses are verified explicitly.

Page 35: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Improved Framework

Reference views

Extract model groups

Estimate the range of AFoVs parameters

Sampling AFoVs parameter space

New image

Extract image groups

K-d Tree

Rank hypotheses

Estimate AFoVs parameters

Verify hypotheses

TRAINING PHASE RECOGNITION PHASE

Using SVD & IA

Access

Retrieve

Validate views

Hypothetical matches

Low-dimensional representation

Manifold learning using EM

Recognition results

Random Projection

dense coarsedense

coarse

Page 36: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Eliminate Unrealistic Model Appearances

• Under the assumption of linear transformations, many unrealistic views could be generated.

• Impose rigidity constraints to eliminate them.– Storage requirements can be reduced significantly.

– Recognition becomes faster and more efficient.

0)()( 122332111331332211 rbabarbababababa

0)(2)(2 12323211313123

22

21

23

22

21 raabbraabbbbbaaa

Page 37: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Eliminate Unrealistic Model Appearances

Unrealistic Views (without constraints)

Realistic Views (with constraints)

Page 38: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Indexing Appearances

• Sample the space of views “coarsely” and represent the samples in an index table.

• Hashing might not very well in this case ...

• Need an improved indexing scheme.

Page 39: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Range Search vs Nearest Neighbor Search

Nearest Neighbor Search Range Search

• Range search is not appropriate when storing a sparse number of views.

• K-d trees perform a nearest-neighbor search.

Page 40: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

K-d Trees for Indexing

P1

P2

P3

P4

P1 P2 P3 P4

• K-d trees perform a nearest-neighbor search.

Page 41: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Learning Geometric Appearance

• We can pre-compute the views that an object can produce off-line.

• These views form a manifold in lower dimensional space.

• Model object appearance using a pdf.– Sample the space of appearances.– Fit a parametric model (e.g., mixtures of Gaussians using EM).– Use mutual information theory to choose the number of components.

• EM has problems when the dimensionality of the data is high.• Apply “Random Projection” first, then run EM algorithm.

Page 42: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Manifolds of Real Objects: An Example

• Need to store a small number of parameters only for each model

Page 43: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Hypothesis Ranking

)))|((max)|((log(max

))|()|(log()|(

iyiixi

jyjxjo OpOp

OpOpOp

.,,1 Hi where

• Each hypothesis generated by the K-d tree is ranked by computing its probability using mixture models.

• For each test group, we compute two probabilities, one from x coordinates, and the other from y coordinates.

• The overall probability for a particular hypothesis is computed according to the following equation:

Page 44: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Reference Views

1st Reference view

2nd Reference view

Page 45: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Reference Views (cont’d)

1st Reference view

2nd Reference view

Page 46: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Test Views

(a) (b) (c)

(d) (e) (f)

Page 47: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Test Views (cont’d)

Hypothesis rejected Hypothesis rejected

Page 48: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Integrate Geometric Appearance with Intensity Appearance

• Using geometrical information only does not provide enough discrimination for objects having similar “geometric” appearance but probably different “intensity” appearance.

• Integrating geometric and intensity apperance during hypothesis verification to improve discrimination power and robustness.

W. Li, G. Bebis, and N. Bourbakis, "3D Object Recognition Using 2D Views", IEEE Transactions on Image Processing (under revision).

Page 49: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Dense Correspondences

• For each group of corresponding points, apply triangulation recursively to get denser correspondences.

• Divide triangles into four sub-triangles by considering the middle point of each side of each triangle.

Page 50: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Refine AFoVs parameters

(before refinement) (after refinement)

Page 51: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Predict Intensity Appearance - Example

Reference view 1 Reference view 2

Test view Prediction

Page 52: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Predict Intensity Appearance - Example

Reference view 1 Reference view 2

Test view Prediction

Page 53: Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Predict Intensity Appearance - Example

(hypothesis rejected)

(hypothesis accepted)