introduction to model-based 3-d object location
DESCRIPTION
INTRODUCTION TO MODEL-BASED 3-D OBJECT LOCATION. Emanuele Trucco Signal and Image Processing Research Area School of Engineering and Physical Sciences Heriot-Watt University. CONTENTS. 1. Problem definition: identification vs. location 2. 3-D shape representations: view centred and - PowerPoint PPT PresentationTRANSCRIPT
INTRODUCTION TO MODEL-BASED 3-D OBJECT LOCATION
Emanuele TruccoSignal and Image Processing Research AreaSchool of Engineering and Physical SciencesHeriot-Watt University
CONTENTS1. Problem definition: identification vs. location2. 3-D shape representations:
view centred andobject centred
3. VC 1: eigenspaces4. VC 2: active shape models5. OC 1: full perspective6. OC 2: weak perspective7. ICP: location without correspondence
1. PROBLEM DEFINITION
3-D model based location:estimating the position and orientation of a known 3-D object from an image
ASSUMPTION- A model must be available, i.e., the object has been identified.
IDENTIFICATION VS LOCATION
Identification = which model in my database matches the data in the image?
Aka classification, recognition …
Location = given that an image object matches a given model, where (location and translation) is that object in 3-D space?
Here, we assume a sequential process:
identify first, then use model to locate
notice: not always applied!
2. 3-D REPRESENT.S: AN INCOMPLETE LIST
Geometric models (object-centred)- primitives (gen’d cones, geons, etc)- CAD-like
Appearance models (view-centred)- aspect graphs- Active shape/appearance models (ASM/AAM)- Eigenspaces- Statistical learning
Others- Invariants
Notice: focus on shape - but shape not whole story!
TWO IMPORTANT TYPES OF SHAPE MODELS
OBJECT-CENTRED GEOMETRIC MODELS:- Model: CAD-like description based on detectable features (e.g., lines, surface patches and spatial relations)- All co-ords expressed in ref. frame rigidly attached to obj.- Cannot be compared directly with images
VIEW-CENTRED MODELS:- Model: set of views under different conditions- Basis for current visual learning approaches- Can be compared directly with images
VISUAL EXAMPLES
VIEW -CENTRED OBJECT-CENTRED
AN UNPRETENTIOUS COMPARISON
OBJECT CENTRED:
- better for measurements (e.g., photogrammetry)
- CAD-like, geometric model must be feasible (e.g., deformable objects a typical problem)
- compact
VIEW CENTRED:
- better for complex objects (e.g., deformable, articulated, unpredictable)
- not so good for exact measurements
- can be expensive (memory intensive)
3. VIEW-CENTRED 1: EIGENSPACES
KEY IDEAS:
- img X as 1-D vector, x, obtained by scanning rows:
- matching: compare imgs by correlation dot product:
- build (= learn) compact object repr. from set of
views x1 ,…, xV (i.e., do not store full imgs)
- reduce repr. size by principal component analysis
2
],...,,,...,,[ 22211211NT
NNxxxxx x
2121 xxTXXc
EIGENSPACES (cont’d)
A COMPACT MODEL USING PCA
n
iiijj g
1
exx
where
- e1 , …, en eigenvectors of Q=XXT (covariance)
associated to the n nonzero eigenvalues of Q;- gij is the representation of the img xj in eigenspace
THE BIG DEAL: keep only first important eigenvectors!
with k<<n !!
n
jjn 1
1xx
k
iiijj g
1
exx
EIGENSPACES (cont’d)
BUILDING THE MODEL
- Project all examples into eigenspace to get :
- The 3-D object model is the resulting curve in eigenspace E.g., varying only 1 appearance parameter:
)](|...|[ 1 xxeeg jkj
],...,[ 1 jkjj ggg
In general: m appear. params (e.g., various orient angles, illum.) hypersurface (manifold)
e1
e3
e2
EIGENSPACES (cont’d)
LOCATION:- get input image- project into eigenspace g - find closest point to g on manifold (model)- associated appearance parameters give pose etc.
SOME COMMENTS- Discrete manifold, so approximated pose only (but can interpolate)- Extends naturally to recognition
(using one manifold per 3-D object)- Closest-point problem not trivial- Universal vs. object-specific eigenspaces
4. VIEW-CENTRED 2: ACTIVE SHAPE MODELS[Cootes, Taylor et al., CVIU’95 etc]
IDEA:- Another application of PCA !- Learn shape variation of contours from a set of examples (extends to grey levels, AAM).- Same idea as eigenspaces, BUT basic element is contour (vector of contour co-ords), not full image- See tracking of deformable objects (e.g., Baumberg & Hogg)
...
TRAINING SET
MEAN IMG
FIRST MODE
SECOND MODE
THIRD MODE
FOURTH MODE
ACTIVE APPEARANCE MODELS[Cootes, Edwards and Taylor ECCV 1998]
IDEA:extend Active Shape Models by 1. modelling shape and texture variations ; 2. dividing large variation ranges into smaller intervals
assigned to a set of sub-models
SUB-MODEL VISIBILITY CONSTRAINTDifferent models use different sets of features, such thatno feature is ever occluded in the traning set of any sub-model.
ACTIVE APPEARANCE MODELS 2FOR EXAMPLE:face appearance as head rotates -90 to +90 deg (0 deg is frontoparallel)
5 models sufficient, roughly centered in -90, -45, 0, 45, 90
For the contour component:
Model k Model k+1
Some features disappear
rotation
ACTIVE APPEARANCE MODELS 3
EXTENDED MODEL
ssQ cxx ggQ cgg
where: mean shape, mean texture, Qc, Qg matrices describing modes of variations.
TO GENERATE IMAGES FROM c:
1. Generate texture g(c) ;
2. Warp texture using shape x(c) .
gx
ACTIVE APPEARANCE MODELS 4
EXAMPLE: ROTATING HEAD
Pose representation = single rot angle, .
Assume model c=c() :
with c0, cc, cs vectors estimated from the training set .
(I.e., elliptical shape variation with , correct if affine proj;
elliptical variation is approximation for texture )
TRAINING
Assume known orientation for each j-th training example; find best-fit model parameters for cj (ext. model eqs.);estimate c0, cc, cs by regression from equation above.
)sin()cos(0 sc cccc
ACTIVE APPEARANCE MODELS 5
ESTIMATING THE ROTATION ANGLE
Acquire new image, c ;
let the pseudo-inverse of , i.e.,
if
then
TRACKING THROUGH WIDE ROTATION ANGLES
Track orientation angle, use to switch to most adequate model in set.
aa xy /arctan
;| 21 IR scc cc
sc cc |
01, cc c
Taa Ryx
1cR
5. OBJ-CENTRED 1: FULL PERSPECTIVE[Lowe PAMI’91 -> Trucco&Verri’98]
PURPOSE: find R and T bringing model to 3-D position generating the perspective image
OBJ-CENTRED 1 (cont’d)
IDEA:
1. calibrated persp. projection (xi,yi)T of model point
:
2. match N scene and model points, N > 6, thus
getting data (xi,yi)T and ;
3. solve linearized system iteratively (Newton), given initial guess + 1, 2, 3 parameters of R
3333231
2232221
3333231
1131211
TZrYrXr
TZrYrXrfy
TZrYrXr
TZrYrXrfx
mi
mi
mi
mi
mi
mi
i
mi
mi
mi
mi
mi
mi
i
Tmi
mi
mi ZYX ),,(
Tmi
mi
mi ZYX ),,(
OBJ CENTRED 1 (cont’d)
2 linearized (first order Taylor) eqs for each point:
SOME COMMENTS- calibration required! - fully projective version exists [Araújo, Carceroni & Brown CVIU’98]
- iterative method: some care needed (e.g., step)
- can be applied to lines (instead of points)
ijj
ij
j j
i
ijj
ij
j j
i
yy
TT
y
xx
TT
x
][
][
3
1
3
1
6. OBJ-CENTRED 2: WEAK PERSPECTIVE[Alter MIT ‘92 -> Trucco&Verri’98]
PURPOSE: find camera co-ords of model points, , given weak-perspective projs,
mmm210 ,, PPP 210 ,, ppp
WP = orthographic proj followed by scaling ->use right triangles in diagram !
210 ,, PPP
OBJ-CENTRED 2 (cont’d)IDEA:
1. from right trianges (see diagram):
s is scale factor, w irrecoverable depth offset [why?]
2. compute the rigid tranformation R, T aligning camera and model co-ords using correspondences j
mj PP
jiij
jiij
D
d
sDdhh
sDdh
sDdh
PP
pp
,)(
)(
)(
212
212
22
21
202
202
22
201
201
21
),,(
),,(
),,(
0
2221
2
1111
1
001
24
hwyxs
hwyxs
wyxs
scbsas
o
P
P
P
7. ITERATIVE CLOSEST POINT MATCHING (ICP)
WHAT IF IMG-MODEL CORRESPONDENCES ARE UNKNOWN?The previous methods cannot be applied !!
IDEA:If the estimate is close enough to the real ,a backprojected feature, mj , will be very close to the corresponding image feature, fj.
THEREFORE:Given fj , assume the closest mk is the correspondence
(and get it right most of the times ...)
For example: = ok = wrong
t̂,R̂ t,R
ICP ALGORITHM FOR RANGE DATA[Besl & MacKay PAMI ‘92; Luong IJCV ‘94 ]
Assuming set I of 3-D points , i = 1, ... Np ,
and set M of model points , j = 1, ... Nm , with Np Nm :
1. For each , compute closest model point,
2. Compute least-squares estimate of rigid motion aligning I and M
3. Apply motion to data points:
4. If convergence not reached, go to 1;
5. Return
ip
jP
ip jP
t̂,R̂
tpp ii R
t̂,R̂
ICP: COMMENTS
1. Great: no correspondences needed ! But price: additional search problem (closest point,
not trivial computationally). Corresp. minimis. is a common trade-off in vision!
2. Convergence = min alignment error (local!), or max number of iterations
3. In practice, numerical optimization of residual usual problems: e.g., quality of initial guess, basin of convergence
4. Robust estimator at each iteration improves result (but costs additional time [Trucco Fusiello Roberto PRL ‘99] )
5. Image data (ie, not 3-D): see Besl&McKay or Zhang