model-based hand tracking€¦ · 3d hand model - used as generative model - constructed from 37...

59
Hand tracking using a quadric surface model and Bayesian filter Roberto Cipolla, A. Thayananthan, Bjorn Stenger and Phil Torr Department of Engineering

Upload: others

Post on 26-Sep-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Hand tracking using a quadric

surface model and Bayesian filter

Roberto Cipolla, A. Thayananthan, Bjorn Stenger

and Phil Torr

Department of Engineering

Page 2: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

The Problem

Page 3: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

The Problem

Page 4: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Background - Pedestrian detection

Page 5: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Background - using exemplars

Page 6: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Hierarchical matching with trees

Page 7: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Template-based Detection

• Large number of templates are generated off-line to handle global motion and finger articulation.

• Need for

– Inexpensive template-matching function

• Distance Transform and Chamfer Matching

– Efficient search structure

• Bayesian Tree structure

Page 8: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Overview

• Building the 3D hand model and generating templates

• Learning the kinematic prior and building the tree

• Formulating the likelihood

• Tree-based Bayesian filtering

• Detection and tracking experiments

Page 9: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

3D hand model

Page 10: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Statistical framework

● Hand model has internal state t

● Use prediction (Prior) and measurement from image observation Dt

Prior

Likelihood

p(t | D1:t-1)

p(Dt|t)

Posterior:

p(t |D1:t) = k p(t|D1:t-1)p(Dt| t)

X

p(X)

p(X)

X

X

p(X)

Bayes Rule

Page 11: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Learning the kinematic prior

Data collection with a CyberGlove Kinematic hand model

Page 12: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Inexpensive Likelihood

Page 13: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Tree-based Bayesian Filter

Page 14: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Overview

• Building the 3D hand model and generating templates

• Learning the kinematic prior

• Formulating the likelihood

• Tree-based Bayesian filtering

• Detection and tracking experiments

Page 15: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Building the hand model

Page 16: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Quadrics

0QXXT

Quadric: Second degree implicit surface

defined by points X satisfying .

4)rank( Q 2)rank( Q

Ellipsoid Cone Cylinder Pair of Planes

3)rank( Q

Page 17: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Clipping Quadrics

For modelling more general shapes

truncate quadrics by finding points Xwhich satisfy:

0 and 0 XXQXX TT

Page 18: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Hand Model

• 37 truncated quadrics

• 27 degrees of freedom

Page 19: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Projection of a Quadric

Assuming a normalized projective camera 0|IP

ss

xX )(

0)()( ssT XQX

Parameterize 3D points

Page 20: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Projection of a Quadric (2)

cTb

bAQ

0T CxxT

bbAC c

0)()( ssT XQX

02 TT2 Axxxb scs

0)(0 TT xbbAx c

Condition for X(s) to be on the contour generator of Q:

0T

scs

x

b

bAx

Page 21: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Projecting the Hand Model

3D model Contours

Page 22: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

3D Hand Model

- Used as generative model

- Constructed from 37 truncated quadrics (ellipsoids, cones)

- Efficient contour projection

- 21 degrees of freedom + 6 pose = 27

Page 23: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

3D Hand Model

- Used as generative model

- Constructed from 35 truncated quadrics (ellipsoids, cones)

- Efficient contour projection

- 27 degrees of freedom

Page 24: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Single View Tracking

Page 25: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Stereo Tracking

Page 26: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Kinematic prior for articulated hand

Page 27: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Capturing Hand Articulation

Data collection with a CyberGlove Kinematic hand model

Page 28: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Dimensionality Reduction

● Analyse joint angle data sets using PCA

● 95% of variance is captured by first eight eigenvectors

First principal component Second principal component

Third principal componentFourth principal component

Page 29: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Dimensionality Reduction

● Analyse joint angle data sets using PCA

● 95% of variance is captured by first eight eigenvectors

Third principal component Fourth principal component

Second principal componentFirst principal component

Page 30: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Motion Reconstruction Example

3 eigenvectors 5 eigenvectors

1 eigenvectorOriginal Motion (20

DOF)

Page 31: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Trajectories in Eigenspace

TrajectoriesFour basis configurations

Page 32: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Building the search tree

Page 33: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

State-space partitioning

Template at grid centres become nodes of tree. Different

levels obtained by subdividing each partition.

Search TreeGrid-based partitioning of eigenspace.

Page 34: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Template-based Detection and Tracking

• Large number of templates are generated off-line to handle global motion and finger articulation.

• Need for

– Inexpensive template-matching function

• Distance Transform and Chamfer Matching

– Efficient search structure

• Bayesian Tree structure

Page 35: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Likelihood function

Distance Transform and Chamfer

Matching

Page 36: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Feature Detection

Camera Image

Canny edge map

Distance Transform

Search Template

Page 37: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Distance Image & Chamfer Matching

• Distance Image gives the distance to the nearest

edge feature at every pixel location in the image.

• Calculated only once for each frame.

• Chamfer distance is a specific distance function that is

calculated at each pixel by propagating the nearest

distances from neighbouring pixels.

(x,y)

d

(x,y)

d

Page 38: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Distance Image & Chamfer Matching

• The cost of matching a template at a given image location can be calculated by adding the distances to the nearest edge feature from each of the template point.

• The nearest distances are readily obtained from the distance image.

• A single template matching at a given location typically costs around 200 additions and one division operations- inexpensive.

• 50,000 hypothesis evaluations can be made in less than 1 second on a 1GHz 256MB PC.

N

i idN

C1

1

Page 39: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Distance Image & Chamfer Matching

• Distance Image provides a smooth cost function.

• Efficient Searching techniques can be used to find the correct template.

Page 40: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Distance Image & Chamfer

Matching

Page 41: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Distance Image & Chamfer

Matching

Page 42: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Distance Image & Chamfer

Matching

Page 43: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Distance Image & Chamfer Matching

Page 44: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Distance Image & Chamfer Matching

Page 45: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Tree-based bayesian filtering

Page 46: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Tree-based Search

Page 47: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Pruning the Search-Tree

Page 48: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Pruning the Search-Tree

9.0 20.0 17.0

Page 49: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Pruning the Search-Tree

9.0 20.0 17.0

Page 50: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Pruning the Search-Tree

9.0 20.0 17.0

10.0 4.5 6.7

Page 51: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Pruning the Search-Tree

9.0 20.0 17.0

10.0 4.5 6.7

Page 52: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Tracking as Inference

● Object has internal state Xt

● Image observation Dt

Prior

Likelihood

p(Xt|D1:t-1)

p(Dt|Xt)

Posterior:

p(Xt|D1:t) = k p(Xt|D1:t-1)p(Dt|Xt)

X

p(X)

p(X)

X

X

p(X)

Bayes Rule

Page 53: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Recursive Formulation

● Prediction Step:

p(Xt|D1:t-1) = p(Xt|Xt-1) p(Xt-1|D1:t-1) dXt-1

state transitionsprior previous estimate

● Measurement Step:

p(Xt|D1:t) = k p(Dt|Xt) p(Xt|D1:t-1)

posterior likelihood prior

Page 54: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Bayesian-Tree

• The search-tree is brought into a Bayesian framework

by adding the prior knowledge from previous frame.

• The Bayesian-Tree can be thought as approximating

the posterior probability at different resolutions.

State space partitioning Estimation of posterior pdf

Page 55: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Experimental results

Page 56: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Tracking Results

Page 57: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Rotating in clutter

Page 58: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Opening and closing

Page 59: Model-Based Hand Tracking€¦ · 3D Hand Model - Used as generative model - Constructed from 37 truncated quadrics (ellipsoids, cones) - Efficient contour projection - 21 degrees

Overview

• Building the 3D hand model and generating templates

• Learning the kinematic prior

• Efficiently evaluating the likelihood

• Tree-based Bayesian filtering

• Detection and tracking together