1 articulated pose estimation in a learned smooth space of feasible solutions taipeng tian, rui li...

46
1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

Upload: tiffany-hall

Post on 17-Dec-2015

220 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

1

Articulated Pose Estimation in a Learned Smooth Space of

Feasible Solutions

Taipeng Tian, Rui Li and Stan Sclaroff

Computer Science Dept.

Boston University

Page 2: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

2

Introduction

• Motivating application– Gesture Recognition– Fixed Gesture Lexicon.– For example :

Aircraft Signaler hand gestures

Traffic Controllerhand Signals

Basketball Refereehand Signals

Page 3: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

3

Pose

Estimation

Problem Definition

2D Projected Marker Positions

Input (Observation) Output

Silhouette(Alt Moments)

Page 4: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

4

Related Work : Pose Estimation from a Single Image

• Geometry Based – Taylor CVIU ’01– Barron & Kakadiaris IVC ’01– Parameswaran & Chellappa CVPR ‘04

• Learning Based– Rosales & Sclaroff HUMO ’00– Agarwal & Triggs CVPR ’04

• Others– Lee & Cohen CVPR ’04– Shakhnarovich, Viola, Darrell ICCV ’03– Mori, Ren, Efros and Malik CVPR ‘04– Many more …

Page 5: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

5

Idea 1 : Learning Mappings

• Specialized Mapping Architechture (SMA)[Rosales and Sclaroff NIPS ‘01]

• Relevance Vector Regression[Agarwal and Triggs CVPR ‘04]

Image Features

Pose

Page 6: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

6

Idea 1 : Learning Mappings

• Specialized Mapping Architechture (SMA)[Rosales and Sclaroff NIPS ‘01]

• Relevance Vector Regression[Agarwal and Triggs CVPR ‘04]

Image Features

Pose

Page 7: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

7

Idea 2 : Exploring the Solution Space

• Simulated Annealing[Deutscher et al. CVPR ’00]

• Monte Carlo Markov Chain[Lee and Cohen CVPR ‘04]

• etc …

Page 8: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

8

Idea 2 : Exploring the Solution Space

• Simulated Annealing[Deutscher et al. CVPR ’00]

• Monte Carlo Markov Chain [Lee and Cohen CVPR ‘04]

• etc …

• Accurate model and typically with high DOF.

• Exploring the pose space for a solution consistent with observations.

• Difficult for high DOF.

• Computationally intensive.

Page 9: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

9

Key Observations

• We have a constrained set of poses.• Not necessary to explore the full parameter space.• Combine two ideas

– Learn Mappings– Explore a constrained space (i.e. learned model of body poses)

Aircraft Signaler hand gestures

Traffic Controllerhand Signals

Basketball Refereehand Signals

Page 10: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

10

Overview of Framework

Learn the rendering function Φ(.)

Learn a model of human body poses1

2

Y: Training DataLearning Phase

)(||||min 12 yx,sΦ(y)

yx,YL

Pose Inference PhaseInput Silhouette Output Pose

X: Latent Space

Page 11: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

11

Learning a Model of Human Poses

• Gaussian Process Latent Variable Model (GPLVM) [Neil Lawrence NIPS ’04] is used.

• GPLVM originally used for visualizing high dimensional data

• Grochow et al. (SIGGRAPH ’03) uses it to solve the inverse kinematics problem for human motion animation.

• Currently we use it for automated articulated body pose inference

Page 12: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

12

Gaussian Process Latent Variable

Model(GPLVM) Overview

Higher Dimensional

Lower Dimensional / Latent Space

Probabilistic Mapping

y

x

Page 13: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

13

GPLVM Training : Learning a Model of Body Poses

• Given : training set of 2D projected marker positions {yi} (each yi is of D dimension)

• Goal : Learn parameters ,,},{ ix

Corresponding latent variable valuesfor each training data point

Variables related to the Kernel

Page 14: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

14

Kernel Function

• Also known as covariance function.• Measures the similarity of the latent

variables x and x’.

• For a data set of size N, we form an N by N kernel matrix K, in which Ki,j = k(xi, xj).

1',

2'

2-exp )',(

xxxxxxk

Page 15: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

15

• For a single dimension, the likelihood of y given the Gaussian Process (GP) model parameters is:

• Joint likelihood for D dimensions is:

dTdNdidip YKY

Kxy 1

,, 2

1exp

||)2(

1),,},{|}({

D

d idiii pp1 , ),,},{|}({),,},{|}({ xyxy

GPLVM Training : Learning a Model of Body Poses

Page 16: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

16

}){|,,},({ iip yx

)ln(||||2

1

2

1||ln

221

ii

Td

d

Td

DxYKYK

To learn GPLVM from the training set {yi}, we maximize the following posterior:

And placing the priors

)|()( I0xx ,Np 1

),,( p

Negative Log

Page 17: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

17

}){|,,},({ iip yx

)ln(||||2

1

2

1||ln

221

ii

Td

d

Td

DxYKYK

To learn GPLVM from the training set {yi}, we maximize the following posterior:

Negative Log

Computationally Intensive. A subsetis chosen to compute the kernel matrix.This subset of poses is called the ActiveSet.

Page 18: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

18

• For a new pair (x,y) we can predict using

222

2

||||2

1)(ln

2)(2

||)f(||

),,,},{|},({ln),(

xxx

xy

xxyyyx

D

pL iiY

)},{|,,,},({ y'yx'x iip

• This eqn. can be used to solve for x given y or y given x, via gradient descent.

Page 19: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

19

GPLVM1x

2x

Page 20: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

20

GPLVM1x

2x

Page 21: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

21

GPLVMLeft hand raised silhouettes tend to be clustered together

Page 22: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

22

GPLVMDoes not always do a good job

Page 23: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

23

About GPLVM

• Allows mapping to and from the lower dimensional space.

• Allows smooth parameterization (i.e. allows derivatives) in lower dimensional space.

• Two dimensions work well for our data set. (Growchow et al. uses 2-5)

Page 24: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

24

Input2D Pose

Silhouettes (Represented using Alt Moments)

Learning the Forward/Rendering Function

Similar to Rosales and Sclaroff

Page 25: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

25

Overview of Framework

Learn the rendering function Φ(.)

Learn a model of human body poses1

2

Y: Training DataLearning Phase

)(||||min 12 yx,sΦ(y)

yx,YL

Pose Inference PhaseInput Silhouette Output Pose

X: Latent Space

Page 26: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

26

Pose Inference

21

2 ||||||||min ysΦ(y)y

Typical Regularization(Also used by Agarwal and Triggs)

Page 27: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

27

Data Term

21

2 ||||||||min ysΦ(y)y

Forward function (Rendering function)

2D Projected Marker Positions

Silhouette(Alt Moments)

Page 28: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

28

Regularization Term

21

2 ||||||||min ysΦ(y)yx,

Replace with prior knowledge term(i.e the learned model of poses)

)(1 yx,YLIndependent of feature s

Page 29: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

29

Pose Inference

)(||||min 12 yx,sΦ(y)

yx,YL

Solution obtained using Conjugate Gradient- Initialization using Active Set

Page 30: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

30

Data Collection

• 12 gestures in the flight director lexicon

• Synthesize 6000 pairs of (Silhouette, Pose) pairs using Poser

• 3000 training (Male model)

• 3000 testing (Female model)

3D Pose

Synthesized Silhouettes sampledUniformly over the frontal view-sphere

Page 31: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

31

(a) Silhouette images generated by Poser 5 (Test Set)

Experiments (Synthetic Data)

(c) Our Approach

(b) Estimation from SMA (Specialized Mapping Architecture)

(d) Ground Truth

Page 32: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

32

Comparison with SMA

Page 33: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

33

Additional Constraints

212 |||| tt yy

)(||||min 12

ttYt Ltt

y,xs)Φ(yy,x

Additional constraints can be added to achieve more accurate estimate, e.g. temporal consistency

Page 34: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

34

Experiments (Real Data)

(d) Our Approach (With Temporal Consistency)

(a) Silhouette images of real person

(b) SMA (Specialized Mapping Architecture)

(c) Our Approach (Without Temporal Consistency)

Page 35: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

35

Experiments (Real Data)

(a) Silhouette images of real person

(b) SMA (Specialized Mapping Architecture)

(c) Our Approach (Without Temporal Consistency)

(d) Our Approach (With Temporal Consistency)

Page 36: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

36

Conclusion• Proposed a novel method for Pose

estimation for a pre-defined gesture lexicon.

• Interesting to note that two dimension is enough in our case.

• Technique is fast. (about 0.1 sec per frame in Matlab)

• Tracking as an extension. [video]

Page 37: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

37

Thank You

Page 38: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

38

Comments after the talk• Related Works

– Bullets / Summary of Strength vs Weakness– Why we need this work?

• Include year of publication for the related work (eg Rosales Sclaroff work not mentioned, Smichisecu work not mentioned)

• Order the related work temporally?• Include an introduction slide and motivating slide

– How to Motivate this work?– State of the art is so and so… We found this common weakness. So we proposed this

work..• Human Pose not mentioned in Intro• At the end of the talk say why use this work over the others• Why GPLVM and not other reduction techniques? Like LLE/PCA/ISOMAP etc• Give a top overview of the algorithm. A flow chart view?• Explain the L(x,y) mapping using an illustration like the mapping between two planes.

Clearly say what is high dimension y and what is low dimension x• Give reference for GPLVM or website link.• Add a slide on Math of GPLVM• The Tikhonov regularization approach of minimizing ||phi(y)-s|| + regularization term.

Usually the regularization term is ||Dx|| but now we chose L(x,y). Explain why• Slide to talk about temporal constraint.• Why learn the rendering function? i.e because we want to take the derivative…• Give the numbers for the training set and this gives an idea how good are the

quantitative results

Page 39: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

39

Related Work

Model Based• Simulated Annealing

[Deutscher et al CVPR ’00]

• Kinematic Jump Processes[Sminchisescu and Triggs CVPR ’03]

• Monte Carlo Markov Chain [Lee and Cohen CVPR ‘04]

• etc …

Learning Based• Specialized Mapping

Architechture (SMA)[Rosales and Sclaroff NIPS ‘01]

• Relevance Vector Regression[Agarwal and Triggs CVPR ‘04]

• Parameter Sensitive Hashing[Shakhnarovich et al CVPR ‘03 ]

• etc …

Page 40: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

40

}){|,,},({ iip yx

)ln(||||2

1

2

1||ln

221

ii

Td

d

Td

DL xYKYK

To learn GPLVM from the training set {yi}, we maximize the following posterior:

Negative Log

Page 41: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

41

Overview of Framework (Learning Phase)

Learn the Rendering Function Φ(.)

Learning a model of human body poses(Using GPLVM)1 2

Page 42: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

42

Overview of Framework (Estimation Phase)

Input Silhouette

Output Pose

Search over learned model of human body pose for solution consistent with observation

Page 43: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

43

Kernel Function

• measures the similarity of the latent variables x and x’.

• For a data set of N, we can form a N by N kernel matrix K, in which Ki,j = k(xi, xj).

1',

2'

2-exp )',(

xxxxxxk

how correlated x, x’ are in general spread of the

functionnoise in the prediction

Page 44: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

44

}){|,,},({ iip yx

To learn the parameters of the GPLVM from the training set {yi}, we maximize the following posterior:

And placing the priors

)|()( I0xx ,Np

GPLVM Training : Learning a Model of Body Poses

1

),,( p

Page 45: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

45

Gaussian Process Latent Variable Model(GPLVM)

)( yx,YL

Low dimensional parameterization

Original space representationExpress how well

the two value matches

Space of FeasiblePoses

Page 46: 1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University

46

• For a new pair (x,y) we can predict using

222

2

||||2

1)(ln

2)(2

||)f(||

),,,},{|},({ln),(

xxx

xy

xxyyyx

D

pL iiY

)},{|,,,},({ y'yx'x iip

)()( 1 xT kKYxf )()(),()( 12 xKxxxx kkk T