3/5/2002phillip saltzman video motion capture christoph bregler jitendra malik uc berkley 1997

3/5/2002 Phillip Saltzman

Video Motion CaptureVideo Motion Capture

Christoph BreglerChristoph Bregler

Jitendra MalikJitendra Malik

UC Berkley 1997UC Berkley 1997


OverviewOverview

• ChallengesChallenges• ReviewReview• MethodMethod• ResultsResults• ConclusionsConclusions


ChallengesChallenges

• High AccuracyHigh Accuracy• Frequent Inter-part OcclusionFrequent Inter-part Occlusion• Low ContrastLow Contrast


ReviewReview


ReviewReview

• Motion capture on synthetic imagesMotion capture on synthetic images– O’Rouke and Balder, 1980O’Rouke and Balder, 1980

• 1 DOF marker free tracking1 DOF marker free tracking– Hogg, 1983. Rohr, 1993Hogg, 1983. Rohr, 1993

• Higher DOF full body trackingHigher DOF full body tracking– Gravrila and Davis, 1995Gravrila and Davis, 1995


ReviewReview

About the previous workAbout the previous work– All in controlled environments with high All in controlled environments with high

contrast and clear edge bounriescontrast and clear edge bounries– Most use skintight suits or markersMost use skintight suits or markers– Camera calibration neededCamera calibration needed


MethodMethod


MethodMethod

Basic AssumptionsBasic Assumptions– From frame to frame, all intensity pixel From frame to frame, all intensity pixel

intensity changes are local:intensity changes are local:

– uu is motion model and is written as a matrix is motion model and is written as a matrix equation:equation:


MethodMethod

Finding GradientsFinding Gradients– Gradient form of the first equation:Gradient form of the first equation:

– Find a least squares solution to Find a least squares solution to – Warp image I(t+1) based on Warp image I(t+1) based on – Find new gradientsFind new gradients– Repeat to minimizeRepeat to minimize


MethodMethod

Motion as twistsMotion as twists– Standard pose matrix to move from object Standard pose matrix to move from object

space to camera space (3D)space to camera space (3D)– Scaled orthographic projection moves to Scaled orthographic projection moves to

image spaceimage space– Requires knowing something about the 3D Requires knowing something about the 3D

model of the image. Approximated as model of the image. Approximated as ellipsoids.ellipsoids.


MethodMethod

Motion as twistsMotion as twists– Any motion can be represented as a Any motion can be represented as a

rotation about an axis, and a translation rotation about an axis, and a translation about that axisabout that axis

– For example,For example,

to make this motion:to make this motion:


MethodMethod

Motion as twistsMotion as twistsYou make this motion:You make this motion:


MethodMethod

Motion as twistsMotion as twists– Twists can be represented as small vector Twists can be represented as small vector

or matrixor matrix

– Can be made to a pose byCan be made to a pose by– Encode the motion of a pixel between two Encode the motion of a pixel between two

framesframes


MethodMethod

Motion as twistsMotion as twists– Linear algebra manipulation allows using Linear algebra manipulation allows using

the twist vector to write a motion equation the twist vector to write a motion equation for each pixelfor each pixel

– Those equations are put in a vector and Those equations are put in a vector and used to find a global used to find a global parameter for that parameter for that objectobject


MethodMethod

Kinematic chainsKinematic chains– Body parts represented as multiple Body parts represented as multiple

connected objectsconnected objects– Each object can be found by the top pose Each object can be found by the top pose

and an angle and twist for each object and an angle and twist for each object down the chaindown the chain

– More linear algebra is used to find a More linear algebra is used to find a for for each body parteach body part


MethodMethod

Multiple camerasMultiple cameras– Adds accuracy because change of fully Adds accuracy because change of fully

occluded parts drop with each viewoccluded parts drop with each view– Normal motion equation is:Normal motion equation is:– H is system of equations for each pixelH is system of equations for each pixel– is global parameter vector for each objectis global parameter vector for each object– z is initial position of the pixelz is initial position of the pixel


MethodMethod

Multiple camerasMultiple cameras– Adding synchronized cameras:Adding synchronized cameras:– H becomes a matrix where each column H becomes a matrix where each column

represents a viewrepresents a view– The The vector gets a term vector gets a term for each view for each view

that represents the pose seen from that that represents the pose seen from that viewview

– z becomes a vector with an initial position z becomes a vector with an initial position for each view.for each view.


MethodMethod

Support mapsSupport maps– Limits pixel search to area defined by map Limits pixel search to area defined by map

for speedfor speed– Value for each pixel in range [0,1], where 1 Value for each pixel in range [0,1], where 1

means pixel is in the regionmeans pixel is in the region– Method for finding starts as an elliptical Method for finding starts as an elliptical

guess, but refining it is not describedguess, but refining it is not described


MethodMethod

Algorithm reviewAlgorithm reviewInput: Image I(t), I(t+1), pose and IK angles

Output: Pose and IK angles for I(t+1)

Find 3D points for each pixel in image

Compute support map for each segment

Set poses and IK angles for I(t+1) = I(t)

Iterate:

Compute gradients

Estimate Update poses and IK angles

Warp image based on the pose and support map


MethodMethod

InitializationInitialization– Algorithm depends on known positions for Algorithm depends on known positions for

the first framethe first frame– For multiple views, each first frame must be For multiple views, each first frame must be

initializedinitialized– User clicks joint positions, and 3D User clicks joint positions, and 3D

estimations and joint angles are computedestimations and joint angles are computed– Values like symmetry can be enforcedValues like symmetry can be enforced


ResultsResults


ResultsResults

– Single angleSingle angle– 53 frames with decent 53 frames with decent

resultsresults– Upper leg hard to track, so Upper leg hard to track, so

IK chain compensates with IK chain compensates with lower leg and torsolower leg and torso

In Lab MovieIn Lab Movie


ResultsResults

– Oblique angleOblique angle– Tracking over 45 framesTracking over 45 frames– Algorithm could track Algorithm could track

change in scale due to change in scale due to perspective changesperspective changes

Oblique Lab MovieOblique Lab Movie


ResultsResults

– Oldest known “movie”Oldest known “movie”– High noise and low High noise and low

contrastcontrast– Low framerateLow framerate– Multiple viewsMultiple views

Digital MuybridgeDigital Muybridge


ConclusionsConclusions

Future Work/ShortcomingsFuture Work/Shortcomings– May break with large movementsMay break with large movements– Fixed camera onlyFixed camera only– Did not show tracking of back limbsDid not show tracking of back limbs– No timing dataNo timing data– Few resultsFew results

3/5/2002phillip saltzman video motion capture christoph bregler jitendra malik uc berkley 1997

Documents

phillip saltzman method

phillip saltzman review

phillip saltzman method

pixel slide

motion model

view normal motion equation

twists twists

twist vector