3/5/2002phillip saltzman video motion capture christoph bregler jitendra malik uc berkley 1997
TRANSCRIPT
3/5/2002 Phillip Saltzman
Video Motion CaptureVideo Motion Capture
Christoph BreglerChristoph Bregler
Jitendra MalikJitendra Malik
UC Berkley 1997UC Berkley 1997
3/5/2002 Phillip Saltzman
OverviewOverview
• ChallengesChallenges• ReviewReview• MethodMethod• ResultsResults• ConclusionsConclusions
3/5/2002 Phillip Saltzman
ChallengesChallenges
• High AccuracyHigh Accuracy• Frequent Inter-part OcclusionFrequent Inter-part Occlusion• Low ContrastLow Contrast
3/5/2002 Phillip Saltzman
ReviewReview
3/5/2002 Phillip Saltzman
ReviewReview
• Motion capture on synthetic imagesMotion capture on synthetic images– O’Rouke and Balder, 1980O’Rouke and Balder, 1980
• 1 DOF marker free tracking1 DOF marker free tracking– Hogg, 1983. Rohr, 1993Hogg, 1983. Rohr, 1993
• Higher DOF full body trackingHigher DOF full body tracking– Gravrila and Davis, 1995Gravrila and Davis, 1995
3/5/2002 Phillip Saltzman
ReviewReview
About the previous workAbout the previous work– All in controlled environments with high All in controlled environments with high
contrast and clear edge bounriescontrast and clear edge bounries– Most use skintight suits or markersMost use skintight suits or markers– Camera calibration neededCamera calibration needed
3/5/2002 Phillip Saltzman
MethodMethod
3/5/2002 Phillip Saltzman
MethodMethod
Basic AssumptionsBasic Assumptions– From frame to frame, all intensity pixel From frame to frame, all intensity pixel
intensity changes are local:intensity changes are local:
– uu is motion model and is written as a matrix is motion model and is written as a matrix equation:equation:
3/5/2002 Phillip Saltzman
MethodMethod
Finding GradientsFinding Gradients– Gradient form of the first equation:Gradient form of the first equation:
– Find a least squares solution to Find a least squares solution to – Warp image I(t+1) based on Warp image I(t+1) based on – Find new gradientsFind new gradients– Repeat to minimizeRepeat to minimize
3/5/2002 Phillip Saltzman
MethodMethod
Motion as twistsMotion as twists– Standard pose matrix to move from object Standard pose matrix to move from object
space to camera space (3D)space to camera space (3D)– Scaled orthographic projection moves to Scaled orthographic projection moves to
image spaceimage space– Requires knowing something about the 3D Requires knowing something about the 3D
model of the image. Approximated as model of the image. Approximated as ellipsoids.ellipsoids.
3/5/2002 Phillip Saltzman
MethodMethod
Motion as twistsMotion as twists– Any motion can be represented as a Any motion can be represented as a
rotation about an axis, and a translation rotation about an axis, and a translation about that axisabout that axis
– For example,For example,
to make this motion:to make this motion:
3/5/2002 Phillip Saltzman
MethodMethod
Motion as twistsMotion as twistsYou make this motion:You make this motion:
3/5/2002 Phillip Saltzman
MethodMethod
Motion as twistsMotion as twists– Twists can be represented as small vector Twists can be represented as small vector
or matrixor matrix
– Can be made to a pose byCan be made to a pose by– Encode the motion of a pixel between two Encode the motion of a pixel between two
framesframes
3/5/2002 Phillip Saltzman
MethodMethod
Motion as twistsMotion as twists– Linear algebra manipulation allows using Linear algebra manipulation allows using
the twist vector to write a motion equation the twist vector to write a motion equation for each pixelfor each pixel
– Those equations are put in a vector and Those equations are put in a vector and used to find a global used to find a global parameter for that parameter for that objectobject
3/5/2002 Phillip Saltzman
MethodMethod
Kinematic chainsKinematic chains– Body parts represented as multiple Body parts represented as multiple
connected objectsconnected objects– Each object can be found by the top pose Each object can be found by the top pose
and an angle and twist for each object and an angle and twist for each object down the chaindown the chain
– More linear algebra is used to find a More linear algebra is used to find a for for each body parteach body part
3/5/2002 Phillip Saltzman
MethodMethod
Multiple camerasMultiple cameras– Adds accuracy because change of fully Adds accuracy because change of fully
occluded parts drop with each viewoccluded parts drop with each view– Normal motion equation is:Normal motion equation is:– H is system of equations for each pixelH is system of equations for each pixel– is global parameter vector for each objectis global parameter vector for each object– z is initial position of the pixelz is initial position of the pixel
3/5/2002 Phillip Saltzman
MethodMethod
Multiple camerasMultiple cameras– Adding synchronized cameras:Adding synchronized cameras:– H becomes a matrix where each column H becomes a matrix where each column
represents a viewrepresents a view– The The vector gets a term vector gets a term for each view for each view
that represents the pose seen from that that represents the pose seen from that viewview
– z becomes a vector with an initial position z becomes a vector with an initial position for each view.for each view.
3/5/2002 Phillip Saltzman
MethodMethod
Support mapsSupport maps– Limits pixel search to area defined by map Limits pixel search to area defined by map
for speedfor speed– Value for each pixel in range [0,1], where 1 Value for each pixel in range [0,1], where 1
means pixel is in the regionmeans pixel is in the region– Method for finding starts as an elliptical Method for finding starts as an elliptical
guess, but refining it is not describedguess, but refining it is not described
3/5/2002 Phillip Saltzman
MethodMethod
Algorithm reviewAlgorithm reviewInput: Image I(t), I(t+1), pose and IK angles
Output: Pose and IK angles for I(t+1)
Find 3D points for each pixel in image
Compute support map for each segment
Set poses and IK angles for I(t+1) = I(t)
Iterate:
Compute gradients
Estimate Update poses and IK angles
Warp image based on the pose and support map
3/5/2002 Phillip Saltzman
MethodMethod
InitializationInitialization– Algorithm depends on known positions for Algorithm depends on known positions for
the first framethe first frame– For multiple views, each first frame must be For multiple views, each first frame must be
initializedinitialized– User clicks joint positions, and 3D User clicks joint positions, and 3D
estimations and joint angles are computedestimations and joint angles are computed– Values like symmetry can be enforcedValues like symmetry can be enforced
3/5/2002 Phillip Saltzman
ResultsResults
3/5/2002 Phillip Saltzman
ResultsResults
– Single angleSingle angle– 53 frames with decent 53 frames with decent
resultsresults– Upper leg hard to track, so Upper leg hard to track, so
IK chain compensates with IK chain compensates with lower leg and torsolower leg and torso
In Lab MovieIn Lab Movie
3/5/2002 Phillip Saltzman
ResultsResults
– Oblique angleOblique angle– Tracking over 45 framesTracking over 45 frames– Algorithm could track Algorithm could track
change in scale due to change in scale due to perspective changesperspective changes
Oblique Lab MovieOblique Lab Movie
3/5/2002 Phillip Saltzman
ResultsResults
– Oldest known “movie”Oldest known “movie”– High noise and low High noise and low
contrastcontrast– Low framerateLow framerate– Multiple viewsMultiple views
Digital MuybridgeDigital Muybridge
3/5/2002 Phillip Saltzman
ConclusionsConclusions
Future Work/ShortcomingsFuture Work/Shortcomings– May break with large movementsMay break with large movements– Fixed camera onlyFixed camera only– Did not show tracking of back limbsDid not show tracking of back limbs– No timing dataNo timing data– Few resultsFew results