algorithms & applications in computer vision lihi zelnik-manor lecture 9: optical flow

108
Algorithms & Applications in Computer Vision Lihi Zelnik-Manor Lecture 9: Optical Flow

Post on 19-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

  • Slide 1
  • Algorithms & Applications in Computer Vision Lihi Zelnik-Manor Lecture 9: Optical Flow
  • Slide 2
  • Last Time: Alignment Homographies Rotational Panoramas RANSAC Global alignment Warping Blending
  • Slide 3
  • Today: Motion and Flow Motion estimation Patch-based motion (optic flow) Regularization and line processes Parametric (global) motion Layered motion models
  • Slide 4
  • Optical Flow Where did each pixel in image 1 go to in image 2
  • Slide 5
  • Optical Flow Pierre Kornprobst's Demo
  • Slide 6
  • Introduction Given a video sequence with camera/objects moving we can better understand the scene if we find the motions of the camera/objects.
  • Slide 7
  • Scene Interpretation How is the camera moving? How many moving objects are there? Which directions are they moving in? How fast are they moving? Can we recognize their type of motion (e.g. walking, running, etc.)?
  • Slide 8
  • Applications Recover camera ego-motion. Result by MobilEye (www.mobileye.com)
  • Slide 9
  • Applications Motion segmentation Result by: L.Zelnik-Manor, M.Machline, M.Irani Multi-body Segmentation: Revisiting Motion Consistency To appear, IJCV
  • Slide 10
  • Applications Structure from Motion InputReconstructed shape Result by: L. Zhang, B. Curless, A. Hertzmann, S.M. Seitz Shape and motion under varying illumination: Unifying structure from motion, photometric stereo, and multi-view stereo ICCV03
  • Slide 11
  • Examples of Motion fields Forward motion Rotation Horizontal translation Closer objects appear to move faster!!
  • Slide 12
  • Motion Field & Optical Flow Field Motion Field = Real world 3D motion Optical Flow Field = Projection of the motion field onto the 2d image 3D motion vector 2D optical flow vector CCD
  • Slide 13
  • When does it break? The screen is stationary yet displays motion Homogeneous objects generate zero optical flow. Fixed sphere. Changing light source. Non-rigid texture motion
  • Slide 14
  • The Optical Flow Field Still, in many cases it does work. Goal: Find for each pixel a velocity vector which says: How quickly is the pixel moving across the image In which direction it is moving
  • Slide 15
  • How do we actually do that?
  • Slide 16
  • Estimating Optical Flow Assume the image intensity is constant Time = t Time = t+dt
  • Slide 17
  • Brightness Constancy Equation First order Taylor Expansion Simplify notations: Divide by dt and denote: Gradient and flow are orthogonal Problem I: One equation, two unknowns
  • Slide 18
  • Time t+dt Problem II: The Aperture Problem Time t ? Time t+dt Where did the blue point move to? We need additional constraints For points on a line of fixed intensity we can only recover the normal flow
  • Slide 19
  • Use Local Information Sometimes enlarging the aperture can help
  • Slide 20
  • Local smoothness Lucas Kanade (1984) Assume constant (u,v) in small neighborhood
  • Slide 21
  • Lucas Kanade (1984) Goal: Minimize 2x2 2x1 Method: Least-Squares
  • Slide 22
  • How does Lucas-Kanade behave? We want this matrix to be invertible. i.e., no zero eigenvalues
  • Slide 23
  • How does Lucas-Kanade behave? Edge becomes singular
  • Slide 24
  • How does Lucas-Kanade behave? Homogeneous 0 eigenvalues
  • Slide 25
  • How does Lucas-Kanade behave? Textured regions two high eigenvalues
  • Slide 26
  • How does Lucas-Kanade behave? Edge becomes singular Homogeneous regions low gradients High texture
  • Slide 27
  • Iterative Refinement Estimate velocity at each pixel using one iteration of Lucas and Kanade estimation Warp one image toward the other using the estimated flow field (easier said than done) Refine estimate by repeating the process
  • Slide 28
  • Optical Flow: Iterative Estimation x x0x0 Initial guess: Estimate: estimate update
  • Slide 29
  • Optical Flow: Iterative Estimation x x0x0 estimate update Initial guess: Estimate:
  • Slide 30
  • Optical Flow: Iterative Estimation x x0x0 Initial guess: Estimate: Initial guess: Estimate: estimate update
  • Slide 31
  • Optical Flow: Iterative Estimation x x0x0
  • Slide 32
  • Some Implementation Issues: Warping is not easy (ensure that errors in warping are smaller than the estimate refinement) Warp one image, take derivatives of the other so you dont need to re-compute the gradient after each iteration. Often useful to low-pass filter the images before motion estimation (for better derivative estimation, and linear approximations to image intensity)
  • Slide 33
  • Other break-downs Brightness constancy is not satisfied A point does not move like its neighbors what is the ideal window size? The motion is not small (Taylor expansion doesnt hold) Aliasing Correlation based methods Regularization based methodsUse multi-scale estimation
  • Slide 34
  • Optical Flow: Aliasing Temporal aliasing causes ambiguities in optical flow because images can have many pixels with the same intensity. I.e., how do we know which correspondence is correct? nearest match is correct (no aliasing) nearest match is incorrect (aliasing) coarse-to-fine estimation To overcome aliasing: coarse-to-fine estimation. actual shift estimated shift
  • Slide 35
  • Multi-Scale Flow Estimation image I t-1 image I Gaussian pyramid of image I t Gaussian pyramid of image I t+1 image I t+1 image I t u=10 pixels u=5 pixels u=2.5 pixels u=1.25 pixels
  • Slide 36
  • Multi-Scale Flow Estimation image I t-1 image I Gaussian pyramid of image I t Gaussian pyramid of image I t+1 image I t+1 image I t run Lucas-Kanade warp & upsample......
  • Slide 37
  • Other break-downs Brightness constancy is not satisfied A point does not move like its neighbors what is the ideal window size? The motion is not small (Taylor expansion doesnt hold) Aliasing Correlation based methods Regularization based methodsUse multi-scale estimation
  • Slide 38
  • Spatial Coherence Assumption * Neighboring points in the scene typically belong to the same surface and hence typically have similar motions. * Since they also project to nearby points in the image, we expect spatial coherence in image flow. Black
  • Slide 39
  • Formalize this Idea Noisy 1D signal: x u Noisy measurements p(x) Black
  • Slide 40
  • Regularization Find the best fitting smoothed function v(x) x u Noisy measurements p(x) q(x) Black
  • Slide 41
  • Membrane model Find the best fitting smoothed function v(x) x u q(x) Black
  • Slide 42
  • Membrane model Find the best fitting smoothed function q(x) x u q(x) Black
  • Slide 43
  • Membrane model Find the best fitting smoothed function q(x) u q(x) Black
  • Slide 44
  • Spatial smoothness assumption Faithful to the data Regularization x u q(x) Minimize: Black
  • Slide 45
  • Bayesian Interpretation Black
  • Slide 46
  • Discontinuities x u q(x) What about this discontinuity? What is happening here? What can we do? Black
  • Slide 47
  • Robust Estimation Noise distributions are often non-Gaussian, having much heavier tails. Noise samples from the tails are called outliers. Sources of outliers (multiple motions): specularities / highlights jpeg artifacts / interlacing / motion blur multiple motions (occlusion boundaries, transparency) velocity space u1u1 u2u2 + + Black
  • Slide 48
  • Occlusion occlusiondisocclusionshear Multiple motions within a finite region. Black
  • Slide 49
  • Coherent Motion Possibly Gaussian. Black
  • Slide 50
  • Multiple Motions Definitely not Gaussian. Black
  • Slide 51
  • Weak membrane model u v(x) x Black
  • Slide 52
  • Analog line process Penalty function Family of quadratics Black
  • Slide 53
  • Analog line process Infimum defines a robust error function. Minima are the same: Black
  • Slide 54
  • Least Squares Estimation Standard Least Squares Estimation allows too much influence for outlying points
  • Slide 55
  • Robust Regularization x u v(x) Minimize: Treat large spatial derivatives as outliers. Black
  • Slide 56
  • Robust Estimation Problem: Least-squares estimators penalize deviations between data & model with quadratic error f n (extremely sensitive to outliers) error penalty functioninfluence function Redescending error functions (e.g., Geman-McClure) help to reduce the influence of outlying measurements. error penalty functioninfluence function Black
  • Slide 57
  • Regularization Horn and Schunk (1981) Add global smoothness term Smoothness error: Error in brightness constancy equation Minimize: Solve by calculus of variations
  • Slide 58
  • Robust Estimation Black & Anandan (1993) Regularization can over-smooth across edges Use smarter regularization
  • Slide 59
  • Robust Estimation Black & Anandan (1993) Regularization can over-smooth across edges Use smarter regularization Minimize: Brightness constancySmoothness
  • Slide 60
  • Optimization Gradient descent Coarse-to-fine (pyramid) Deterministic annealing Black
  • Slide 61
  • Example
  • Slide 62
  • Multi-resolution registration
  • Slide 63
  • Optical Flow Results
  • Slide 64
  • Slide 65
  • Parametric motion estimation
  • Slide 66
  • Global (parametric) motion models 2D Models: Affine Quadratic Planar projective transform (Homography) 3D Models: Instantaneous camera motion models Homography+epipole Plane+Parallax Szeliski
  • Slide 67
  • Motion models Translation 2 unknowns Affine 6 unknowns Perspective 8 unknowns 3D rotation 3 unknowns Szeliski
  • Slide 68
  • Example: Affine Motion For panning camera or planar surfaces: Only 6 parameters to solve for Better results Least Square Minimization (over all pixels): 2 aErr)( t yyyxxx I yIxIIyIxII p
  • Slide 69
  • Last lecture: Alignment / motion warping Alignment: Assuming we know the correspondences, how do we get the transformation? Expressed in terms of absolute coordinates of corresponding points Generally presumed features separately detected in each frame e.g., affine model in abs. coords
  • Slide 70
  • Today: flow, parametric motion Two views presumed in temporal sequencetrack or analyze spatio-temporal gradient Sparse or dense in first frame Search in second frame Motion models expressed in terms of position change
  • Slide 71
  • Today: flow, parametric motion Two views presumed in temporal sequencetrack or analyze spatio-temporal gradient Sparse or dense in first frame Search in second frame Motion models expressed in terms of position change
  • Slide 72
  • Today: flow, parametric motion Two views presumed in temporal sequencetrack or analyze spatio-temporal gradient Sparse or dense in first frame Search in second frame Motion models expressed in terms of position change
  • Slide 73
  • Today: flow, parametric motion Two views presumed in temporal sequencetrack or analyze spatio-temporal gradient Sparse or dense in first frame Search in second frame Motion models expressed in terms of position change (u i,v i )
  • Slide 74
  • Today: flow, parametric motion Two views presumed in temporal sequencetrack or analyze spatio-temporal gradient Sparse or dense in first frame Search in second frame Motion models expressed in terms of position change (u i,v i )
  • Slide 75
  • Today: flow, parametric motion Two views presumed in temporal sequencetrack or analyze spatio-temporal gradient Sparse or dense in first frame Search in second frame Motion models expressed in terms of position change (u i,v i ) Previous Alignment model: Now, Displacement model:
  • Slide 76
  • Quadratic instantaneous approximation to planar motion Projective exact planar motion Other 2D Motion Models Szeliski
  • Slide 77
  • 3D Motion Models Local Parameter: Instantaneous camera motion: Global parameters: Homography+Epipole Local Parameter: Residual Planar Parallax Motion Global parameters: Local Parameter:
  • Slide 78
  • Segmentation of Affine Motion =+ InputSegmentation result Result by: L.Zelnik-Manor, M.Irani Multi-frame estimation of planar motion, PAMI 2000
  • Slide 79
  • Panoramas Motion estimation by Andrew Zissermans group Input
  • Slide 80
  • Stabilization Result by: L.Zelnik-Manor, M.Irani Multi-frame estimation of planar motion, PAMI 2000
  • Slide 81
  • Consider image I translated by The discrete search method simply searches for the best estimate. The gradient method linearizes the intensity function and solves for the estimate Discrete Search vs. Gradient Based Szeliski
  • Slide 82
  • Correlation and SSD For larger displacements, do template matching Define a small area around a pixel as the template Match the template against each pixel within a search area in next image. Use a match measure such as correlation, normalized correlation, or sum-of-squares difference Choose the maximum (or minimum) as the match Sub-pixel estimate (Lucas-Kanade) Szeliski
  • Slide 83
  • Shi-Tomasi feature tracker 1.Find good features (min eigenvalue of 2 2 Hessian) 2.Use Lucas-Kanade to track with pure translation 3.Use affine registration with first feature patch 4.Terminate tracks whose dissimilarity gets too large 5.Start new tracks when needed Szeliski
  • Slide 84
  • Tracking result
  • Slide 85
  • How well do these techniques work?
  • Slide 86
  • A Database and Evaluation Methodology for Optical Flow Simon Baker, Daniel Scharstein, J.P Lewis, Stefan Roth, Michael Black, and Richard Szeliski ICCV 2007 http://vision.middlebury.edu/flow/
  • Slide 87
  • Limitations of Yosemite Only sequence used for quantitative evaluation Limitations: Very simple and synthetic Small, rigid motion Minimal motion discontinuities/occlusions Image 7 Image 8 Yosemite Ground-Truth Flow Flow Color Coding Szeliski http://www.youtube.com/watch?v=0xRDFf7UtQE
  • Slide 88
  • Limitations of Yosemite Only sequence used for quantitative evaluation Current challenges: Non-rigid motion Real sensor noise Complex natural scenes Motion discontinuities Need more challenging and more realistic benchmarks Image 7 Image 8 Yosemite Ground-Truth Flow Flow Color Coding Szeliski
  • Slide 89
  • Motion estimation89 Realistic synthetic imagery Randomly generate scenes with trees and rocks Significant occlusions, motion, texture, and blur Rendered using Mental Ray and lens shader plugin Rock Grove Szeliski
  • Slide 90
  • 90 Modified stereo imagery Recrop and resample ground-truth stereo datasets to have appropriate motion for OF Venus Moebius Szeliski
  • Slide 91
  • Paint scene with textured fluorescent paint Take 2 images: One in visible light, one in UV light Move scene in very small steps using robot Generate ground-truth by tracking the UV images Dense flow with hidden texture Setup Visible UV LightsImageCropped Szeliski
  • Slide 92
  • Motion estimation92 Experimental results Szeliski http://vision.middlebury.edu/flow/
  • Slide 93
  • Conclusions Difficulty: Data substantially more challenging than Yosemite Diversity: Substantial variation in difficulty across the various datasets Motion GT vs Interpolation: Best algorithms for one are not the best for the other Comparison with Stereo: Performance of existing flow algorithms appears weak Szeliski
  • Slide 94
  • Layered Scene Representations
  • Slide 95
  • Motion representations How can we describe this scene? Szeliski
  • Slide 96
  • Block-based motion prediction Break image up into square blocks Estimate translation for each block Use this to predict next frame, code difference (MPEG-2) Szeliski
  • Slide 97
  • Layered motion Break image sequence up into layers: = Describe each layers motion Szeliski
  • Slide 98
  • Layered Representation Estimate dominant motion parameters Reject pixels which do not fit Convergence Restart on remaining pixels For scenes with multiple affine motions
  • Slide 99
  • Layered motion Advantages: can represent occlusions / disocclusions each layers motion can be smooth video segmentation for semantic processing Difficulties: how do we determine the correct number? how do we assign pixels? how do we model the motion? Szeliski
  • Slide 100
  • How do we form them? Szeliski
  • Slide 101
  • How do we estimate the layers? 1.compute coarse-to-fine flow 2.estimate affine motion in blocks (regression) 3.cluster with k-means 4.assign pixels to best fitting affine region 5.re-estimate affine motions in each region Szeliski
  • Slide 102
  • Layer synthesis For each layer: stabilize the sequence with the affine motion compute median value at each pixel Determine occlusion relationships Szeliski
  • Slide 103
  • Results Szeliski
  • Slide 104
  • Recent results: SIFT Flow
  • Slide 105
  • Slide 106
  • Recent GPU Implementation http://gpu4vision.icg.tugraz.at/ Real time flow exploiting robust norm + regularized mapping
  • Slide 107
  • Today: Motion and Flow Motion estimation Patch-based motion (optic flow) Regularization and line processes Parametric (global) motion Layered motion models
  • Slide 108
  • Slide Credits Trevor Darrell Rick Szeliski Michael Black