Computer Vision Object and People Tracking . Alain Pagani Kaiserlautern ... Computer Vision Object and People Tracking . 2 Video Tracking ... Video tracking vs. Camera tracking 7

Download Computer Vision Object and People Tracking . Alain Pagani Kaiserlautern ... Computer Vision Object and People Tracking . 2 Video Tracking ... Video tracking vs. Camera tracking 7

Post on 21-Apr-2018

216 views

Category:

Documents

3 download

Embed Size (px)

TRANSCRIPT

  • Video Tracking

    Prof. Didier Stricker

    Dr. Alain Pagani

    Kaiserlautern University

    http://ags.cs.uni-kl.de/

    DFKI Deutsches Forschungszentrum fr Knstliche Intelligenz

    http://av.dfki.de

    Computer Vision

    Object and People Tracking

    http://ags.cs.uni-kl.de/http://ags.cs.uni-kl.de/http://ags.cs.uni-kl.de/http://av.dfki.de/

  • 2

    Video Tracking Definition Applications Bayesian filters for video tracking Motion models Appearance models Dynamic Appearance Models

    Template Update Problem Machine learning based models

    Occlusion handling Fusion Evaluation Redetection

    Outline

  • Known as Visual Tracking

    Visual Object Tracking

    Vision-based Tracking

    Defined as

    the process of estimating over time

    the location of one or more objects using a

    camera 1

    Objects: Person, head, hands, animal, ball,

    car, plane, semi-finished products,

    commercial products, cell

    Video Tracking: definitions

    1E. Maggio and A. Cavallaro Video Tracking, theory and practice John Wiley & Sons, Ltd, 2011 3

  • Sensor: a video camera

    Input: One (continuous) video sequence

    Images: unique source of data for measurement

    Visible variable (in the Hidden Markov Model)

    Camera might be moving (non-static

    background)

    State is known in the first frame only

    Online (i.e. sequential) Tracking

    Video Tracking: details

    4

  • Sequential (recursive, online)

    + Inexpensive real-time

    - no future information

    - cannot revisit past errors

    Approaches to tracking

    t

    t-1

    t+1

    t=1,,T

    Batch Processing (offline)

    - Expensive not real-time

    + considers all information

    + can correct past errors

    5

    Slide from 1st lecture

  • Tracking applications

    6

    Slide from 1st lecture

  • Same input: video sequence

    Different aims: Video Tracking: track an object inside the video

    sequence over time

    Camera Tacking: track the pose (rotation +

    translation) of the camera over time

    Typical application of Camera Tracking:

    Augmented Reality

    Video tracking vs. Camera tracking

    7

  • Tracking is an essential step in many computer vision based

    applications

    Video Tracking applications

    Detection + Feature

    Extraction Tracking

    Activity Recognition

    + Event Recognition

    Behavior Analysis +

    Social Models

    Prithwijit Guha , Amitabha Mukerjee and K. S. Venkatesh, Spatio-temporal Discovery: Appearance + Behavior = Agent,

    Computer Vision, Graphics and Image Processing 4338 516-527, 2007 8

    Slide from 1st lecture

    http://www.cse.iitk.ac.in/research/pdfs/phd/guha-mukerjee-venkatesh-icvgip07_spatiotemporal-discovery.pdfhttp://www.cse.iitk.ac.in/research/pdfs/phd/guha-mukerjee-venkatesh-icvgip07_spatiotemporal-discovery.pdfhttp://www.cse.iitk.ac.in/research/pdfs/phd/guha-mukerjee-venkatesh-icvgip07_spatiotemporal-discovery.pdf

  • [1] E. Maggio and A. Cavallaro Video Tracking, theory and practice John Wiley & Sons, Ltd, 2011

    (Slide from [1])

    9

  • [1] E. Maggio and A. Cavallaro Video Tracking, theory and practice John Wiley & Sons, Ltd, 2011

    (Slide from [1])

    10

  • [1] E. Maggio and A. Cavallaro Video Tracking, theory and practice John Wiley & Sons, Ltd, 2011

    (Slide from [1]) Sport analysis

    11

  • Automatic Sports Coverage

    F. Chen, C. De Vleeschouwer, "Autonomous production of basketball videos from multi-sensored data with personalized viewpoints," WIAMIS, 2009 12

  • Medical application and biological research

    cell tracking and mitosis detection

    Video Tracking applications

    Ryoma Bise, Zhaozheng Yin, and Takeo Kanade, Reliable Cell Tracking by Global Data Association, ISBI 2011 13

  • Video Tracking applications Robotics, Industrial production and unmanned

    vehicles

    Adept Quattro s650

    https://www.youtube.com/watch?v=8unkXTtfEBQ

    14

  • Video Tracking applications Interactive video, product placement

    Clickable video

    15

  • 16

    Components of a Bayesian Filter: State Measurement Control input Motion model (|1, ) Measurement model (|)

    State is hidden, only measurement is

    observed

    In visual tracking: no control input

    Bayesian Filters for Video Tracking

  • 17

    State (hidden state) Measurement (observation) Motion model (State Transition probability, dynamic model)

    Measurement model (Appearance model, likelihood)

    Measurement model

    Motion model

    1: = 1 1 1:1 1

    Bayesian Filter: generic equation,

    components

  • 18

    State : 1D (position on the floor) Measurement : I see a door/ I see a wall (binary) Motion model : 1 = (1 + ) (linear 1D + zero noise) Appearance model : certain (defined by the map)zero noise

    Appearance model

    (Measurement model)

    Motion model

    1: = 1 1 1:1 1

    Bayesian Filters: robot example

  • State : Point in image (2D)

    Bounding box

    Ellipse

    Shape

    Articulated body

    Bayesian Filters for Video Tracking: state

    Appearance model

    (Measurement model)

    Motion model

    1: = 1 1 1:1 1

    19

  • 20

    Bayesian Filters for Video Tracking: state

    State :

    Appearance model

    (Measurement model)

    Motion model

    1: = 1 1 1:1 1

  • 21

    Exemplary state: bounding box

    =

    ( , ): center of the box (, ): size of the box (optional)

  • 22

    Exemplary state: 3D bounding box

    =

    ( , ): position on the map (road) : angular rotation (yaw angle) (fixed bounding box size)

    ( , )

    2D map

    Video sequence

    Zoomed view

  • 23

    2D scene interpretation

    position of centroid (fixed bounding box size): 2D

    Centroid + scale (fixed bounding box aspect ratio): 3D

    Free Bounding Box: 4D (x, y, w, h)

    Free Bounding Box + rotation: 5D

    3D scene interpretation

    Position in the real world on a map + orientation: 3D

    3D Bounding Box: up to 6D

    3D BBox + rotations: up to 9D

    Video Tracking: usual states

  • 24

    Motion model : 1

    Depends on how the scene is interpreted:

    In 3D, it is possible to define constraints like Constant acceleration

    Constant velocity

    In 2D tracking: the camera projection breaks the linearity 1 very difficult to express often extremely simplified:

    Zero velocity + large Gaussian noise (very common)

    Constant velocity + large Gaussian noise

    Bayesian Filters for Video Tracking: motion model

    Appearance model

    (Measurement model)

    Motion model

    1: = 1 1 1:1 1

  • 25

    Motion model : 1

    Zero velocity + large Gaussian noise Constant velocity + large Gaussian noise

    Motion model Appearance model

    (Measurement model)

    1: = 1 1 1:1 1

    Bayesian Filters for Video Tracking: motion model

  • 26

    Measurement : The entire image at time Full HD resolution, 3 color channels: 6.3 Mo values

    One measurement is a point in a 6.3 Mo-dimensional space Compared with a range-finder or a radar intractable

    Fortunately, we do not need to consider measurement, but only

    the measurement model

    Appearance model

    (Measurement model)

    Motion model

    1: = 1 1 1:1 1

    Bayesian Filters for Video Tracking: measurement

  • 27

    Measurement model : Interpretation of the measurement model:

    Suppose the state is , what is the probability that the entire image looks like the one I have?

    Without statistical description of the background, we can reduce

    this probability to a local neighborhood of the object position

    defined by .

    Often interpreted as how similar are the image part covered by

    state and the object I expect to see? = ( , , _) this explains the term appearance model

    Video Tracking: measurement model

    Appearance model

    (Measurement model)

  • Simple Example: PacMan tracking

    Track the pink ghost Frame 0

    Object model: reference template

    Frame 0

    Simple object model based on

    Expected color (hue and saturation)

    Approximate shape

    28

  • Simple Example: PacMan tracking

    Track the pink ghost Frame 1

    Motion prediction

    Measurement step

    Object model: reference template

    Frame 1

    Motion prediction centered on last position

    Measurement step based on object model

    New belief 29

  • Simple Example: PacMan tracking

    Track the pink ghost Frame 2

    Motion prediction

    Measurement step

    Object model: reference template

    Frame 2

    Motion prediction centered on last position

    Measurement step based on object model

    New belief

    ?

    ? ?

    ?

    30

  • PacMan tracking failure example:

    For a human it is still easy to follow Sudden appearance changes are more likely

    than teleportation (at least in that game)

    In the tracking algorithm, this was not

    modelled

    Computed belief is correct