kapitel 7 “tracking” – p. 1 tracking fundamentals object representation object detection ...

30
1 Kapitel 7 “Tracking” – p. Tracking Fundamentals Object representation Object detection Object tracking A. Yilmaz, O. Javed, and M. Shah Object tracking: A survey ACM Computing Surveys, Vol. 38, No. 4, 1-45, 2006 Kapitel 7

Post on 15-Jan-2016

229 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

1Kapitel 7 “Tracking” – p.

Tracking

Fundamentals Object representation Object detection Object tracking

A. Yilmaz, O. Javed, and M. ShahObject tracking: A surveyACM Computing Surveys, Vol. 38, No. 4, 1-45, 2006

Kapitel 7

Page 2: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

2Kapitel 7 “Tracking” – p.

Fundamentals (1)

Page 3: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

3Kapitel 7 “Tracking” – p.

Fundamentals (2)

Applications of object tracking:

motion-based recognition: human identification based on gait, automatic object detection, etc.

automated surveillance: monitoring a scene to detect suspicious activities or unlikely events

video indexing: automatic annotation and retrieval of the videos in multimedia databases

human-computer interaction: gesture recognition, eye gaze tracking for data input to computers, etc.

traffic monitoring: real-time gathering of traffic statistics to direct traffic flow

vehicle navigation: video-based path planning and obstacle avoidance capabilities

Page 4: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

4Kapitel 7 “Tracking” – p.

Fundamentals (3)

Tracking task:

In the simplest form, tracking can be defined as the problem of estimating the trajectory of an object in the image plane as it moves around a scene. In other words, a tracker assigns consistent labels to the tracked objects in different frames of a video. Additionally, depending on the tracking domain, a tracker can also provide object-centric information, such as orientation, area, or shape of an object.

Two subtasks:

• Build some model of what you want to track

• Use what you know about where the object was in the previous frame(s) to make predictions about the current frame and restrict the search

Repeat the two subtasks, possibly updating the model

Page 5: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

5Kapitel 7 “Tracking” – p.

Fundamentals (4)

Tracking objects can be complex due to:

loss of information caused by projection of 3D world on 2D image noise in images complex object shapes / motion nonrigid or articulated nature of objects partial and full object occlusions scene illumination changes real-time processing requirements

Simplify tracking by imposing constraints: Almost all tracking algorithms assume that the object motion is

smooth with no abrupt changes The object motion is assumed to be of constant velocity Prior knowledge about the number and the size of objects, or the

object appearance and shape

Page 6: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

6Kapitel 7 “Tracking” – p.

Object Represention (1)

Object representation = Shape + Appearance

Shape representations:

Points. The object is represented by a point, that is, the centroid or by a set of points; suitable for tracking objects that occupy small regions in an image

Primitive geometric shapes. Object shape is represented by a rectangle, ellipse, etc. Object motion for such representations is usually modeled by translation, affine, or projective transformation. Though primitive geometric shapes are more suitable for representing simple rigid objects, they are also used for tracking nonrigid objects.

Page 7: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

7Kapitel 7 “Tracking” – p.

Object Represention (2)

Object silhouette and contour. Contour = boundary of an object. Region inside the contour = silhouette. Silhouette and contour representations are suitable for tracking complex nonrigid shapes.

Articulated shape models. Articulated objects are composed of body parts (modelled by cylinders or ellipses) that are held together with joints. Example: human body = articulated object with torso, legs, hands, head, and feet connected by joints. The relationship between the parts are governed by kinematic motion models, e.g. joint angle, etc.

Skeletal models. Object skeleton can be extracted by applying medial axis transform to the object silhouette. Skeleton representation can be used to model both articulated and rigid objects.

Page 8: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

8Kapitel 7 “Tracking” – p.

Object Represention (3)

Object representations. (a) Centroid, (b) multiple points, (c) rectangularpatch, (d) elliptical patch, (e) part-based multiple patches, (f) object skeleton, (g) control points on object contour, (h) complete object contour, (i) object silhouette

Page 9: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

9Kapitel 7 “Tracking” – p.

Object Represention (4)

Appearance representations:

Templates. Formed using simple geometric shapes or silhouettes. Suitable for tracking objects whose poses do not vary considerably during the course of tracking. Self-adapation of templates durch the tracking is possibe.

http://www.cs.toronto.edu/vis/projects/dudekfaceSequence.html

Page 10: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

10Kapitel 7 “Tracking” – p.

Probability densities of object appearance, can either be parametric (Gaussian and mixture of Gaussians) or nonparametric (histograms)

Characterize an image region by its statistics.If the statistics differ from background, theyshould enable tracking.

• nonparametric: histogram (grayscale or color)

Object Represention (5)

Page 11: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

11Kapitel 7 “Tracking” – p.

Object Represention (6)

• parametric: 1D Gaussian distribution

Page 12: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

12Kapitel 7 “Tracking” – p.

Object Represention (7)

• parametric: n-D Gaussian distribution

Centered at (1,3) with a standard deviation of 3 in roughly the (0.878, 0.478) direction and of 1 in the orthogonal direction

Page 13: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

13Kapitel 7 “Tracking” – p.

Object Represention (8)

• parametric: Gaussian Mixture Models (GMM)

Page 14: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

14Kapitel 7 “Tracking” – p.

Object Represention (9)

Beispiel: Mixture of three Gaussians in 2D space. (a) Contours of constant density for each mixture component. (b) Contours of constant density of mixture distribution p(x). (c) Surface plot of p(x).

Page 15: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

15Kapitel 7 “Tracking” – p.

Object Represention (10)

Object representations are chosen according to the application

Point representations appropriate for tracking objects, which appear very small in an image (e.g. track distant birds)

For the objects whose shapes can be approximated by rectangles or ellipses, primitive geometric shape representations are more appropriate (e.g. face)

For tracking objects with complex shapes, for example, humans, a contour or a silhouette-based representation is appropriate (surveillance applications)

Page 16: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

16Kapitel 7 “Tracking” – p.

Object Represention (11)

Feature selection for tracking:

Color: RGB, L u v , L a b , HSV, etc. There is no last word on ∗ ∗ ∗ ∗ ∗ ∗which color space is more effective; a variety of color spaces have been used

Edges: less sensitive to illumination changes compared to color features. Algorithms that track the object boundary usually use edges as features. Because of its simplicity and accuracy, the most popular edge detection approach is the Canny Edge detector

Texture: measure of the intensity variation of a surface which quantifies properties such as smoothness and regularity

In general, the most desirable property of a visual feature is its uniqueness so that the objects can be easily distinguished in the feature space

Page 17: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

17Kapitel 7 “Tracking” – p.

Object Detection (1)

Object detection mechanism: required by every tracking method either at the beginning or when an object first appears in the video

Point detectors: find interest points in images which have an expressive texture in their respective localities

Segmentation: partition the image into perceptually similar regions

Page 18: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

18Kapitel 7 “Tracking” – p.

Object Detection (2)

Background subtraction:

Object detection can be achieved by building a representation of the scene called the background model and then finding deviations from the model for each incoming frame. Any significant change in an image region from the background model signifies a moving object. The pixels constituting the regions undergoing change are marked for further processing. Usually, a connected component algorithm is applied to obtain connected regions corresponding to the objects.

Page 19: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

19Kapitel 7 “Tracking” – p.

Object Detection (3)

Frame differencing of temporally adjacent frames:

Page 20: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

20Kapitel 7 “Tracking” – p.

Object Detection (4)

Bildsequenz: ≈ 5 Bilder/s

Page 21: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

21Kapitel 7 “Tracking” – p.

Object Detection (5)

Bildsubtraktion: Variante 1

Schwäche: Doppelbild eines Fahrzeugs (aus dem letzten und aktuellen Bild); Aufteilung einer konstanten Fläche

Page 22: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

22Kapitel 7 “Tracking” – p.

Object Detection (6)

Bildsubtraktion: Variante 2

Referenzbild fr(r, c): Mittelung einer langen Sequenz von Bildern

Page 23: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

23Kapitel 7 “Tracking” – p.

Object Detection (7)

Page 24: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

24Kapitel 7 “Tracking” – p.

Object Detection (8)

Statistical modeling of background:

Learn gradual changes in time by Gaussian, I (x, y) ∼ N(μ(x, y), (x, y)), from the color observations in several consecutive frames. Once the background model is derived for every pixel (x, y) in the input frame, the likelihood of its color coming from N(μ(x, y), (x, y)) is computed.

Page 25: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

25Kapitel 7 “Tracking” – p.

Object Tracking (1)

(a) Point Tracking. Objects detected in consecutive frames are represented by points, and a point matching is done. This approach requires an external mechanism to detect the objects in every frame.

(b) Kernel Tracking. Kernel = object shape and appearance. E.g. kernel = a rectangular template or an elliptical shape with an associated histogram. Objects are tracked by computing the motion (parametric transformation such as translation, rotation, and affine) of the kernel in consecutive frames.

(c)+(d) Silhouette Tracking. Such methods use the information encoded inside the object region (appearance density and shape models). Given the object models, silhouettes are tracked by either shape matching (c) or contour evolution (d). The latter one can be considered as object segmentation applied in the temporal domain using the priors generated from the previous frames.

Page 26: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

26Kapitel 7 “Tracking” – p.

Object Tracking (2)

Template Matching: brute force method for tracking single objects

Define a search area

Place the template defined from the previous frame at each position of the search area and compute a similarity measure between the template and the candidate

Select the best candidate with the maximal similarity measure

The similarity measure can be a direct template comparison or statistical measures between two probability densities

Limitation of template matching: high computation cost due to the brute force search limit the object search to the vicinity of its previous position; position prediction

Page 27: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

27Kapitel 7 “Tracking” – p.

Object Tracking (3)

Direct comparison: between template t(i,j) and candidate g(i,j)

Bhattacharyya coefficient between two distributions:

Page 28: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

28Kapitel 7 “Tracking” – p.

Object Tracking (4)

Example: Eye tracking (direct grayvalue comparison)

Page 29: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

29Kapitel 7 “Tracking” – p.

Object Tracking (5)

http://robotics.stanford.edu/~birch/headtracker/

Example: Elliptical head tracking using intensity gradients and color

histograms

Page 30: Kapitel 7 “Tracking” – p. 1 Tracking  Fundamentals  Object representation  Object detection  Object tracking A. Yilmaz, O. Javed, and M. Shah Object

30Kapitel 7 “Tracking” – p.

Object Tracking (6)

Mean-shift tracking (instead of brute force search). (a) estimated object location at time t − 1, (b) frame at time t with initial location estimate using the previous object position, (c), (d), (e) location update using mean-shift iterations, (f) final object position at time t.

D. Comaniciu, V. Ramesh, and P. Meer, Kernel-based object tracking. IEEE Trans. Patt. Analy. Mach. Intell. 25, 564–575, 2003