tracking multiple mouse contours (without too many...

Tracking Multiple Mouse Contours (without Too Many Samples)

Paper 1334

Abstract

We present a particle filtering algorithm for robustly track-ing the contours of multiple deformable moving objectsthrough severe occlusions. Our algorithm combines a mul-tiple blob tracker with a contour tracker in a manner thatkeeps the required number of samples small. This is a natu-ral combination because both algorithms have complemen-tary strengths. The blob tracker is an effective solution tothe model-data association problem and searches a smallerand simpler space. On the other hand, contour trackinggives more fine-tuned results and relies on cues that areavailable during severe occlusions. Our choice of combina-tion of these two algorithms accentuates the advantages ofeach. We formulate each importance sampling step of par-ticle filtering so that we first search the blob space jointlyover all objects. While the size of this space still increasesexponentially with the number of objects, the blob poste-rior distribution can still be represented with a reasonablysmall number of samples. Then, we search the more compli-cated but detailed contour space independently for each ob-ject given the results of our blob space search. We demon-strate encouraging results on a challenging video sequenceof three identical mice that contains multiple instances ofsevere occlusion.

1. IntroductionIn this paper, we address the problem of tracking the con-tours of multiple identical mice from video of the side oftheir cage; see Figure 3(b) for example video frames. Al-though existing tracking algorithms might work well froman overhead view of the cage, the majority of vivaria are setup in a way that prohibits this view. A solution to the sideview tracking problem would be extremely useful for medi-cal researchers wishing to automatically monitor the healthand behavior of lab animals. The contour tracking prob-lem is also interesting from a computer vision standpoint.Tracking mouse contours is difficult because they are ex-tremely deformable 3D objects with unconstrained motion.Thus, any mouse contour model general enough to representmost possible deformations is necessarily complex. Thebiggest challenge to tracking mice from a side view is thatthe mice occlude one another severely and often. Track-ing the mice independently would inevitably result in twotrackers following the same mouse. Thus, the data associ-

ation problem must be handled explicitly by a multitargettracking algorithm. These algorithms involve searching aspace whose size increases exponentially with the numberof objects. Directly searching the contour parameterizationspace for all mice at once is prohibitively expensive.

In addition, tracking individual mouse identities is diffi-cult because the mice are indistinguishable. We cannot relyon object-specific identity models and must instead accu-rately track the miceduringocclusions. This is challengingbecause mice have few if any trackable features, their be-havior is erratic, and edges (particularly between two mice)are hard to detect. Other features of the mouse trackingproblem that make it difficult are clutter (the cage bedding,scratches on the cage, and the mice’s tails), inconsistentlighting throughout the cage, and moving reflections andshadows cast by the mice.

Our algorithm is of general interest to the tracking com-munity because the challenges to successful mouse track-ing are common to many real world tracking applications.While many video sequence testbeds are constructed toshow off the novelty of an algorithm, our algorithm is con-structed to address the challenges of a specific trackingproblem. Thus, our feature extraction algorithm must bepowerful, our objects’ state representation must be detailed,and our algorithm must be able to search the complex pa-rameter space with a limited number of samples.

We propose a combination of existing blob and con-tour tracking algorithms each of which address subsets ofthe challenges above. Our combination accentuates thestrengths of each algorithm to address all the challengesspecific to mouse tracking. In addition, we capitalize on theindependence assumptions made by our model, and muchof our search can be done independently for each mouse.This reduces the size and complexity of the search space,and allows our Monte Carlo sampling algorithm to robustlysearch the complex state parameter space with a reasonablenumber of samples. Our algorithm works with a detailedrepresentation of mouse contour to achieve positive resultson a video sequence of three mice exploring a cage.

The paper is organized as follows. In Section 2, we de-scribe the algorithms we build off of: the Bayesian MultipleBlob (BraMBLe) tracker [?] and MacCormick et al.’s con-tour likelihood model [?]. In Section 3, we describe themodel assumed for the blob and contour tracking problem.In Section 4, we describe our particle filtering algorithm for

1

optimizing this criterion. In Section 5, we present specificdetails of our algorithm, and the results for a challengingvideo sequence. Finally, in Section 6, we discuss the virtuesand vices of our algorithm, and ideas for future work to bet-ter solve the mouse tracking problem.

2. Related WorkOur algorithm builds off of the BraMBLe tracker [?] andMacCormick et al.’s contour tracking framework [?]. Bothof these approaches are based on particle filtering. In thissection, we first introduce the standard particle filtering al-gorithm with the purpose of introducing the notation we willuse throughout the paper; see [?] for a complete treatmentof particle filtering. Then, we describe the blob and contourtracking algorithms briefly, and point out their strengths andweaknesses for our application.

2.1. Particle FilteringParticle filtering is a Monte Carlo sampling algorithm forestimating properties of hidden variables given observationsin a Markov chain. For tracking from video,xt, the state attime t, represents the positions of the objects in framet,andyt, the observation at timet, is a function of frametof the video. Particle filtering assumes that we can directlysample fromp(xt|xt−1), the probability of transitioning tostatext from statext−1. It also assumes that we can easilyevaluate the observation likelihood,p(yt|xt), the likelihoodof observingyt in statext.

Particle filtering sequentially constructs particle set rep-resentations ofp(xt|y1:t), the posterior probability of ob-serving statext given the sequence of observationsy1:t =(y1, ...,yt), using the following recursive factorization:

p(xt|y1:t) ∝ p(yt|xt)p(xt|y1:t−1)

= p(yt|xt)∫X

p(xt|xt−1)dp(xt−1|y1:t−1).

Let {(x(i)t , w

(i)t )}N

i=1 be the set ofN particles represent-ing the filtering distribution at framet. Particle filteringuses importance sampling to sequentially create the cur-rent particle set,{(x(i)

t , w(i)t )}N

i=1, from the previous set,

{(x(i)t−1, w

(i)t−1)}N

i=1. In its most standard form, the impor-tance function and importance weights are:

x(j)t ∼ p(xt|y1:t−1) ≈

N∑i=1

w(i)t−1p(xt|x(i)

t−1)

w(j)t ∝ p(yt|x(j)

t ),

where the weights are normalized to sum to one. In order tokeep the variance of the weights small, the samples can beresampled according to their weights.

2.2. BraMBLe LikelihoodThe Bayesian Multiple Blob (BraMBLe) tracker [?]provides a solution to tracking multiple occludingblobs/objects from video. The novelty of BraMBLe is itsobservation likelihood model, which provides a natural androbust solution to the data association problem. Given a hy-pothesized set of blob positionsx, BraMBLe first computesthe labellg(x) of each image locationg on a grid as eitherforeground (withinsomeblob) or background (outside ofall blobs). The likelihood of an observed pixel valueyg foreach label is modeled as a Gaussian mixture model (a sep-arate background GMM is learned for each grid location,a common GMM is learned for the foreground). The pixelvalues are assumed to be independent given the state, so thelikelihood of the entire frame is assumed to be the productof the individual pixel likelihoods:

p(y|x) ∝G∏

g=1

pg(yg|lg(x)),

wherep(yg|fore) andpg(yg|back) are the GMMs.BraMBLe does suffer from the curse of dimensionality;

the space searched by BraMBLe grows exponentially withthe number of objects tracked. However, the number of pa-rameters required to describe a blob is minimal. In addition,the likelihood function is smooth, in comparison to likeli-hood functions used for contour tracking (e.g. [?]). Thismakes the search simpler and more robust.

Empirically, we found that BraMBLe performs well onthe mouse tracking application in locating the approximatepositions of the mice (up to permutations of the indentitylabels). The main failure is that our blob representation ofthe state of a mouse (an ellipse) was not detailed enoughto give precise object positions. This approximate modelmakes the likelihood higher for some reasonable fits thanothers, based on the relationship between the model and theobject’s shape (in the common case that the 2D mouse shapeis not precisely elliptical). Because of the invalid indepen-dence assumptions made in the likelihood model, some rea-sonable fits will be evaluated as orders of magnitude betterthan others. This is particularly evident during occlusions,when the set of reasonable blob fits is large. BraMBLe willchoose one fit as best early in the occlusion, and give thisfit a disproportionately large weight. The particle sets pro-duced are thus extremely sparse, and BraMBLe cannot laterrecover from its mistake.

2.3. MacCormick et al.’s Contour LikelihoodWe combine the blob likelihood from BraMBLe with the“generic contour likelihood” described in [?]. In contourtracking, the state is represented by parameters which de-fine a B-spline. To determine the likelihood of a video frame

2

given the hypothesized state, edges in the video frame aredetected along a sparse set of short measurement lines nor-mal to and centered on the B-spline. The intersection be-tween the hypothesized contour state and the measurementline is the center of the measurement line.

Let y be the binary vector indicating where edges aredetected on a measurement line,n be the number of edgesdetected, andL be the measurement line length. Ifn = 0,then the measurement line likelihood isp(y|x) = p01. Ifn ≥ 1, with probability1−p01, one of these edge detectionswas produced by the hypothesized intersection, and the restwere produced by background clutter. With probabilityp01,they were all produced by background clutter. The loca-tion of the edge produced by the hypothesized intersectionis Gaussian around the line center with varianceσ2. Clut-ter edge detections are uniformly distributed along the line.The numbern′ of clutter detections is a Poisson randomvariable,b(n′) = e−λL(λL)n′/n′!. Thus, forn ≥ 1,

p(y, n|x) = (1−p01)b(n− 1)Ln−1

n∑i=1

yiN(i;L/2, σ2)+p01b(n)Ln

.

The measurement line observations are assumed to be inde-pendent given the state, so the total likelihood is the productof the individual likelihoods.

We found that, even for one mouse, the search performedby contour tracking is much more difficult than that per-formed by BraMBLe. This is because, for a detailed contourmodel, the parameter space is larger than for a blob model.In addition, the contour observation likelihood is much lesssmooth than the blob likelihood: if a pair of contours arenot a very good fit to the video frame, then the rankingsgiven to the contours are often not meaningful. However,for contours that are very close to fitting the data, the con-tour likelihood is usually peaked around a meaningful fit.This is in contrast to BraMBLe, in which the rankings areusually meaningful on a large scale but not meaningful ona small scale. Even for one object, low-level blob trackingis useful for guiding the contour search [?]. Thus, while ageneralization of this contour likelihood to multiple targetsexists [?, ?], we found that it alone did not work well forour complex multitarget contour model. In addition, thereis much more contour signal available during occlusions,both in the silhoutte of the occlusion and edges that are de-tected between pairs of mice. Thus, contour tracking is auseful asset to tracking the mice during occlusions.

3. A Blob and Contour Object ModelWe use a particle filtering algorithm to approximatep(xt|y1:t), the posterior distribution of the state of allkmice in framet, xt = xt,1:k, given blob and contour ob-servations for frames1 throught, y1:t = (yc,1:t,yb,1:t). Toperform the blob and contour particle filtering described in

Section 4, we must be able to generate samples from andevaluate the transition distributions. These are:• p(xtm|xt−1,m): the distribution of the contour state at

time t for mousem given the contour state for mousem at timet− 1.

• p(xbtm|xt−1,m): the distribution of the blob state attime t for mousem given the contour state for mousem at timet− 1.

• p(xtm|xbtm,xt−1,m): the distribution of the contourstate at timet for mousem given the blob state formousem at timet and the contour state for mousemat timet− 1.

We must also be able to evaluate the observations likeli-hoods. These are:• p(yctm|xtm): the likelihood of observing contour ob-

servationsyctm for mousem at timet given the con-tour state for mousem at timet.

• p(ybt|xt,1:k) = p(ybt|xb,t,1:k): the likelihood of ob-serving blob observationsybt at timet given the statefor all k mice at timet.

Notice that the only model that depends on allk mice is theblob observation likelihood. This is because of the manyindependence assumptions made about the mouse dynam-ics and observation likelihood. First, we assume that themice move independently of one another. Thus, the transi-tion models for allk mice can be factored as the product ofthe transition models for each mouse. Second, as we discussin Section 3.3, we assume that the blob and contour obser-vations are independent given the state, so we can factor theobservation likelihood as the product of the blob likelihoodand the contour likelihood. Third, the assumptions made bythe generic contour likelihood allow the contour likelihoodto be factored as the product of the likelihood of the relevantmeasurement lines for each mouse.

In this section, we describe the forms of these distri-butions assumed. We first describe the representation andparameterization of the blob and contour states in Section3.1. In Section 3.2, we describe the transition distributionswhich make many independence assumptions based on thestate parameterization. In Section 3.3, we describe the as-sumed observation model.

3.1. The Blob-Contour StateWe have a combined blob and contour state model. An in-dividual mouse blob is modeled as an ellipse, which wedescribe by five shape and three velocity parameters. Wechose to parameterize the blob by the centerx-coordinateµx, the centery-coordinate,µy, the semimajor axis length,a, the semiminor axis lengthb, the rotation after scaling,θ,the centerx-velocity, vx, the centery-velocity, vy, and thesemimajor axis length velocity,va:

xbm = (µxm, µym, am, bm, θm, vx, vy, va)>.

3

Figure 1:The 12 B-spline contour templates chosen to representa mouse contour. The circles are the locations of the measurementlines.

An individual mouse contour is modeled as any affinetransformation of any of a discrete set of closed B-splinecontour templates. Figure 1 shows the 12 templates wechose. These contours each have 12 knots, but there isno assumed correspondence between these knots. This wasnecessary because the shape of the 2D mouse image variesso much. Thus, there is no sense of a linear combinationof contours. Thus, we cannot use the model described in[?]. The locations of the measurement lines along eachcontour template are set by hand, as shown in Figure 1.In addition, the minimum and maximum allowed eccentric-ity, pre-scaling rotation, and post-scaling rotation are set foreach contour (no limits on post-scaling rotation are givenfor mice hanging from the ceiling). Each contour is alsolabeled as facing left, right, or both left and right.

The contour state was parameterized by the eight param-eters describing the ellipse and its velocity, the rotation be-fore scaling,φ, the contour template,c, and whether thetemplate is flipped,f : xm = (xbm, φm, cm, fm)>. Thestate of allk mice is the concatenation of the individualmouse state vectors,x1:k = (x>1 ...x>k )>.

3.2. Model DynamicsWe now describe the assumed transition distributions. Weassume simple Gaussian transition models for continuousparameters. For the discrete contour parameters, we cal-culate a probability for each contour template based onwhether the contour template is the same ((ctm, ftm) =(ct−1,m, ft−1,m)) and whether the contour templateis facing the same direction (direction(ctm, ftm) =direction(ct−1,m, ft−1,m)). All our models are similar.

The contour to blob transition model,p(xbtm|xt−1,m),is the same as the contour to contour transition model,p(xtm|xt−1,m), for all relevant parameters. We describehow to generate a sample fromp(xtm|xt−1,m).

We first generate the continuous shape parametersxs =(µx, µy, a, b, θ, φ)>. These are assumed to follow indepen-dent damped constant velocity and/or autogressive models:

p(xst|xt−1) = N (xst; (I−Γ)(xs,t−1+Λvs,t−1)+Γx̂s,Σs),

whereΓ is the diagonal autoregressive constant (which is

set to 1 forµx, µy, andθ), Λ is the diagonal dampeningconstant (which is set to 0 forb, θ, andφ), andΣs is the as-sumed diagonal covariance matrix. The velocities are thenset by subtracting the previous shape state from the currentshape state.

Next, we generate a contour. There is a high probabilityof the contour in the current frame being the same as thecontour in the previous frame. The probability of chang-ing to a different contour is based on whether the currentcontour and the new contour face the same direction. Wefirst determine which contours are allowed given the gen-erated shape. If no contour is allowed, we rotate the shapeafter scaling by the minimum amount to allow at least onecontour. We then decide which direction (left or right) thegenerated contour should face (if contours facing both di-rections are allowed). We flip the direction with probabilityproportional to the squared eccentricity. Given the direc-tion, we choose an allowed contuor facing that direction. Ifthe direction has not changed, we choose the same contourwith high probability. All other contours allowed in a givendirection are given equal weight.

The assumed model of the contour given the blob andprevious contour,p(xtm|xbtm,xt−1,m), is the same as thecontour to contour transition model for parameters usedonly to describe the contour. The other parameters are as-sumed to be Gaussian around the blob parameters, with thesame variance as the contour to contour transition model.

3.3. The Observation Likelihood ModelWe use an observation likelihood model that is a combina-tion of the BraMBLe likelihood and a soft version of thegeneric contour likelihood. Both likelihoods are dependenton the observation features extracted from the raw videoframe. In Section 3.3.1 we describe the blob and contourobservations. In Section 3.3.2, we describe our soft gener-alization of the contour likelihood. Finally, in Section 3.3.3,we describe our combined blob and contour likelihood.

3.3.1. Feature Extraction

BraMBLe relies on foreground and background models ofpixel value being differentiable from one other. The bed-ding and the mice are similar in color, so using just the colorof the pixel does not work. BraMBLe therefore filters thethree color channels with both Gaussian and Laplacian ofGaussian filters at a set scale. The Laplacian of Gaussianfilter is useful in differentiating the smooth mouse texturefrom the variable bedding texture. The only noticeable fail-ure for this choice of features is the mouse’s shadow on thebedding. This is in between the mouse and lighted bed-ding in color, and the training data we used to learn thebackground model did not accurately represent both lightedand shaded modes at many locations in the image. Figure

4

3(a) shows the log-likelihood ratio of foreground over back-ground for some example frames.

Contour tracking relies on an accurate edge detection al-gorithm. This is a challenge for our video sequence becausethere is a large amount of clutter in the scene. Much of thebedding has a high gradient in image intensity and there arescratches on the cage. It is also difficult to detect edges be-tween the mice and the bedding, because the bedding in theshadow of the mouse is very similar in color to the mouse.In addition, the edges between pairs of mice are also subtle(if visible at all). We tried numerous edge detection meth-ods, and the only algorithm that gave reasonable resultswas the boundary detection algorithm used by the BerkeleySegmentation Engine (BSE) [?]. The BSE boundary detec-tor computes the posterior probability of an edge based onthe brightness, texture, and color gradient using a classifiertrained on 12,000 manually labeled images. We credit theBSE’s superior performance in part to the texture gradient,which is robust to the types of clutter described. Figure3(b) shows example images illustrating the BSE boundarydetector’s performance. The major downside of the BSEboundary detector is it is computationally intensive – pro-cessing one image took over five minutes on a 2.8 GHz ma-chine. We hypothesize that this algorithm can be optimizedfor tracking applications to reduce this cost.

3.3.2. A Soft Contour Likelihood

Because the BSE boundary detector outputs meaningfulprobabilities rather than hard edge classifications, we useda soft version of the generic contour likelihood. This wasessential for detecting edges between pairs of mice, as BSEoften output a weak response for these edges (see Figure3(b)). The BSE output for locationi is p(edge|yi), theprobability thati is an edge given the observationsyi. Wethen modeled the probability of a binary classification ofeach pixel along the linez given the edge featuresy alonga measurement line as the product,

p(z|y) =L∏

i=0

p(edge|yi)zi(1− p(edge|yi))1−zi .

We assume equal priors for allz, so this is equal top(y|z).The probability of observing measurement liney given thehypothesized contour is the sum over all these possibilities:

p(y|x) =∑

z∈{0,1}L+1

p(y|z)p(z, n|x),

wherep(z, n|x) is the generic contour likelihood describedin Section 2.3. While this computation is extremely fast forsmallL, it grows exponentially withL. To combat this, thesum can be taken only overz such thatp(y|z) is large.

3.3.3. The Blob-Contour Observation Likelihood

We make the simple assumption that the blob observationsyb and the contour observationsyc are independent giventhe state of the object:p(yb,yc|x) = p(yb|x)p(yc|x).Here, p(yb|x) is the BraMBLe blob likelihood describedin Section 2.2 andp(yc|x) =

∏km=1 p(ycm|xm) is the soft

contour likelihood model described in Section 3.3.2. TheICondensation algorithm avoids this assumption by usingthe blob posteriorp(xb|yb,1:t) as the importance functionfor contour tracking [?]. For our algorithm, it is essentialthat p(yb|x) to be included in our likelihood in order touse BraMBLe to solve the data association problem. Wefeel that this independence assumption is more reasonablethan the independence assumptions we have already made(i.e. that adjacent grid pixel values or adjacent measurementlines are conditionally independent).

4. A Blob and Contour Particle FilterIn this section, we describe our algorithm for efficientlysampling from the combined blob and contour posterior dis-tribution for thek mice, p(xt,1:k|yb,1:t,yc,1:t), given themodels and independence assumptions described in Section3. We do this in a manner that accentuates the strengths ofthe blob and contour tracking algorithms, as described inSection 2. Our sampling algorithm also capitalizes on thenumerous independence assumptions – we assume that thecontour likelihood and the state dynamics are independentfor each mouse.

At each iteration of particle filtering, we begin by sam-pling using only the blob observations. This step localizesthe search space for each mouse. This has two effects.It decreases the amount of space that must be searched inthe contour tracking step, like the blob importance functionof [?]. It also separates the mice so that we can run con-tour tracking independently for each mouse. Next, for eachmouse independently, we incorporate the contour observa-tions to fine-tune the fit. These two sampling steps result inan algorithm that samples from the product of the posteriormarginals,

q(xtm|y1:t,m) =k∏

m=1

p(xtm|y1:t,m),

wherey1:tm = (yb,1:t,yc,1:t−1,yctm) In order to obtaina particle set representation of the posterior distribution ofall the mice, we reweight the samples by the importanceweight. In Section 4.1, we show how the marginal posteriorcan be decomposed into the distributions described in Sec-tion 3. We also present our particle filtering algorithm thatuses this decomposition to generate a particle set represent-ing the marginal posterior for each mouse. In Section 4.2,we describe our algorithm for combining these particle setsto produce a particle set representation of the joint posterior.

5

4.1. Sampling from the Marginal Posterior

Given {(x(i)t,1:k, w

(i)t )} representingp(xt−1,1:k|y1:t−1), we

sample from the marginal

p(xtm|y1:t,m) =∫

p(xtm|ytm,xt−1,1:k)dp(xt−1,1:k|y1:t−1)

≈N∑

i=1

p(xtm|ytm,x(i)t−1,1:k)w(i)

t−1,

whereytm = (ybt,yctm) is the blob observation and thecontour observation for the current mouse (recall that weuse different measurement lines for each mouse). This isthe probability that the state for mousem at framet is xtm

given the state of allk mice in the previous frame,xt−1,1:k

and the observations for mousem. We only have a modelof the blob likelihood given the positions of allk mice,p(ybt|xt,1:k), so we introduce dummy state vectors in anintegral to get:

p(xtm|y1:t,m,xt−1,1:k) =∫

p(xtm,b1:k|ytm,xt−1,1:k)db1:k,

whereb1:k are the blob state vectors at timet. Using theMarkov chain and observation likelihood independence as-sumptions, we can rewrite this as∫

p(xtm|bm,xt−1,m,yctm)p(b1:k|xt−1,1:k,ybt)db1:k

= p(yctm|xtm)∫

p(xtm|bm,xt−1,m)p(ybt|b1:k)

p(b1:k|xt−1,1:k)db1:k

This decomposition is in terms of the distributions de-scribed in Section 3 for which we have assumed sim-ple models. We can therefore generate particles fromthis distribution from the previous state’s particle set,{(x(i)

t−1,1:k, w(i)t−1)}, using standard particle filtering tech-

niques. There are many ways that this can be done; pseu-docode for our choice is shown in Step 1 of Figure 2. Theparticle set{(x̄(i)

tm, w(i)tm)} generated in this algorithm repre-

sents the marginal posterior distribution for mousem. Weresample in steps 1a (as in standard particle filtering) and1d. The extra resampling step in 1d localizes the contoursearch space to the parts of parameter space deemed impor-tant by blob tracking.

4.2. Sampling from the Joint PosteriorWe use the product of the marginal posteriors as the impor-tance function for sampling from the joint posterior. [Needsome motivation as to why this is okay here]. The algo-rithm for sampling from the joint posterior given particlesets{(x̄(i)

tm, w(i)tm)} representing the marginal posterior is

For t = 1, 2, ...:1. Sample from the marginal posteriors:

For i = 1, ..., N :

a. Choose x̄(i)t−1,1:k ∼ {(x(j)

t−1,1:k, w(j)t−1)}.

b. Generate b̄(i)1:k ∼ p(b1:k|x̄(i)

t−1,1:k).

c. Compute the weight w(i)b ∝ p(ybt|b̄(i)

1:k.

d. Choose (b(i)1:k ∼ {(b̄(j)

1:k, w(j)b )}

e. For m = 1,...,k,

Generate x̃(i)tm ∼ p(xtm|b(i)

m ), xt−1,m.

f. Compute the weight w(i)tm = p(yctm|xtm).

2. Sample from the joint posterior:For i = 1, ..., N :

a. For m = 1, ..., k,

Choose x(im)tm ∼ {(x̃(j)

tm, w(j)tm)}.

b. Concatenate: x(i)t,1:k = (x

(i1)t1 , ..., x

(ik)tk ).

c. Compute the importance weight:

w(i)t ∝ p(ybt|x(i)

t,1:k)∑N

j=1∏k

m=1 p(x(i)tm|x(j)

t−1,m).

Figure 2:Blob and Contour Particle Filtering

shown in step 2 of Figure 2. To sample from the product ofmarginals, we simply choose samples independently fromeach of the marginals and concatenate. We then reweighteach samplei by the ratio of the joint posterior and the prod-uct of marginals evaluated at the sample:

w(i)t ∝

∫p(x(i)

t,1:k|xt−1,1:k,yt)dxt−1,1:k∏m p(x(i)

tm|y1:t,m).

The numerator factors as proportional to∫p(ybt|x(i)

t,1:k)∏m

p(yctm|x(i)tm)p(x(i)

tm|xt−1,m)dxt−1,1:k

≈ p(ybt|x(i)t,1:k)

∏m

p(yctm|x(i)tm)

∑j

w(j)t−1p(x(i)

tm|x(j)t−1,m).

We approximate the denominator by∏

m p(yctm|x(im)tm ).

Thus, the importance weight is the ratio

w(i)t ∝ p(ybt|x(i)

t,1:k)N∑

j=1

k∏m=1

p(x(i)tm|x

(j)t−1,m).

5. ExperimentsWe evaluated our blob and contour tracking algorithm ona video sequence of three identical mice exploring a cage.The model parameters were chosen by hand using a sepa-rate video sequence. Many of the parameters were set us-ing our knowledge about the problem. These include thevariance of the transition models and the constraints on thestate. Other parameters, including the damping and autore-gressive constants, were set to values used in [?]. The con-tour likelihood parameters were set so that the ranking ofthe probability of each vector of observed edge detectionsseemed reasonable. Some of the parameters were chosen

6

somewhat arbitrarily and never varied – these include thenumber of measurement lines and the parameters used in theBraMBLe likelihood. The number of samples was chosento beN = 2000. While we had qualitatively similar per-formance withN = 1000 samples, the results returned byparticle filtering varied quite a bit. We thus chose to presentresults withN = 2000 samples, for which the output of ourparticle filtering algorithm was stable. This number of sam-ples compares favorably to the 4000 samples used to track apair of leaves in [?] and the 1000 samples used to track twopeople/blobs in [?].

We provide with this paper a video of our results on ashort sequence. Summary still frames are shown in Fig-ure 4. One of the main successes of our contour trackingalgorithm is that it is resistant to drift – we never lose amouse, despite their erratic behavior. For instance, we fol-low a mouse jumping, a mouse falling from the ceiling, andmice making quick turns and accelerations that are not fitby our simple constant velocity model (see Figure 4(a) forexamples). This is due to the robustness of the blob track-ing component of our algorithm. Another success of our al-gorithm is that two contours never match the same mouse,showing that our solution to the data association problem iseffective. Our blob contour tracking algorithm therefore isable to preserve the strengths of BraMBLe.

Our algorithm is rarely distracted by background clutter.The only exceptions to this are whenbothalgorithms makemistakes: when the blob tracker mistakes shaded beddingfor foreground and the contour tracker fits to the edge of atail (see Figure 4(b) for an example). This implies that thepowerful feature extraction methods and the blob and con-tour combination provide robust observation likelihoods.

Our algorithm accurately tracks the mice completelythrough some severe occlusions, and partway through manyothers. This is because of the detailed fit provided by thecontour tracking algorithm and its ability to use featuresavailable during occlusions. Example successful frames areshown in Figure 4(c). In general, we found that our algo-rithm often found very reasonable contour fits.

One failure of our algorithm is that it occasionally getsstuck in local optima in which the mouse contour fit wasfacing the wrong direction. We plan to address this prob-lem with a better model of the probability of the directionchanging. Figure 4(b) shows examples of this.

The most severe failure of our algorithm was that iden-tity labels were swapped during some of the occlusions. Thereason for this is that the fit of our algorithm is heavily bi-ased by the fit of the BraMBLe algorithm. For occlusionsin which the contour observation signals are weak, this biasfrom BraMBLe can dominate. We propose a solution to thisproblem in Section 6. Figure 4(d) shows some examples ofthese failures.

For comparison, we also implemented a combined blob-

contour tracking algorithm that did not exploit the contourlikelihood independence assumption. This algorithm hasthe disadvantage that samples of allk mice are weightedby the product of the contour likelihood for each mouse.Thus, the number of samples fitting blobs is the same, butthe effective number of samples used when fitting contoursis much smaller. The results observed fit this theory. Whilethis algorithm was also resistant to drift (blob tracking isthe same in both algorithms), the contour fits found wereless satisfactory. In general, they were less fine-tuned andmore variable from frame to frame, which suggests that thealgorithm needed more samples. This is particularly evidentduring occlusions. The fits during occlusions are less fine-tuned to the contour data, and therefore more influenced bythe blob tracking results. This causes worse fits in everyocclusion in the sequence. In the first 500 frames, this al-gorithm swapped identities twice more than the algorithmproposed in this paper.

6. Conclusions and Future WorkWe have presented a combined blob and contour particlefiltering algorithm that performs well on a difficult videosequence of three identical mice exploring a cage. Our al-gorithm leverages the strengths of multiple blob tracking,contour tracking, and the independence assumptions of ourmodel to accurately track the mouse contours without toomany particles. This combination allows robust tracking ofmultiple occluding objects on real data.

In future work, we plan on exploring algorithms that useinformation from both the past and the future to determinethe positions of the mice during an occlusion. We hopethat this will solve the main failure of the algorithm pro-posed in this algorithm by making the BraMBLe trackerless greedy. Standard Monte Carlo smoothing algorithmsare insufficient to solve this problem because the number ofpossible fits during an occlusion is huge and the BraMBLelikelihood is peaked enough that, for a reasonable numberof samples, some potential paths are lost altogether. MonteCarlo smoothing fails because it works by reweighting thefiltering particle set based on the future observations. Weare thus exploring more heuristic solutions to this problem,in the direction of [?].

Acknowledgments

References

[1] I. M. Author, “Some Related Article I Wrote,”Some FineJournal, Vol. 17, pp. 1-100, 1987.

[2] A. N. Expert,A Book He Wrote,His Publisher, 1989.

7

(a) The log-likelihood ratio of foreground over back-ground for selected frames. Image ii shows an oc-clusion. Image iii shows the tail and shadow of themice.

(b) The second column shows the BSE output; the third shows the canny edgedetector’s output. In image i an edge between a pair of mice is found. In imageii, BSE gives a weak response to an edge between a pair of mice. In the bottomframe, BSE is robust to scratches.

Figure 3:Features extracted for the (a) blob and (b) contour observation likelihoods.

(a) Tracking results for a mouse jumping, a mouse falling from the ceiling, and a mouse turning quickly.

(b) The contour is fit to a tail and the blob is fit to a shadow; the tracker is robust to scratches on the cage; the contour is flipped.

(c) The first three occlusion sequences in which our tracking algorithm performs well.

(d) The first two occlusion sequences on which our algorithm swaps mouse identities.Figure 4:Still frame summary of our results. We plot the average affine transformation applied to the contour with the most total weight.

8

tracking multiple mouse contours (without too many...

Documents