jayanth nayak 1 , luis gonzalez-argueta 2 , bi song 2 , amit roy-chowdhury 2 , ertem tuncel 2

21
MULTI-TARGET TRACKING THROUGH OPPORTUNISTIC CAMERA CONTROL IN A RESOURCE CONSTRAINED MULTIMODAL SENSOR NETWORK Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2 Department of Electrical Engineering, University of California, Riverside 9/8/2008 ICDSC'08 1 Bourns College of Engineering Information Processing Laboratory www.ipl.ee.ucr.edu

Upload: junius

Post on 22-Feb-2016

31 views

Category:

Documents


0 download

DESCRIPTION

MULTI-TARGET TRACKING THROUGH OPPORTUNISTIC CAMERA CONTROL IN A RESOURCE CONSTRAINED MULTIMODAL SENSOR NETWORK. Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2 Department of Electrical Engineering, University of California, Riverside. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

ICDSC'08 1

MULTI-TARGET TRACKING THROUGH OPPORTUNISTIC CAMERA CONTROL IN ARESOURCE CONSTRAINED MULTIMODAL

SENSOR NETWORKJayanth Nayak1, Luis Gonzalez-Argueta2, Bi Song2,

Amit Roy-Chowdhury2, Ertem Tuncel2Department of Electrical Engineering,

University of California, Riverside

9/8/2008

Bourns College of EngineeringInformation Processing Laboratorywww.ipl.ee.ucr.edu

Page 2: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

ICDSC'08 2

Overview

IntroductionProblem FormulationAudio And Video ProcessingCamera Control StrategyComputing Final Tracks Of All TargetsExperimental ResultsConclusionAcknowledgements

9/8/2008

Page 3: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

ICDSC'08 3

Motivation

Obtaining multi-resolution video from a highly active environment requires a large number of cameras.Disadvantages

Cost of buying, installing and maintainingBandwidth limitationsProcessing and storagePrivacy

Our goal: minimize the quantity of cameras by a control mechanism that directs the attention of the cameras to the interesting parts.

9/8/2008

Page 4: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

ICDSC'08 4

Proposed Strategy

Audio sensors direct the pan/tilt/zoom of the camera to the location of the event.Audio data intelligently turns on the camera and video data turns off the camera.Audio and video data are fused to obtain tracks of all targets in the scene.

9/8/2008

Page 5: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

ICDSC'08 5

Example Scenario

9/8/2008

An example scenario where audio can be used to efficiently control two video cameras. There are four tracks that need to be inferred. Directly indicated on tracks are time instants of interest, i.e., initiation and end of each track, mergings, splittings, and cross-overs. The mergings and crossovers are further emphasized by X. Two innermost tracks coincide in the entire time interval (t2, t3). The cameras C1 and C2 need to be panned, zoomed, and tilted as decided based on their own output and that of the audio sensors a1, . . . , aM.

Page 6: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

ICDSC'08 6

Relation To Previous Work

Fusion of simultaneous audio and video data.Our audio and video data are captured at disjoint time intervals.

Dense network of vision sensors.In order to cover a large field, we focus on controlling a reduced set of vision sensors.

Our video and audio data is analyzed from dynamic scenes.

9/8/2008

Page 7: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

ICDSC'08 7

Problem Formulation

Audio sensors A = {a1, . . . , aM} are distributed across ground plane RR is also observable from a set of controllable cameras C = {c 1, . . . ,cL}.However, entire region R may not be covered with one set of camera settings.p-tracks: tracks belonging to targetsa-tracks: tracks obtained by clustering audioResolving p-track ambiguity

Camera ControlPerson Matching

9/8/2008

Page 8: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

ICDSC'08 8

Tracking System Overview

9/8/2008

a-tracks

Overall camera control system. Audio sensors A = {a1, . . . , aM} are distributed across regions Ri. The set of audio clusters are denoted by Bt, and Kt− represent the set of confirmed a-tracks estimated based on observations before time t. P/T/Z cameras are denoted by C = {c1, . . . , cL}. Ground plane positions are denoted by Ot

k .

Page 9: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

ICDSC'08 9

Processing Audio and Video

a-tracks are clusters of audio data that are above amplitude threshold

Tracked using Kalman FilterIn video, people are detected using histogram of orientated gradients and tracked using Auxilary Particle Filter

9/8/2008

Page 10: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

ICDSC'08 10

Mapping From Image Plane to Ground Plane

Learned parameters are used to transform tracks from image plane to ground planeEstimate projective transformation matrix H during a calibration phasePrecompute H for each PTZ setting of each camera

9/8/2008

vanishing line

Page 11: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

ICDSC'08 11

Tracking System Overview

9/8/2008

Page 12: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

ICDSC'08 12

Camera Control

Camera controlGoal: avoid ambiguity or disambiguate when tracks

are created or deletedintersectmerge

Set pan/tilt/zoom parameters

9/8/2008

Page 13: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

ICDSC'08 13

Setting Camera Parameters

Heuristic algorithmCover ground plane by regions Ri

l Ri

l in field of view of camera Cl Camera parameters

Tracking algorithm specifies point of interest x from last known a-track

If no camera on, find Ri l containing x

Reassign a camera and set its parameters if x approaches boundary of current Ri

l

9/8/2008

li

li

li ZTP ,,

Page 14: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

ICDSC'08 14

Camera Control Based on Track Trajectories

Intersection

9/8/2008

SeparationMerger

Sudden Appearance Undetected Disappearance

Sudden Disappearance

Locatio

n(M

eters)

Time(Seconds)

Locatio

n(M

eters)

Time(Seconds)

Locatio

n(M

eters)

Time(Seconds)

Locatio

n(M

eters)

Time(Seconds)

Locatio

n(M

eters)

Time(Seconds)

Switch to video

Locatio

n(M

eters)

Time(Seconds)

Page 15: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

ICDSC'08 15

Creating Final Tracks Of All Targets

Bipartite graph matching over a set of color histograms

We collect features as the target enters and exits the scene in video.For every new a-track, features are collected from a small set of frames.The weight of an edge is the distance between the observed video features.Additionally, audio data is enforced on the weights.

9/8/2008

Page 16: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

ICDSC'08 16

Creating Final Tracks Using Bipartite Matching

9/8/2008

Locatio

n(M

eters)

Time(Seconds)

Audio AudioVideo[a+, a-]

[b+, b-]

[c+]

[d+]

[e+, e-]

Tracking in Audio and Video

Locatio

n(M

eters)

Time(Seconds)

Tracking in Audio Only

Three tracks are recovered by matching every node (entry and exit from the scene) where video was capture.

Two tracks are recovered . However, red and green show the wrong path.

Audio cannot disambiguate independence once the clusters have merged.

[f+]

[g+]

Video

abcdefg

+-

Bipartite Graph Matching

abcdefg

abcdefg

+-

Bipartite Graph Matching Without Audio Constraint

abcdefg

[d-]

[c-]

Page 17: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

ICDSC'08 17

Experimental Results

9/8/2008

Inter P-Track Distance at a Merge EventInter P-Track Distance at a Crossover Event

Page 18: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

ICDSC'08 18

Experimental Results (Cont.)

9/8/2008

Click To Review Layout

Page 19: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

ICDSC'08 19

Conclusion

Goal: minimize camera usage in a surveillance system

Save power, bandwidth, storage and moneyAlleviate privacy concerns

Proposed a probabilistic scheme for opportunistically deploying cameras in a multimodal network. Showed detailed experimental results on real data collected in multimodal networks.Final set of tracks are computed by bipartite matching

9/8/2008

Page 20: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

ICDSC'08 20

Acknowledgements

This work was supported by Aware Building: ONR-N00014-07-C-0311 and the NSF CNS 0551719.

Bi Song2 and Amit Roy-Chowdhury2 were additionally supported by NSF-ECCS 0622176 and ARO-W911NF-07-1-0485.

9/8/2008

Page 21: Jayanth  Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit  Roy-Chowdhury 2 ,  Ertem  Tuncel 2

Thank You.

Questions?

Jayanth Nayak1

[email protected] Gonzalez-Argueta2, Bi Song2,

Amit Roy-Chowdhury2, Ertem Tuncel2

{largueta,bsong,amitrc,ertem}@ee.ucr.edu

9/8/2008

Bourns College of EngineeringInformation Processing Laboratorywww.ipl.ee.ucr.edu