video synopsis

86
Video Synopsis Yael Pritch Alex Rav-Acha Shmuel Peleg The Hebrew University of Jerusalem

Upload: margot

Post on 09-Jan-2016

35 views

Category:

Documents


1 download

DESCRIPTION

Video Synopsis. Yael Pritch Alex Rav-Acha Shmuel Peleg The Hebrew University of Jerusalem. Detective Series: “Elementary”. Video Surveillance Problem. Cologne Train Bombs, 31-7-06. Terrorists, London tube, 7-7-05. It took weeks to find these events in video archives. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Video Synopsis

Video Synopsis

Yael Pritch Alex Rav-Acha Shmuel Peleg

The Hebrew University of Jerusalem

Page 2: Video Synopsis

Detective Series: “Elementary”

Page 3: Video Synopsis

Video Surveillance Problem

• It took weeks to find these events in video archives.

• Cost of a lost information or a delay may be very high.

Terrorists, London tube, 7-7-05Cologne Train Bombs, 31-7-06

Page 4: Video Synopsis

Challenges in Video Surveillance

• Millions of surveillance cameras are installed, capturing data 24/365

• Number of cameras and their resolution increases rapidly

• Not enough people to watch captured data

• Human Attention is Lost after ~20 Minutes

• Result: Recorded Video is Lost Video– Less than 1% of surveillance video is

examined

Page 5: Video Synopsis

Handling Surveillance Video

• Object Detection and Tracking– Background Subtraction

• Object Recognition– Individual people

• Activity Recognition– Left luggage; Fight

• A lot of progress done. More work remains.

Page 6: Video Synopsis

• Object Detection and Tracking– Background Subtraction (Assume Done)

• Object Recognition (Do not use)– Individual people

• Activity Recognition (Do not use)– Left luggage; Fight

• A lot of progress done. More work remains.

• Let People do the Recognition

Handling Surveillance VideoVideo Synopsis

Page 7: Video Synopsis
Page 8: Video Synopsis

Video Synopsis

Video SynopsisOriginal video

• A fast way to browse & index video archives.• Summarize a full day of video in a few minutes.• Events from different times appear simultaneously.• Human inspection of synopsis!!!

Page 9: Video Synopsis

Synopsis of Surveillance VideosHuman Inspection of Search Results

• Serve queries regarding each camera:– Generate a 3 minutes video showing

most activities in the last 24 hours– Generate the shortest video showing all

activities in the last 24 hours

• Each presented activity points back to original time in the original video

• Orthogonal to Video Analytics

Page 10: Video Synopsis

Non-Chronological Time

Dynamic Mosaicing Video Synopsis

SalvadorDali

The Hebrew University of Jerusalem

Page 11: Video Synopsis

Dynamic Mosaics

Non Chronological Time

Page 12: Video Synopsis

HandheldStereo Mosaic

Page 13: Video Synopsis

u

t

Mosaic Image

Original framesstrips

Page 14: Video Synopsis

Frame tl

u

t

Frame tk

uaub

Mosaic Image

Space-TimeSlice

Visibility region

Page 15: Video Synopsis

u

t

First Slice

Last Slice

play

Creating Dynamic Panoramic Movies

First Mosaic - Appearance

Last Mosaic - Disappearance

Page 16: Video Synopsis

Dynamic Panorama: Iguazu Falls

u

t

Page 17: Video Synopsis

From Video In to Video OutConstructing an aligned

Space-Time Volume

u

dtv

aαt

bAlignment: Parallax, Dynamic Scenes, etc.

Page 18: Video Synopsis

t

u

kk+1

u

t

Stationary Camera Panning Camera

kk+1

Aligned ST Volume: View from Top

Page 19: Video Synopsis

Generate Output VideoSweeping a “Time Front” surface

Time is not chronological any more

Interpolation

Page 20: Video Synopsis

Generate Output VideoSweeping a “Time Front” surface

Time is not chronological any more

Interpolation

Page 21: Video Synopsis

u

t

Evolving Time Frontu

t

x

Mapping each TF to a new frame using spatio-temporal interpolation

Page 22: Video Synopsis

Example: Demolition

Page 23: Video Synopsis

t

u

Page 24: Video Synopsis

Example: Racing

Page 25: Video Synopsis

t

v

Page 26: Video Synopsis

Dynamic Panorama: Thessaloniki

Page 27: Video Synopsis

Creating Panorama: 4D min-cutAligned space-time

volume

t

x

Page 28: Video Synopsis

Mosaic Stitching Examples

Page 29: Video Synopsis

Mosaic Stitching Examples

Page 30: Video Synopsis

Video Synopsis and IndexingMaking a Long Video Short

• 11 million cameras in 2008• Expected 30 million in 2013• Recording 24 hours a day, every day

Page 31: Video Synopsis

2009

Explosive growth in cameras…

201431

11m

24m

Page 32: Video Synopsis

Handling the Video Overflow

• Not enough people to watch captured data

• Guards are watching 1% of video

• Automatic Video Analytics covers less than 5%

– Only when events can be accurately defined & detected

• Most video is never watched or examined!!!

Page 33: Video Synopsis

A Recent Example

Page 34: Video Synopsis

• Key framesC. Kim and J. Hwang. An integrated scheme for object-based video abstraction. In ACM Multimedia, pages 303–311, New York, 2000.

• Collection of short video sequencesA. M. Smith and T. Kanade. Video skimming and characterization through the combination of image and

language understanding. In CAIVD, pages 61–70, 1998.

• Adaptive Fast Forward N. Petrovic, N. Jojic, and T. Huang. Adaptive video fast forward. Multimedia Tools and Applications,

26(3):327–344, August 2005.

Entire frames are used as the fundamental building blocks

• Mosaic images together with some meta-data for video indexingM. Irani, P. Anandan, J. Bergen, R. Kumar, and S. Hsu. Efficient representations of video sequences

and their applications. Signal Processing: Image Communication, 8(4):327–351, 1996.

• Space Time Video montageH. Kang, Y. Matsushita, X. Tang, and X. Chen. Space-time video montage. In CVPR’06, pages 1331–

1338, New-York, June 2006.

Related Work (Video Summary)

Page 35: Video Synopsis

• We proposed Objects / Events based summary as opposed to Frames based summary– Enables to shorten a very long video

into a short time

– No fast forward of objects (preserve dynamics)

– Causality is not necessarily kept

Object Based Video Summary

Page 36: Video Synopsis

Original video: 24 hours Video Synopsis: 1 minute

Video Synopsis• Browse Hours in Minutes• Index back to Original Video

Page 37: Video Synopsis

t

Video SynopsisShift Objects in Time

Input Video I(x,y,t)

Synopsis Video S(x,y,t)

Page 38: Video Synopsis

Objects Extracted to Database

10:00

09:0311:08

14:38

18:45

21:50

38

How does Video Synopsis work?

Original: 9 hours

Video Synopsis:30 seconds

38

Page 39: Video Synopsis

How Does Video Synopsis works

Original: 9 hours

Video Synopsis:30 seconds

Page 40: Video Synopsis

• Detect and track objects, store in database.• Select relevant objects from database• Display selected objects in a very short

“Video Synopsis”• In “Video Synopsis”, objects from different

times can appear simultaneously• Index from selected objects into original video• Cluster similar objects

Steps in Video Synopsis

Page 41: Video Synopsis

42

Input Video

t

Synopsis Video

x

Object “Packing”

• Compute object

trajectories

• Pack objects in shorter

time (minimize overlap)

• Overlay objects on top

of time-laps background

Page 42: Video Synopsis

Example: Monitoring a Coffee Station

t

x

Page 43: Video Synopsis

x

t

Page 44: Video Synopsis

Original Movie Stroboscopic Movie

Page 45: Video Synopsis

Panoramic Synopsis

Panoramic synopsis is possible when the camera is rotating.

Original

Panoramic Video Synopsis

Page 46: Video Synopsis

Endless video – Challenges

• Endless video – finite storage (“forget” events)

• Background changes during long time periods

• Stitching object on a background from a different time

• Fast response to user queries

Page 47: Video Synopsis

Online Monitoring• Online Monitoring (real time)

– Compute background (background model)– Find Activity Tubes and insert to database– Handle a queue of objects

• Query Service– Collect tubes with desired properties (time…)– Generate Time Lapse Background– Pack tubes into desired length of synopsis– Stitching of objects to background

2 Phase approach

Page 48: Video Synopsis

Online Monitoring• Online Monitoring (real time)

– Compute background (background model)– Find Activity Tubes and insert to database– Handle a queue of objects

• Query Service– Collect tubes with desired properties (time…)– Generate Time Lapse Background– Pack tubes into desired length of synopsis– Stitching of objects to background

2 Phase approach

Page 49: Video Synopsis

Extract TubesObject Detection and

Tracking• We used a simplification of

Background-Cut*– combining background subtraction

with min-cut

• Connect space time tubes component

• Morphological operations

* J. Sun, W. Zhang, X. Tang, and H. Shum. Background cut. In ECCV, pages 628–641, 2006

Page 50: Video Synopsis

Extract Tubes

Page 51: Video Synopsis

The Object Queue

• Limited Storage Space with Endless Video– May need to discard objects

• Estimate object usefulness for future queries– “Importance” (application dependent)– Collision Potential – Age: discard older objects first

• Take mistakes into account….

Page 52: Video Synopsis

Query Service• Online Monitoring (real time)

– Pre-Processing : remove stationary frames– Compute background (temporal median)– Find Activity Tubes and insert to database– Handle a queue of objects

• Query Service– Collect tubes with desired properties (time…)– Generate Time Lapse Background– Pack tubes into desired length of synopsis– Stitching of objects to background

2 Phase approach

Page 53: Video Synopsis

Time-Lapse Background

Page 54: Video Synopsis

Time-Lapse Background

• Time Lapse background goals– Represent background changes over time– Represent the background of activity tubes

Activity distribution over time(parking lot 24 hours)

20% night frames

Page 55: Video Synopsis

Tubes Selection

Guidelines for the tubes arrangement :• Maximum “activity” in synopsis• Minimum collision between objects• Preserve causality (temporal consistency)

This defines energy minimization process :

A time mapping between the input tubes and the appearance time in the output synopsis

Page 56: Video Synopsis

Energy Minimization Problem

Bb Bbb

tca bbEbbEbEME',

)'ˆ,ˆ()'ˆ,ˆ()ˆ()(

Activity Cost(favors synopsis

video with maximal activity)

Temporal consistency Cost(favors synopsis video that preserves original

order of events )

Collision Cost(favors synopsis

video withminimal collision between tubes )

synopsis theinto b tubeof shift) (time mapping the- b̂

ubesactivity t -

synopsis theinput to thefrom mapping temporal-

B

M

Page 57: Video Synopsis

Tubes Selection as Energy Minimization

• Each state – temporal mapping of tubes into the synopsis

• Neighboring states - states in which a single activity tube changes its mapping into the synopsis.

• Initial state - all tubes are shifted to the beginning of the synopsis video.

Page 58: Video Synopsis

Stitching the Synopsis

• Challenge : Different lighting for objects and background

• Assumption : Extracted tubes are surrounded with background pixels

• Our Stitching method :Modification of Poisson Editing – add weight for object to

keep original color

Page 59: Video Synopsis

Stitching the Synopsis

• Challenge : objects stitched on time lapse background with possibly different lighting condition (for example : day / night)

• Assumption : no accurate segmentation. Tubes are extracted surrounded with background pixels

• Our Stitching method : modification of Poisson editing

add weight

for object to

keep original color

Page 60: Video Synopsis

Stitching the Synopsis

Page 61: Video Synopsis

Stitching the Synopsis

Page 62: Video Synopsis

Webcam in Parking LotTypical Webcam Stream

(24 hours)

Webcam Synopsis :20 Seconds

Page 63: Video Synopsis

Video Indexing

Webcam Synopsis :20 Seconds

Link from the synopsis back to the original video context

synopsis can be used for video indexing

Page 64: Video Synopsis

Webcam Synopsis :20 Seconds

Link from the synopsis back to the original video context

synopsis can be used for video indexing

Video Indexing

Page 65: Video Synopsis

Link from the synopsis back to the original video context

Video Indexing

Hotspot on Tracked Objects

Page 66: Video Synopsis

Link from the synopsis back to the original video context

Video Indexing

Hotspot on Tracked Objects

Page 67: Video Synopsis

Who soiled my lawn?

Unexpected Applications

2 hours 20 seconds

Page 68: Video Synopsis

Examples

Page 69: Video Synopsis
Page 70: Video Synopsis

Video Synopsis Should be More Organized

Page 71: Video Synopsis

Clustered SynopsisFaster and more accurate browsing

cars people

Example: Cluster into 2 clusters based on shape

Continue Examining the ‘Car’ cluster

Page 72: Video Synopsis

Clustering by Motion of ‘Cars’ ClassSynopsis now useful in crowded scenes

ExitEnter

Up HillRight

Page 73: Video Synopsis

)ˆˆ(2

1 k k

ik

jk

jk

ikij ssss

Nsd

Appearance (Shape) Distance Between Objects

Symmetric Average Nearest Neighbor distance between SIFT descriptors

 O. Boiman,  E. Shechtman   and   M. Irani,  In Defense of Nearest-Neighbor Based Image Classification .  

IEEE Conference on Computer Vision and  Pattern Recognition (CVPR), June 2008    .

K’s Sift Descriptor in tube iSift Descriptor closest to K of tube j

Page 74: Video Synopsis

Spectral Clustering by Appearance

Cluster 1 Cluster 2

Cluster 3 Cluster 4

Page 75: Video Synopsis

• More Classes : Easy to Remove False Alarm Classes

Gate Trees

Spectral Clustering by Appearance

Page 76: Video Synopsis

)()(

)()( kSep

kT

kwkMd ij

ijij

Object Distance: MotionTrajectory Similarity

– Computing minimum area between trajectories over all temporal shifts

– Efficient computation using NN and KD trees

Weight encouraging long temporal overlap

Common Time of tubes

Space Time trajectory distance

))()(()()(

22

kTt

j

kt

i

t

j

kt

i

tij

ij

yyxxkSep

x

t

k

Page 77: Video Synopsis

Spectral Clustering by Motion‘Cars’ Class

ExitEnter

Up HillRight

Page 78: Video Synopsis

Creating Video Synopsis

• Goals – Video Synopsis Having Shortest Duration– Minimal Collision Between Objects

• Approach– Displaying clustered objects together– Objects packed in space-time like sardines

Page 79: Video Synopsis

Packing Cost Example• Packing cars on the top road

Affinity Matrix after Clustering

Arranged Cluster 1 Arranged Cluster 2

Page 80: Video Synopsis

Combining Two Clusters

Low Collision Cost Between

Classes

High Collision Cost Between

Classes

Page 81: Video Synopsis

An Important Application:Display Results of Video Analytics

• Display the hundreds of “Blue Cars”

• Display thousands of people going left

• Good for verification of algorithm as well as for

deployment

Page 82: Video Synopsis

Two Clusters

Cars

People

Camera in St. Petersburg

• Detect specific events• Discover activity patterns

Page 83: Video Synopsis

Cars

People

Two Clusters

Camera in China

Page 84: Video Synopsis

Automatically Generated ClustersUsing Only Shape & Motion

People LeftPeople Right

Cars LeftCars Right Cars Parking

People Misc.

Page 85: Video Synopsis
Page 86: Video Synopsis