talk 2009-monash-seminar-perception

Mahfuzul Haque

Manzur Murshed

Manoranjan Paul

Object Detection Based on Human

Visual Perception

Object Detection

Real-time Surveillance Applications

Output

Object Detection: Applications

• Intelligent visual surveillance

– Event Detection

– Tracking

– Behaviour Analysis

– Activity Recognition

• Remote sensing

• Traffic monitoring

• Context-aware applications

Feature

Extraction

Object

Detection

Behaviour

Analysis

Object Detection: How?

• Not a practical approach

• Illumination variation

• Local background motion

• Camera displacement

• Shadow and reflection

Challenges with BBS

Current frame

Background

Detected object

Background Modelling

Basic Background Subtraction (BBS)

Current frame Background Detected object

Typical Surveillance Setup

Object

Detection Feature

Extraction

Object

Tracking

Behaviour

Analysis

Surveillance

Video Stream Frame-size reduction

Frame-rate reduction

Background model

State-of-the-art

Region and texture-based approaches

Shape-based approaches

Predictive modeling

Model initialization approaches

Nonparametric background modeling

Stationary foreground detection

Pixel-based approaches

Environment modelling

Background subtraction

Background modelling

Background maintenance

Foreground detection

Moving foreground detection

Object detection

Moving object detection

State-of-the-art

Hierarchical (Zhong et al., ICPR, 2008)

Type-2 Fuzzy MOG (Baf et al., LNCS, 2008)

Cascaded Classifiers (Chen et al., WMVS, 2007)

Gaussian Mixture Model with SVM (Zhang et al., THS, 2007)

Generalized Gaussian Mixture Model (Allili et al., CRV, 2007)

Bayesian Formation (Lee, PAMI, 2005)

Gaussian Mixture Model (Stauffer et al., PAMI, 2000)

Gaussian Mixture Model (Stauffer et al., CVPR, 1999)

Single Gaussian Model (Wren et al., PAMI, 1997)

Pixel-based Background Modelling

Moving Person

Shadow

Moving Car

Shadow

Walking People

Person

x (Pixel intensity)

road shadow car road shadow

Frame 1 Frame N

Current frame Detected object

Background

How to identify the

background models?

shadow

65% 20% 15%

Models are ordered by ω/σ

Context

Information (T)

Typical Surveillance Setup

Object

Detection Feature

Extraction

Object

Tracking

Behaviour

Analysis

Surveillance

Video Stream Frame-size reduction

Frame-rate reduction

Background model

Model adaptability: learning rate (α)

Scenario 1

Test Sequence: PETS2001_D1TeC2

T = 0.4 T = 0.6 T = 0.8

α = 0.1

α = 0.01

α = 0.001

First Frame

Test Frame

Ground Truth

α = Learning rate

T = Background data proportion

Scenario 2

T = 0.4 T = 0.6 T = 0.8

α = 0.1

α = 0.01

α = 0.001

First Frame

Test Frame

Ground Truth

Test Sequence: VSSN06_camera1

α = Learning rate

Scenario 3

Test Sequence: CAVIAR_EnterExitCrossingPaths2cor

T = 0.4 T = 0.6 T = 0.8

α = 0.1

α = 0.01

α = 0.001

First Frame

Test Frame

Ground Truth

α = Learning rate

Observation Summary

• Slow learning rate (α) is not preferable (ghost

or back-out).

• Simple post-processing will not improve the

detection quality at fast learning rate (α).

• Need to know the context behaviour in

advance.

How can we detect abnormal situations?

“Hey, a mob will be approaching soon,

and background will be visible only 10%

of that duration. Please set T = 0.1”

Research Goals

• A new object detection technique for

unconstrained environment, i.e., no context

dependant information (no T)

• Better detection quality at fast learning rate (α)

• Better stability across learning rates (α)

The New Technique

• Pixel-based

• MOG for environment modelling

• Incorporating human perceptual characteristics

in the underlying background model

– Model Reference Point

– Model Extent

Model Reference Point

shadow

65% 20% 15%

b New!

Higher agility than using mean

Not tied with the learning rate

Realistic: actual intensity value

No artificial value due to mean

shadow

65% 20% 15%

Model Extent

x x x = Kσ

shadow

65% 20% 15%

Model Extent

x = Kσ

• Depends on model std. dev.

• Model std. dev. is further tied

with learning rate

• Low detection sensitivity during

initial age of the model

• High detection sensitivity for in

stationary regions

• Adverse consequences:

– Redundant model introduced

– Precious model dropped

Model Extent

How is x related with b?

Low x High x

Human Visual Perception

Acquisition

Compression

Processing

Transmission

Reproduction

Why images are distorted?

System 1

System 2 Reference

Distorted

Images

How distortion is measured?

RMSEPSNR

25510log20

|x – y| < 0.5 dB

Not perceivable

by human visual

system

Human Visual Perception

Our Problem

System 1

System 2 Reference

Distorted

Images

How the range is determined?

10log20255

10log20

|x – y| < 0.5 dB

x x x = ?

Not perceivable

by human visual

system

Are we designing an artificial human eye?

• It’s a computer/machine vision application.

• Isn’t 0.5 dB is too sensitive to envelope

shadow, reflection, and noise?

Impact of Human Perceptual Threshold First Frame Test Frame Ground Truth 0.5 dB 0.75 dB 1.0 dB 2.0 dB

Summary of the technique

• Pixel Based

• Environment Modelling: MOG

• New variable in MOG: most recent observation

• Detection Phase

– No Gaussian mean as the reference

– Most recent observation as the reference

– No Gaussian variance for computing model extent

– Model extent based on human perceivable threshold

Total 50 test sequences from 8 different sources

Scenario distribution Indoor

Outdoor

Multimodal

Shadow and Reflection

Low background-foreground contrast

Test Sequences

Evaluation Qualitative and quantitative

Lee (PAMI, 2005)

Stauffer and Grimson (PAMI, 2000)

False Positive (FP)

False Negative (FN)

False Classification

Experiments

Test Sequences

PETS (9) Wallflower (7) UCF (7) IBM (11) CAVIAR (7) VSSN06 (7) Other (2)

Visual Comparison

Quantitative Analysis

ROC: S&G

ROC: Lee

ROC: Proposed Technique

PDR: S&G vs. Proposed Technique (α = 0.1)

PDR: Proposed Technique

PDR: S&G (T = 0.6)

Instability (ALL)

Performance Matrix (ALL)

Research Summary

• A new object detection technique

• Context independent

• Higher stability

• Higher agility (fast learning rate)

• Future direction:

– Multimodal scenarios

– Even higher detection quality – multilevel approach

Q&A Mahfuzul.Haque@infotech.monash.edu.au

http://www.mahfuzulhaque.com

Thanks!

Image Source

http://www.inkycircus.com/photos/uncategorized/2007/04/25/eye.jpg

talk 2009-monash-seminar-perception

Technology

monash university clayton...

adrian cahill tedx sjtu talk focus and perception. this is...

embedded bayesian perception and collision risk assessment...

highbury road landscape plan - city of monash | monash …

monash psm

urban design guidelines - monash council · monash...

monash academy orchestra music under dictatorship · monash...

monash arts and culture strategy - city of monash · monash...

unit guide - monash college | monash college

bsc (monash), gradcertbus(mkt) (moorabbin tafe ...€¦ ·...

monash university

title risk perception and risk talk: the case of the...

subjective well-being, eudemonic well-being and perception...

1 intelligent robotics research centre (irrc) department of...

talk 2007-monash-seminar-behavior-recognition-framework

talk 2009-monash-seminar-intelligent-video-surveillance

monash neuroscience

talk 2012-icmew-perception

monash urban biodiversity strategy 2018-2028 - city of...

talk 2010-monash-seminar-panic-driven-event-detection