chapter 2 literature survey -...

17

CHAPTER 2

LITERATURE SURVEY

2.1 INTRODUCTION

While tracking a target in realistic physical environments, the

sensor information related to the target is being updated with incorrect data

computed due to thermal noise, false alarms, clutter, occlusions and shadows.

Consequently, tracking performance degrades and the resulting tracking

errors are often far worse than those predicted by the tracking filter’s error

covariance matrix. The proposed research work provides solutions for

efficient tracking of targets in a radar sensor network, wireless sensor network

and camera sensor network. This chapter gives the overview of existing

techniques used for target tracking in various sensor networks. The taxonomy

of single and multiple target tracking techniques are presented in this chapter.

The various requirements and challenges in the design of the target tracking

algorithms are also discussed. Section 2.2 provides tracking of targets under

radar sensor network and section 2.3 explains about tracking of targets in

Wireless Sensor Network. Tracking of targets in camera sensor network is

explained in section 2.4.

18

2.2 TRACKING OF TARGETS UNDER RADAR SENSOR

NETWORK

Multiple Targets Tracking (MTT) is an important topic under radar

surveillance, since many applications such as remote sensing observing

system, ground based target recognition, detection and tracking, detecting

speed of the vehicle & highway safety, target tracking in ATC, aircraft safety,

electronic warfare, ship safety and navigation are based upon it.

2.2.1 Existing Algorithms

The data association is the basic problem of MTT. Various methods

for multiple targets tracking have been analyzed in the literature are described

below.

The Figure 2.1 shows classification of literature survey of target

tracking in radar sensor network.

The existing literature survey available for target tracking in radar

sensor network can be mainly classified as maneuvering target and non

maneuvering target for single target and multiple targets. Various

methodologies such as data association, position estimation and classification

techniques are available for tracking multiple targets. This thesis mainly

focuses on data association and position estimation.

20

In MTT, the data (location information represented by spherical

coordinates) produced by the same source is identified and partitioned into

sets of tracks. Also MTT finds a number of targets and parameters such as

position, velocity and acceleration for each track (Blackman 1986). Li and

Jilkov (2000) presented the methodology of new targets identification, new

plot creation and existing track updation for each scan. Observations that are

not assigned to existing tracks are used to form new tentative tracks. Once a

tentative track is formed from the observations, it is updated by successive

scans. The gate size and time duration allowed for confirming observation can

be chosen as functions of the confidence in the validity of the original

observation. A track which is not updated by successive scans has to be

deleted.

Bar-Shalom and Fortmann (1988) explained that when tracking is

performed in an environment that contains clutter and/or more than one

object, the measurements need to be associated with the correct tracks. Not all

measurements convey information about the tracked object and the

measurements that are not informative about the tracked object are called

clutter. Determining which measurements are informative and which are not,

is usually referred to as data association. As a result of this process, data

association is able to produce a set of tracks for a target.

Multiple targets tracking with radar applications (Blackman 1986,

Blackman et al 1993) described multiple target tracking and data association

of the sensor data for individual targets. When multiple number of

observations are received by the tracking system, it is necessary to assign

each incoming observation report to a specific target track. The popular

mechanism for classifying reports was the “nearest-neighbor rule” (Liggins

et al 2009). The idea of the rule is to estimate each target position at the time

21

of a new position report, and then assign that report to the target nearest to

such estimate.

Bar-Shalom and Tse (1975) have proposed an all-neighbor PDA

approach to correlate sensor data under the assumption of a single target. The

PDA method is based on computing the posterior probability of each

candidate measurement found in a validation gate, assuming that only one real

target is present and all other measurements are Poisson-distributed clutter.

The PDA and its extension JPDA (Blackman 1986) are used for tracking

single and multiple targets respectively. In JPDA method, joint posterior

probabilities are computed for multiple targets in a Poisson clutter.

However, these methods are computationally heavy and have no

explicit provision for track initiation. Although many association (Smith and

Sameer 2006) and tracking algorithms (Liggins et al 2009) have been

suggested, it is still difficult to generate and maintain tracks in practice

(Musicki 2007).

Fortmann et al (1983) proposed a new JPDA algorithm for multiple

targets in clutter. This was a target oriented approach, in the sense that a set of

established targets is used to form gates in the measurement space and to

compute posterior probabilities.

Roecker et al (1995) proposed a multiple scan or n-back scan JPDA

algorithm which addresses itself to the problem of measurement to track data

association in a multiple target and clutter environment and uses multiple

scans of measurements along with the present target information to produce

better weights for data association.

In MTT, there are number of methods for classifying the observed

data into tracks. MHT uses track splitting technique for accurate decision

22

making from the observed data (Musicki and Suvorova 2008). Under this

MHT scheme, the tracking system does not have to commit immediately or

irrevocably to a single assignment of each report. If a report is highly

correlated with more than one track, an updated copy of each track can be

created; subsequent reports can be used to determine which assignment is

correct. As more reports come in, the track associated with the correct

assignment will rapidly converge on the true target trajectory, whereas the

falsely updated tracks are less likely to be correlated with subsequent reports.

The n-backscan MHT approach requires information collected from

‘n’ number of previous scans for making a decision. Hence it needs more

memory for maintaining numerous track hypotheses (Feo et al 1997). The

main drawback of n-backscan MHT is the exponential increase in

computation complexity and memory requirement. Bar-Shalom et al (2007)

discussed several theoretical issues relating to the score function for the

measurement-to-track association/assignment decision in the track oriented

version of the MHT. The score function is the ratio of the Probability Density

Function (PDF) of a measurement having originated from a track to the PDF

of the measurement having a different origin and is called as likelihood ratio.

When the system is linear with additive Gaussian noise, there exists

an analytical solution to the Bayesian time and measurement update equations.

The solution is given by the KF. Many books describing different aspects of

the KF exist (Simon 2001). Since the system is linear and Gaussian, the

update formula will remain Gaussian, and hence all Gaussian systems can be

described by their first two moments (mean and covariance). The update

equations consist of mean and covariance update. The original KF (Kalman

1960, Kalman 1961) defined in continuous-time, but soon a discrete version

was also derived. Much of the classical theory is described in Anderson and

Moore (1979). For the discretized-linearization, (Gustafsson 2000), the non-

23

linear continuous-time system was linearized and then the system was

discretized. Anderson and Moore (1979) and Bar-Shalom and Li (1993) have

discussed the EKF for the discrete-time.

Farina et al (2002) have compared the estimation performance

(error mean and standard deviation; consistency test) of nonlinear filters like

the Extended Kalman Filter (EKF), the. statistical linearization, the particle

filtering, and the Unscented Kalman filter (UKF).

Singer et al (1974) proposed a new optimal filter for target tracking

in dense multitarget environment. The sensitivity of tracking accuracy to data

rate, maneuver magnitude, maneuver correlation coefficient, and single-look

measurement accuracy of KF is discussed

The problems and issues involved in Multitarget Ocean tracking

using a heterogeneous set of passive acoustic measurements are outlined by

Fortmann and Baron (1979). They have also described an approach to solve

data association and maneuver detection problems. Their method uses an EKF

with both geographic and acoustic states, and handles measurement vectors

such as bearing/frequency and delay/Doppler difference.

To resolve the problem of track-to-track association in a distributed

multisensor situation, He and Zhang (2006) presented independent and

dependent sequential track correlation algorithms based on those of Singer

(1970) and Bar-Shalom (1981). In this paper, based on sequential track

correlation algorithm, the restricted and attenuation memory track correlation

algorithms and sequential classic assignment rules are explained. The

correlation performances of the sequential algorithms are much better than

those of Singer (1970) and Bar-Shalom (1981) with a little more computation

and memory burden under the environments of dense targets, interfering noise

24

and track cross. The computational complexity of these algorithm increases

with increasing environmental parameter under consideration. Also,

performance of these algorithms reduces with increased number of targets.

Keuk (1998) derived an optimal combinational method which can

be used under different operating conditions. The method related to MHT

uses a sequential likelihood ratio test and derives benefit from processing

signal strength information. Multiscan data association can significantly

enhance tracking performance (Battistelli et al 2011) in critical radar

surveillance scenarios involving multiple targets, low detection probability,

high false alarm probability, evasive target maneuvers, and finite radar

resolution. Unfortunately, multiscan data association approach is affected by

dimensionality which delays its real-time application for tracking problems

with short scan periods and/or a more number of scans of the association

logics and/or many measurements per scan. To solve this Battistelli et al

(2011) have suggested multiscan association as a multi-commodity or single-

commodity flow optimization problem that allows a relaxation of the

association problem which provides close-to-optimal association

performance.

Li and Bar-Shalom (1996) presented the conditional PDF of the

Nearest Neighbor (NN) measurement under the events of correct and

incorrect data association, the probabilities of these data association events,

and the propagation of the matrix mean square error conditioned on these

events. The development of the above mentioned recursion relies heavily on

these conditional PDFs and probabilities.

Feo et al (1997) explained about Interacting Multiple Model Joint

Probabilistic Data Association (IMMJPDA) and MHT for improved tracking

performance. They also provide a performance comparison between three

25

tracking algorithms Nearest Neighbour (NN) correlation and KF, IMMJPDA

and MHT in terms of track maintenance probability and tracking errors. Both

perform better than NN and KF.

Roecker and Mcgillem (1988) compared the state vector fusion

method and the measurement fusion method for fusing the tracks of two

different sensors and showed the reduction achieved in the covariance of the

filtered state vector by utilizing measurement fusion.

The decision-based techniques for maneuvering target tracking,

which appeared after the decision free adaptive KF techniques based on a

single model, have become quite popular and have been studied extensively in

the literature (Bar-Shalom and Li 1993, Bar-Shalom et al 2001). In decision-

based approaches, the state estimation is based on a target motion model

determined by maneuver detection. Therefore, making reliable and timely

decisions is the key to these approaches for satisfactory state estimation.

Many algorithms and techniques have been developed for the detection of

maneuvers. A comprehensive survey of such decision-based approaches is

given by Li and Jilkov (2002).

Li et al (1999) proposed a general Multiple Model (MM) estimator

with a Variable Structure (VSMM), called Model Group Switching (MGS)

algorithm. In this estimator, a set of model groups is used, each representing a

collection or cluster of closely related system models. The set of models is

made adaptive by switching among these groups to follow possible jumps

(across groups) of the system mode in such a way that it balances well

between the needs to have the smallest delay in switching and to have a

minimum false switching rate.

Li and Jilkov (2001) dealt with surveys of the problems and

techniques of tracking maneuvering targets in the absence of the

26

measurement-origin uncertainty. Mallick et al (2004) have proposed a

solution of multi-target tracking problems in clutter with the probability of

detection lesser than unity using the track-oriented MHT. They have also

presented about multiple hypotheses distributed tracking algorithms for track

initialization, gating, hypothesis generation, track update, computation of

track likelihood, formation of global hypothesis, and pruning using the pseudo

measurement formulation.

Ru et al (2009) proposed a technique that gives explicit solutions

for Gaussian-mixture prior distributions and can be applied to arbitrary prior

distributions through Gaussian-mixture approximations. The approach

essentially utilizes prior information about the maneuver accelerations in

typical tracking engagements and thus allows implementation of detection

performance as compared to traditional maneuver detectors.

Zhu and Li (2003) presented a fusion rule for distributed multi

hypothesis decision systems with communication patterns among sensors and

the fusion center. They have also proposed a scheme for generating optimum

sensor rules and optimum fusion rules, which reduce computation

tremendously as compared with the commonly used exhaustive search. They

have provided a guideline to assign sensors to nodes in a signal detection

networks with a given communication pattern too.

Myung (2003) explained Maximum Likelihood Estimation (MLE)

for parameter estimation in statistics and in particular, in non-linear modeling

with non-normal data. Hwang et al (2004) proposed a filter based on JPDA

for target-measurement correlation combined with an identity management

algorithm incorporating suitable local information, when available, in a

manner that decreases the uncertainty, as measured by system entropy.

27

Sathyan and Sinha (2011) presented a new two-stage algorithm for

multitarget tracking using multiple asynchronous passive sensors. The

proposed algorithm used local bearings-only (mono) tracks for each sensor

and combined these tracks to generate complete kinematic (stereo) tracks in

the Cartesian coordinate frame. Once stereo tracks have been formed, in the

second stage known as the stereo tracking stage, bearing measurements are

directly used to update the stereo tracks. They have also used the assignment

based technique to solve various data association problems that arise due to

measurement origin uncertainty.

Scala and Pulford (2005) have described an algorithm for tracking a

maneuvering target in heavy clutter and/or with a low probability of detection.

They have used a computationally efficient algorithm for multi-scan target

tracking based on the Viterbi algorithm, known as the Viterbi Data

Association (VDA) algorithm. Tugnait (2004) proposed a novel suboptimal

filtering algorithm by applying the basic IMM approach and the JPDA

technique for tracking of two highly maneuvering closely spaced targets.

Jeong and Tugnait (2005) presented a filtering algorithm by

applying basic IMM approach and the PDA technique to a two sensor (radar

and infrared) scheme for tracking a highly maneuvering target in a cluttered

environment. Musicki and Suvorova (2008) have described IMM-integrated

PDA (IMM-IPDA), IMM–Joint IPDA (IMM-JIPDA), and linear multitarget

IMM-IPDA (IMM-LMIPDA) filters for tracking maneuvering targets in

clutter. These algorithms use the IMM approximation for multiple model

target trajectory estimation and the PDA approximation to estimate target

trajectories of individual IMM models in clutter. All algorithms recursively

update the probability of target existence, which has been used for false track

discrimination.

28

For tracking heavily maneuvering targets, a KF with a single

motion model will not be sufficient. If the true observation is much farther

away from the prediction to the nearest false observation, the measurement

update will be incorrect and thus the next prediction will be in the wrong

direction. In order to deal with this problem, Bar-Shalom and Fortmann

(1988) have used multiple motion models.

Maneuvering target tracking is an important problem, because

target accelerations are generally unknown and structural variations also exist

as the target moves into or out of the maneuvering mode. Li and Jilkov (2000,

2001, 2002 and 2003) provided a comprehensive survey of the problems and

techniques of tracking maneuvering targets. In the year 2000, they presented a

survey of various mathematical models of target dynamics for maneuvering

target tracking including 2D and 3D maneuver models as well as coordinate-

uncoupled generic models. This survey emphasized the underlying ideas and

assumptions of the models. Li and Jilkov (2001) provided a comprehensive

survey of the problems and techniques of tracking maneuvering targets in the

absence of the measurement-origin uncertainty. Li and Jilkov (2002) provided

a survey of maneuvering target motion model used for tracking ballistic

targets.

An improved IMMJPDA algorithm for tracking multiple

maneuvering targets in clutter has been analyzed by Mao et al (2006).

Musicki et al (2007) have implemented a near-optimal algorithm for tracking

a single maneuvering target in clutter. The algorithm integrates the target

existence paradigm with a multi-scan target state estimation algorithm. The

target trajectory estimation calculates the target’s posteriori PDF based on all

possible measurement detection histories and all possible target maneuvering

model histories.

29

Puranik and Tugnait (2007) have presented tracking of multiple

maneuvering targets in the presence of clutter using switching multiple target

motion models. A novel suboptimal filtering algorithm was developed by

applying the basic IMM approach and multiscan-JPDA technique. It showed

significant improvement in target position estimates by the proposed IMM

multiscan-JPDA compared with the results of the single-scan IMM/JPDA

algorithm for closely spaced targets.

Gerasimos (2012) has explained a derivative-free Kalman filtering

approach, which is suitable for state estimation based control of a class of

nonlinear systems. The considered systems are first subject to a linearization

transformation, and next state estimation is performed by applying the

standard Kalman filter to the linearized model. The proposed method provides

estimates of the state vector of the nonlinear system without the need for

derivatives and Jacobians calculation and without using linearization

approximations.

From the literature, it is found that multiple target tracking methods

are computationally heavy and have no explicit provision for track initiation.

Hence, there is a need to have a MTT scheme with lesser computation

complexity and memory requirement with a new correlation logic which has

better response time than the existing tracking schemes.

2.3 TRACKING OF TARGETS IN WIRELESS SENSOR

NETWORK

Recently, with the rapid development of wireless communication

technologies, wireless sensors have become more popular in military and

civilian systems. Civilian applications include air traffic, marine control,

navigation, and person/object tracking etc. (Culler et al 2004, Zhang and Cao

30

2004). Since sensor nodes are small and cheap, a large number of sensors are

deployed in the interesting field to retrieve the real-world information. The

Figure 2.2 shows classification of literature survey of target

tracking in wireless sensor network.

The existing literature survey available for target tracking in

wireless sensor network can be mainly classified as single sensor and multiple

sensors. The tracking is performed for both indoor and outdoor environment

based on various sensors and techniques. This thesis mainly focuses on

multiple target tracking using multiple sensors in WSN.

Due to the rapid development in sensor technology, Wireless

Sensor Networks are used for person tracking, home monitoring and

environment monitoring (Akyildiz et al 2002). Several solutions for human

detection and tracking have been proposed in the literature.


Many Intelligent environments and security systems deploy WSN to

detect and track the targets. To make WSN economically feasible, the

individual nodes are to be low-end inexpensive devices (Krishnamachari and

Sitharama 2004). In the random deployment of WSN scenario, samples may

not arrive at regular time intervals. There is a need to predict the future

position of a target using sensor data based on the target dynamics even if

events are missed.

32

Umesh Babu et al (2006) proposed a KF based method for tracking

target in a sensor network. This approach used the acoustic signal and Time

Difference of Arrival (TDoA) for detecting and localization. Hopper et al

(1993) have presented a scheme based on the active badge system for

identifying the targets but the drawback of the scheme is limited scalability.

Kim et al (2009) proposed a method for object tracking in an indoor

environment. RFID system has been used to increase the accuracy and

resolution of location estimation.

Taketoshi et al (2004) proposed a method to track multiple persons

by integrating distributed floor pressure sensors and RFID. The distributed

floor pressure sensor data was utilized to detect the person by finding high

pressure in the area where the person was available and, when it failed to do

that, the RFID made it possible to associate that areas with the ID names of

the persons in the sensor covered areas.

Several schemes were able to reliably detect and track the

movement of persons in indoor and controlled environment with a unique

identification mark like RFID tag (Bharghavi et al 2010). Ayd n et al (2007)

presented a study on localization and tracking of an object carrying an active

RFID tag. Received Signal Strength (RSS) measurements at outdoor,

obstacle-free indoor and obstacled indoor environments were analyzed for

that purpose.

For object tracking, the existing method (Roseveare and Natarajan

2012) aimed at minimizing the number of sensing nodes. A common way to

reduce the number of sensing nodes was through the prediction technique (Xu

et al 2004). On the other hand, Chen and Chung (2005) have proposed useful

Alert-based Object Tracking (AbOT) scheme, to track irregular movement of

object that did not depend on predicting the object trajectory.

33

Chun Chen et al (2009) proposed a complete systematic

architecture, called Hierarchical Alert Model Architecture (HAMA), to

minimize the number of nodes participating in the tracking activities and

efficiently manage all moving object’s location information.

Yang et al (2008) described the probabilistic approaches, such as

Simultaneous Localization and Mapping (SLAM) algorithm for 2D trajectory

tracking with improved efficiency.

The KF is suitable for the estimation of the state of targets moving

with nearly constant velocity, but once the target starts maneuvering, the

single KF is not suitable for tracking. Various mathematical models

representing target motion have been developed in the literature. Bar-Shalom

and Birmiwal (1982) have explained KF based target tracking scheme to

provide the estimate of the states of the target. The velocity model KF gave

poor performance for a maneuvering target and the acceleration model KF

gave inferior performance when the target moved with linear velocity. Li and

Jilkov (2003) have mentioned that to precisely estimate the state of the target,

the exact model of maneuvering target was to be selected. Hence a Multiple

Model (MM) filter was needed for efficient tracking of maneuvering and non

maneuvering targets. In the IMM estimator, multiple models were used to

describe the motion of the target. The IMM made use of a bank of KF to

accommodate various possible target trajectory patterns and conditions (Blom

and Bar-Shalom 1988). The final estimate was obtained by the weighted sum

of estimates from sub-filters of the different models (Chen et al 2007) and

switching between models was obtained as per Markov transition probability

matrix (Mazor et al 1998). The IMM estimator with KF as sub filter used a set

of models to describe the target model (Li and Bar-Shalom 1993,

Yeddanapudi et al 1997).

Engin et al (2012) addressed the problem of target tracking based

on received signal strengths in WSN. The Kalman gain matrix has been

34

obtained as the solution to an optimization problem. Since each column of the

Kalman gain matrix corresponds to one sensor measurement, by formulating

an optimization problem with sparsity promoting penalty function in which

the number of nonzero columns of the Kalman gain are penalized.

Sreekanth and Krishna (2011) considered the problem of providing

guided navigation in a target tracking enabled wireless sensor network. In

this work, a constant velocity model is considered and the location of the

target is computed using the predictive regeneration method and the weighted

centroid method. The position of the target at a particular time is computed

using the weighted centroid algorithm. It is then compared with the predicted

position. Depending upon the comparison, a correction is provided and the

position is recomputed.

Gireesan et al (2001) described the target tracking application of

WSN based on an experimental testbed using Digi Xbee device, a Passive

Infrared (PIR) and MaxSonar ultrasonic ranging sensors. The experiments

showed that it is not possible to expect continuous ranging and reporting in

practice for low power sensors. The stabilization time and false positive

probability are very significant when deploying sensors in an outdoor

environment. Also, the MaxSonar ultrasonic sensor does not have hardware

range inhibition

The proposed solutions in the existing literature used sensors like

acoustic, image and PIR sensors for person detection. Tracking was achieved

using techniques like Particle Filter (Ozdemir 2009), signatures, mobile

agents (Yu et al 2004) and KF (Umesh Babu et al 2006). Although the PIR

sensor sensed the presence of an object, it failed to classify the type of object

as human or non-human or to give the count of the number of objects. Hence,

in addition to the PIR sensor, another sensor is required to identify the person

for tracking applications. The random deployment of sensor nodes exhibit

unreliable behavior and might not generate samples at regular time intervals.

35

The target (person) may be detected by more than one sensor or may not be

sensed even by a single sensor. Hence, an algorithm is required to estimate

the missing events and the future position of a target based on the available

measurements.

2.4 TRACKING OF TARGETS IN CAMERA SENSOR

NETWORK

Video surveillance has long been in use for the purpose of

monitoring in highly secured areas like banks, malls. Also it is used in athletic

performance analysis, industries and video conferencing etc. Traditionally,

the video streams are monitored online by human operators and stored for

future reference.

A considerable amount of work has been devoted to tracking

humans in the view of a single camera. However, single camera tracking can

monitor only a relatively narrow area due to the limited viewing angle of a

camera lens. Recently, growing interest has focused on tracking humans using

distributed monocular cameras (Sato et al 1994).


The increasing need for intelligent visual surveillance in

commercial, law enforcement and military applications makes automated

visual surveillance systems and it is one of the main current application

domains in computer vision. Vision based multi-target tracking has been

focussed extensively and several algorithms are available in the literature to

track people using camera images (Tsagkatakis and Savakis 2011, Iketani et

al 1998, Anurag and Davis 2002, Kang et al 2004). The Figure 2.3 shows

classification of literature survey of target tracking in camera sensor network.

37

The existing literature survey available for target tracking in camera

sensor network can be mainly classified as single camera and multiple camera.

The major problems associated with visual tracking are variation in the

backgrounds, camera position and occlusion. This thesis mainly focuses on

occlusion handling and background estimation.

However, most of the existing methods have given inferior

performance due to camera position, varying pose, illumination conditions,

dynamic background and occlusion. Classifying multiple detected targets into

human, vehicle or animal is yet another difficult problem and also

computationally intensive. The different features for object tracking include

template, colour, contour, histogram of gradients, etc. of an object image

(Dalal and Triggs 2005).

Many existing techniques made assumptions which greatly

restricted the generality of the approach in real-world settings like (Comaniciu

et al 2003) background modelling techniques, scenes often included many

other dynamic objects, fast changes in lighting, and complex object

interactions like shadows and reflections that greatly influenced the image.

Comaniciu et al 2003 has proposed a Kernel-Based Object Tracking, which

successfully handles camera motion, partial occlusions, clutter, and target

scale variations

In single-camera tracking techniques, it was common to assume

that distinct targets had distinct appearances with respect to colour, texture,

size, or contour features (Iketani et al 1998), and also faced a fundamental

limitation caused by changing background. A real-time people tracking

system for an interactive environment used depth based background

subtraction (Krumm et al 2000). These approaches require the objects in the

38

scene to have enough texture information for dense stereo reconstruction and

build background models assuming static environment.

The region based stereo technique avoided many problems with

wide-baseline correspondence by matching regions instead of points (Kang et

al 2003). This approach required background modelling and assumed that

everyone in the scene was wearing uniquely coloured clothing to perform the

region based correspondence. The detection of moving objects is performed

by defining an adaptive background model that takes into account the camera

motion approximated by the affine transformation (Kang et al 2004).

The problems associated with automatic real time visual

surveillance include tracking unwanted target rather than desired target,

changes in the background, occlusions and the assumption that the

background environment is a static model (Forsyth and Ponce 2003). The

distance metric learning reliably represented the similarity between different

appearances of the object as well as the difference in appearance between the

object & the background and in detection of occlusions (Tsagkatakis and

Savakis 2011). Also they identify occlusion by comparing the distance

between the object and the background

The region based stereo technique required background modelling

and assumed that everyone in the scene is wearing uniquely coloured clothing

(Darrell et al 2001) to perform the region based correspondence.

Moreover, scenes often include many other dynamic objects, fast

changes in lighting and complex object interactions shadows and reflections

that greatly influence the image (Forsyth and Ponce 2003). However, in many

real-world settings, it was not possible to place a camera in an ideal location

39

that minimized occlusions (like a very high overhead view). Hence a robust

technique for tracking was required to handle frequent and prolonged

occlusions (Isard and MacCormick 2001), to work in crowded areas with

multiple views (Anurag and Davis 2002).

The person tracking was performed from offline data using

background subtraction and multiple cameras. It introduced some delay in

segmentation because sensor fusion was done by rendering foregrounds from

multiple sensors image (Krumm et al 2000). Multi-camera techniques need to

perform correspondence between the views and assumed that the appearance

of a feature in one view would be similar to its appearance in another view.

This assumption failed for widely separated views where the scene geometry

and lighting could result in the lack of commonly observed features and very

different appearances of the same feature.

However, single camera tracking can monitor only a relatively

narrow area due to the limited viewing angle of a camera lens. Recently,

growing interest has focused on tracking humans using distributed monocular

cameras. In such a setup, the image of the target within the area monitored by

cameras will be present in at least one of the video sequences produced by the

cameras.

Jun et al (2006) has proposed a novel vehicle classification scheme

for estimating important traffic parameters from video sequences. For

robustness condition, to keep background static, a background update method

was used. The desired target was detected through image differencing and

then tracked by a KF.

When surveillance is performed over a wide area, multi-camera

techniques needed to provide correspondence between views and assumed

40

that the appearance of a feature in one view will be similar to its appearance

in another view. This assumption failed for widely separated views where the

scene geometry and lighting could result in a lack of commonly observed

features (Kang et al 2004).

KF was the first filter to be used for visual tracking. Various

extensions of the filter have shown considerable success (Dalal and Triggs

2005, Zhao 2004, Piater and Crowley 2001) for person tracking. When the

state space was discrete and made up of a finite number of states, the Hidden

Markov Model (HMM) explained by Rabiner (1986 and 1989) could be

applied for tracking.

Tracking of people with particle filter (Osawa 2006) demonstrated

tracking in a cluttered office environment with two people but did not discuss

the cost of rendering an image from a model per particle per time step (Saad

and Mubarak 2006). Tracking methods were based on the visual hull

techniques, which were sensitive to errors in foreground segmentation but not

suited for environments with many occlusions because the visual hull became

loose and could not resolve individuals (Lopez et al 2007).

Valera and Velastin (2005) presented the state of development of

intelligent distributed surveillance systems, including a review of current

image processing techniques that are used in different modules that constitute

part of the surveillance systems. They also explained that the ability to

recognize objects and humans, to describe their actions and interactions from

information acquired by sensors were essential for automated visual

surveillance. Gian et al (2005) proposed image and video processing

techniques for advanced visual surveillance system using multicamera

systems to provide surveillance coverage across a wide area, ensuring object

visibility over a large range of depths. In the work proposed by Antonio et al

41

(2007), the cascaded structure form of the multiclass detection with fragment

based approach was used. The object detection and classification were done

on the dynamic background to overcome the limitations observed with static

background like illumination conditions and artifacts due to movement of

leaves etc.

Xue et al (2009) have proposed a system for multi-view visual

target surveillance system in WSN, which autonomously implemented target

classification and tracking with collaborative online learning and localization.

Complex Event Processing (CEP) for sensor networks was analyzed by

Dunkel (2009) and he processed complex event streams in real time. The

approach was based on semantically rich event models using ontologies that

allow representation of structural properties of event types. A survey

presented by Joshua et al (2010) described an overview of the state of the art

developments on behavior recognition algorithms for transit visual

surveillance applications. These techniques are often sensitive to poor

resolution, frame rate, drastic illumination changes, and frequent occlusions,

among other common problems prevalent in transit surveillance systems.

Norbert et al (2011) have presented a comprehensive review of

computer vision techniques for traffic analysis systems, with a specific focus

on urban environments. There is an increasing scope in intelligent transport

systems to adopt video analysis for traffic measurement. Traditional methods

were used for background estimation and perform top down classification,

which could raise issues under challenging urban conditions. Methods from

the object recognition domain (bottom up) have shown promising results,

overcoming some of the issues of traditional methods, but are limited in

different ways.

42

Liang et al (2012) proposed a scheme to track multiple video

targets and recovered their trajectories against occlusion, interruption, and

background clutter using stochastic sampling algorithm to iteratively solve the

spatial graph partition and temporal graph matching. Also this algorithm was

designed under the Metropolis-Hastings method without the need for good

initializations.

Mirabi and Javadi (2012) have presented an algorithm for accurate

segmentation and tracking of people in dynamic outdoor environments

The existing literatures proposed solutions for camera angle (Kang

et al 2004), pose variation, correspondence between the regions (Kang et al

2003), changing background (Iketani et al 1998), differences in appearances

between the object and background, (Tsagkatakis and Savakis 2011), multiple

views (Krumm et al 2000, Isard and MacCormick 2001, Anurag and Davis

2002, Saad and Mubarak 2006, Xue et al 2009), and background modelling

(Darrell et al 2001). For occlusion handling, various techniques such as KF

(Dalal and Triggs 2005, Zhao 2004, Piater and Crowley 2001, Mirabi and

Javadi 2012), minimum allowable distance between the object and the

background (Tsagkatakis and Savakis 2011), HMM (Rabiner 1986 and 1989)

and PF (Osawa 2006) have been proposed. In this thesis, a Combined

Gaussian Hidden Markov Model based Kalman Filter (CGHMM-KF) scheme

is proposed to accurately detect, classify and track multiple persons in the

complex scenario.

chapter 2 literature survey -...

Documents