sachin;s interim report

TRAFFIC DENSITY DEDUCTION

USING VIDEO ANALYSIS

Main Project Interim Report

Submitted by

Sachin Sajan George

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Rajagiri School of Engineering & Technology

Rajagiri Valley, Kakkanad, Kochi- 39

OCTOBER 2015

i

RAJAGIRI SCHOOL OF ENGINEERING & TECHNOLOGY

RAJAGIRI VALLEY, KAKKANAD, KOCHI- 39

CERTIFICATE

Certified that this is a Bonafide Record of the project work on Traffic Density Deduction using

Video Analysis done by Sachin Sajan George University registration number 12012676 during

semester Seven of Computer Science and Engineering in partial fulfillment of the requirement

for the award of degree of Bachelor of Technology in Computer Science and Engineering of the

Mahatma Gandhi University, Kottayam, during the academic year 2015-2016

Project Guide Head of the Department

Project Coordinator

ii

ACKNOWLEDGEMENT

I am grateful to the almighty God for his blessings and for helping me complete this

project successfully.

I would like to thank the management and Dr. A. Unnikrishnan, Principal for providing

me with all the necessary facilities including a very well furnished lab. I would also like to thank

Mr. Ajit S., Head of the Department of Computer Science and Engineering for his

encouragement to do this work. I am also grateful to our guide Ms. Tripti C., Asst. Professor,

Department of Computer Science and Engineering for her guidance throughout the course of the

work.

I also thank the faculty and non-faculty staff of Department of Computer Science and

Engineering for helping me in some way or the other. Last but not the least I thank all my

classmates in S7 CSE for helping me with their valuable suggestions and their whole hearted

support.

iii

ABSTRACT

Roads of cities are getting crowded due to the increasing number of vehicles day by day.

The latest technology has facilities to manage this problem in the most beautiful manner

improving life of common people, thus allowing them to travel safely on roads and reach their

destinations at the appropriate time without getting in traffic jams. Reducing traffic can save

time, money and fuel and in time create a cleaner, greener city. This project aims to create an

open-source solution to traffic density by having a video analytical based study and the result can

be used for effective traffic management. Although the proposed technology exists in other

countries, it is yet to be implemented in this very state and is seldom used throughout the

country.

iv

TABLE OF CONTENTS

ACKNOWLEDGEMENT ii

ABSTRACT iii

LIST OF FIGURES v

1 Introduction

1.1 Problem Statement

1.2 Problem Scope and Objective

1.3 Design and Implementation Constraints

1.4 Assumptions and Dependencies

1.5 Development Method

1

1

1

2

2

3

2 Related Work 4

3 System Specification

2.1 Use Cases

2.2 Non-functional Requirements

14

14

17

2 System Design

3.1 Structural Design

3.2 External Interface Design

3.3 Component Level Diagram

3.3.1 Hardware and Software Requirements

18

18

22

23

23

5 Project Schedule and Major Milestones 25

6 Risks and Challenges 28

7 Glossary 29

8 Bibliography 30

v

LIST OF FIGURES

Figure 1.1: The iterative waterfall model 3

Figure 2.1: Detecting a reference using feature extraction and matching 7

Figure 2.2: Blob Analysis 8

Figure 2.4: Overview of Cascade Classifiers 9

Figure 2.3: The moving pixels detected from the video frame above are shown in white 11

Figure 3.1: Activity Diagram of the project 16

Figure 4.1: Use Case diagram of the first stage of the project 19

Figure 4.2: Use Case diagram of the second stage of the project 20

Figure 4.3: Activity Diagram of the project 21

Figure 4.4: Sample GUI 22

Figure 5.5: Deployment Diagram 23

Figure 5.1: Milestone Chart of the project 25

Figure 5.2: Gantt-Chart of Phase I 26

Figure 5.3: Gantt-Chart of Phase II 27

CS010 710 Main Project Interim Report Department of Computer Science and Engineering

Rajagiri School of Engineering and Technology, Kochi -39

© University Register Number: 12012676 1 Traffic Density Deduction using Video Analysis

1. INTRODUCTION

1.1. Problem Statement

Kerala, Gods own country, with its highest literacy rate aims to become the most

developed state of India in the field of business as well as research. The growth in population and

emerging needs of the people from various cultures demand Cochin, the metropolitan city of

Kerala, to be the center of attraction for business.

Transportation forms the backbone for business development. But the roads of Cochin

city have high density of vehicles and becomes very difficult to manage. Traffic on road and the

delay caused by traffic jams cause stress to people. It is hard to manage the vehicles individually.

This leads to the demand of having an efficient mechanism to manage vehicles on road, avoiding

traffic jams and accidents leading to safer lives on road.

1.2. Project Scope and Objective

Cochin city, being a part of the Gods own country, is envisioned to have facilities with

the latest technology, improving life of the common man and allowing everyone to have a

peaceful and stress free life. The people can travel safely on road and reach at the appropriate

place at the appropriate time without getting in traffic jams.

Reducing traffic can save time, money and fuel and in turn create a cleaner, greener city.

This reduces both the pollution of the mind by stress and the environment by harmful gases.

The theme falls under Public infrastructure and utilities (Urban mobility and traffic

management) and Digital Infrastructure (Better connectivity for better service).

The proposed system aims at reducing traffic congestion at inter-junctions by providing a

video analytical based study to determine the density of vehicles in a region. It is hard to manage

the vehicles individually, i.e., surveillance of a mass of cars rather than individual surveillance is

more efficient




This project may be extended to traffic re-routing and augmented reality supported windshields

for providing drivers with re-routing information.

1.3. Design and Implementation Constraints

The system performs video analysis on live feed from traffic cameras at inter-junctions or

specifically fit cameras that provide live feed. The video analysis is done it to split it into frames

and get the count of vehicles to calculate density.

Large objects, such as birds near to the camera can affect video feed providing erroneous results.

In case of rain, the video feed might be blurred and would not provide solid feed to the system

for effective computation. Rain and other weather can also cause damage to the camera.

Government rules prohibit us from using the live camera feed from the speed and

detection camera. Analysis on blurred feed might be tough, especially if considering really fast

movement of cars.

1.4. Assumptions and Dependencies

The project is written and based on open source software. Thus the software requirements

are also dependent on open source. There exists an external interface to the camera to obtain the

video feed.

Assumptions are made such as this system is to remain unaffected during the process.

Since government policies do not allow us to use live video feed, presets of videos are used for

analysis process and consider that it can be extended to dynamic feed. Also, the system assumes

that the camera feed does not have noise from external factors such as birds or weather.

The video analysis is mainly based on an averaging system for finding the correct

threshold. Thus the system assumes initial transition from higher to lower density to be

discarded.




1.5. Development Method

The method used for the software development is Iterative Waterfall model. The

waterfall model is a sequential software development process, in which progress is seen as

flowing steadily downwards (like a waterfall) through the phases of Conception, Initiation,

Analysis, Design (validation), Construction, Testing and maintenance. In a typical model, a

project begins with feasibility analysis. On successfully demonstrating the feasibility of a project,

the requirements analysis and project planning begins. The design starts after the requirements

analysis is completed, the coding begins after the design is completed. Once the programming is

completed, the code is integrated and testing is done. On successful completion of testing, the

system is installed. After this, the regular operation and maintenance of the system takes place.

Figure 1.1: The iterative waterfall model




Since an iterative waterfall model is being used, even though there occurs a shortage of

data, or a change in designing, or change in coding, it can go back to the previous phases and do

the same. This is not possible if simple waterfall model is chosen where requirements of the

system will be frozen. The iteration part involves the redesign and implementation of a task from

the project control list, and the analysis of the current version of the system. The goal for the

design and implementation of any iteration is to be simple, straightforward, and modular,

supporting redesign at that stage or as a task added to the project control list.




2. RELATED WORK

In imaging science, image processing is processing of images using mathematical

operations by using any form of signal processing for which the input is an image, such as

a photograph or video frame; the output of image processing may be either an image or a set of

characteristics or parameters related to the image. Most image-processing techniques involve

treating the image as a two-dimensional signal and applying standard signal-processing

techniques to it.

Closely related to image processing are computer graphics and computer vision.

Computer vision is often considered high-level image processing out of which a

machine/computer/software intends to decipher the physical contents of an image or a sequence

of images (e.g., videos or 3D full-body magnetic resonance scans).

This project involves vehicle detection and hence deduction of traffic density of that road

using appropriate image and video processing techniques. The simplest method that can be used

to detect moving or even static vehicles is using background subtraction.

Background subtraction, also known as Foreground Detection, is a technique in the fields

of image processing and computer vision wherein an image's foreground is extracted for further

processing (object recognition etc.). Generally an image's regions of interest are objects (humans,

cars, text etc.) in its foreground. After the stage of image preprocessing (which may

include image denoising, post processing like morphology etc.) object localisation is required

which may make use of this technique. Background subtraction is a widely used approach for

detecting moving objects in videos from static cameras. The rationale in the approach is that of

detecting the moving objects from the difference between the current frame and a reference

frame, often called “background image”, or “background model”. Background subtraction is

mostly done if the image in question is a part of a video stream. Background subtraction provides

important cues for numerous applications in computer vision, for example surveillance tracking

or human poses estimation.




However, background subtraction is generally based on a static background hypothesis

which is often not applicable in real environments. With indoor scenes, reflections or animated

images on screens lead to background changes. In a same way, due to wind, rain or illumination

changes brought by weather, static backgrounds methods have difficulties with outdoor scenes.

Thus practicality is low with such an approach. This project requires detection of vehicles, thus

another approach that may be used is object detection.

Object detection is one of the major goals in computer vision that deals with detecting

instances of semantic objects of a certain class (such as humans, buildings, or cars) in images or

videos. Object detection such as face detection and pedestrian detection are among the well-

researched domains. Object detection algorithms typically use extracted features and learning

algorithms to recognize instances of an object category. Object detection has applications in

many areas of computer vision such as image retrieval and video surveillance.

The following models are used for object detection:

Feature-based object detection: Detecting a reference object in a cluttered scene using

feature extraction and matching. In Matlab, RANSAC can be used to estimate the

location of the object in the test image.

Viola-Jones object detection

SVM classification with histograms of oriented gradients (HOG) features

Image segmentation and blob analysis

Gradient-based, derivative-based, and template matching approaches.

Feature extraction a type of dimensionality reduction that efficiently represents interesting

parts of an image as a compact feature vector. This approach is useful when image sizes are large

and a reduced feature representation is required to quickly complete tasks such as image

matching and retrieval.

Feature detection, feature extraction, and matching are often combined to solve common

computer vision problems such as object detection and recognition, content-based image

retrieval, face detection and recognition, and texture classification.




Figure 2.1: Detecting a reference object (left) in a cluttered scene (right) using feature extraction and matching.

Common feature extraction techniques include Histogram of Oriented Gradients (HOG),

Speeded Up Robust Features (SURF), Local Binary Patterns (LBP), Haar wavelets, and color

histograms.

Local features and their descriptors are the building blocks of many computer vision

algorithms. Their applications include image registration, object detection and classification,

tracking, and motion estimation. These algorithms use local features to better handle scale

changes, rotation, and occlusion.

Blob Analysis is a fundamental technique of machine vision based on analysis of

consistent image regions. As such it is a tool of choice for applications in which the objects being

inspected are clearly discernible from the background. Diverse set of Blob Analysis methods

allows to create tailored solutions for a wide range of visual inspection problems.




Figure 2.2: Blob Analysis

Let us begin by defining the notions of region and blob.

Region is any subset of image pixels. In Adaptive Vision Studio regions are represented

using Region data type.

Blob is a connected region. In Adaptive Vision Studio blobs (being a special case of

region) are represented using the same Region data type. They can be obtained from any

region using a single SplitRegionIntoBlobs filter or (less frequently) directly from an

image using image segmentation filters from category Image Analysis techniques.

The basic scenario of the Blob Analysis solution consists of the following steps:

1. Extraction - in the initial step one of the Image Thresholding techniques is applied to

obtain a region corresponding to the objects (or single object) being inspected.

http://docs.adaptive-vision.com/4.2/studio/datatypes/Region.html

http://docs.adaptive-vision.com/4.2/studio/datatypes/Region.html

http://docs.adaptive-vision.com/4.2/studio/filters/RegionGlobalTransforms/SplitRegionIntoBlobs.html

http://docs.adaptive-vision.com/4.2/studio/filters/ImageAnalysis/index.html




2. Refinement - the extracted region is often flawed by noise of various kind (e.g. due to

inconsistent lightning or poor image quality). In the Refinement step the region is

enhanced using region transformation techniques.

3. Analysis - in the final step the refined region is subject to measurements and the final

results are computed. If the region represents multiple objects, it is split into individual

blobs each of which is inspected separately.

Figure 2.3: The moving pixels (foreground) detected from the video frame above are shown in white.

Moving cars are detected using blob analysis. This is followed by image segmentation

using background subtraction to detect objects.

Viola-Jones algorithm is commonly used for face recognition and tracking. Pedestrian

detection is commonly done using SVM with HOG features which are pre-trained.

Usage of Gaussian mixture models (GMM) is another effective method that can be used

to detect cars in surveillance. A mixture model is a probabilistic model for representing the




presence of subpopulations within an overall population, without requiring that an observed data

set should identify the sub-population to which an individual observation belongs. Formally a

mixture model corresponds to the mixture distribution that represents the probability

distribution of observations in the overall population.

A GMM on the other hand is a variant of a mixture model. General mixture models have

a set of parameters to it. A GMM variant has some of these parameters modified as A Bayesian

version of a Gaussian mixture model.

Cascade Object Trainers can be used to detect objects whose aspect ratios do not vary

much. This includes stop signs and other road signs, faces, cars or similar objects viewed from

one side.

Cascading is a particular case of ensemble learning based on the concatenation of several

classifiers, using all information collected from the output from a given classifier as additional

information for the next classifier in the cascade. Unlike voting or stacking ensembles, which are

multi-expert systems, cascading is a multistage one.

The first cascading classifier is the face detector of Viola and Jones (2001). The

requirement was that the classifier be fast in order to be implemented on low CPU systems, such

as cameras and phones.




Figure 2.4: Overview of Cascade Classifiers; courtesy of mathworks.com

Classifiers will not accept faces that are upside down (the eyebrows are not in a correct

position) or the side of the face (the nose is no longer in the center, and shadows on the side of

the nose might be missing). Separate cascade classifiers have to be trained for every rotation that

is not in the image plane (side of face) and will have to be retrained or run on rotated features for

every rotation that is in the image plane (face upside down or tilted to the side). Scaling is not a

problem, since the features can be scaled (centerpixel, leftpixels and rightpixels have a

dimension only relative to the rectangle examined). In recent cascades, pixel value from some

parts of a rectangle compared to another have been replaced with Haar wavelets.

To have good overall performance, the following criteria must be met:

Each stage must validate all faces, and can produce many false positives. For example, if

stage 1 were to mark as 'does not contain a face' 20% of rectangles containing a face (false

negative rate=20%), then the total performance of the chain cannot be higher than 80% true

positive, whatever the next stages are, since 20% of faces have already been rejected.

This suggests that a good stage must have 100% true positive and for example 40% false

positive, that is accept all rectangles containing faces and erroneously mark many rectangles




as potentially containing a face, to be eliminated by later stages. It should be noted that for a

first stage, 100% true positive and 40% false positive still gives a lot of false negative, if

only 1 in a 1000 rectangles in an image contain a face, there will still be 400 to 1 false

possible faces after the first stage.

If the first stage is very fast (a few operations), we have eliminated 60% of rectangles not

containing a face very quickly.

The training procedure for one stage is therefore to have many weak learners (simple pixel

difference operators), train them as a group (raise their weight if they give correct result), but be

mindful of having only a few active weak learners so the computation time remains low.

The first detector of Viola & Jones had 38 stages, with 1 feature in the first stage, then 10,

25, 25, 50 in the next five stages, for a total of 6000 features. The first stages remove unwanted

rectangles rapidly to avoid paying the computational costs of the next stages, so that

computational time is spent analyzing deeply the part of the image that have a high probability of

containing the object.

There is a trade-off between fewer stages with a lower false positive rate per stage or more

stages with a higher false positive rate per stage. Stages with a lower false positive rate are more

complex because they contain a greater number of weak learners. Stages with a higher false

positive rate contain fewer weak learners. Generally, it is better to have a greater number of

simple stages because at each stage the overall false positive rate decreases exponentially. For

example, if the false positive rate at each stage is 50%, then the overall false positive rate of a

cascade classifier with two stages is 25%. With three stages, it becomes 12.5%, and so on.

However, the greater the number of stages, the greater the amount of training data the classifier

requires. Also, increasing the number of stages increases the false negative rate. This increase

results in a greater chance of rejecting a positive sample by mistake. Set the false positive rate

(FalseAlarmRate) and the number of stages, (NumCascadeStages) to yield an acceptable overall

false positive rate. Then you can tune these two parameters experimentally.

Training can sometimes terminate early. For example, suppose that training stops after seven

stages, even though you set the number of stages parameter to 20. It is possible that the function

cannot generate enough negative samples. If you run the function again and set the number of




stages to seven, you do not get the same result. The results between stages differ because the

number of positive and negative samples to use for each stage is recalculated for the new number

of stages.

Cascades are usually done through cost-aware ADAboost. The sensitivity can be adjusted so

that there is close to 100% true positives and some false positives. The procedure can then be

started again for stage 2, until the desired accuracy/computation time is desired.

After the initial algorithm, it was understood that training the cascade as a whole can be

optimized, to achieve a desired true detection rate with minimal complexity. Examples of such

algorithms are RCBoost, ECBoost or RCECBoost, in their most basic versions, they can be

understood as choosing, at each step, between adding a stage or adding a weak learner to a

previous stage, whichever is less costly, until the desired accuracy has been reached. Every stage

of the classifier cannot have a detection rate (sensitivity) below the desired rate, so this is a

constrained optimization problem. To be precise, the total sensitivity will be the product of stage

sensitivities.

Cascade classifiers are available in OpenCV, with already trained cascades for frontal faces.

Training a cascade on different object is possible (search for training a haar cascade, for

example), but can currently take a few days.

This project will be using one or more of these methods in combination through trial and

error methods to find the most efficient method. If time permits, a comparison study of the

different combination of techniques will also be provided.




3. SYSTEM SPECIFICATION

The project is divided into a number of functions which are run in a sequential manner to

obtain the desired output. The first part is to obtain the video feed from the camera. During the

project, a sample video would be used to simulate the video feed.

Next, the relevant frames are retrieved from the video feed containing the traffic. These

frames are given to recognition processor. The recognition processor implements background

subtraction onto the frames. Contour analysis is performed to obtain the different objects in the

scene; vehicles are identified and counted.

The recognition processor outputs the data to the density calculator which calculates the

density based on the number of vehicles on road in comparison to empty road. The display unit

compares the calculated density with the threshold value and outputs whether the traffic density

is high or not.

3.1. Use Cases

There are mainly two parts to the system. One is the video camera to software interface

and the second is the video analysis tool. The video camera provides the feed to the software

through the interface.

Here the video feed is simply divided into its constitutional frames and select few frames

are picked corresponding to a pre-determined time interval. The initial case can have no input

feed in which case there will be no output, thus a video input must be provided.

Pre-conditions include video feed corresponding to the reference image provided.

Post-conditions include a set of video frames on which processing will be done.




Figure 3.1: Activity Diagram of the project




The second stage has these frames as input which is processed against the reference

image for background subtraction and contour analysis is then done to retrieve objects of

interest. Vehicles are then counted from these objects of interests which contribute to calculation

of density.

Pre-conditions for this stage includes a reference image for the purpose of background

subtraction and the corresponding frames to it from the video feed which must be provided by

the previous stage.

Post-conditions include a simple GUI displaying the video feed and density values.

Data flow can occur as per the activity diagram. The only alternate paths are that which are

caused by exceptions such as conditions with no video footage or ones with a bad reference

image used for background subtraction.

3.2. Non-functional Requirements

The system is an open source method to find traffic density on roads. Thus there is a

performance requirement of the computer to run the video analysis tool during real-time to

produce required output. Most of the more efficient algorithms and systems related to such work

is proprietary software and this system is a direct protest to this as knowledge and safety systems

should be open to all.

If all of the frames per second are used for calculation, it can lead to a lot of

inconsistency, extraneous results and unnecessary computing. To reduce this, only certain

relevant frames are taken for processing every few seconds. Selection of these frames is a

difficult task as most frames may be blurred or may have streaks of moving cars which is

undesirable and makes detection extremely hard.

If cars are extremely close to each other, and the angle of the camera is not well suited,

then those cars may appear as a single blob or a single vehicle. As the project keeps going,

solutions to such problems will be addressed.




Also, certain external factors such as rain, birds, etc can affect the video feed during real-

time which can cause erroneous results. Weather can be a huge factor in case of background

subtraction methods and other similar simple methods. Thus, blob detection or other feature

detecting mechanisms must be used with colour thresholding in conjunction with background

subtraction for this purpose to produce better results.




4. SYSTEM DESIGN

4.1. Structural Design

The system has two main stages. The first one deals with basic pre-processing of the

video for the project. The video camera to software interface is part of the first stage and the

second is the video analysis tool. The video camera provides the feed to the software through the

interface.

The design has a number of assumptions around which the processing takes place. Some

of the more important assumptions include weather conditions. A clear sunny weather is

expected. Rain, storms and other weather conditions will render this design useless and very

erroneous. Other assumptions include an obstacle free video footage. Physical obstacles can

block the path of the camera, thus giving a useless input feed to the footage, again giving out a

set of erroneous results. This design compromises on such details. However, provided there are

no such problems, this project is deemed to be successful.

Here the video feed is simply divided into its constitutional frames and select few frames

are picked corresponding to a pre-determined time interval. The initial case can have no input

feed in which case there will be no output, thus a video input must be provided. Several UML

diagrams are shown below with a brief description of each such as use case diagrams and activity

diagrams.




Figure 4.1: Use Case diagram of the first stage of the project

The video is read from the source, which maybe recorder video footage or live footage.

Constitutional frames from the video are then extracted. These frames will be extremely

numerous if all them are considered since average cameras has a default of 23.5 frames per

second or 29 frames per second or 30 frames per second. Practically speaking there will not be

much of a difference between frames when it comes to count of cars in a single second, thus we

don’t need to process each and every frame of a second. Therefore, only a few frames are picked

for every few seconds which will be determined through trial and error for an optimal solution of




the above. These frames are then combined to an object which is then transferred to the second

stage of the project where further complex processing is done to deduce the traffic density.

The actors here include the camera and the second stage system.

Figure 4.2: Use Case diagram of the second stage of the project




Figure 4.3: Activity Diagram of the project




The above activity diagram displays how the activities flow and follow. It is rather simple and

sequential. However and is self-explanatory from the diagram. As described elsewhere is takes

the footage, extracts frames, analyses frames and produces desires output

4.2. External Interface Design

A simple GUI interface is provided to the user to display the video, potential vehicles,

traffic density at that instant of time and average traffic density over a decided time. The

interface can also be used to load other video footages and background images to deduce the

traffic density. Other interfaces include that which will be used by the computer and the camera

to provide live footage to be used by the software. Some sample GUI interfaces are shown

below.

Figure 4.4: Sample GUI




4.3. Component Level Design

Figure 4.5: Deployment Diagram

The camera gives continuous video footage to the video preprocessor where frames are

extracted. These frames are then transferred to the video analyzer where further processing such

as background subtraction and contour analysis and traffic density calculation takes place.

The GUI is where the output or display module resides. It is here that the interface between the

benefactor and the system is. Here the camera footage with the traffic density is displayed.

Sample GUI interfaces are shown in the previous sections.

4.3.1. Hardware and Software Requirements

Software used includes MATLAB and C++/Java to integrate their features into each other to

create the software in question. Hardware used can be a simple computer that handles simple




image/video processing. The target environment can be as simple as a device as Raspberry Pi

provided it can handle the processing in real time without unnecessary delay or other

complications. The target will however have to be equipped with a camera, and since Raspberry

Pi won’t be able to handle more complex cameras than a simple webcam which cannot be used

for our purpose, one will have to use a little powerful machine such as a home desktop with

minimal specifications but with a dedicated video card to allow for efficient processing of the

data.




5. PROJECT SCHEDULE AND MAJOR MILESTONES

MILESTONES

DELIVERABLES

TARGET

DATE

CURRENT

STATUS

Project Initiation Studying existing technologies and

project planning

07-Sep-2015 Achieved

Project Design Project analysis and design 15-Oct-2015 Achieved

Getting Approval Presentation and submission of

interim report

21-Oct-2015 Achieved

Software

Implementation

- Retrieving manageable data from

ifvideo

- Processing of data to get desired

ifinformation

01-Mar-2016

21-Jan-2016

21-Feb-2016

Work-in-Progress

GUI

Implementation

and Testing

Providing a simple GUI for

showing necessary information

and testing the entire software

01-Mar-2016

Work-in-Progress

Figure 4.1: Milestone Chart of the project

A brief description of the achievable milestones is shown above in the table and the basic

outline of what must be done and when it is expected to be completed.




Figure 4.2: Gantt-Chart of Phase I

The first phase of the project includes all sorts of preliminary work of which data

gathering, studying previous related works, improvements to be made and requirements to be

managed are some of the many things done here.

The second half of the first phase deals with project analysis and design. Here, what the

project is actually supposed to do and achieve is clearly mentioned here. How this can be

achieved, and the most efficient methods are researched and simulated if needed. This

information helps in designing the actual project in question. The timeline of the first phase of

the project is depicted by a Gantt-Chart above.




Figure 4.3: Gantt-Chart of Phase II

The second phase of the project involves implementation, testing and evaluation of the

project. Testing and implementation goes hand in hand as far as alpha testing is concerned.

However, most of the core testing gets done in the latter part of implementation.

This project deals with video and image processing to deduce the traffic density of a given region

provided there are reliable reference images and obstacle free video feed with ambient light

conditions. Therefore, the initial parts of retrieving frames from the videos for processing and

retrieving objects can be done almost easily in a couple of weeks’ time. The processing which

involves separating relevant objects from irrelevant objects however would be a far more tedious

task which must be dealt with. Further testing and evaluation is done to make sure the

requirements are fulfilled. The second phase of the project which starts in the beginning of the

year of 2016 is depicted by the Gantt-Chart above.




6. RISKS AND CHALLENGES

The risks involved in this project include insufficient data and insufficient time due to

which the project may be incomplete. However a solid work plan and schedule has been

established to ensure no such thing occurs.

The challenges include dealing with different weather and low visibility conditions. Low

visibility such as at night will deter this project since detection of objects will be next to

impossible unless algorithms are included to detect vehicles based on their headlights or having

to use specialized cameras such as night vision enabled cameras or infrared cameras.

Varying weather conditions can also affect the performance of this system drastically, like In the

case of rain, where visibility can be deteriorated heavily.




7. GLOSSARY

UML: Unified Modular Language

GUI: Graphical User Interface

GMM: Gaussian Mixture Models




8. BIBLIOGRAPHY

[1] N. Abbas, M. Tayyab, M. Tahir, “Real Time Traffic Density Count using Image Processing”,

[di4International Journal of Computer Applications, Volume 83,December 2013.

[2] P. Srinivas, Y.L. Malathilatha, Dr. M.V.N.K Prasad, “Image Processing Edge Detection

[di4Technique used for Traffic Control Problem”, International Journal of Computer Science and

[di4Information Technologies, Vol. 4 (1), 2013

[3] P. Niksaz, “Automatic Traffic Estimation Using Image Processing”, International Journal of

[4] Signal Processing, Image Processing and Pattern Recognition Vol. 5, No. 4, December, 2012.

[4] D. Rosenbaum, J. Leitloff, F. Kurz, O. Meynberg, and T. Reize, “Real-Time Image

[di4Processing For Road Traffic Data Extraction From Aerial Images”, ISPRS TC VII

[di4Symposium 100 Years ISPRS, Vienna, Austria, July 5–7, 2010.

sachin;s interim report

Documents