a deep learning approach to predict accident occurrence ...€¦ · a deep learning approach to...

A Deep Learning Approach to Predict Accident Occurrence Basedon Traffic Dynamics

Farnaz Khaghani

Thesis submitted to the Faculty of the

Virginia Polytechnic Institute and State University

in partial fulfillment of the requirements for the degree of

Master of Science

in

Computer Science and Application

Edward A. Fox, Chair

Farrokh Jazizadeh, Co-chair

Hoda M. Eldardiry

May 11, 2020

Blacksburg, Virginia

Keywords: Deep learning, LSTM, Bidirectional LSTM, Database management, Anomaly

detection

Copyright 2020, Farnaz Khaghani


Farnaz Khaghani

(ABSTRACT)

Traffic accidents are of concern for traffic safety; 1.25 million deaths are reported each

year. Hence, it is crucial to have access to real-time data and rapidly detect or predict

accidents. Predicting the occurrence of a highway car accident accurately any significant

length of time into the future is not feasible since the vast majority of crashes occur due to

unpredictable human negligence and/or error. However, rapid traffic incident detection could

reduce incident-related congestion and secondary crashes, alleviate the waste of vehicles’ fuel

and passengers’ time, and provide appropriate information for emergency response and field

operation. While the focus of most previously proposed techniques is predicting the number

of accidents in a certain region, the problem of predicting the accident occurrence or fast

detection of the accident has been little studied. To address this gap, we propose a deep

learning approach and build a deep neural network model based on long short term memory

(LSTM). We apply it to forecast the expected speed values on freeways’ links and identify

the anomalies as potential accident occurrences. Several detailed features such as weather,

traffic speed, and traffic flow of upstream and downstream points are extracted from big

datasets. We assess the proposed approach on a traffic dataset from Sacramento, California.

The experimental results demonstrate the potential of the proposed approach in identifying

the anomalies in speed value and matching them with accidents in the same area. We show

that this approach can handle a high rate of rapid accident detection and be implemented

in real-time travelers’ information or emergency management systems.


Farnaz Khaghani

(GENERAL AUDIENCE ABSTRACT)

Rapid traffic accident detection/prediction is essential for scaling down non-recurrent conges-

tion caused by traffic accidents, avoiding secondary accidents, and accelerating emergency

system responses. In this study, we propose a framework that uses large-scale historical

traffic speed and traffic flow data along with the relevant weather information to obtain

robust traffic patterns. The predicted traffic patterns can be coupled with the real traffic

data to detect anomalous behavior that often results in traffic incidents in the roadways.

Our framework consists of two major steps. First, we estimate the speed values of traffic at

each point based on the historical speed and flow values of locations before and after each

point on the roadway. Second, we compare the estimated values with the actual ones and

introduce the ones that are significantly different as an anomaly. The anomaly points are

the potential points and times that an accident occurs and causes a change in the normal

behavior of the roadways. Our study shows the potential of the approach in detecting the

accidents while exhibiting promising performance in detecting the accident occurrence at a

time close to the actual time of occurrence.

Dedication

To my parents and sister for their unconditional love and support though they were

thousands of miles away from me.

iv

Acknowledgments

I would like to express my sincere gratitude to my advisor Professor Edward Fox for his pa-

tience and continuous support, who has the attitude and the substance of a genius. Without

his guidance and persistent help, this thesis would not have been possible. It has been my

utmost privilege to work with you.

My appreciation also extends to Professor Hesham Rakha for his insightful comments and

valuable feedback. His timely suggestions with kindness, enthusiasm and dynamism have

enabled me to complete my thesis. I would also like to thank my committee members, Pro-

fessor Farrokh Jazizadeh and Professor Hoda Eldardiry for their time and cooperation in the

completion of this thesis.

I would especially like to acknowledge and thank students in ‘CS 4624: Multimedia, Hy-

pertext, and Information Access’, Elias Gorine and Jacob Smethurst, for assistance with

preparation, conditioning, and processing the data as well as the development of an SQL-

based data management pipeline.

My sincerest appreciation and gratitude go to my parents and my sister Forough, for their

unfailing love and support, for encouraging me every single day to be a better person, and

for giving me wings to fly although they were thousands of miles away.

v

Contents

List of Figures ix

List of Tables xi

1 Introduction 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Research Question . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.4 Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.5 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.6 Organization of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Motivating Works 5

2.1 CS 4624: Multimedia, Hypertext, and Information Access . . . . . . . . . . 5

2.2 CEE 5604: Traffic Characteristics and Flow . . . . . . . . . . . . . . . . . . 6

3 Review of Literature 7

3.1 Traffic Accident Prediction Using Classical Techniques . . . . . . . . . . . . 7

3.2 Deep Learning Models for Traffic Accident Prediction . . . . . . . . . . . . . 9

vi

4 Methodology 11

4.1 Traffic Dynamics at the Time of an Accident . . . . . . . . . . . . . . . . . . 11

4.2 Basic Deep Learning Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4.2.1 Recurrent Neural Network (RNN) . . . . . . . . . . . . . . . . . . . . 13

4.2.2 Long Short Term Memory (LSTM) . . . . . . . . . . . . . . . . . . . 15

4.2.3 Bidirectional LSTM Recurrent Structure . . . . . . . . . . . . . . . . 16

5 Results 18

5.1 Area of Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

5.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

5.2.1 Traffic and Accident Data . . . . . . . . . . . . . . . . . . . . . . . . 18

5.2.2 Weather Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.3 Deep Learning Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.4 Feature Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

5.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

5.6 Anomaly Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

6 Conclusions 36

6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

6.2.1 Urban-scale Implementation . . . . . . . . . . . . . . . . . . . . . . . 38

vii

6.2.2 Sensitivity Analysis for Setting the Threshold . . . . . . . . . . . . . 38

6.2.3 Additional Spatial Dependency . . . . . . . . . . . . . . . . . . . . . 38

7 User Manual 40

7.1 Software Requirement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

7.2 Repository Content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

8 Developer Manual 44

8.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

8.2 Database Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

8.2.1 Database Management with SQL . . . . . . . . . . . . . . . . . . . . 45

8.2.2 Database Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

Bibliography 49

Appendices 56

Appendix A Supplementary Results for Various Road Postmiles 57

viii

List of Figures

4.1 The fundamental traffic diagrams according to Greenshield [32] . . . . . . . 12

4.2 Position of traffic states at the fundamental diagram when an accident occurs 13

4.3 An example of time–space diagram for typical temporary capacity reduction

(i.e., traffic accident) [14] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4.4 Graphic representation of LSTM gates [1] . . . . . . . . . . . . . . . . . . . 16

4.5 The architecture of Bidirectional LSTM model [23] . . . . . . . . . . . . . . 17

5.1 The spatial extent of area of the study . . . . . . . . . . . . . . . . . . . . . 19

5.2 PeMS homepage, available at http://pems.dot.ca.gov/ . . . . . . . . . . . . 21

5.3 MesoWest weather data API map . . . . . . . . . . . . . . . . . . . . . . . 24

5.4 The flow of data through the framework . . . . . . . . . . . . . . . . . . . . 25

5.5 Loss Value for training for Postmile 517.916 . . . . . . . . . . . . . . . . . . 27

5.6 Actual and prediction of speed values for Postmile 517.916 . . . . . . . . . . 28

5.7 Histogram of loss value for training for Postmile 517.916 . . . . . . . . . . . 29

5.8 Anomaly points for test dataset for Postmile 517.916 . . . . . . . . . . . . . 30

5.9 Comparison between the number of actual incidents (reported by CHP)and

detected anomaly events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

8.1 Loading speed data using parameterized queries . . . . . . . . . . . . . . . . 47

ix

A.1 Loss value for training for Postmile 508.463 . . . . . . . . . . . . . . . . . . 57





A.6 Actual and prediction of speed values for Postmile 508.463 . . . . . . . . . . 60





A.11 Histogram of loss value for training for Postmile 508.463 . . . . . . . . . . . 62





x

List of Tables

5.1 Example of traffic Speed data . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5.2 A sample of incident data available at PeMS and provided by California High-

way Patrol (CHP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5.3 A sample of weather data retrieved from MesoWest . . . . . . . . . . . . . . 24

5.4 Data statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.5 A sample of anomaly points in anomaly DataFrame . . . . . . . . . . . . . . 28

5.6 A sample of anomaly events DataFrame . . . . . . . . . . . . . . . . . . . . 29

5.7 Variation of performance evaluation metrics for different Postmiles in the area

of study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

5.8 A sample of anomaly events DataFrame . . . . . . . . . . . . . . . . . . . . 35

5.9 Number of the anomaly events in each group . . . . . . . . . . . . . . . . . . 35

xi

Chapter 1

Introduction

1.1 Motivation

Traffic accidents play an important role in traffic safety. According to ASIRT (Association

for Safe International Road Travel), more than 38,000 people die annually in crashes and

car accidents on U.S. roadways. An additional 4.4 million are seriously injured and require

immediate medical attention. Traffic accidents and crashes are the dominant cause of death

in the U.S. for people aged 1-54. Reducing the response time of Emergency Medical Techni-

cians (EMT) to car crashes is a key in increasing survivability of the crash for those involved.

According to studies, counties across the United States with a response time of more than

12 minutes had a motor vehicle collision mortality rate nearly twice that of counties with

a response time of fewer than 7 minutes [5, 13]. For this reason, even a decrease in EMT

response time on the order of fractions of a minute can prove life-saving, especially in serious

collisions at highway speeds. If transportation officials in a state have any advance notice

or warning as to which areas of the state’s interstate highways are most likely to have an

accident at certain times of the day, a decrease in response time might be obtained.

Road traffic accident prediction or rapid detection plays a crucial role in safety manage-

ment and planning. Research on traffic safety has a long tradition as accidents on the roads

are one of the most fatal threats to people. Predicting possible traffic accidents can be a

solution to avoid accidents, reduce damage from them, give the drivers chances to reduce

1

2 Chapter 1. Introduction

the damage by quick response and reaction, or improve the emergency management system.

However, predicting the exact time and location of accidents is practically impossible. An

alternative strategy is to detect the occurrence of the accidents rapidly or identify abnormal

behavior that may lead to an accident. Early detection of accidents results in less delay

and inconvenience, faster emergency response, and faster announcements to users to take a

detour. A nationwide survey on the deployment of accident detection algorithms in Traffic

Management Centers (TMC) showed that 90% of survey respondents feel that the exist-

ing algorithms are not applicable in large-scale real-world systems due to the complicated

and time-consuming calibration or unacceptable false alarm rates [47]. In this study, the

main goal is to develop an accident detection model that can extract maximum information

from the traffic data to generate the normal travel pattern of each segment. Consequently,

anomalous behavior could be captured as a potential accident occurrence.

1.2 Problem

The task of accurate traffic forecasting in extreme conditions is difficult mainly due to the

complex nature of traffic accidents. Another problem with accident prediction/detection is

the scarcity of accidents in both space and time. Due to the limited number of samples, it

is challenging to precisely predict the occurrence of individual accidents. A large number of

existing works on traffic accident detection or prediction are applying classical models such

as classification or regression on limited data. This leads to unsatisfactory performance. The

classification models mainly create a standard predicting model based on the information

learned from the training set. For example, some work employed classic classifiers to predict

if an accident will occur at a specific location during each time window [8, 12]. Another group

of studies focused on the prediction of the number of accidents in a specific area [3, 4, 36].

1.3. Research Question 3

This could be more applicable for risk analysis and unsafe area identification rather than

decreasing the emergency response time.

The goal of this research is to establish a deep learning LSTM neural network model that can

address some of these problems. The deep learning approach provides automatic represen-

tation learning from raw data. Instead of common classification and regression approaches,

we propose an anomaly detection approach. It is based on the difference between prediction

and actual values. It eliminates the cumbersome process of labeling the data as ‘accident’

and ‘non-accident’ and dealing with imbalanced data. Furthermore, the detected anomalous

data represents hazardous situations in addition to potential accidents. Several detailed fea-

tures such as traffic speed, traffic flow, and weather are used to train the predictive model

and identify the abnormal traffic dynamics that lead to an accident.

1.3 Research Question

Do deep learning structures help with the development of tools to predict traffic accidents

occurrence or detect them rapidly? What kind of approaches would fit better for accident

occurrence prediction/detection rather than predicting the number of accidents?

1.4 Hypothesis

With the aforementioned research questions in mind, we present the following hypothesis:

Development of a deep learning model coupled with anomaly detection would be efficient

for accident prediction/detection purposes.

4 Chapter 1. Introduction

1.5 Contribution

In this work, we propose a deep-learning-based anomaly detection approach to detect/predict

accidents on the roadways. We highlight our contribution as follows:

• We collect and fuse heterogeneous big datasets including weather, time, and traffic for

traffic accident prediction/detection.

• To address the spatial dependency of the traffic features and improve the accuracy of

the prediction, we passed the traffic feature sequences of upstream and downstream of

the target point to train the model accordingly.

• We focus on the detection/prediction of accidents and hazardous dynamics rather than

predicting the number of accidents in a region.

1.6 Organization of Thesis

The rest of this document is organized as follows: Chapter 3 includes a review of literature

that discusses research into predictive models, and into the needs of projects which could

leverage a framework like this, to help motivate the design of the framework. Chapter 4 gives

an outline of the design and architecture of our proposed model, including a brief discussion

of the functionality of each of the provided modules. Chapter 5 discusses the implementation

of the model on the traffic and accident data collected from the California PeMS system, and

evaluates the success of the developed model. Chapter 6 summarizes the previous sections,

discusses the findings, and presents proposals for future work. We have provided a user and

developer manual in Chapter 7 and Chapter 8, respectively.

Chapter 2

Motivating Works

This research was inspired by CEE 5604 at Virginia Tech. A part of this work was con-

ducted in collaboration with Jacob Smethurst and Elias Gorine as partial fulfillment of the

requirements of CS 4624 at Virginia Tech. In this section, we present a description of these

two courses to illustrate the relation to this project.

2.1 CS 4624: Multimedia, Hypertext, and Information

Access

CS 4624 at Virginia Tech is a class which uses a project-based learning approach to teach

students the architectures, concepts, data, hardware, methods, models, software, standards,

structures, technologies, and issues involved with: networked multimedia (e.g., image, audio,

video) information, access, and systems; hypertext and hypermedia; electronic publishing;

and virtual reality. The project-based learning approach makes use of a single, semester-

long project to guide students through the learning process. The class normally functions

by grouping the students into teams, with each team being responsible for a small project

related to the overall course goal. These teams are assigned to a client and will have to work

together under the guidance of Dr. Fox and the class GTAs to come up with a way to use

the resources provided to accomplish an overarching, semester-long project goal.

5

6 Chapter 2. Motivating Works

2.2 CEE 5604: Traffic Characteristics and Flow

The goal of the course is to provide a background in traffic flow theory for the analysis

of controlled and uncontrolled roadway facilities. The course focuses on traffic flow theory

as it relates to vehicle steady-state longitudinal motion, behavior during non-steady states

(deceleration and acceleration), traffic stream models, heterogeneous traffic stream flow,

lane-gap acceptance and lane changing modeling, and the estimation of delay upstream of

moving and stationary bottlenecks. Partial fulfillment of the requirements of this course is

from the final project that applies the traffic theory concepts into real-world case studies

that have formed the skeleton of this thesis.

Chapter 3

Review of Literature

To this date, several studies have deployed various data sources or a combination of them

for detection of accidents. In this chapter, we review the studies that have utilized classical

machine learning techniques and deep learning models for accident detection.

3.1 Traffic Accident Prediction Using Classical Tech-

niques

There have been numerous studies to investigate methods for classifying spatial units (e.g.,

road segments) at a given time into classes of ‘accident’ and ‘no accident’. Statistical

techniques, image processing [25], pattern recognition, and artificial intelligence methods

[3, 12, 37] have been widely used to address the accident detection problem. For example,

Chang et al. [8] developed a decision tree model to build a classifier that predicts accidents

with training and testing accuracy of 55%. Lin et al. [30] explored multiple machine learning

techniques such as Random Forest, K-Nearest Neighbor, and Bayesian Network, to predict

accidents along the roadways. Yuan et al. [50] evaluate the performance of Support Vector

Machine (SVM), Decision Tree, Random Forest, and Deep Neural Network (DNN) in pre-

dicting and classifying the accidents on roadways. Caliendo et al. [6] employed the Poisson,

Negative Binomial, and Negative Multinomial regression models for the task of predicting

7

8 Chapter 3. Review of Literature

the number of accidents in multi-lane roadways.

Theofilatos [44] applied Random Forest and Bayesian logistic regression models to the real-

time traffic data of urban arterial roads to study the likelihood of road accident likelihood.

Hebert et al. [20] explored various machine learning methods to manage the class imbal-

ance inherent in accident prediction problems. The authors employed the Balanced Random

Forest algorithm, a variant of the Random Forest machine learning algorithm in Apache

Spark. The results from the experimental case study show that 85% of vehicle collisions are

detected.

Among machine learning techniques, Probabilistic Neural Network (PNN) and Support Vec-

tor Machine (SVM) are two important techniques that have been used to detect accidents

[48]. Studies of SVM application in accident detection and prediction are well documented.

It is acknowledged that SVM models provide a higher correct detection rate and lower false

alarm rate when compared to probabilistic neural network models [50]. For example, Li et

al. [29] assessed the application of SVM models for predicting vehicle crashes, and compared

the performance of SVM models with the Back-Propagation Neural Network (BPNN). It has

been shown that SVM does not have the over-fitting problem that often occurs in BPNNs

and is faster to implement for the specific purpose of accident prediction.

More recently, eXtreme Gradient Boosting (XGBoost) has been leveraged to predict the

occurrence and duration of an accident by Meng [33]. Some authors [34, 42] have also shown

that XGBoost shows a better performance in prediction of the likelihood of an accident

when compared to methods like Logistic Regression, Bayesian Regularized Neural Network,

Bagging Average Neural Networks, and Gradient Boosting.

All of the above works are applied to limited features and small scale traffic accident data

(e.g., one or a small number of roads). Increasing the number of features to improve the

accuracy and expand the spatial scale of the analysis may lead to unsatisfactory performance

and high computational cost. To address these problems, some recent works have used deep

3.2. Deep Learning Models for Traffic Accident Prediction 9

learning approaches.

3.2 Deep Learning Models for Traffic Accident Predic-

tion

A series of recent studies have employed deep learning methods for traffic accident inference.

For example, Convolutional Neural Networks (CNN) have been coupled with vision-based

data (e.g., facial features such as eye movement and blink rates) to detect drivers’ distrac-

tion [26], drowsiness [16], and fatigue [31]. As CNNs work well with vision-based data, they

have been widely used for predicting and analyzing accidents when images and videos are

involved. Shah et al. [43] utilized CNN models to explore and investigate accidents using

the data from closed-circuit television traffic cameras. In another study, Najjar et al. [35]

trained a CNN using historical accident data and satellite images to predict the risk of ac-

cidents on an intersection where they achieved an accuracy of 73%.

Recurrent Neural Networks (RNNs) show promise to work well with sequential data like

time-series. They have also been leveraged for traffic accident prediction thanks to their

generally high performance and the availability of time-series data [46]. For example, Ren

et al. [39] proposed a deep learning approach (RNN) to predict traffic accident risk, where

risk is defined as the number of accidents in a region at a certain time. Chen et al. [10] used

a similar concept of traffic accident risk and developed an Autoenoder deep architecture to

understand the impact of human mobility on traffic accident risk.

More recently, Yuan et al. [51] used the Convolutional Long Short-Term Memory (ConvL-

STM) to predict the number of accidents in a region based on the spatial structure of a road

network, weather information, and volume of traffic. Multiple heterogeneous data have been

collected and integrated using satellite images, traffic camera data, roadway weather infor-

10 Chapter 3. Review of Literature

mation system data, and rainfall data. However, it still focuses on the prediction of frequency

and number of accidents rather than the time of occurrence. Identifying the frequency and

number of accidents could generate useful information for safety analysis. However, for an

efficient emergency management system, predicting the occurrence of an accident or detect-

ing the accident promptly could provide more beneficial information. This study offers a

test of addressing this gap by employing a deep learning model.

Chapter 4

Methodology

In this section, we first look at traffic fundamentals and how an accident impacts the dy-

namics of traffic flow. The principals and fundamentals of traffic theory help us to better

understand and interpret the input features and structure of our model. Next, we review

the basic deep learning models that are going to be used for this study.

4.1 Traffic Dynamics at the Time of an Accident

Traffic accidents are one of the important sources of traffic jams. Accidents cause a temporal

local reduction of capacity. To explain the change in the traffic parameters, we need to look

at the triangular fundamental diagram (Figure 4.1). The fundamental diagram of traffic

flow represents the relation between the traffic features (i.e., flow, speed, and density).

As presented in Figure 4.2, when an accident occurs the traffic moves from uncongested

state (point A) to congested state (point B). This change in the states affects the speed

and flow of the vehicles. In other words, it is going to create a shock-wave that will form

a queue after the bottleneck (i.e., accident location). This phenomenon is often shown in

the space-time diagram and will create a draw-up draw-down cycle in the speed-time graph.

Figure 4.3 illustrates the concept of shock-wave and how the speed of the vehicles is going

to change when the shock-wave happens. In normal cases (i.e., non-accident), the traffic

conditions do not vary significantly in sequences of time series between the upstream and

11

12 Chapter 4. Methodology

Figure 4.1: The fundamental traffic diagrams according to Greenshield [32]

downstream. On the other hand, traffic conditions between the upstream and downstream

fluctuate rapidly when an accident occurs. This fluctuation is a result of the shock-waves

caused by the accident. Mathematically, the speed of a shock-wave (i.e., the speed at which

congestion travels backward from the temporal bottleneck formed because of the accident)

can be derived from the traffic characteristics (i.e., flow rate and density) of the upstream

and downstream. Hence, the change in the speed dynamics when an accident occurs could

be observed more significantly at the road sections after the accident location [41]. That

being said, to detect or predict an accident, we should look for the anomalies where the

queue is formed (backward from the accident location). However, since the loop detectors

(the main source of traffic data in this study) are located at a rough distance of 0.1 miles,

some anomalies may be observed in the upward direction as well. This information about

4.2. Basic Deep Learning Concepts 13

Figure 4.2: Position of traffic states at the fundamental diagram when an accident occurs

the general dynamics of traffic at the time of an accident could enhance our understanding

of the anomaly points and how they should be interpreted.

4.2 Basic Deep Learning Concepts

In this section, we first introduce the basic concepts in neural networks and deep learning

architecture.

4.2.1 Recurrent Neural Network (RNN)

A recurrent neural network is a feature map that contains at least one feedback loop. In

other words, the connections between nodes form a connected graph along with a temporal


Figure 4.3: An example of time–space diagram for typical temporary capacity reduction(i.e., traffic accident) [14]

sequence [52]. If the input vector at timestamp t is denoted as xt, the hidden layer vector

ht, the weight matrix by Wt and Ut, and bias as bh, then ot is an output sequence which is

a function over the current hidden state. RNN iteratively computes the hidden layer and

outputs using the following recursive procedure:

ht = σ(Whxt + Uhh(t− 1) + bh (4.1)

and,

ot = σ(Woht + bo) (4.2)

where Wo and bo denote the weight and bias for the output, respectively.


4.2.2 Long Short Term Memory (LSTM)

LSTM is a special type of RNN that makes it easier to remember past data in memory

and avoid the vanishing gradient problem of RNNs [17]. LSTM trains the model by using

back-propagation. LSTM can remove or add information to the cell state, which allows

information to flow along with the network, and regulate them using the ‘gate’ concept. In

an LSTM network, three gates are present: 1) Input gate (it), 2) Forget gate (ft), and 3)

Output gate (ot) (Figure 4.4). The LSTM architecture is specified as follows [49]:

it = σ(Wixt + Uih(t− 1) + bi) (4.3)

ct = tanh(Wcxt + Ufh(t− 1) + bc) (4.4)

ft = σ(Wfxt + Ufh(t− 1) + bf ) (4.5)

ot = σ(Woxt + Uoh(t− 1) + bo) (4.6)

st = s(t− 1) ◦ ft + ct ◦ it (4.7)

ht = st ◦ ot (4.8)

where ht denotes the hidden state, st denotes the cell state at time t, and ◦ denotes Hadamard

product [21]. The gates can learn the most important data in a sequence to keep or throw

away. With this consideration, the gates can pass relevant information down the long chain of

sequences for predictions. This architecture makes LSTM perfect for time-series prediction.


Figure 4.4: Graphic representation of LSTM gates [1]

4.2.3 Bidirectional LSTM Recurrent Structure

Bidirectional LSTMs are based on the traditional LSTMs that were introduced to improve

model performance on sequence classification problems [24]. The arrangement of the LSTM

memory block enables the network to store and retrieve information over long periods (Figure

4.5). One drawback of the standard LSTM networks is that they only have access to the

previous context but not to future context. In problems where all time steps of the input

sequence are available, Bidirectional LSTMs train two instead of one LSTMs on the input

sequence (i.e., the input sequence as-is and a reversed copy of the input sequence). This

results in faster and even fuller learning on the problem [19].

In our framework, for each location on the roadways (denoted by stations where the loop

detectors are located and collect the speed and flow data), we construct a Bidirectional

LSTM model. The input X is the historical value of the dependent variables (i.e., speed

and flow values of upstream and downstream points). Furthermore, the time components

that influence the traffic conditions (i.e., time of the day, day of the week, and day of the


month) will be added to the input vector. Finally, the weather information collected for

each timestamp will form the additional input features. The bidirectional LSTM is a good

fit for speed prediction as the LSTM can potentially capture temporal autocorrelation in the

data. Once the model is trained based on the historical data, the speed values are estimated.

Thereafter, the anomalous behavior can be classified by setting a threshold for loss values

and examining the actual traffic data with the corresponding pattern.

Figure 4.5: The architecture of Bidirectional LSTM model [23]

Chapter 5

Results

5.1 Area of Study

We choose a 20-mile section of freeway I-5 N in the Sacramento area, as the study area

(Figure 5.1). According to the California Office of Traffic Safety, there are over 3,000 traffic

accidents per year in Sacramento that result in death or serious injuries. We looked at the

traffic and accident data for 6 months of data from July to December 2018. All the data we

collected about Sacramento are described below.

5.2 Data

5.2.1 Traffic and Accident Data

Traffic flow data used for empirical assessments was provided by the California Department

of Transportation (Caltrans) Performance Measurement System (PeMS) [9]. PeMS gets its

data from ITS, Vehicle Detector Stations (VDS), traffic counters (e.g., traffic census stations

and weight-in motion (WMI) sensors), and other data sets like California Highway Patrol

(CHP) incident data, the Caltrans Photolog, etc. Caltrans PeMS consists of 18300 detector

stations and collects traffic data every 30 seconds. To account for possible malfunction of

detectors and sensors, PeMS uses a process called data imputation to compile 30-seconds

18

5.2. Data 19

Figure 5.1: The spatial extent of area of the study

data sets without any gaps and aggregate them into 5-minute increments [45]. PeMS is a

real-time Archive Data Management System (rt-ADMS) that collects, stores, and processes

raw data in real-time [9]. An advantage of using PeMS compared to raw inductive loop or

sensor data is that the PeMS platform and algorithms manage the data fusion and most

of the pre-processing and cleaning of the data. However, the drawback is the aggregated

data in 5-minute increments, which decrease the temporal resolution of the dataset. As our

analysis is mainly at macro-scale (macroscopic behavior of traffic), the aggregation may not

significantly impact the results.

PeMS uses Postmiles to measure locations on state highways. The system uses two types

20 Chapter 5. Results

Table 5.1: Example of traffic Speed data

Time Postmile (Abs) Postmile (CA) VDS Agg Speed # Lane Points % Observed0:00 3.335 3.425 1118352 68.1 4 1000:00 2.56 2.65 1114720 70.4 4 1000:00 2.195 2.285 1118348 67.3 4 1000:00 1.143 1.233 1118333 67.5 4 1000:00 0.22 R.31 1114091 71.1 6 1000:05 533.515 3.57 317377 67.6 2 00:05 530.517 0.572 316096 67.6 2 00:05 524.95 29.657 317843 67.6 3 00:05 524.193 28.9 318236 67.6 3 00:05 523.39 28.097 315927 67.6 4 00:05 523.247 27.954 315969 67.6 4 00:05 520.744 25.451 315054 67.6 4 00:05 519.874 24.581 318632 67.6 4 0

of Postmiles and includes both in the dataset. The jurisdictional (Caltrans) Postmiles are

assigned to physical boxes and geometric features on freeways when they are built. Absolute

Postmiles reflect the actual distance along a freeway from its beginning to its terminus. PeMS

uses absolute Postmiles to compute the distance between detectors. With this definition,

the absolute Postmile is a unique value for each freeway and will be used as the location

indicator of road sections in our analysis. Road sections are defined as the section of the

road between two correctly working stations (Postmile) [15].

Traffic speed and flow data are grouped by day and interstate highway. For each day of

data seven pieces of data per Postmile are included (Figure 5.1): Time, Absolute Postmile,

Caltrans Postmile, Vehicle detection Station (VDS) ID, Aggregate Speed/Flow, Number of

Lane Points, and Percent Observed.

• The Time value begins at 00:00, representing 12:00 AM, and continues throughout the

day, using a 24-hour clock.

• The Absolute Postmile is a measure of the location of the reading, using the statewide

5.2. Data 21

Figure 5.2: PeMS homepage, available at http://pems.dot.ca.gov/

mile markers for the particular highway.

• The Caltrans Postmile is a more convoluted measure of the location of the reading; it

measures the distance from the county line of the county in which the reading occurs.

• The Vehicle Detector Station, or VDS, is the unique identifier of the station to which

the loop detector belongs.

• The Aggregate Speed/Flow value is the average of the speeds/flows of vehicles passing

over the loop detector.


Table 5.2: A sample of incident data available at PeMS and provided by California HighwayPatrol (CHP)

Incident Start Time Duration Freeway CA PM Abs PM Area Location DescriptionId (mins)

18258118 10/01/18 62 I-5 N R17.865 680.5 Redding I5 N 1182-00:05 Twin View Blvd Ofr Trfc Collision-No Inj

18258129 10/01/18 210 I-5 N 26.567 143.2 Altadena I5 N CZP-Assist00:23 I5 N Ca134 W Con with Construction

18258138 10/01/18 38 I-5 N 25.067 141.7 Central LA I5 N / Hit00:33 So Colorado Blvd and Run No Injuries

• The number of lane-points indicates the number of detector data points used to make

the computation.

• The percent observed means how much data is observed (actual data received that

met all diagnostic tests) as opposed to imputed. The percent observed is very useful

in determining the quality of data.

For this project, the most important pieces of data for each speed reading are the Time

value, the Absolute Postmile, and the Aggregate Speed value. Traffic flow data is organized

in almost the same way, with the main difference being the inclusion of an Aggregate Flow

value instead of an Aggregate Speed value.

The traffic incident data, also available on PeMS, has 101 pieces of data per entry (see Table

5.2): Incident ID, Start Time, Duration, Freeway, Caltrans Postmile, Absolute Postmile,

Source, Area, Location, and Description.

• The Incident ID is a unique identifier for the incident.

• The Start Time is a 24-hour clock timestamp of the form MM/DD/YY HH:MM that

represents when the incident occurred.

• The Duration is a value in minutes that represents the duration of the incident that

occurred, as assessed by the police department reporting the incident.

5.2. Data 23

• The Freeway is the name and travel direction of the interstate highway, for example,

“I5-N”.

• The Caltrans Postmile and Absolute Postmile are a measures of the location of the

incident, using the same conventions as discussed for the traffic speed and flow data.

• The Source is typically “CHP”, which represents the California Highway Patrol.

• The Area is the county in which the incident took place.

• The Location includes the information from the Freeway value, as well as the nearest

cross-street.

• The Description is a text description of the incident that also includes a four-digit

incident code. For example, the entry for a traffic collision with no injuries is “1182-

Trfc Collision-No Inj”.

For this project, the most important pieces of data for each incident report are the Start

Time, Duration, Freeway, Absolute Postmile, and Description. For each of these data types

(traffic speed, flow, and incidents) contained in a CSV file, we begin our data preprocessing

by loading each into a table in an SQLite3 database. After using Python’s CSV reader

capabilities to create objects of each type of data point, we use SQLite3’s helpful Python

library to insert these data points into a raw database.

5.2.2 Weather Data

The weather data is collected from the MesoWest database [22]. MesoWest is a cooperative

project and was first developed at the University of Utah in 1996. The Mesowest project

provides access to current and archived weather observations across the United States. These


Table 5.3: A sample of weather data retrieved from MesoWest

Date Time Air temperature (Celsius) Wind Speed (m/s) Rainfall (mm) Weather Condition2018-10-02T13:53:00Z 17.8 2.57 0.025 Clear2018-10-02T13:55:00Z 18 2.57 0 Mostly Clear2018-10-02T14:35:00Z 18 5.14 0 Partly Cloudy2018-10-04T00:53:00Z 21.7 4.12 0.254 Rain

data are available through the traditional suite of web products and an API Service. MesoW-

est data can be downloaded directly from the website. Figure 5.3 shows the weather stations

for which data is available in the Sacramento area. For each weather monitoring station,

Figure 5.3: MesoWest weather data API map

different variables are available. This dataset includes wind features (e.g., speed, gust, or

direction), temperature, cloud layer, weather condition, pressure, altimeter, and many more.

For this study, we use the rainfall, temperature, wind speed, and general weather condition.

Table 5.3 presents an example of the weather data we used as the input of our model. The

flow of data is shown in Figure 5.4. A summary of the data statistics is presented in Table

5.4. It should be noted that there were some missing data in the flow dataset. Before train-

ing the model, the timestamps (data points) with missing flow values have been excluded.

5.3. Deep Learning Model 25

Figure 5.4: The flow of data through the framework

Table 5.4: Data statistics

Feature Count Mean Standard DeviationSpeed 43617 63.88 4.40Flow 43513 111.54 61.52

Temperature 43617 19.30 7.61Rainfall 43617 0.29 0.46

Wind Speed 43617 2.37 1.83

5.3 Deep Learning Model

In an effort to build a predictive model of accident prediction/detection, we construct a

neural network based on a series of combinations of deep learning primitives. This deep

learning architecture allows us to predict future speed values based on the historical data of

upstream and downstream speed and flow. The input features are selected considering the

dynamics of traffic after an accident occurrence and based on the fundamentals of traffic

engineering and flow theories. Since we are dealing with time-series and previous states

are important in the prediction, we employed an LSTM architecture as the baseline of our

model. In order to take the spatial dependencies of road sections into account, we passed


downstream and upstream information as input features in training our model. To increase

the rate of training, we selected the bidirectional wrapper as it is a better fit for our purpose.

5.4 Feature Engineering

Preparing the data for Time Series forecasting (LSTMs in particular) can be tricky. Intu-

itively, we need to predict the value at the current time step by using the history (n time

steps from it). The temporal resolution of the traffic data is 5 minutes. We chose 5 time

steps to make the sequences. Hence, it is going to look at the 25 minutes before each point

to train the model. In our experiment, we select the traffic flow and speed of the past 25

minutes, which is a time sequence of 5 data points, for 5 stations on the upstream and 5

stations on the downstream to predict the coming traffic speed. We select 80% of the data to

train our model. Additionally, the weather features (i.e., temperature, rainfall, wind speed,

and weather condition) are added to the input vector.

5.5 Results

After conditioning the data and building the sequences of time, the bidirectional LSTM

model is built using the Keras [11] framework on top of TensorFlow [2] using the Adam

optimizer [27] with mean squared error as the loss function. The model was trained with 10

epochs with a batch size of 32. We used the Google Colab environment to run the analysis

on a GPU.

In this section, we present the results of the analysis. The prediction and anomaly detection

has been done for 6 different locations (with an average of 1 mile of distance between each

5.5. Results 27

location). We present the results for one sample location in this section. However, the graphs

and results for other locations are presented in the Appendix for further demonstration.

Figure 5.5 shows the loss values after training the model for 10 epochs. It can be seen the

model learns with a satisfactory rate. It should be noted that by increasing the number of

epochs the model tended to over-fitting. We selected the 10 epochs to avoid the over-fitting

to better capture the anomalies in future steps. An example of how well the model predicts

Figure 5.5: Loss Value for training for Postmile 517.916

the speed values is presented in Figure 5.6. Since traffic speed shows a regular pattern in

normal and undisputed situations, it is expected that the model does not capture the extreme

points as well as the other points. Indeed, the extreme points are the abnormal points of the

data set which potentially are formed due to a disruption to the traffic flow (i.e., accidents,

hazards, or road closures). In order to find the extreme points and investigate the potential

for describing the accidents, we calculate the Mean Absolute Error (MAE) on the training

data. The idea is to find the points that the actual value is significantly different from the

predicted one. Figure 5.7 shows the histogram of loss value for training data. We picked a

threshold of 1.5, as not much of the loss is larger than that. Using the threshold, we can


Figure 5.6: Actual and prediction of speed values for Postmile 517.916

Table 5.5: A sample of anomaly points in anomaly DataFrame

Time Loss Threshold Anomaly11/1/18 6:30 4.16351712 1.5 True11/1/18 6:35 6.4858084 1.5 True11/1/18 6:40 7.89771065 1.5 True11/1/18 6:45 8.5084442 1.5 True11/1/18 6:50 8.67621649 1.5 True11/1/18 6:55 8.83714712 1.5 True11/1/18 7:00 8.66085566 1.5 True

turn the problem into a simple binary classification task:

• If the reconstruction loss for an example is below the threshold, we will classify it as a

normal speed,

• Alternatively, if the loss is higher than the threshold, we will classify it as an anomaly.

For each location on the roadway, a dataframe including the anomaly points is generated.

Table 5.5 presents an example of the anomaly points. We then calculate the MAE values

for test data. We build a DataFrame containing the loss and the anomalies (values above the

5.5. Results 29

Figure 5.7: Histogram of loss value for training for Postmile 517.916

Table 5.6: A sample of anomaly events DataFrame

ID Start time End time1 11/1/18 6:30 11/1/18 8:252 11/5/18 7:20 11/5/18 9:153 11/6/18 7:00 11/6/18 9:25

threshold). Figure 5.8 shows the anomalies found in the test data. The red dots (anomalies)

are mostly located in the extreme values of speed. In normal anomaly problems, one point

may suffice to describe the anomaly event. However, in our case and due to the temporal

resolution of data, a single point of anomaly may be a noise or not related to any specific

extreme situation. As can be seen in Figure 5.8, in many cases anomaly points are located

on a draw-down and draw-up cycle similar to ones in Figure 4.3. This observation appears

consistent with the dynamics of traffic when a disruption like an accident occurs. Hence, we

aggregated the anomaly points that formed a cycle as one anomaly event. The results have

been stored in a secondary DataFrame for interpretation. A sample of this DataFrame is

presented in Table 5.8. We further removed the anomaly points that were not followed by

any consecutive point and did not form a cycle as they might be indicating noise or random


Figure 5.8: Anomaly points for test dataset for Postmile 517.916

fluctuation in speed values. Once the anomaly events have been produced, the connection

with accidents could be explored.

Considering the spatial heterogeneity, we present the results for 6 Postmile stations in the

region of study. For each station, we looked at the number of reported incidents (traffic

collisions) by CHP for 5 miles on the upstream and 5 miles on the downstream. The results

are presented in Figure 5.9. The results show that the number of detected anomalies is

close to the number of accidents within that area. The lower number of detected anomalies

compared to the actual accident reports is in-line with real-world accident situations. In

reality, not all accidents cause a significant change in traffic flow. The fast response and

clearance time, the severity of the accidents and light traffic at the time of the accident, are

among the reasons that may lead to non-significant changes in the accidents.

The total number of accidents is generally a good indicator of long-term prediction. However,

it might not be good at predicting short-term traffic accidents, especially if the goal is

5.5. Results 31

improved emergency management or travelers’ information systems. Quantitatively, incident

Figure 5.9: Comparison between the number of actual incidents (reported by CHP)anddetected anomaly events

detection performance has been assessed by the performance measures used in past studies

[28, 38].

• Detection Rate (DR) is the ratio of number of detected accidents to the total

number of actual accidents (i.e., reported by CHP).

DR =Total number of detected incidents

Total number of actual incidents∗ 100% (5.1)

• False Alarm Rate (FAR) is defined as the ratio of the number of false alarms (i.e.,

in our case anomalies that did not match any accident reports from CHP) to the total


number of detected anomalies.

FAR =Total number of false alarm cases

Total number of algorithm applications∗ 100% (5.2)

• Mean Time to Detect (MTTD) is defined as the average of time between the actual

start of the accident and time when the model captured the start of the accident.

MTTD =Total time elapsed between detecting incidents

Total number of incidents detected(5.3)

• Performance Index (PI) integrates all 3 performance measures (DR, FAR, and

MTTD) to evaluate the overall performance of the anomaly detection framework ([40]).

The lower values of PI are associated with better performance of the model. Since DR

can be 100% or FAR can be 0% during training, the PI measure is slightly modified

with the constants (1.01 and 0.001) to handle such cases, similar to [40].

PI = (1.01− DR

100) ∗ (FAR

100+ 0.001) ∗MTTD (5.4)

Table 5.7 presents the performance evaluation metrics. DR values show potential in the

detection of a good number of actual accidents using the anomaly detection framework. The

relatively low value of DR in some cases can be explained by the nature of accidents and

how some accidents may not generate significant changes in the speed of the vehicles (as

the recovery may happen quickly or the traffic has been light in the pre-accident state). On

the other hand, FAR values vary among different locations. The high FAR values could be

explained by the fact that some of the detected anomalies are associated with non-accident

disruptions in the traffic. For example, an anomaly could be related to a traffic hazard,

animal crossing, defective traffic signals, or closure on the road. Since the focus of this study

5.5. Results 33

Table 5.7: Variation of performance evaluation metrics for different Postmiles in the area ofstudy

Postmile Station DR FAR MTTD (mins) PI516.593 0.667 0.136 13.1 0.031515.173 0.651 0.292 16.8 0.066513.503 0.870 0.299 14.4 0.066511.543 0.650 0.267 12.0 0.044510.293 0.692 0.286 19.5 0.075508.463 0.739 0.195 15.4 0.046

was on the collisions, these types of events were excluded from the accident dataset. How-

ever, further demonstration could explain the cause of the anomalies with further details.

Interestingly, MTTD values tend to be small, which verifies the potential of the proposed

framework for implementation in emergency response systems. Overall, the aggregation of

these metrics (i.e., PI) is found promising as shown in Table 5.7. It should be noted that once

the threshold for anomalies changes, the framework may be able to capture more accidents

and improve the model’s performance.

This result ties well with previous studies wherein incident detection algorithms have been

developed to leverage large-scale traffic data for traffic accident detection considering the

data resolution and scale of analysis. The most related study in terms of resolution and

scope of the study achieved an average DR of %85, FAR of %0.15, MTD of 10 minutes, and

PI of 0.0025, where the authors used the high-resolution INRIX data and denoised thresh-

olds for anomaly detection [7]. Even though we did not replicate the previously reported

method proposed by Chakraborty et al. [7], our results show promising potential for accident

detection/prediction. We speculate that the difference among the performance of different

algorithms might be due to the threshold setting and data resolution used in this study

which should be further investigated in future works.


5.6 Anomaly Inference

In this section, we investigate the short-term performance of anomalies and test if they

match reported accident reports. In general, the anomaly events observations are classified

into three major groups. The anomaly events in the first group are the ones that happen

at a close time to an accident report on the upstream points. The second group included

the events that were matched with an accident that occurs on the downstream points. The

third group was the anomalies that did not match with any close upstream or downstream

accident. Table 5.8 presents the anomaly events with the matched potential causes for sta-

tion 516.593. To match the accident on the CHP dataset with the detected anomaly events,

for each anomaly event, we looked at the accidents that occur 5 miles before and after the

anomalies. The accidents that happen at a similar time or close to the timespan of anomalies

have been identified as the potential accident caused by the anomaly.

In the first group, the anomaly is detected at a point before the reported accident. From

the temporal point of view, in almost all of the observations, the start time of the anomaly

event is after the time of the accident. This is in line with traffic theory fundamentals and

the shock-wave concept. As traffic theory states, when an accident occurs, a stationary bot-

tleneck is formed and congestion (i.e., draw-draw-up cycle) forms upstream. This explains

the observations in group 1.

Interestingly, in group 2, the detected anomaly’s timestamp is before the reported accident.

A summary of the number of anomaly events in each group is presented in Table 5.9. From

the spatial point of view, the anomalies in this group vary in terms of their location relative

to the location of the accident. In some cases, the location of detected anomalies is before

the reported accident, while they are located after the accident location in other cases. This

makes good sense since our model predicts the speed values based on the upstream and

downstream data. This might also be due to the nature of the bidirectional training process

5.6. Anomaly Inference 35

Table 5.8: A sample of anomaly events DataFrame

Anomaly Event IncidentStart DateTime End DateTime Start Time Location Type11/1/2018 6:30 11/1/2018 8:25 11/1/2018 6:21 517.2 1181-Trfc Collision-Minor Inj11/5/2018 7:20 11/5/2018 9:15 N/A N/A N/A11/6/2018 7:00 11/6/2018 9:25 N/A N/A N/A11/6/2018 16:50 11/6/2018 17:00 11/6/2018 17:23 519.7 1182-Trfc Collision-No Inj11/6/2018 19:05 11/6/2018 19:05 11/6/2018 17:38 520 1183-Trfc Collision-Unkn Inj11/7/2018 8:00 11/7/2018 8:25 N/A N/A N/A11/8/2018 7:45 11/8/2018 9:00 11/8/2018 8:47 514.4 1182-Trfc Collision-No Inj11/8/2018 16:25 11/8/2018 18:15 11/8/2018 18:51 503.5 1183-Trfc Collision-Unkn Inj11/9/2018 17:25 11/9/2018 17:55 11/9/2018 13:53 521.5 1183-Trfc Collision-Unkn Inj11/13/2018 7:40 11/13/2018 8:50 11/13/2018 7:02 510.2 1182-Trfc Collision-No Inj11/14/2018 8:00 11/14/2018 8:05 11/14/2018 7:01 520.3 1182-Trfc Collision-No Inj11/14/2018 16:25 11/14/2018 16:30 11/14/2018 16:03 517.2 1183-Trfc Collision-Unkn Inj11/15/2018 7:15 11/15/2018 7:20 11/15/2018 7:39 515.8 1183-Trfc Collision-Unkn Inj11/15/2018 7:35 11/15/2018 7:40 11/15/2018 7:38 515.8 1183-Trfc Collision-Unkn Inj11/16/2018 15:55 11/16/2018 16:55 11/16/2018 14:37 518.7 1182-Trfc Collision-No Inj11/21/2018 15:45 11/21/2018 17:50 11/21/2018 15:59 518.7 1182-Trfc Collision-No Inj11/28/2018 8:25 11/28/2018 8:35 11/28/2018 7:01 521.5 1182-Trfc Collision-No Inj11/28/2018 17:15 11/28/2018 17:40 11/28/2018 18:02 519 1183-Trfc Collision-Unkn Inj11/29/2018 9:00 11/29/2018 13:45 11/29/2018 11:55 515.6 1182-Trfc Collision-No Inj11/29/2018 14:00 11/29/2018 18:05 11/29/2018 18:13 517.2 1125-Traffic Hazard11/29/2018 19:50 11/29/2018 20:30 11/29/2018 22:38 520.5 1183-Trfc Collision-Unkn Inj

Table 5.9: Number of the anomaly events in each group

Postmile Group 1 Group 2 Group 3516.593 10 8 4515.173 9 9 6513.503 8 21 10511.543 10 9 10510.293 6 7 5

and the fuller learning process. The overall results show that over the locations of the study,

the anomaly points potentially highlight accidents occurring in the region. However, one

can play around with the threshold and try to get even better results.

Chapter 6

Conclusions

6.1 Conclusion

The problem of accident forecasting is an important problem for transportation and public

safety. Forecasting the accident or detecting the occurrence of one, as fast as possible, could

accelerate the emergency response and lead to faster clearance. In this research, an anomaly

detection approach using the LSTM model was proposed for traffic accident detection/pre-

diction. We proposed an anomaly detection approach based on the prediction from the deep

learning framework. We employed Deep LSTM Bidirectional for speed prediction based on

the historical data of traffic speed and flow of upstream and downstream points. Several

traffic and environmental features were retrieved from big datasets over 20 miles of the I-5 N

freeway in the Sacramento area across 6 months. We used the traffic features for downstream

and upstream of each point to address the spatial dependency. The proposed methodology

consists of two major steps.

First, the speed values at each location of the freeway were predicted. The bidirectional

LSTM model was trained with the traffic features of upstream and downstream points (i.e.,

speed and flow) and weather features (i.e., temperature, rainfall, wind speed, and general

weather condition). We further added the temporal features that are important in generic

traffic conditions (i.e., time components (hour and minute of the day), day of the month,

and day of the week (to take the effect of weekend and weekday into account). Second, the

36

6.2. Future Work 37

anomaly detection module was used to capture the points where the predicted and actual

speed values were significantly different. We set a threshold for loss value based on the

history of loss values in training and testing to capture the anomaly points.

Broadly speaking, the problem of accident detection/prediction has often been addressed

from the risk perspective where the objective is to analyze the number of accidents in a

region. However, it would be of special interest to capture the occurrence of an accident. To

illuminate this uncharted area, this work showed that anomaly detection using deep learning

techniques offers promising solutions to traffic accident prediction/detection and interpreta-

tion of the causes, if unique data properties are well handled. One of the contributions of

this study is the deep-learning-based anomaly detection approach to detect and predict an

accident. Contrary to the studies that focused on the prediction of the number of accidents

in a region, this study focus on the identification of potential accidents downstream or up-

stream of a location on the roadways. While the number of accidents is more applicable in

risk analysis, the current approach could be implemented in traveler’s information and emer-

gency management systems. Furthermore, as presented in the results, the detected anomalies

could be due to non-accident events like hazards or construction zones. This approach has

been functional, but not optimal, and could be further improved in future avenues of research.

6.2 Future Work

As mentioned before, traffic accident detection and prediction is a complex problem. This

thesis and project was designed with future work in mind. Three major areas are outlined

where this project could be expanded in such future works.

38 Chapter 6. Conclusions

6.2.1 Urban-scale Implementation

Future studies should aim to replicate the results on a larger scale for better capturing

the anomalies and related accidents. This is the most obvious way this project could be ex-

panded. The current study focus on a 20-mile section of I-5 in the Sacramento area. Although

the factors we utilized in this study can reveal and predict some traffic accident patterns,

they are far from complete, and other factors, such as driver behavior, road characteristics,

light conditions, and special events, are important as well.

6.2.2 Sensitivity Analysis for Setting the Threshold

The limitations of the present study naturally include the threshold for defining the anomaly

point. One concern about the findings of the anomaly was the threshold value for anomaly

detection. The model performance can be improved by employing sensitivity analysis on

different threshold values. Future research should consider the potential effects of threshold

values more carefully. In this study, we chose the threshold value based on the observations

of loss value in the training and testing sets. However, future research could continue to

explore the effect of different threshold values in detecting the accidents and how it is going

to impact the performance evaluation metrics (i.e., detection rate, false alarm rate, and mean

time to detect). One potential method would be using total variation denoising.

6.2.3 Additional Spatial Dependency

One limitation of this study is the lack of spatial features and dependencies in the feature

groups. Due to the complexity of the traffic accidents, there is not a certain answer for

the spatial extent that an accident impact can be observed. Though we looked for accident

6.2. Future Work 39

occurrence within 5 miles upstream and downstream of each point to address this problem,

future research should examine an automated algorithm to look for the potential accident

occurrence in each region.

Chapter 7

User Manual

The traffic data used for the case study in this project as well as relevant project codes can

be found in https://github.com/farnazkgn/DeepLearningPredictingAccidents [18].

7.1 Software Requirement

The project requires the following software for use:

• SQLite3 for data conditioning and management

• Python 3 or later for data acquisition

• Jupyter Notebook for model development

7.2 Repository Content

The notable files available in the repository are as following:

• Bidirectional model.ipynb

• DataLoader.py

40

https://github.com/farnazkgn/DeepLearningPredictingAccidents

7.2. Repository Content 41

• combined.db

• combined.csv

• data (directory)

– flows (directory)

– incidents (directory)

– speeds (directory)

– weather (directory)

Bidirectional model.ipynb is the Jupyter Notebook that is responsible for training the deep

learning model and generating results. The dataset used for analysis of the case study in

this project are stored as combined.db and combined.csv.

DataLoader.py is the script responsible for aggregating data from these raw data files into

combined.db. The raw data file retrieved from the PeMs includes separate txt files for each

day including the traffic information. DataLoader.py aggregates all of the txt files and stores

them in proper format for further analysis.

To run the script, execute “python DataLoader.py” from the command line. Successful script

execution should take ten to twenty minutes and produce the following output in the termi-

nal:

∼ $ python DataLoader.py

Database created and successfully connected to SQLite Creating table: combined_data

∼ $

Running the script creates combined.db, which contains the full, unsorted dataset. To pro-

duce combined.csv for use in training and evaluating the model, some processing of the

dataset using SQL is required. The first step is to launch the SQLite command-line inter-

face:

42 Chapter 7. User Manual

∼ $ sqlite3

Then, use the following commands to open combined.db and export the sorted contents to

combined.csv:

.open combined.db

.headers on

.mode csv

.output combined.csv

SELECT * from combined_data ORDER BY time;

.quit

Exporting the sorted data should take less than a minute. After completion, it is advisable

to briefly browse combined.csv to ensure there are no major issues with data quality. At this

point, Bidirectional model.ipynb can be run using your preferred Jupyter Notebook manager.

Bidirectional model.ipynb includes the deep learning model. The prerequisite libraries to run

the models are Tensorflow and Keras. There are multiple methods to install these packages.

The easiest method is to use pip and enter the following in the command line (it is recom-

mended to use the latest stable release with CPU and GPU support):

$ pip install –upgrade tensorflow

Once Tensorflow is installed, the Keras library is required to be installed using the following

command:

$ pip install keras

The description for each step of the deep learning model is available in the notebook. It

is recommended to run the model on a GPU since it is going to be faster and more effi-

cient. We used the Google Colab environment to run the code. Google Colab is a free cloud

service based on Jupyter Notebooks that supports a free GPU. To use Google Colab, you

need to upload the Jupyter notebook, upload the dataset directly, or use the Google Drive

connection. Once the requirement is met, you can run the code similar to any other Jupyter

7.2. Repository Content 43

notebook.

Chapter 8

Developer Manual

8.1 Data

Traffic and incident data for California could be found on http://pems.dot.ca.gov/. PeMS

is also an Archived Data User Service (ADUS) that provides over ten years of data for his-

torical analysis. To use this site, you must apply for an account. Registering only takes

a few minutes. Accounts are typically approved within one to two business days. Further

instruction for the data provided by PeMS and how to use the website could be found at:

http://pems.dot.ca.gov/Papers/PeMS_Intro_User_Guide_v6.pdf

As noted, weather data has been provided by MesoWest. The first step is to select a

weather station. The MesoWest main home page provides quick access to station data

through the “Station Search” section. Additional information about how to search for the

station is available on https://mesowest.utah.edu/html/help/userguide.html. Once

the weather station is chosen, users can download the weather variables (e.g., temperature,

wind direction, heat index, snow depth, etc.) for different periods using the following API:

https://developers.synopticdata.com/. To use this API, users must create an account

that will be immediately approved and ready to use.

44

http://pems.dot.ca.gov/

http://pems.dot.ca.gov/Papers/PeMS_Intro_User_Guide_v6.pdf

https://mesowest.utah.edu/html/help/userguide.html

https://developers.synopticdata.com/

8.2. Database Implementation 45

8.2 Database Implementation

The database management module has been developed by Elias Gorine and Jacob Smethurst

as a fulfillment of the CS4624 project. Further details and information could be found on

the project report available in VTechWorks [18].

The database schema is constructed so that for every five-minute interval between 12 AM

on July 1, 2018, and 11:55 PM on November 30, 2018, there is data for traffic speed, traffic

flow, and weather (air temperature, wind speed, and precipitation) at each of the 30 data

collection stations from Postmile 504 and Postmile 520 on I-5 N in California. The full list

of locations for the traffic data collection stations used for this project is [504.223, 504.793,

506.383, 507.504, 507.953, 508.463, 509.013, 510.094, 510.293, 510.643, 511.341, 511.543,

512.073, 512.435, 512.753, 513.503, 513.998, 514.662, 515.173, 515.973, 516.593, 517.093,

517.916, 518.543, 518.864, 519.193, 519.571, 519.863, 519.874, 520.744]. However, it can be

further expanded for other freeways in California depending on the area of analysis.

The first column in the database schema is a timestamp in the form MM/DD/YYYY HH:

MM. Then, for each of the listed data collection station locations (x), there is a column for

each of the data types in the form sx, fx, tx, wx, and px. Column sx represents the traffic

speed at station location x, column fx represents the traffic flow at station location x, and

so on with t = air temperature, w = wind speed, and p = precipitation.

8.2.1 Database Management with SQL

The flow_reader.py, incident_reader.py, and speed_reader.py scripts are made to aggregate

every value in the raw data files into a SQLite database. These scripts take longer to exe-

cute than DataLoader.py, which organizes data specifically in the schema used by our model,

46 Chapter 8. Developer Manual

but they conveniently aggregate all the data so that it can be manipulated for any desired

application. That is to say, DataLoader.py is less extensible and is built directly for our ap-

plication, whereas the flow_reader.py, incident_reader.py, and speed_reader.py scripts are

more extensible, but perform more slowly and parse possibly unneeded data from the raw

files.

We attempted to make our database population scripts modular to promote extensibility

by future developers. We recognize that successful accident detection models may include

many more factors than our code does, such as the vertical gradient of the road, a measure

of the road degradation, or a measure of the tightness of a curve in the road. If a developer

is seeking to add a new factor to the model, they should first find a high-quality source for

the data and ensure they have a way of finding the location and time of each data point in

the form used by our project.

To create a new database reader module, a developer can follow the example of flow_reader.py,

incident_reader.py, or speed_reader.py. These files create a class with fields corresponding

to the different columns in the CSV data file. They then use Python’s CSV reader capabili-

ties to parse the files and populate an array of the new class objects. Depending on the data

used by the developer, there may be additional work required to create a timestamp in the

same format as the rest of the data entries.

Then, db_loader.py can be easily adapted to load this list of Python objects into the

SQLite3 database. All insert and update queries to the database that are managed by

the db_loader.py script are protected by parameterized queries. The insert queries in

db_loader.py are prepared within prepared statements. Then, as each data point is read

out of its associated data point list, the information is broken down by column and each

column is loaded into the prepared statement as a dynamic parameter.

In part, this decision was made given the fact that it is not possible to verify the integrity

of all of the data that is being loaded into the database. Rather than verifying that each

8.2. Database Implementation 47

Figure 8.1: Loading speed data using parameterized queries

datapoint contains no hidden SQL queries within them, we instead chose to use prepared

statements and dynamic parameters (bind variables). The primary goal was to prevent SQL

injection attacks which may come from unverified data sources. Although all of the data

sources for this iteration of the project are typically government-based and trustworthy –

should the project be extended, future developers may attempt to pull data from various

sources. These sources may be compromised, and if a malicious SQL statement ends up

within the source-data, unpredictable outcomes are possible. For this reason, we highly

recommend that you follow the same practices in the provided source code and utilize pa-

rameterized queries and bind variables.

Another issue to consider is performance. Consider that it is the nature of the db_loader.py

script to load hundreds of thousands – or millions – of records into the DB. SQLite, like

other databases, builds execution plans to determine the best strategy to execute a query.

An execution plan can be stored in an execution plan cache. However, this only works if

the SQL statement to be executed is the same. This is not the case with our loads since the

data parameters vary. For this reason, the database would handle each insert operation by

building a whole new execution plan. This is certainly not ideal as it wastes a lot of time,

and performance suffers as a result.

By using bind variables, the actual values of our data points are not being written, instead,

the bind variables act as placeholders within the prepared statement. This means that the

SQL statements will not change and the same execution plan can be reused, thus improving

performance. Considering the volume of data that can potentially be loaded for extensions

48 Chapter 8. Developer Manual

of this project, it is recommended to follow the aforementioned practices to ensure that the

performance of the database does not suffer in the build phase.

8.2.2 Database Indexes

Indexes are added to the frequently accessed columns and tables of our database. An index

is a data structure that helps improve the performance of queries. By default, SQLite uses

B-trees (balanced trees, not binary trees) to organize indexes. The general rule we followed

was adding indexes to the most frequently searched and accessed columns of each table.

While it is not necessary, we recommend that any tables added as extensions to the project

use indexes where appropriate. If the table has columns that are frequently accessed, then

an index may be added to improve performance.

Keep in mind that indexes have some overhead themselves. They occupy space on the disk

and in the DB memory itself. Moreover, with each update/insertion/deletion, the index will

also have to be updated. Having too many indexes, or indexes on unnecessary columns,

can introduce performance issues. For these reasons, we added indexes to columns of tables

which would be joined together or searched frequently. For example, if we needed to join the

flow_raw and speed_raw tables using the pm_abs (absolute Postmiles) columns, it would

be wise to create indexes on these columns and tables. In SQLite this is quite easy:

CREATE INDEX Postmiles_index_flow ON flow_raw(pm_abs);

CREATE INDEX Postmiles_index_speed ON speed_raw(pm_abs);

Bibliography

[1] Understanding RNN and LSTM. https://towardsdatascience.com/

understanding-rnn-and-lstm-f7cdf6dfc14e. Accessed: 2020-04-22.

[2] Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig

Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat,

Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Joze-

fowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga,

Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit

Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasude-

van, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke,

Yuan Yu, and Xiaoqiang Zheng. TensorFlow: Large-scale machine learning on hetero-

geneous systems, 2015. URL https://www.tensorflow.org/. Accessed: 2020-02-18.

[3] Joaquín Abellán, Griselda López, and Juan De OñA. Analysis of traffic accident severity

using decision rules via decision trees. Expert Systems with Applications, 40(15):6047–

6054, 2013.

[4] Ruth Bergel-Hayat, Mohammed Debbarh, Constantinos Antoniou, and George Yannis.

Explaining the road accident risk: weather effects. Accident Analysis & Prevention, 60:

456–465, 2013.

[5] James P Byrne, N Clay Mann, Mengtao Dai, Stephanie A Mason, Paul Karanicolas,

Sandro Rizoli, and Avery B Nathens. Association between emergency medical service

response time and motor vehicle crash mortality in the United States. JAMA surgery,

154(4):286–293, 2019.

49

https://towardsdatascience.com/understanding-rnn-and-lstm-f7cdf6dfc14e

https://towardsdatascience.com/understanding-rnn-and-lstm-f7cdf6dfc14e

https://www.tensorflow.org/

50 BIBLIOGRAPHY

[6] Ciro Caliendo, Maurizio Guida, and Alessandra Parisi. A crash-prediction model for

multilane roads. Accident Analysis & Prevention, 39(4):657–670, 2007.

[7] Pranamesh Chakraborty, Chinmay Hegde, and Anuj Sharma. Data-driven parallelizable

traffic incident detection using spatio-temporally denoised robust thresholds. Trans-

portation research part C: emerging technologies, 105:81–99, 2019.

[8] Li-Yen Chang and Wen-Chieh Chen. Data mining of tree-based models to analyze

freeway accident frequency. Journal of safety research, 36(4):365–375, 2005.

[9] Chao Chen. Freeway performance measurement system (PeMS). UC Berkeley:

California Partners for Advanced Transportation Technology, 2003. URL https:

//escholarship.org/uc/item/6j93p90t.

[10] Quanjun Chen, Xuan Song, Harutoshi Yamada, and Ryosuke Shibasaki. Learning deep

representation from big and heterogeneous data for traffic accident inference. In Thir-

tieth AAAI Conference on Artificial Intelligence, 2016.

[11] François Chollet et al. Keras, 2015. URL https://github.com/fchollet/keras.

Accessed: 2020-02-18.

[12] Miao Chong, Ajith Abraham, and Marcin Paprzycki. Traffic accident analysis using

machine learning paradigms. Informatica, 29(1), 2005.

[13] Marie Crandall. Rapid emergency medical services response saves lives of persons injured

in motor vehicle crashes. JAMA surgery, 154(4):293–294, 2019.

[14] Francois Dion, Hesham Rakha, and Youn-Soo Kang. Comparison of delay estimates at

under-saturated and over-saturated pre-timed signalized intersections. Transportation

Research Part B: Methodological, 38(2):99–122, 2004.

https://escholarship.org/uc/item/6j93p90t

https://escholarship.org/uc/item/6j93p90t

https://github.com/fchollet/keras

BIBLIOGRAPHY 51

[15] Yanjie Duan, Yisheng Lv, Yu-Liang Liu, and Fei-Yue Wang. An efficient realization

of deep learning for traffic data imputation. Transportation research part C: emerging

technologies, 72:168–181, 2016.

[16] Kartik Dwivedi, Kumar Biswaranjan, and Amit Sethi. Drowsy driver detection using

representation learning. In 2014 IEEE international advance computing conference

(IACC), pages 995–999. IEEE, 2014.

[17] Felix A Gers, Jürgen Schmidhuber, and Fred Cummins. Learning to forget: Continual

prediction with LSTM. 9th International Conference on Artificial Neural Networks:

ICANN ’99, 1999.

[18] Elias Gorine, Farnaz Khaghani, Junkai Zeng, and Jacob Smethurst. Deep learning

predicting accidents. http://hdl.handle.net/10919/98230, 2020. Accessed: 2020-

05-06, Virginia Tech, CS4624 team term project.

[19] Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. Speech recognition with

deep recurrent neural networks. In 2013 IEEE international conference on acoustics,

speech and signal processing, pages 6645–6649. IEEE, 2013.

[20] Antoine Hébert, Timothée Guédon, Tristan Glatard, and Brigitte Jaumard. High-

resolution road vehicle collision prediction for the City of Montreal. arXiv preprint

arXiv:1905.08770, 2019.

[21] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computa-

tion, 9(8):1735–1780, 1997.

[22] John Horel, Michael Splitt, L Dunn, J Pechmann, B White, C Ciliberti, S Lazarus,

J Slemmer, D Zaff, and J Burks. Mesowest: Cooperative mesonets in the western

United States. Bulletin of the American Meteorological Society, 83(2):211–226, 2002.

http://hdl.handle.net/10919/98230

52 BIBLIOGRAPHY

[23] Zhiting Hu, Xuezhe Ma, Zhengzhong Liu, Eduard Hovy, and Eric Xing. Harnessing

deep neural networks with logic rules. arXiv preprint arXiv:1603.06318, 2016.

[24] Zhiheng Huang, Wei Xu, and Kai Yu. Bidirectional LSTM-CRF models for sequence

tagging. arXiv preprint arXiv:1508.01991, 2015.

[25] Yong-Kul Ki. Accident detection system using image processing and MDR. Interna-

tional Journal of Computer Science and Network Security IJCSNS, 7(3):35–39, 2007.

[26] Whui Kim, Hyun-Kyun Choi, Byung-Tae Jang, and Jinsu Lim. Driver distraction

detection using single convolutional neural network. In 2017 international conference

on information and communication technology convergence (ICTC), pages 1203–1205.

IEEE, 2017.

[27] Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization.

CoRR, abs/1412.6980, 2014. URL http://arxiv.org/abs/1412.6980.

[28] Xiangmin Li, William HK Lam, and Mei Lam Tam. New automatic incident detec-

tion algorithm based on traffic data collected for journey time estimation. Journal of

transportation engineering, 139(8):840–847, 2013.

[29] Xiugang Li, Dominique Lord, Yunlong Zhang, and Yuanchang Xie. Predicting motor

vehicle crashes using support vector machine models. Accident Analysis & Prevention,

40(4):1611–1618, 2008.

[30] Lei Lin, Qian Wang, and Adel W Sadek. A novel variable selection method based

on frequent pattern tree for real-time traffic accident risk prediction. Transportation

Research Part C: Emerging Technologies, 55:444–459, 2015.

[31] Bappaditya Mandal, Liyuan Li, Gang Sam Wang, and Jie Lin. Towards detection of

http://arxiv.org/abs/1412.6980

BIBLIOGRAPHY 53

bus driver fatigue based on robust visual analysis of eye state. IEEE Transactions on

Intelligent Transportation Systems, 18(3):545–557, 2016.

[32] Adolf D May. Traffic flow fundamentals. Transportation Research Board, 1990.

[33] Hailang Meng, Xinhong Wang, and Xuesong Wang. Expressway crash prediction based

on traffic big data. In Proceedings of the 2018 International Conference on Signal

Processing and Machine Learning, pages 11–16, 2018.

[34] Saleh R Mousa, Peter R Bakhit, and Sherif Ishak. An extreme gradient boosting method

for identifying the factors contributing to crash/near-crash events: a naturalistic driving

study. Canadian Journal of Civil Engineering, 46(8):712–721, 2019.

[35] Alameen Najjar, Shun’ichi Kaneko, and Yoshikazu Miyanaga. Combining satellite im-

agery and open data to map road safety. In Thirty-First AAAI Conference on Artificial

Intelligence, 2017.

[36] Jutaek Oh, Simon P Washington, and Doohee Nam. Accident prediction model for

railway-highway interfaces. Accident Analysis & Prevention, 38(2):346–356, 2006.

[37] VA Olutayo and AA Eludire. Traffic accident analysis using decision trees and neural

networks. International Journal of Information Technology and Computer Science, 2:

22–28, 2014.

[38] Emily Parkany and Chi Xie. A complete review of incident detection algorithms &

their deployment: what works and what doesn’t. Technical report, 2005. URL http:

//www.uvm.edu/~transctr/pdf/netc/netcr37_00-7.pdf.

[39] Honglei Ren, You Song, JingXin Liu, Yucheng Hu, and Jinzhi Lei. A deep learn-

ing approach to the prediction of short-term traffic accident risk. arXiv preprint

arXiv:1710.09543, 2017.

http://www.uvm.edu/~transctr/pdf/netc/netcr37_00-7.pdf

http://www.uvm.edu/~transctr/pdf/netc/netcr37_00-7.pdf

54 BIBLIOGRAPHY

[40] Jimmy SJ Ren, Wei Wang, Jiawei Wang, and Stephen Liao. An unsupervised feature

learning approach to improve automatic incident detection. In 2012 15th International

IEEE Conference on Intelligent Transportation Systems, pages 172–177. IEEE, 2012.

[41] Paul I Richards. Shock waves on the highway. Operations research, 4(1):42–51, 1956.

[42] Matthias Schlögl, Rainer Stütz, Gregor Laaha, and Michael Melcher. A comparison of

statistical learning methods for deriving determining factors of accident occurrence from

an imbalanced high resolution dataset. Accident Analysis & Prevention, 127:134–149,

2019.

[43] Ankit Parag Shah, Jean-Bapstite Lamare, Tuan Nguyen-Anh, and Alexander Haupt-

mann. Cadp: A novel dataset for CCTV traffic camera based accident analysis. In 2018

15th IEEE International Conference on Advanced Video and Signal Based Surveillance

(AVSS), pages 1–9. IEEE, 2018.

[44] Athanasios Theofilatos. Incorporating real-time traffic and weather data to explore road

accident likelihood and severity in urban arterials. Journal of safety research, 61:9–21,

2017.

[45] Pravin Varaiya. Freeway Performance Measurement System, PeMS v3, Phase 1. UC

Berkeley: California Partners for Advanced Transportation Technology, 2001. URL

https://escholarship.org/uc/item/20p1j2w7.

[46] Xuesong Wang and Mohamed Abdel-Aty. Temporal and spatial analyses of rear-end

crashes at signalized intersections. Accident Analysis & Prevention, 38(6):1137–1150,

2006.

[47] Billy M Williams and Angshuman Guin. Traffic management center use of incident

https://escholarship.org/uc/item/20p1j2w7

BIBLIOGRAPHY 55

detection algorithms: Findings of a nationwide survey. IEEE Transactions on intelligent

transportation systems, 8(2):351–358, 2007.

[48] Rongjie Yu and Mohamed Abdel-Aty. Utilizing support vector machine in real-time

crash risk evaluation. Accident Analysis & Prevention, 51:252–259, 2013.

[49] Rose Yu, Yaguang Li, Cyrus Shahabi, Ugur Demiryurek, and Yan Liu. Deep learning:

A generic approach for extreme condition traffic forecasting. In Proceedings of the 2017

SIAM international Conference on Data Mining, pages 777–785. SIAM, 2017.

[50] Zhuoning Yuan, Xun Zhou, Tianbao Yang, James Tamerius, and Ricardo Mantilla. Pre-

dicting traffic accidents through heterogeneous urban data: A case study. In Proceedings

of the 6th International Workshop on Urban Computing (UrbComp 2017), Halifax, NS,

Canada, volume 14, 2017.

[51] Zhuoning Yuan, Xun Zhou, and Tianbao Yang. Hetero-convlstm: A deep learning

approach to traffic accident prediction on heterogeneous spatio-temporal data. In Pro-

ceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery

& Data Mining, pages 984–992, 2018.

[52] Wojciech Zaremba, Ilya Sutskever, and Oriol Vinyals. Recurrent neural network regu-

larization. arXiv preprint arXiv:1409.2329, 2014.

Appendices

56

Appendix A

Supplementary Results for Various

Road Postmiles

Figure A.1: Loss value for training for Postmile 508.463

57

58 Appendix A. Supplementary Results for Various Road Postmiles



59




Figure A.6: Actual and prediction of speed values for Postmile 508.463


61





Figure A.11: Histogram of loss value for training for Postmile 508.463

63



a deep learning approach to predict accident occurrence ...€¦ · a deep learning approach to...

Documents