a deep learning approach to predict accident occurrence ...€¦ · a deep learning approach to...
TRANSCRIPT
A Deep Learning Approach to Predict Accident Occurrence Basedon Traffic Dynamics
Farnaz Khaghani
Thesis submitted to the Faculty of the
Virginia Polytechnic Institute and State University
in partial fulfillment of the requirements for the degree of
Master of Science
in
Computer Science and Application
Edward A. Fox, Chair
Farrokh Jazizadeh, Co-chair
Hoda M. Eldardiry
May 11, 2020
Blacksburg, Virginia
Keywords: Deep learning, LSTM, Bidirectional LSTM, Database management, Anomaly
detection
Copyright 2020, Farnaz Khaghani
A Deep Learning Approach to Predict Accident Occurrence Basedon Traffic Dynamics
Farnaz Khaghani
(ABSTRACT)
Traffic accidents are of concern for traffic safety; 1.25 million deaths are reported each
year. Hence, it is crucial to have access to real-time data and rapidly detect or predict
accidents. Predicting the occurrence of a highway car accident accurately any significant
length of time into the future is not feasible since the vast majority of crashes occur due to
unpredictable human negligence and/or error. However, rapid traffic incident detection could
reduce incident-related congestion and secondary crashes, alleviate the waste of vehicles’ fuel
and passengers’ time, and provide appropriate information for emergency response and field
operation. While the focus of most previously proposed techniques is predicting the number
of accidents in a certain region, the problem of predicting the accident occurrence or fast
detection of the accident has been little studied. To address this gap, we propose a deep
learning approach and build a deep neural network model based on long short term memory
(LSTM). We apply it to forecast the expected speed values on freeways’ links and identify
the anomalies as potential accident occurrences. Several detailed features such as weather,
traffic speed, and traffic flow of upstream and downstream points are extracted from big
datasets. We assess the proposed approach on a traffic dataset from Sacramento, California.
The experimental results demonstrate the potential of the proposed approach in identifying
the anomalies in speed value and matching them with accidents in the same area. We show
that this approach can handle a high rate of rapid accident detection and be implemented
in real-time travelers’ information or emergency management systems.
A Deep Learning Approach to Predict Accident Occurrence Basedon Traffic Dynamics
Farnaz Khaghani
(GENERAL AUDIENCE ABSTRACT)
Rapid traffic accident detection/prediction is essential for scaling down non-recurrent conges-
tion caused by traffic accidents, avoiding secondary accidents, and accelerating emergency
system responses. In this study, we propose a framework that uses large-scale historical
traffic speed and traffic flow data along with the relevant weather information to obtain
robust traffic patterns. The predicted traffic patterns can be coupled with the real traffic
data to detect anomalous behavior that often results in traffic incidents in the roadways.
Our framework consists of two major steps. First, we estimate the speed values of traffic at
each point based on the historical speed and flow values of locations before and after each
point on the roadway. Second, we compare the estimated values with the actual ones and
introduce the ones that are significantly different as an anomaly. The anomaly points are
the potential points and times that an accident occurs and causes a change in the normal
behavior of the roadways. Our study shows the potential of the approach in detecting the
accidents while exhibiting promising performance in detecting the accident occurrence at a
time close to the actual time of occurrence.
Dedication
To my parents and sister for their unconditional love and support though they were
thousands of miles away from me.
iv
Acknowledgments
I would like to express my sincere gratitude to my advisor Professor Edward Fox for his pa-
tience and continuous support, who has the attitude and the substance of a genius. Without
his guidance and persistent help, this thesis would not have been possible. It has been my
utmost privilege to work with you.
My appreciation also extends to Professor Hesham Rakha for his insightful comments and
valuable feedback. His timely suggestions with kindness, enthusiasm and dynamism have
enabled me to complete my thesis. I would also like to thank my committee members, Pro-
fessor Farrokh Jazizadeh and Professor Hoda Eldardiry for their time and cooperation in the
completion of this thesis.
I would especially like to acknowledge and thank students in ‘CS 4624: Multimedia, Hy-
pertext, and Information Access’, Elias Gorine and Jacob Smethurst, for assistance with
preparation, conditioning, and processing the data as well as the development of an SQL-
based data management pipeline.
My sincerest appreciation and gratitude go to my parents and my sister Forough, for their
unfailing love and support, for encouraging me every single day to be a better person, and
for giving me wings to fly although they were thousands of miles away.
v
Contents
List of Figures ix
List of Tables xi
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Research Question . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.6 Organization of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Motivating Works 5
2.1 CS 4624: Multimedia, Hypertext, and Information Access . . . . . . . . . . 5
2.2 CEE 5604: Traffic Characteristics and Flow . . . . . . . . . . . . . . . . . . 6
3 Review of Literature 7
3.1 Traffic Accident Prediction Using Classical Techniques . . . . . . . . . . . . 7
3.2 Deep Learning Models for Traffic Accident Prediction . . . . . . . . . . . . . 9
vi
4 Methodology 11
4.1 Traffic Dynamics at the Time of an Accident . . . . . . . . . . . . . . . . . . 11
4.2 Basic Deep Learning Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.2.1 Recurrent Neural Network (RNN) . . . . . . . . . . . . . . . . . . . . 13
4.2.2 Long Short Term Memory (LSTM) . . . . . . . . . . . . . . . . . . . 15
4.2.3 Bidirectional LSTM Recurrent Structure . . . . . . . . . . . . . . . . 16
5 Results 18
5.1 Area of Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.2.1 Traffic and Accident Data . . . . . . . . . . . . . . . . . . . . . . . . 18
5.2.2 Weather Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.3 Deep Learning Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.4 Feature Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.6 Anomaly Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6 Conclusions 36
6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2.1 Urban-scale Implementation . . . . . . . . . . . . . . . . . . . . . . . 38
vii
6.2.2 Sensitivity Analysis for Setting the Threshold . . . . . . . . . . . . . 38
6.2.3 Additional Spatial Dependency . . . . . . . . . . . . . . . . . . . . . 38
7 User Manual 40
7.1 Software Requirement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7.2 Repository Content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
8 Developer Manual 44
8.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
8.2 Database Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
8.2.1 Database Management with SQL . . . . . . . . . . . . . . . . . . . . 45
8.2.2 Database Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Bibliography 49
Appendices 56
Appendix A Supplementary Results for Various Road Postmiles 57
viii
List of Figures
4.1 The fundamental traffic diagrams according to Greenshield [32] . . . . . . . 12
4.2 Position of traffic states at the fundamental diagram when an accident occurs 13
4.3 An example of time–space diagram for typical temporary capacity reduction
(i.e., traffic accident) [14] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.4 Graphic representation of LSTM gates [1] . . . . . . . . . . . . . . . . . . . 16
4.5 The architecture of Bidirectional LSTM model [23] . . . . . . . . . . . . . . 17
5.1 The spatial extent of area of the study . . . . . . . . . . . . . . . . . . . . . 19
5.2 PeMS homepage, available at http://pems.dot.ca.gov/ . . . . . . . . . . . . 21
5.3 MesoWest weather data API map . . . . . . . . . . . . . . . . . . . . . . . 24
5.4 The flow of data through the framework . . . . . . . . . . . . . . . . . . . . 25
5.5 Loss Value for training for Postmile 517.916 . . . . . . . . . . . . . . . . . . 27
5.6 Actual and prediction of speed values for Postmile 517.916 . . . . . . . . . . 28
5.7 Histogram of loss value for training for Postmile 517.916 . . . . . . . . . . . 29
5.8 Anomaly points for test dataset for Postmile 517.916 . . . . . . . . . . . . . 30
5.9 Comparison between the number of actual incidents (reported by CHP)and
detected anomaly events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
8.1 Loading speed data using parameterized queries . . . . . . . . . . . . . . . . 47
ix
A.1 Loss value for training for Postmile 508.463 . . . . . . . . . . . . . . . . . . 57
A.2 Loss value for training for Postmile 510.293 . . . . . . . . . . . . . . . . . . 58
A.3 Loss value for training for Postmile 511.543 . . . . . . . . . . . . . . . . . . 58
A.4 Loss value for training for Postmile 513.503 . . . . . . . . . . . . . . . . . . 59
A.5 Loss value for training for Postmile 515.173 . . . . . . . . . . . . . . . . . . 59
A.6 Actual and prediction of speed values for Postmile 508.463 . . . . . . . . . . 60
A.7 Actual and prediction of speed values for Postmile 510.293 . . . . . . . . . . 60
A.8 Actual and prediction of speed values for Postmile 511.543 . . . . . . . . . . 61
A.9 Actual and prediction of speed values for Postmile 513.503 . . . . . . . . . . 61
A.10 Actual and prediction of speed values for Postmile 515.173 . . . . . . . . . . 62
A.11 Histogram of loss value for training for Postmile 508.463 . . . . . . . . . . . 62
A.12 Loss value for training for Postmile 510.293 . . . . . . . . . . . . . . . . . . 63
A.13 Loss value for training for Postmile 511.543 . . . . . . . . . . . . . . . . . . 63
A.14 Loss value for training for Postmile 513.503 . . . . . . . . . . . . . . . . . . 64
A.15 Loss value for training for Postmile 515.173 . . . . . . . . . . . . . . . . . . 64
x
List of Tables
5.1 Example of traffic Speed data . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.2 A sample of incident data available at PeMS and provided by California High-
way Patrol (CHP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.3 A sample of weather data retrieved from MesoWest . . . . . . . . . . . . . . 24
5.4 Data statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.5 A sample of anomaly points in anomaly DataFrame . . . . . . . . . . . . . . 28
5.6 A sample of anomaly events DataFrame . . . . . . . . . . . . . . . . . . . . 29
5.7 Variation of performance evaluation metrics for different Postmiles in the area
of study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.8 A sample of anomaly events DataFrame . . . . . . . . . . . . . . . . . . . . 35
5.9 Number of the anomaly events in each group . . . . . . . . . . . . . . . . . . 35
xi
Chapter 1
Introduction
1.1 Motivation
Traffic accidents play an important role in traffic safety. According to ASIRT (Association
for Safe International Road Travel), more than 38,000 people die annually in crashes and
car accidents on U.S. roadways. An additional 4.4 million are seriously injured and require
immediate medical attention. Traffic accidents and crashes are the dominant cause of death
in the U.S. for people aged 1-54. Reducing the response time of Emergency Medical Techni-
cians (EMT) to car crashes is a key in increasing survivability of the crash for those involved.
According to studies, counties across the United States with a response time of more than
12 minutes had a motor vehicle collision mortality rate nearly twice that of counties with
a response time of fewer than 7 minutes [5, 13]. For this reason, even a decrease in EMT
response time on the order of fractions of a minute can prove life-saving, especially in serious
collisions at highway speeds. If transportation officials in a state have any advance notice
or warning as to which areas of the state’s interstate highways are most likely to have an
accident at certain times of the day, a decrease in response time might be obtained.
Road traffic accident prediction or rapid detection plays a crucial role in safety manage-
ment and planning. Research on traffic safety has a long tradition as accidents on the roads
are one of the most fatal threats to people. Predicting possible traffic accidents can be a
solution to avoid accidents, reduce damage from them, give the drivers chances to reduce
1
2 Chapter 1. Introduction
the damage by quick response and reaction, or improve the emergency management system.
However, predicting the exact time and location of accidents is practically impossible. An
alternative strategy is to detect the occurrence of the accidents rapidly or identify abnormal
behavior that may lead to an accident. Early detection of accidents results in less delay
and inconvenience, faster emergency response, and faster announcements to users to take a
detour. A nationwide survey on the deployment of accident detection algorithms in Traffic
Management Centers (TMC) showed that 90% of survey respondents feel that the exist-
ing algorithms are not applicable in large-scale real-world systems due to the complicated
and time-consuming calibration or unacceptable false alarm rates [47]. In this study, the
main goal is to develop an accident detection model that can extract maximum information
from the traffic data to generate the normal travel pattern of each segment. Consequently,
anomalous behavior could be captured as a potential accident occurrence.
1.2 Problem
The task of accurate traffic forecasting in extreme conditions is difficult mainly due to the
complex nature of traffic accidents. Another problem with accident prediction/detection is
the scarcity of accidents in both space and time. Due to the limited number of samples, it
is challenging to precisely predict the occurrence of individual accidents. A large number of
existing works on traffic accident detection or prediction are applying classical models such
as classification or regression on limited data. This leads to unsatisfactory performance. The
classification models mainly create a standard predicting model based on the information
learned from the training set. For example, some work employed classic classifiers to predict
if an accident will occur at a specific location during each time window [8, 12]. Another group
of studies focused on the prediction of the number of accidents in a specific area [3, 4, 36].
1.3. Research Question 3
This could be more applicable for risk analysis and unsafe area identification rather than
decreasing the emergency response time.
The goal of this research is to establish a deep learning LSTM neural network model that can
address some of these problems. The deep learning approach provides automatic represen-
tation learning from raw data. Instead of common classification and regression approaches,
we propose an anomaly detection approach. It is based on the difference between prediction
and actual values. It eliminates the cumbersome process of labeling the data as ‘accident’
and ‘non-accident’ and dealing with imbalanced data. Furthermore, the detected anomalous
data represents hazardous situations in addition to potential accidents. Several detailed fea-
tures such as traffic speed, traffic flow, and weather are used to train the predictive model
and identify the abnormal traffic dynamics that lead to an accident.
1.3 Research Question
Do deep learning structures help with the development of tools to predict traffic accidents
occurrence or detect them rapidly? What kind of approaches would fit better for accident
occurrence prediction/detection rather than predicting the number of accidents?
1.4 Hypothesis
With the aforementioned research questions in mind, we present the following hypothesis:
Development of a deep learning model coupled with anomaly detection would be efficient
for accident prediction/detection purposes.
4 Chapter 1. Introduction
1.5 Contribution
In this work, we propose a deep-learning-based anomaly detection approach to detect/predict
accidents on the roadways. We highlight our contribution as follows:
• We collect and fuse heterogeneous big datasets including weather, time, and traffic for
traffic accident prediction/detection.
• To address the spatial dependency of the traffic features and improve the accuracy of
the prediction, we passed the traffic feature sequences of upstream and downstream of
the target point to train the model accordingly.
• We focus on the detection/prediction of accidents and hazardous dynamics rather than
predicting the number of accidents in a region.
1.6 Organization of Thesis
The rest of this document is organized as follows: Chapter 3 includes a review of literature
that discusses research into predictive models, and into the needs of projects which could
leverage a framework like this, to help motivate the design of the framework. Chapter 4 gives
an outline of the design and architecture of our proposed model, including a brief discussion
of the functionality of each of the provided modules. Chapter 5 discusses the implementation
of the model on the traffic and accident data collected from the California PeMS system, and
evaluates the success of the developed model. Chapter 6 summarizes the previous sections,
discusses the findings, and presents proposals for future work. We have provided a user and
developer manual in Chapter 7 and Chapter 8, respectively.
Chapter 2
Motivating Works
This research was inspired by CEE 5604 at Virginia Tech. A part of this work was con-
ducted in collaboration with Jacob Smethurst and Elias Gorine as partial fulfillment of the
requirements of CS 4624 at Virginia Tech. In this section, we present a description of these
two courses to illustrate the relation to this project.
2.1 CS 4624: Multimedia, Hypertext, and Information
Access
CS 4624 at Virginia Tech is a class which uses a project-based learning approach to teach
students the architectures, concepts, data, hardware, methods, models, software, standards,
structures, technologies, and issues involved with: networked multimedia (e.g., image, audio,
video) information, access, and systems; hypertext and hypermedia; electronic publishing;
and virtual reality. The project-based learning approach makes use of a single, semester-
long project to guide students through the learning process. The class normally functions
by grouping the students into teams, with each team being responsible for a small project
related to the overall course goal. These teams are assigned to a client and will have to work
together under the guidance of Dr. Fox and the class GTAs to come up with a way to use
the resources provided to accomplish an overarching, semester-long project goal.
5
6 Chapter 2. Motivating Works
2.2 CEE 5604: Traffic Characteristics and Flow
The goal of the course is to provide a background in traffic flow theory for the analysis
of controlled and uncontrolled roadway facilities. The course focuses on traffic flow theory
as it relates to vehicle steady-state longitudinal motion, behavior during non-steady states
(deceleration and acceleration), traffic stream models, heterogeneous traffic stream flow,
lane-gap acceptance and lane changing modeling, and the estimation of delay upstream of
moving and stationary bottlenecks. Partial fulfillment of the requirements of this course is
from the final project that applies the traffic theory concepts into real-world case studies
that have formed the skeleton of this thesis.
Chapter 3
Review of Literature
To this date, several studies have deployed various data sources or a combination of them
for detection of accidents. In this chapter, we review the studies that have utilized classical
machine learning techniques and deep learning models for accident detection.
3.1 Traffic Accident Prediction Using Classical Tech-
niques
There have been numerous studies to investigate methods for classifying spatial units (e.g.,
road segments) at a given time into classes of ‘accident’ and ‘no accident’. Statistical
techniques, image processing [25], pattern recognition, and artificial intelligence methods
[3, 12, 37] have been widely used to address the accident detection problem. For example,
Chang et al. [8] developed a decision tree model to build a classifier that predicts accidents
with training and testing accuracy of 55%. Lin et al. [30] explored multiple machine learning
techniques such as Random Forest, K-Nearest Neighbor, and Bayesian Network, to predict
accidents along the roadways. Yuan et al. [50] evaluate the performance of Support Vector
Machine (SVM), Decision Tree, Random Forest, and Deep Neural Network (DNN) in pre-
dicting and classifying the accidents on roadways. Caliendo et al. [6] employed the Poisson,
Negative Binomial, and Negative Multinomial regression models for the task of predicting
7
8 Chapter 3. Review of Literature
the number of accidents in multi-lane roadways.
Theofilatos [44] applied Random Forest and Bayesian logistic regression models to the real-
time traffic data of urban arterial roads to study the likelihood of road accident likelihood.
Hebert et al. [20] explored various machine learning methods to manage the class imbal-
ance inherent in accident prediction problems. The authors employed the Balanced Random
Forest algorithm, a variant of the Random Forest machine learning algorithm in Apache
Spark. The results from the experimental case study show that 85% of vehicle collisions are
detected.
Among machine learning techniques, Probabilistic Neural Network (PNN) and Support Vec-
tor Machine (SVM) are two important techniques that have been used to detect accidents
[48]. Studies of SVM application in accident detection and prediction are well documented.
It is acknowledged that SVM models provide a higher correct detection rate and lower false
alarm rate when compared to probabilistic neural network models [50]. For example, Li et
al. [29] assessed the application of SVM models for predicting vehicle crashes, and compared
the performance of SVM models with the Back-Propagation Neural Network (BPNN). It has
been shown that SVM does not have the over-fitting problem that often occurs in BPNNs
and is faster to implement for the specific purpose of accident prediction.
More recently, eXtreme Gradient Boosting (XGBoost) has been leveraged to predict the
occurrence and duration of an accident by Meng [33]. Some authors [34, 42] have also shown
that XGBoost shows a better performance in prediction of the likelihood of an accident
when compared to methods like Logistic Regression, Bayesian Regularized Neural Network,
Bagging Average Neural Networks, and Gradient Boosting.
All of the above works are applied to limited features and small scale traffic accident data
(e.g., one or a small number of roads). Increasing the number of features to improve the
accuracy and expand the spatial scale of the analysis may lead to unsatisfactory performance
and high computational cost. To address these problems, some recent works have used deep
3.2. Deep Learning Models for Traffic Accident Prediction 9
learning approaches.
3.2 Deep Learning Models for Traffic Accident Predic-
tion
A series of recent studies have employed deep learning methods for traffic accident inference.
For example, Convolutional Neural Networks (CNN) have been coupled with vision-based
data (e.g., facial features such as eye movement and blink rates) to detect drivers’ distrac-
tion [26], drowsiness [16], and fatigue [31]. As CNNs work well with vision-based data, they
have been widely used for predicting and analyzing accidents when images and videos are
involved. Shah et al. [43] utilized CNN models to explore and investigate accidents using
the data from closed-circuit television traffic cameras. In another study, Najjar et al. [35]
trained a CNN using historical accident data and satellite images to predict the risk of ac-
cidents on an intersection where they achieved an accuracy of 73%.
Recurrent Neural Networks (RNNs) show promise to work well with sequential data like
time-series. They have also been leveraged for traffic accident prediction thanks to their
generally high performance and the availability of time-series data [46]. For example, Ren
et al. [39] proposed a deep learning approach (RNN) to predict traffic accident risk, where
risk is defined as the number of accidents in a region at a certain time. Chen et al. [10] used
a similar concept of traffic accident risk and developed an Autoenoder deep architecture to
understand the impact of human mobility on traffic accident risk.
More recently, Yuan et al. [51] used the Convolutional Long Short-Term Memory (ConvL-
STM) to predict the number of accidents in a region based on the spatial structure of a road
network, weather information, and volume of traffic. Multiple heterogeneous data have been
collected and integrated using satellite images, traffic camera data, roadway weather infor-
10 Chapter 3. Review of Literature
mation system data, and rainfall data. However, it still focuses on the prediction of frequency
and number of accidents rather than the time of occurrence. Identifying the frequency and
number of accidents could generate useful information for safety analysis. However, for an
efficient emergency management system, predicting the occurrence of an accident or detect-
ing the accident promptly could provide more beneficial information. This study offers a
test of addressing this gap by employing a deep learning model.
Chapter 4
Methodology
In this section, we first look at traffic fundamentals and how an accident impacts the dy-
namics of traffic flow. The principals and fundamentals of traffic theory help us to better
understand and interpret the input features and structure of our model. Next, we review
the basic deep learning models that are going to be used for this study.
4.1 Traffic Dynamics at the Time of an Accident
Traffic accidents are one of the important sources of traffic jams. Accidents cause a temporal
local reduction of capacity. To explain the change in the traffic parameters, we need to look
at the triangular fundamental diagram (Figure 4.1). The fundamental diagram of traffic
flow represents the relation between the traffic features (i.e., flow, speed, and density).
As presented in Figure 4.2, when an accident occurs the traffic moves from uncongested
state (point A) to congested state (point B). This change in the states affects the speed
and flow of the vehicles. In other words, it is going to create a shock-wave that will form
a queue after the bottleneck (i.e., accident location). This phenomenon is often shown in
the space-time diagram and will create a draw-up draw-down cycle in the speed-time graph.
Figure 4.3 illustrates the concept of shock-wave and how the speed of the vehicles is going
to change when the shock-wave happens. In normal cases (i.e., non-accident), the traffic
conditions do not vary significantly in sequences of time series between the upstream and
11
12 Chapter 4. Methodology
Figure 4.1: The fundamental traffic diagrams according to Greenshield [32]
downstream. On the other hand, traffic conditions between the upstream and downstream
fluctuate rapidly when an accident occurs. This fluctuation is a result of the shock-waves
caused by the accident. Mathematically, the speed of a shock-wave (i.e., the speed at which
congestion travels backward from the temporal bottleneck formed because of the accident)
can be derived from the traffic characteristics (i.e., flow rate and density) of the upstream
and downstream. Hence, the change in the speed dynamics when an accident occurs could
be observed more significantly at the road sections after the accident location [41]. That
being said, to detect or predict an accident, we should look for the anomalies where the
queue is formed (backward from the accident location). However, since the loop detectors
(the main source of traffic data in this study) are located at a rough distance of 0.1 miles,
some anomalies may be observed in the upward direction as well. This information about
4.2. Basic Deep Learning Concepts 13
Figure 4.2: Position of traffic states at the fundamental diagram when an accident occurs
the general dynamics of traffic at the time of an accident could enhance our understanding
of the anomaly points and how they should be interpreted.
4.2 Basic Deep Learning Concepts
In this section, we first introduce the basic concepts in neural networks and deep learning
architecture.
4.2.1 Recurrent Neural Network (RNN)
A recurrent neural network is a feature map that contains at least one feedback loop. In
other words, the connections between nodes form a connected graph along with a temporal
14 Chapter 4. Methodology
Figure 4.3: An example of time–space diagram for typical temporary capacity reduction(i.e., traffic accident) [14]
sequence [52]. If the input vector at timestamp t is denoted as xt, the hidden layer vector
ht, the weight matrix by Wt and Ut, and bias as bh, then ot is an output sequence which is
a function over the current hidden state. RNN iteratively computes the hidden layer and
outputs using the following recursive procedure:
ht = σ(Whxt + Uhh(t− 1) + bh (4.1)
and,
ot = σ(Woht + bo) (4.2)
where Wo and bo denote the weight and bias for the output, respectively.
4.2. Basic Deep Learning Concepts 15
4.2.2 Long Short Term Memory (LSTM)
LSTM is a special type of RNN that makes it easier to remember past data in memory
and avoid the vanishing gradient problem of RNNs [17]. LSTM trains the model by using
back-propagation. LSTM can remove or add information to the cell state, which allows
information to flow along with the network, and regulate them using the ‘gate’ concept. In
an LSTM network, three gates are present: 1) Input gate (it), 2) Forget gate (ft), and 3)
Output gate (ot) (Figure 4.4). The LSTM architecture is specified as follows [49]:
it = σ(Wixt + Uih(t− 1) + bi) (4.3)
ct = tanh(Wcxt + Ufh(t− 1) + bc) (4.4)
ft = σ(Wfxt + Ufh(t− 1) + bf ) (4.5)
ot = σ(Woxt + Uoh(t− 1) + bo) (4.6)
st = s(t− 1) ◦ ft + ct ◦ it (4.7)
ht = st ◦ ot (4.8)
where ht denotes the hidden state, st denotes the cell state at time t, and ◦ denotes Hadamard
product [21]. The gates can learn the most important data in a sequence to keep or throw
away. With this consideration, the gates can pass relevant information down the long chain of
sequences for predictions. This architecture makes LSTM perfect for time-series prediction.
16 Chapter 4. Methodology
Figure 4.4: Graphic representation of LSTM gates [1]
4.2.3 Bidirectional LSTM Recurrent Structure
Bidirectional LSTMs are based on the traditional LSTMs that were introduced to improve
model performance on sequence classification problems [24]. The arrangement of the LSTM
memory block enables the network to store and retrieve information over long periods (Figure
4.5). One drawback of the standard LSTM networks is that they only have access to the
previous context but not to future context. In problems where all time steps of the input
sequence are available, Bidirectional LSTMs train two instead of one LSTMs on the input
sequence (i.e., the input sequence as-is and a reversed copy of the input sequence). This
results in faster and even fuller learning on the problem [19].
In our framework, for each location on the roadways (denoted by stations where the loop
detectors are located and collect the speed and flow data), we construct a Bidirectional
LSTM model. The input X is the historical value of the dependent variables (i.e., speed
and flow values of upstream and downstream points). Furthermore, the time components
that influence the traffic conditions (i.e., time of the day, day of the week, and day of the
4.2. Basic Deep Learning Concepts 17
month) will be added to the input vector. Finally, the weather information collected for
each timestamp will form the additional input features. The bidirectional LSTM is a good
fit for speed prediction as the LSTM can potentially capture temporal autocorrelation in the
data. Once the model is trained based on the historical data, the speed values are estimated.
Thereafter, the anomalous behavior can be classified by setting a threshold for loss values
and examining the actual traffic data with the corresponding pattern.
Figure 4.5: The architecture of Bidirectional LSTM model [23]
Chapter 5
Results
5.1 Area of Study
We choose a 20-mile section of freeway I-5 N in the Sacramento area, as the study area
(Figure 5.1). According to the California Office of Traffic Safety, there are over 3,000 traffic
accidents per year in Sacramento that result in death or serious injuries. We looked at the
traffic and accident data for 6 months of data from July to December 2018. All the data we
collected about Sacramento are described below.
5.2 Data
5.2.1 Traffic and Accident Data
Traffic flow data used for empirical assessments was provided by the California Department
of Transportation (Caltrans) Performance Measurement System (PeMS) [9]. PeMS gets its
data from ITS, Vehicle Detector Stations (VDS), traffic counters (e.g., traffic census stations
and weight-in motion (WMI) sensors), and other data sets like California Highway Patrol
(CHP) incident data, the Caltrans Photolog, etc. Caltrans PeMS consists of 18300 detector
stations and collects traffic data every 30 seconds. To account for possible malfunction of
detectors and sensors, PeMS uses a process called data imputation to compile 30-seconds
18
5.2. Data 19
Figure 5.1: The spatial extent of area of the study
data sets without any gaps and aggregate them into 5-minute increments [45]. PeMS is a
real-time Archive Data Management System (rt-ADMS) that collects, stores, and processes
raw data in real-time [9]. An advantage of using PeMS compared to raw inductive loop or
sensor data is that the PeMS platform and algorithms manage the data fusion and most
of the pre-processing and cleaning of the data. However, the drawback is the aggregated
data in 5-minute increments, which decrease the temporal resolution of the dataset. As our
analysis is mainly at macro-scale (macroscopic behavior of traffic), the aggregation may not
significantly impact the results.
PeMS uses Postmiles to measure locations on state highways. The system uses two types
20 Chapter 5. Results
Table 5.1: Example of traffic Speed data
Time Postmile (Abs) Postmile (CA) VDS Agg Speed # Lane Points % Observed0:00 3.335 3.425 1118352 68.1 4 1000:00 2.56 2.65 1114720 70.4 4 1000:00 2.195 2.285 1118348 67.3 4 1000:00 1.143 1.233 1118333 67.5 4 1000:00 0.22 R.31 1114091 71.1 6 1000:05 533.515 3.57 317377 67.6 2 00:05 530.517 0.572 316096 67.6 2 00:05 524.95 29.657 317843 67.6 3 00:05 524.193 28.9 318236 67.6 3 00:05 523.39 28.097 315927 67.6 4 00:05 523.247 27.954 315969 67.6 4 00:05 520.744 25.451 315054 67.6 4 00:05 519.874 24.581 318632 67.6 4 0
of Postmiles and includes both in the dataset. The jurisdictional (Caltrans) Postmiles are
assigned to physical boxes and geometric features on freeways when they are built. Absolute
Postmiles reflect the actual distance along a freeway from its beginning to its terminus. PeMS
uses absolute Postmiles to compute the distance between detectors. With this definition,
the absolute Postmile is a unique value for each freeway and will be used as the location
indicator of road sections in our analysis. Road sections are defined as the section of the
road between two correctly working stations (Postmile) [15].
Traffic speed and flow data are grouped by day and interstate highway. For each day of
data seven pieces of data per Postmile are included (Figure 5.1): Time, Absolute Postmile,
Caltrans Postmile, Vehicle detection Station (VDS) ID, Aggregate Speed/Flow, Number of
Lane Points, and Percent Observed.
• The Time value begins at 00:00, representing 12:00 AM, and continues throughout the
day, using a 24-hour clock.
• The Absolute Postmile is a measure of the location of the reading, using the statewide
5.2. Data 21
Figure 5.2: PeMS homepage, available at http://pems.dot.ca.gov/
mile markers for the particular highway.
• The Caltrans Postmile is a more convoluted measure of the location of the reading; it
measures the distance from the county line of the county in which the reading occurs.
• The Vehicle Detector Station, or VDS, is the unique identifier of the station to which
the loop detector belongs.
• The Aggregate Speed/Flow value is the average of the speeds/flows of vehicles passing
over the loop detector.
22 Chapter 5. Results
Table 5.2: A sample of incident data available at PeMS and provided by California HighwayPatrol (CHP)
Incident Start Time Duration Freeway CA PM Abs PM Area Location DescriptionId (mins)
18258118 10/01/18 62 I-5 N R17.865 680.5 Redding I5 N 1182-00:05 Twin View Blvd Ofr Trfc Collision-No Inj
18258129 10/01/18 210 I-5 N 26.567 143.2 Altadena I5 N CZP-Assist00:23 I5 N Ca134 W Con with Construction
18258138 10/01/18 38 I-5 N 25.067 141.7 Central LA I5 N / Hit00:33 So Colorado Blvd and Run No Injuries
• The number of lane-points indicates the number of detector data points used to make
the computation.
• The percent observed means how much data is observed (actual data received that
met all diagnostic tests) as opposed to imputed. The percent observed is very useful
in determining the quality of data.
For this project, the most important pieces of data for each speed reading are the Time
value, the Absolute Postmile, and the Aggregate Speed value. Traffic flow data is organized
in almost the same way, with the main difference being the inclusion of an Aggregate Flow
value instead of an Aggregate Speed value.
The traffic incident data, also available on PeMS, has 101 pieces of data per entry (see Table
5.2): Incident ID, Start Time, Duration, Freeway, Caltrans Postmile, Absolute Postmile,
Source, Area, Location, and Description.
• The Incident ID is a unique identifier for the incident.
• The Start Time is a 24-hour clock timestamp of the form MM/DD/YY HH:MM that
represents when the incident occurred.
• The Duration is a value in minutes that represents the duration of the incident that
occurred, as assessed by the police department reporting the incident.
5.2. Data 23
• The Freeway is the name and travel direction of the interstate highway, for example,
“I5-N”.
• The Caltrans Postmile and Absolute Postmile are a measures of the location of the
incident, using the same conventions as discussed for the traffic speed and flow data.
• The Source is typically “CHP”, which represents the California Highway Patrol.
• The Area is the county in which the incident took place.
• The Location includes the information from the Freeway value, as well as the nearest
cross-street.
• The Description is a text description of the incident that also includes a four-digit
incident code. For example, the entry for a traffic collision with no injuries is “1182-
Trfc Collision-No Inj”.
For this project, the most important pieces of data for each incident report are the Start
Time, Duration, Freeway, Absolute Postmile, and Description. For each of these data types
(traffic speed, flow, and incidents) contained in a CSV file, we begin our data preprocessing
by loading each into a table in an SQLite3 database. After using Python’s CSV reader
capabilities to create objects of each type of data point, we use SQLite3’s helpful Python
library to insert these data points into a raw database.
5.2.2 Weather Data
The weather data is collected from the MesoWest database [22]. MesoWest is a cooperative
project and was first developed at the University of Utah in 1996. The Mesowest project
provides access to current and archived weather observations across the United States. These
24 Chapter 5. Results
Table 5.3: A sample of weather data retrieved from MesoWest
Date Time Air temperature (Celsius) Wind Speed (m/s) Rainfall (mm) Weather Condition2018-10-02T13:53:00Z 17.8 2.57 0.025 Clear2018-10-02T13:55:00Z 18 2.57 0 Mostly Clear2018-10-02T14:35:00Z 18 5.14 0 Partly Cloudy2018-10-04T00:53:00Z 21.7 4.12 0.254 Rain
data are available through the traditional suite of web products and an API Service. MesoW-
est data can be downloaded directly from the website. Figure 5.3 shows the weather stations
for which data is available in the Sacramento area. For each weather monitoring station,
Figure 5.3: MesoWest weather data API map
different variables are available. This dataset includes wind features (e.g., speed, gust, or
direction), temperature, cloud layer, weather condition, pressure, altimeter, and many more.
For this study, we use the rainfall, temperature, wind speed, and general weather condition.
Table 5.3 presents an example of the weather data we used as the input of our model. The
flow of data is shown in Figure 5.4. A summary of the data statistics is presented in Table
5.4. It should be noted that there were some missing data in the flow dataset. Before train-
ing the model, the timestamps (data points) with missing flow values have been excluded.
5.3. Deep Learning Model 25
Figure 5.4: The flow of data through the framework
Table 5.4: Data statistics
Feature Count Mean Standard DeviationSpeed 43617 63.88 4.40Flow 43513 111.54 61.52
Temperature 43617 19.30 7.61Rainfall 43617 0.29 0.46
Wind Speed 43617 2.37 1.83
5.3 Deep Learning Model
In an effort to build a predictive model of accident prediction/detection, we construct a
neural network based on a series of combinations of deep learning primitives. This deep
learning architecture allows us to predict future speed values based on the historical data of
upstream and downstream speed and flow. The input features are selected considering the
dynamics of traffic after an accident occurrence and based on the fundamentals of traffic
engineering and flow theories. Since we are dealing with time-series and previous states
are important in the prediction, we employed an LSTM architecture as the baseline of our
model. In order to take the spatial dependencies of road sections into account, we passed
26 Chapter 5. Results
downstream and upstream information as input features in training our model. To increase
the rate of training, we selected the bidirectional wrapper as it is a better fit for our purpose.
5.4 Feature Engineering
Preparing the data for Time Series forecasting (LSTMs in particular) can be tricky. Intu-
itively, we need to predict the value at the current time step by using the history (n time
steps from it). The temporal resolution of the traffic data is 5 minutes. We chose 5 time
steps to make the sequences. Hence, it is going to look at the 25 minutes before each point
to train the model. In our experiment, we select the traffic flow and speed of the past 25
minutes, which is a time sequence of 5 data points, for 5 stations on the upstream and 5
stations on the downstream to predict the coming traffic speed. We select 80% of the data to
train our model. Additionally, the weather features (i.e., temperature, rainfall, wind speed,
and weather condition) are added to the input vector.
5.5 Results
After conditioning the data and building the sequences of time, the bidirectional LSTM
model is built using the Keras [11] framework on top of TensorFlow [2] using the Adam
optimizer [27] with mean squared error as the loss function. The model was trained with 10
epochs with a batch size of 32. We used the Google Colab environment to run the analysis
on a GPU.
In this section, we present the results of the analysis. The prediction and anomaly detection
has been done for 6 different locations (with an average of 1 mile of distance between each
5.5. Results 27
location). We present the results for one sample location in this section. However, the graphs
and results for other locations are presented in the Appendix for further demonstration.
Figure 5.5 shows the loss values after training the model for 10 epochs. It can be seen the
model learns with a satisfactory rate. It should be noted that by increasing the number of
epochs the model tended to over-fitting. We selected the 10 epochs to avoid the over-fitting
to better capture the anomalies in future steps. An example of how well the model predicts
Figure 5.5: Loss Value for training for Postmile 517.916
the speed values is presented in Figure 5.6. Since traffic speed shows a regular pattern in
normal and undisputed situations, it is expected that the model does not capture the extreme
points as well as the other points. Indeed, the extreme points are the abnormal points of the
data set which potentially are formed due to a disruption to the traffic flow (i.e., accidents,
hazards, or road closures). In order to find the extreme points and investigate the potential
for describing the accidents, we calculate the Mean Absolute Error (MAE) on the training
data. The idea is to find the points that the actual value is significantly different from the
predicted one. Figure 5.7 shows the histogram of loss value for training data. We picked a
threshold of 1.5, as not much of the loss is larger than that. Using the threshold, we can
28 Chapter 5. Results
Figure 5.6: Actual and prediction of speed values for Postmile 517.916
Table 5.5: A sample of anomaly points in anomaly DataFrame
Time Loss Threshold Anomaly11/1/18 6:30 4.16351712 1.5 True11/1/18 6:35 6.4858084 1.5 True11/1/18 6:40 7.89771065 1.5 True11/1/18 6:45 8.5084442 1.5 True11/1/18 6:50 8.67621649 1.5 True11/1/18 6:55 8.83714712 1.5 True11/1/18 7:00 8.66085566 1.5 True
turn the problem into a simple binary classification task:
• If the reconstruction loss for an example is below the threshold, we will classify it as a
normal speed,
• Alternatively, if the loss is higher than the threshold, we will classify it as an anomaly.
For each location on the roadway, a dataframe including the anomaly points is generated.
Table 5.5 presents an example of the anomaly points. We then calculate the MAE values
for test data. We build a DataFrame containing the loss and the anomalies (values above the
5.5. Results 29
Figure 5.7: Histogram of loss value for training for Postmile 517.916
Table 5.6: A sample of anomaly events DataFrame
ID Start time End time1 11/1/18 6:30 11/1/18 8:252 11/5/18 7:20 11/5/18 9:153 11/6/18 7:00 11/6/18 9:25
threshold). Figure 5.8 shows the anomalies found in the test data. The red dots (anomalies)
are mostly located in the extreme values of speed. In normal anomaly problems, one point
may suffice to describe the anomaly event. However, in our case and due to the temporal
resolution of data, a single point of anomaly may be a noise or not related to any specific
extreme situation. As can be seen in Figure 5.8, in many cases anomaly points are located
on a draw-down and draw-up cycle similar to ones in Figure 4.3. This observation appears
consistent with the dynamics of traffic when a disruption like an accident occurs. Hence, we
aggregated the anomaly points that formed a cycle as one anomaly event. The results have
been stored in a secondary DataFrame for interpretation. A sample of this DataFrame is
presented in Table 5.8. We further removed the anomaly points that were not followed by
any consecutive point and did not form a cycle as they might be indicating noise or random
30 Chapter 5. Results
Figure 5.8: Anomaly points for test dataset for Postmile 517.916
fluctuation in speed values. Once the anomaly events have been produced, the connection
with accidents could be explored.
Considering the spatial heterogeneity, we present the results for 6 Postmile stations in the
region of study. For each station, we looked at the number of reported incidents (traffic
collisions) by CHP for 5 miles on the upstream and 5 miles on the downstream. The results
are presented in Figure 5.9. The results show that the number of detected anomalies is
close to the number of accidents within that area. The lower number of detected anomalies
compared to the actual accident reports is in-line with real-world accident situations. In
reality, not all accidents cause a significant change in traffic flow. The fast response and
clearance time, the severity of the accidents and light traffic at the time of the accident, are
among the reasons that may lead to non-significant changes in the accidents.
The total number of accidents is generally a good indicator of long-term prediction. However,
it might not be good at predicting short-term traffic accidents, especially if the goal is
5.5. Results 31
improved emergency management or travelers’ information systems. Quantitatively, incident
Figure 5.9: Comparison between the number of actual incidents (reported by CHP)anddetected anomaly events
detection performance has been assessed by the performance measures used in past studies
[28, 38].
• Detection Rate (DR) is the ratio of number of detected accidents to the total
number of actual accidents (i.e., reported by CHP).
DR =Total number of detected incidents
Total number of actual incidents∗ 100% (5.1)
• False Alarm Rate (FAR) is defined as the ratio of the number of false alarms (i.e.,
in our case anomalies that did not match any accident reports from CHP) to the total
32 Chapter 5. Results
number of detected anomalies.
FAR =Total number of false alarm cases
Total number of algorithm applications∗ 100% (5.2)
• Mean Time to Detect (MTTD) is defined as the average of time between the actual
start of the accident and time when the model captured the start of the accident.
MTTD =Total time elapsed between detecting incidents
Total number of incidents detected(5.3)
• Performance Index (PI) integrates all 3 performance measures (DR, FAR, and
MTTD) to evaluate the overall performance of the anomaly detection framework ([40]).
The lower values of PI are associated with better performance of the model. Since DR
can be 100% or FAR can be 0% during training, the PI measure is slightly modified
with the constants (1.01 and 0.001) to handle such cases, similar to [40].
PI = (1.01− DR
100) ∗ (FAR
100+ 0.001) ∗MTTD (5.4)
Table 5.7 presents the performance evaluation metrics. DR values show potential in the
detection of a good number of actual accidents using the anomaly detection framework. The
relatively low value of DR in some cases can be explained by the nature of accidents and
how some accidents may not generate significant changes in the speed of the vehicles (as
the recovery may happen quickly or the traffic has been light in the pre-accident state). On
the other hand, FAR values vary among different locations. The high FAR values could be
explained by the fact that some of the detected anomalies are associated with non-accident
disruptions in the traffic. For example, an anomaly could be related to a traffic hazard,
animal crossing, defective traffic signals, or closure on the road. Since the focus of this study
5.5. Results 33
Table 5.7: Variation of performance evaluation metrics for different Postmiles in the area ofstudy
Postmile Station DR FAR MTTD (mins) PI516.593 0.667 0.136 13.1 0.031515.173 0.651 0.292 16.8 0.066513.503 0.870 0.299 14.4 0.066511.543 0.650 0.267 12.0 0.044510.293 0.692 0.286 19.5 0.075508.463 0.739 0.195 15.4 0.046
was on the collisions, these types of events were excluded from the accident dataset. How-
ever, further demonstration could explain the cause of the anomalies with further details.
Interestingly, MTTD values tend to be small, which verifies the potential of the proposed
framework for implementation in emergency response systems. Overall, the aggregation of
these metrics (i.e., PI) is found promising as shown in Table 5.7. It should be noted that once
the threshold for anomalies changes, the framework may be able to capture more accidents
and improve the model’s performance.
This result ties well with previous studies wherein incident detection algorithms have been
developed to leverage large-scale traffic data for traffic accident detection considering the
data resolution and scale of analysis. The most related study in terms of resolution and
scope of the study achieved an average DR of %85, FAR of %0.15, MTD of 10 minutes, and
PI of 0.0025, where the authors used the high-resolution INRIX data and denoised thresh-
olds for anomaly detection [7]. Even though we did not replicate the previously reported
method proposed by Chakraborty et al. [7], our results show promising potential for accident
detection/prediction. We speculate that the difference among the performance of different
algorithms might be due to the threshold setting and data resolution used in this study
which should be further investigated in future works.
34 Chapter 5. Results
5.6 Anomaly Inference
In this section, we investigate the short-term performance of anomalies and test if they
match reported accident reports. In general, the anomaly events observations are classified
into three major groups. The anomaly events in the first group are the ones that happen
at a close time to an accident report on the upstream points. The second group included
the events that were matched with an accident that occurs on the downstream points. The
third group was the anomalies that did not match with any close upstream or downstream
accident. Table 5.8 presents the anomaly events with the matched potential causes for sta-
tion 516.593. To match the accident on the CHP dataset with the detected anomaly events,
for each anomaly event, we looked at the accidents that occur 5 miles before and after the
anomalies. The accidents that happen at a similar time or close to the timespan of anomalies
have been identified as the potential accident caused by the anomaly.
In the first group, the anomaly is detected at a point before the reported accident. From
the temporal point of view, in almost all of the observations, the start time of the anomaly
event is after the time of the accident. This is in line with traffic theory fundamentals and
the shock-wave concept. As traffic theory states, when an accident occurs, a stationary bot-
tleneck is formed and congestion (i.e., draw-draw-up cycle) forms upstream. This explains
the observations in group 1.
Interestingly, in group 2, the detected anomaly’s timestamp is before the reported accident.
A summary of the number of anomaly events in each group is presented in Table 5.9. From
the spatial point of view, the anomalies in this group vary in terms of their location relative
to the location of the accident. In some cases, the location of detected anomalies is before
the reported accident, while they are located after the accident location in other cases. This
makes good sense since our model predicts the speed values based on the upstream and
downstream data. This might also be due to the nature of the bidirectional training process
5.6. Anomaly Inference 35
Table 5.8: A sample of anomaly events DataFrame
Anomaly Event IncidentStart DateTime End DateTime Start Time Location Type11/1/2018 6:30 11/1/2018 8:25 11/1/2018 6:21 517.2 1181-Trfc Collision-Minor Inj11/5/2018 7:20 11/5/2018 9:15 N/A N/A N/A11/6/2018 7:00 11/6/2018 9:25 N/A N/A N/A11/6/2018 16:50 11/6/2018 17:00 11/6/2018 17:23 519.7 1182-Trfc Collision-No Inj11/6/2018 19:05 11/6/2018 19:05 11/6/2018 17:38 520 1183-Trfc Collision-Unkn Inj11/7/2018 8:00 11/7/2018 8:25 N/A N/A N/A11/8/2018 7:45 11/8/2018 9:00 11/8/2018 8:47 514.4 1182-Trfc Collision-No Inj11/8/2018 16:25 11/8/2018 18:15 11/8/2018 18:51 503.5 1183-Trfc Collision-Unkn Inj11/9/2018 17:25 11/9/2018 17:55 11/9/2018 13:53 521.5 1183-Trfc Collision-Unkn Inj11/13/2018 7:40 11/13/2018 8:50 11/13/2018 7:02 510.2 1182-Trfc Collision-No Inj11/14/2018 8:00 11/14/2018 8:05 11/14/2018 7:01 520.3 1182-Trfc Collision-No Inj11/14/2018 16:25 11/14/2018 16:30 11/14/2018 16:03 517.2 1183-Trfc Collision-Unkn Inj11/15/2018 7:15 11/15/2018 7:20 11/15/2018 7:39 515.8 1183-Trfc Collision-Unkn Inj11/15/2018 7:35 11/15/2018 7:40 11/15/2018 7:38 515.8 1183-Trfc Collision-Unkn Inj11/16/2018 15:55 11/16/2018 16:55 11/16/2018 14:37 518.7 1182-Trfc Collision-No Inj11/21/2018 15:45 11/21/2018 17:50 11/21/2018 15:59 518.7 1182-Trfc Collision-No Inj11/28/2018 8:25 11/28/2018 8:35 11/28/2018 7:01 521.5 1182-Trfc Collision-No Inj11/28/2018 17:15 11/28/2018 17:40 11/28/2018 18:02 519 1183-Trfc Collision-Unkn Inj11/29/2018 9:00 11/29/2018 13:45 11/29/2018 11:55 515.6 1182-Trfc Collision-No Inj11/29/2018 14:00 11/29/2018 18:05 11/29/2018 18:13 517.2 1125-Traffic Hazard11/29/2018 19:50 11/29/2018 20:30 11/29/2018 22:38 520.5 1183-Trfc Collision-Unkn Inj
Table 5.9: Number of the anomaly events in each group
Postmile Group 1 Group 2 Group 3516.593 10 8 4515.173 9 9 6513.503 8 21 10511.543 10 9 10510.293 6 7 5
and the fuller learning process. The overall results show that over the locations of the study,
the anomaly points potentially highlight accidents occurring in the region. However, one
can play around with the threshold and try to get even better results.
Chapter 6
Conclusions
6.1 Conclusion
The problem of accident forecasting is an important problem for transportation and public
safety. Forecasting the accident or detecting the occurrence of one, as fast as possible, could
accelerate the emergency response and lead to faster clearance. In this research, an anomaly
detection approach using the LSTM model was proposed for traffic accident detection/pre-
diction. We proposed an anomaly detection approach based on the prediction from the deep
learning framework. We employed Deep LSTM Bidirectional for speed prediction based on
the historical data of traffic speed and flow of upstream and downstream points. Several
traffic and environmental features were retrieved from big datasets over 20 miles of the I-5 N
freeway in the Sacramento area across 6 months. We used the traffic features for downstream
and upstream of each point to address the spatial dependency. The proposed methodology
consists of two major steps.
First, the speed values at each location of the freeway were predicted. The bidirectional
LSTM model was trained with the traffic features of upstream and downstream points (i.e.,
speed and flow) and weather features (i.e., temperature, rainfall, wind speed, and general
weather condition). We further added the temporal features that are important in generic
traffic conditions (i.e., time components (hour and minute of the day), day of the month,
and day of the week (to take the effect of weekend and weekday into account). Second, the
36
6.2. Future Work 37
anomaly detection module was used to capture the points where the predicted and actual
speed values were significantly different. We set a threshold for loss value based on the
history of loss values in training and testing to capture the anomaly points.
Broadly speaking, the problem of accident detection/prediction has often been addressed
from the risk perspective where the objective is to analyze the number of accidents in a
region. However, it would be of special interest to capture the occurrence of an accident. To
illuminate this uncharted area, this work showed that anomaly detection using deep learning
techniques offers promising solutions to traffic accident prediction/detection and interpreta-
tion of the causes, if unique data properties are well handled. One of the contributions of
this study is the deep-learning-based anomaly detection approach to detect and predict an
accident. Contrary to the studies that focused on the prediction of the number of accidents
in a region, this study focus on the identification of potential accidents downstream or up-
stream of a location on the roadways. While the number of accidents is more applicable in
risk analysis, the current approach could be implemented in traveler’s information and emer-
gency management systems. Furthermore, as presented in the results, the detected anomalies
could be due to non-accident events like hazards or construction zones. This approach has
been functional, but not optimal, and could be further improved in future avenues of research.
6.2 Future Work
As mentioned before, traffic accident detection and prediction is a complex problem. This
thesis and project was designed with future work in mind. Three major areas are outlined
where this project could be expanded in such future works.
38 Chapter 6. Conclusions
6.2.1 Urban-scale Implementation
Future studies should aim to replicate the results on a larger scale for better capturing
the anomalies and related accidents. This is the most obvious way this project could be ex-
panded. The current study focus on a 20-mile section of I-5 in the Sacramento area. Although
the factors we utilized in this study can reveal and predict some traffic accident patterns,
they are far from complete, and other factors, such as driver behavior, road characteristics,
light conditions, and special events, are important as well.
6.2.2 Sensitivity Analysis for Setting the Threshold
The limitations of the present study naturally include the threshold for defining the anomaly
point. One concern about the findings of the anomaly was the threshold value for anomaly
detection. The model performance can be improved by employing sensitivity analysis on
different threshold values. Future research should consider the potential effects of threshold
values more carefully. In this study, we chose the threshold value based on the observations
of loss value in the training and testing sets. However, future research could continue to
explore the effect of different threshold values in detecting the accidents and how it is going
to impact the performance evaluation metrics (i.e., detection rate, false alarm rate, and mean
time to detect). One potential method would be using total variation denoising.
6.2.3 Additional Spatial Dependency
One limitation of this study is the lack of spatial features and dependencies in the feature
groups. Due to the complexity of the traffic accidents, there is not a certain answer for
the spatial extent that an accident impact can be observed. Though we looked for accident
6.2. Future Work 39
occurrence within 5 miles upstream and downstream of each point to address this problem,
future research should examine an automated algorithm to look for the potential accident
occurrence in each region.
Chapter 7
User Manual
The traffic data used for the case study in this project as well as relevant project codes can
be found in https://github.com/farnazkgn/DeepLearningPredictingAccidents [18].
7.1 Software Requirement
The project requires the following software for use:
• SQLite3 for data conditioning and management
• Python 3 or later for data acquisition
• Jupyter Notebook for model development
7.2 Repository Content
The notable files available in the repository are as following:
• Bidirectional model.ipynb
• DataLoader.py
40
7.2. Repository Content 41
• combined.db
• combined.csv
• data (directory)
– flows (directory)
– incidents (directory)
– speeds (directory)
– weather (directory)
Bidirectional model.ipynb is the Jupyter Notebook that is responsible for training the deep
learning model and generating results. The dataset used for analysis of the case study in
this project are stored as combined.db and combined.csv.
DataLoader.py is the script responsible for aggregating data from these raw data files into
combined.db. The raw data file retrieved from the PeMs includes separate txt files for each
day including the traffic information. DataLoader.py aggregates all of the txt files and stores
them in proper format for further analysis.
To run the script, execute “python DataLoader.py” from the command line. Successful script
execution should take ten to twenty minutes and produce the following output in the termi-
nal:
∼ $ python DataLoader.py
Database created and successfully connected to SQLite Creating table: combined_data
∼ $
Running the script creates combined.db, which contains the full, unsorted dataset. To pro-
duce combined.csv for use in training and evaluating the model, some processing of the
dataset using SQL is required. The first step is to launch the SQLite command-line inter-
face:
42 Chapter 7. User Manual
∼ $ sqlite3
Then, use the following commands to open combined.db and export the sorted contents to
combined.csv:
.open combined.db
.headers on
.mode csv
.output combined.csv
SELECT * from combined_data ORDER BY time;
.quit
Exporting the sorted data should take less than a minute. After completion, it is advisable
to briefly browse combined.csv to ensure there are no major issues with data quality. At this
point, Bidirectional model.ipynb can be run using your preferred Jupyter Notebook manager.
Bidirectional model.ipynb includes the deep learning model. The prerequisite libraries to run
the models are Tensorflow and Keras. There are multiple methods to install these packages.
The easiest method is to use pip and enter the following in the command line (it is recom-
mended to use the latest stable release with CPU and GPU support):
$ pip install –upgrade tensorflow
Once Tensorflow is installed, the Keras library is required to be installed using the following
command:
$ pip install keras
The description for each step of the deep learning model is available in the notebook. It
is recommended to run the model on a GPU since it is going to be faster and more effi-
cient. We used the Google Colab environment to run the code. Google Colab is a free cloud
service based on Jupyter Notebooks that supports a free GPU. To use Google Colab, you
need to upload the Jupyter notebook, upload the dataset directly, or use the Google Drive
connection. Once the requirement is met, you can run the code similar to any other Jupyter
7.2. Repository Content 43
notebook.
Chapter 8
Developer Manual
8.1 Data
Traffic and incident data for California could be found on http://pems.dot.ca.gov/. PeMS
is also an Archived Data User Service (ADUS) that provides over ten years of data for his-
torical analysis. To use this site, you must apply for an account. Registering only takes
a few minutes. Accounts are typically approved within one to two business days. Further
instruction for the data provided by PeMS and how to use the website could be found at:
http://pems.dot.ca.gov/Papers/PeMS_Intro_User_Guide_v6.pdf
As noted, weather data has been provided by MesoWest. The first step is to select a
weather station. The MesoWest main home page provides quick access to station data
through the “Station Search” section. Additional information about how to search for the
station is available on https://mesowest.utah.edu/html/help/userguide.html. Once
the weather station is chosen, users can download the weather variables (e.g., temperature,
wind direction, heat index, snow depth, etc.) for different periods using the following API:
https://developers.synopticdata.com/. To use this API, users must create an account
that will be immediately approved and ready to use.
44
8.2. Database Implementation 45
8.2 Database Implementation
The database management module has been developed by Elias Gorine and Jacob Smethurst
as a fulfillment of the CS4624 project. Further details and information could be found on
the project report available in VTechWorks [18].
The database schema is constructed so that for every five-minute interval between 12 AM
on July 1, 2018, and 11:55 PM on November 30, 2018, there is data for traffic speed, traffic
flow, and weather (air temperature, wind speed, and precipitation) at each of the 30 data
collection stations from Postmile 504 and Postmile 520 on I-5 N in California. The full list
of locations for the traffic data collection stations used for this project is [504.223, 504.793,
506.383, 507.504, 507.953, 508.463, 509.013, 510.094, 510.293, 510.643, 511.341, 511.543,
512.073, 512.435, 512.753, 513.503, 513.998, 514.662, 515.173, 515.973, 516.593, 517.093,
517.916, 518.543, 518.864, 519.193, 519.571, 519.863, 519.874, 520.744]. However, it can be
further expanded for other freeways in California depending on the area of analysis.
The first column in the database schema is a timestamp in the form MM/DD/YYYY HH:
MM. Then, for each of the listed data collection station locations (x), there is a column for
each of the data types in the form sx, fx, tx, wx, and px. Column sx represents the traffic
speed at station location x, column fx represents the traffic flow at station location x, and
so on with t = air temperature, w = wind speed, and p = precipitation.
8.2.1 Database Management with SQL
The flow_reader.py, incident_reader.py, and speed_reader.py scripts are made to aggregate
every value in the raw data files into a SQLite database. These scripts take longer to exe-
cute than DataLoader.py, which organizes data specifically in the schema used by our model,
46 Chapter 8. Developer Manual
but they conveniently aggregate all the data so that it can be manipulated for any desired
application. That is to say, DataLoader.py is less extensible and is built directly for our ap-
plication, whereas the flow_reader.py, incident_reader.py, and speed_reader.py scripts are
more extensible, but perform more slowly and parse possibly unneeded data from the raw
files.
We attempted to make our database population scripts modular to promote extensibility
by future developers. We recognize that successful accident detection models may include
many more factors than our code does, such as the vertical gradient of the road, a measure
of the road degradation, or a measure of the tightness of a curve in the road. If a developer
is seeking to add a new factor to the model, they should first find a high-quality source for
the data and ensure they have a way of finding the location and time of each data point in
the form used by our project.
To create a new database reader module, a developer can follow the example of flow_reader.py,
incident_reader.py, or speed_reader.py. These files create a class with fields corresponding
to the different columns in the CSV data file. They then use Python’s CSV reader capabili-
ties to parse the files and populate an array of the new class objects. Depending on the data
used by the developer, there may be additional work required to create a timestamp in the
same format as the rest of the data entries.
Then, db_loader.py can be easily adapted to load this list of Python objects into the
SQLite3 database. All insert and update queries to the database that are managed by
the db_loader.py script are protected by parameterized queries. The insert queries in
db_loader.py are prepared within prepared statements. Then, as each data point is read
out of its associated data point list, the information is broken down by column and each
column is loaded into the prepared statement as a dynamic parameter.
In part, this decision was made given the fact that it is not possible to verify the integrity
of all of the data that is being loaded into the database. Rather than verifying that each
8.2. Database Implementation 47
Figure 8.1: Loading speed data using parameterized queries
datapoint contains no hidden SQL queries within them, we instead chose to use prepared
statements and dynamic parameters (bind variables). The primary goal was to prevent SQL
injection attacks which may come from unverified data sources. Although all of the data
sources for this iteration of the project are typically government-based and trustworthy –
should the project be extended, future developers may attempt to pull data from various
sources. These sources may be compromised, and if a malicious SQL statement ends up
within the source-data, unpredictable outcomes are possible. For this reason, we highly
recommend that you follow the same practices in the provided source code and utilize pa-
rameterized queries and bind variables.
Another issue to consider is performance. Consider that it is the nature of the db_loader.py
script to load hundreds of thousands – or millions – of records into the DB. SQLite, like
other databases, builds execution plans to determine the best strategy to execute a query.
An execution plan can be stored in an execution plan cache. However, this only works if
the SQL statement to be executed is the same. This is not the case with our loads since the
data parameters vary. For this reason, the database would handle each insert operation by
building a whole new execution plan. This is certainly not ideal as it wastes a lot of time,
and performance suffers as a result.
By using bind variables, the actual values of our data points are not being written, instead,
the bind variables act as placeholders within the prepared statement. This means that the
SQL statements will not change and the same execution plan can be reused, thus improving
performance. Considering the volume of data that can potentially be loaded for extensions
48 Chapter 8. Developer Manual
of this project, it is recommended to follow the aforementioned practices to ensure that the
performance of the database does not suffer in the build phase.
8.2.2 Database Indexes
Indexes are added to the frequently accessed columns and tables of our database. An index
is a data structure that helps improve the performance of queries. By default, SQLite uses
B-trees (balanced trees, not binary trees) to organize indexes. The general rule we followed
was adding indexes to the most frequently searched and accessed columns of each table.
While it is not necessary, we recommend that any tables added as extensions to the project
use indexes where appropriate. If the table has columns that are frequently accessed, then
an index may be added to improve performance.
Keep in mind that indexes have some overhead themselves. They occupy space on the disk
and in the DB memory itself. Moreover, with each update/insertion/deletion, the index will
also have to be updated. Having too many indexes, or indexes on unnecessary columns,
can introduce performance issues. For these reasons, we added indexes to columns of tables
which would be joined together or searched frequently. For example, if we needed to join the
flow_raw and speed_raw tables using the pm_abs (absolute Postmiles) columns, it would
be wise to create indexes on these columns and tables. In SQLite this is quite easy:
CREATE INDEX Postmiles_index_flow ON flow_raw(pm_abs);
CREATE INDEX Postmiles_index_speed ON speed_raw(pm_abs);
Bibliography
[1] Understanding RNN and LSTM. https://towardsdatascience.com/
understanding-rnn-and-lstm-f7cdf6dfc14e. Accessed: 2020-04-22.
[2] Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig
Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat,
Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Joze-
fowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga,
Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit
Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasude-
van, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke,
Yuan Yu, and Xiaoqiang Zheng. TensorFlow: Large-scale machine learning on hetero-
geneous systems, 2015. URL https://www.tensorflow.org/. Accessed: 2020-02-18.
[3] Joaquín Abellán, Griselda López, and Juan De OñA. Analysis of traffic accident severity
using decision rules via decision trees. Expert Systems with Applications, 40(15):6047–
6054, 2013.
[4] Ruth Bergel-Hayat, Mohammed Debbarh, Constantinos Antoniou, and George Yannis.
Explaining the road accident risk: weather effects. Accident Analysis & Prevention, 60:
456–465, 2013.
[5] James P Byrne, N Clay Mann, Mengtao Dai, Stephanie A Mason, Paul Karanicolas,
Sandro Rizoli, and Avery B Nathens. Association between emergency medical service
response time and motor vehicle crash mortality in the United States. JAMA surgery,
154(4):286–293, 2019.
49
50 BIBLIOGRAPHY
[6] Ciro Caliendo, Maurizio Guida, and Alessandra Parisi. A crash-prediction model for
multilane roads. Accident Analysis & Prevention, 39(4):657–670, 2007.
[7] Pranamesh Chakraborty, Chinmay Hegde, and Anuj Sharma. Data-driven parallelizable
traffic incident detection using spatio-temporally denoised robust thresholds. Trans-
portation research part C: emerging technologies, 105:81–99, 2019.
[8] Li-Yen Chang and Wen-Chieh Chen. Data mining of tree-based models to analyze
freeway accident frequency. Journal of safety research, 36(4):365–375, 2005.
[9] Chao Chen. Freeway performance measurement system (PeMS). UC Berkeley:
California Partners for Advanced Transportation Technology, 2003. URL https:
//escholarship.org/uc/item/6j93p90t.
[10] Quanjun Chen, Xuan Song, Harutoshi Yamada, and Ryosuke Shibasaki. Learning deep
representation from big and heterogeneous data for traffic accident inference. In Thir-
tieth AAAI Conference on Artificial Intelligence, 2016.
[11] François Chollet et al. Keras, 2015. URL https://github.com/fchollet/keras.
Accessed: 2020-02-18.
[12] Miao Chong, Ajith Abraham, and Marcin Paprzycki. Traffic accident analysis using
machine learning paradigms. Informatica, 29(1), 2005.
[13] Marie Crandall. Rapid emergency medical services response saves lives of persons injured
in motor vehicle crashes. JAMA surgery, 154(4):293–294, 2019.
[14] Francois Dion, Hesham Rakha, and Youn-Soo Kang. Comparison of delay estimates at
under-saturated and over-saturated pre-timed signalized intersections. Transportation
Research Part B: Methodological, 38(2):99–122, 2004.
BIBLIOGRAPHY 51
[15] Yanjie Duan, Yisheng Lv, Yu-Liang Liu, and Fei-Yue Wang. An efficient realization
of deep learning for traffic data imputation. Transportation research part C: emerging
technologies, 72:168–181, 2016.
[16] Kartik Dwivedi, Kumar Biswaranjan, and Amit Sethi. Drowsy driver detection using
representation learning. In 2014 IEEE international advance computing conference
(IACC), pages 995–999. IEEE, 2014.
[17] Felix A Gers, Jürgen Schmidhuber, and Fred Cummins. Learning to forget: Continual
prediction with LSTM. 9th International Conference on Artificial Neural Networks:
ICANN ’99, 1999.
[18] Elias Gorine, Farnaz Khaghani, Junkai Zeng, and Jacob Smethurst. Deep learning
predicting accidents. http://hdl.handle.net/10919/98230, 2020. Accessed: 2020-
05-06, Virginia Tech, CS4624 team term project.
[19] Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. Speech recognition with
deep recurrent neural networks. In 2013 IEEE international conference on acoustics,
speech and signal processing, pages 6645–6649. IEEE, 2013.
[20] Antoine Hébert, Timothée Guédon, Tristan Glatard, and Brigitte Jaumard. High-
resolution road vehicle collision prediction for the City of Montreal. arXiv preprint
arXiv:1905.08770, 2019.
[21] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computa-
tion, 9(8):1735–1780, 1997.
[22] John Horel, Michael Splitt, L Dunn, J Pechmann, B White, C Ciliberti, S Lazarus,
J Slemmer, D Zaff, and J Burks. Mesowest: Cooperative mesonets in the western
United States. Bulletin of the American Meteorological Society, 83(2):211–226, 2002.
52 BIBLIOGRAPHY
[23] Zhiting Hu, Xuezhe Ma, Zhengzhong Liu, Eduard Hovy, and Eric Xing. Harnessing
deep neural networks with logic rules. arXiv preprint arXiv:1603.06318, 2016.
[24] Zhiheng Huang, Wei Xu, and Kai Yu. Bidirectional LSTM-CRF models for sequence
tagging. arXiv preprint arXiv:1508.01991, 2015.
[25] Yong-Kul Ki. Accident detection system using image processing and MDR. Interna-
tional Journal of Computer Science and Network Security IJCSNS, 7(3):35–39, 2007.
[26] Whui Kim, Hyun-Kyun Choi, Byung-Tae Jang, and Jinsu Lim. Driver distraction
detection using single convolutional neural network. In 2017 international conference
on information and communication technology convergence (ICTC), pages 1203–1205.
IEEE, 2017.
[27] Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization.
CoRR, abs/1412.6980, 2014. URL http://arxiv.org/abs/1412.6980.
[28] Xiangmin Li, William HK Lam, and Mei Lam Tam. New automatic incident detec-
tion algorithm based on traffic data collected for journey time estimation. Journal of
transportation engineering, 139(8):840–847, 2013.
[29] Xiugang Li, Dominique Lord, Yunlong Zhang, and Yuanchang Xie. Predicting motor
vehicle crashes using support vector machine models. Accident Analysis & Prevention,
40(4):1611–1618, 2008.
[30] Lei Lin, Qian Wang, and Adel W Sadek. A novel variable selection method based
on frequent pattern tree for real-time traffic accident risk prediction. Transportation
Research Part C: Emerging Technologies, 55:444–459, 2015.
[31] Bappaditya Mandal, Liyuan Li, Gang Sam Wang, and Jie Lin. Towards detection of
BIBLIOGRAPHY 53
bus driver fatigue based on robust visual analysis of eye state. IEEE Transactions on
Intelligent Transportation Systems, 18(3):545–557, 2016.
[32] Adolf D May. Traffic flow fundamentals. Transportation Research Board, 1990.
[33] Hailang Meng, Xinhong Wang, and Xuesong Wang. Expressway crash prediction based
on traffic big data. In Proceedings of the 2018 International Conference on Signal
Processing and Machine Learning, pages 11–16, 2018.
[34] Saleh R Mousa, Peter R Bakhit, and Sherif Ishak. An extreme gradient boosting method
for identifying the factors contributing to crash/near-crash events: a naturalistic driving
study. Canadian Journal of Civil Engineering, 46(8):712–721, 2019.
[35] Alameen Najjar, Shun’ichi Kaneko, and Yoshikazu Miyanaga. Combining satellite im-
agery and open data to map road safety. In Thirty-First AAAI Conference on Artificial
Intelligence, 2017.
[36] Jutaek Oh, Simon P Washington, and Doohee Nam. Accident prediction model for
railway-highway interfaces. Accident Analysis & Prevention, 38(2):346–356, 2006.
[37] VA Olutayo and AA Eludire. Traffic accident analysis using decision trees and neural
networks. International Journal of Information Technology and Computer Science, 2:
22–28, 2014.
[38] Emily Parkany and Chi Xie. A complete review of incident detection algorithms &
their deployment: what works and what doesn’t. Technical report, 2005. URL http:
//www.uvm.edu/~transctr/pdf/netc/netcr37_00-7.pdf.
[39] Honglei Ren, You Song, JingXin Liu, Yucheng Hu, and Jinzhi Lei. A deep learn-
ing approach to the prediction of short-term traffic accident risk. arXiv preprint
arXiv:1710.09543, 2017.
54 BIBLIOGRAPHY
[40] Jimmy SJ Ren, Wei Wang, Jiawei Wang, and Stephen Liao. An unsupervised feature
learning approach to improve automatic incident detection. In 2012 15th International
IEEE Conference on Intelligent Transportation Systems, pages 172–177. IEEE, 2012.
[41] Paul I Richards. Shock waves on the highway. Operations research, 4(1):42–51, 1956.
[42] Matthias Schlögl, Rainer Stütz, Gregor Laaha, and Michael Melcher. A comparison of
statistical learning methods for deriving determining factors of accident occurrence from
an imbalanced high resolution dataset. Accident Analysis & Prevention, 127:134–149,
2019.
[43] Ankit Parag Shah, Jean-Bapstite Lamare, Tuan Nguyen-Anh, and Alexander Haupt-
mann. Cadp: A novel dataset for CCTV traffic camera based accident analysis. In 2018
15th IEEE International Conference on Advanced Video and Signal Based Surveillance
(AVSS), pages 1–9. IEEE, 2018.
[44] Athanasios Theofilatos. Incorporating real-time traffic and weather data to explore road
accident likelihood and severity in urban arterials. Journal of safety research, 61:9–21,
2017.
[45] Pravin Varaiya. Freeway Performance Measurement System, PeMS v3, Phase 1. UC
Berkeley: California Partners for Advanced Transportation Technology, 2001. URL
https://escholarship.org/uc/item/20p1j2w7.
[46] Xuesong Wang and Mohamed Abdel-Aty. Temporal and spatial analyses of rear-end
crashes at signalized intersections. Accident Analysis & Prevention, 38(6):1137–1150,
2006.
[47] Billy M Williams and Angshuman Guin. Traffic management center use of incident
BIBLIOGRAPHY 55
detection algorithms: Findings of a nationwide survey. IEEE Transactions on intelligent
transportation systems, 8(2):351–358, 2007.
[48] Rongjie Yu and Mohamed Abdel-Aty. Utilizing support vector machine in real-time
crash risk evaluation. Accident Analysis & Prevention, 51:252–259, 2013.
[49] Rose Yu, Yaguang Li, Cyrus Shahabi, Ugur Demiryurek, and Yan Liu. Deep learning:
A generic approach for extreme condition traffic forecasting. In Proceedings of the 2017
SIAM international Conference on Data Mining, pages 777–785. SIAM, 2017.
[50] Zhuoning Yuan, Xun Zhou, Tianbao Yang, James Tamerius, and Ricardo Mantilla. Pre-
dicting traffic accidents through heterogeneous urban data: A case study. In Proceedings
of the 6th International Workshop on Urban Computing (UrbComp 2017), Halifax, NS,
Canada, volume 14, 2017.
[51] Zhuoning Yuan, Xun Zhou, and Tianbao Yang. Hetero-convlstm: A deep learning
approach to traffic accident prediction on heterogeneous spatio-temporal data. In Pro-
ceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery
& Data Mining, pages 984–992, 2018.
[52] Wojciech Zaremba, Ilya Sutskever, and Oriol Vinyals. Recurrent neural network regu-
larization. arXiv preprint arXiv:1409.2329, 2014.
Appendices
56
Appendix A
Supplementary Results for Various
Road Postmiles
Figure A.1: Loss value for training for Postmile 508.463
57
58 Appendix A. Supplementary Results for Various Road Postmiles
Figure A.2: Loss value for training for Postmile 510.293
Figure A.3: Loss value for training for Postmile 511.543
59
Figure A.4: Loss value for training for Postmile 513.503
Figure A.5: Loss value for training for Postmile 515.173
60 Appendix A. Supplementary Results for Various Road Postmiles
Figure A.6: Actual and prediction of speed values for Postmile 508.463
Figure A.7: Actual and prediction of speed values for Postmile 510.293
61
Figure A.8: Actual and prediction of speed values for Postmile 511.543
Figure A.9: Actual and prediction of speed values for Postmile 513.503
62 Appendix A. Supplementary Results for Various Road Postmiles
Figure A.10: Actual and prediction of speed values for Postmile 515.173
Figure A.11: Histogram of loss value for training for Postmile 508.463
63
Figure A.12: Loss value for training for Postmile 510.293
Figure A.13: Loss value for training for Postmile 511.543
64 Appendix A. Supplementary Results for Various Road Postmiles
Figure A.14: Loss value for training for Postmile 513.503
Figure A.15: Loss value for training for Postmile 515.173