travel demand model evaluation: a graph-theoretic approachdocs.trb.org/prp/17-01302.pdf ·...

15
Travel Demand Model Evaluation: A Graph-Theoretic Approach 1 2 3 Meead Saberi (corresponding author) 4 Institute of Transport Studies 5 Department of Civil Engineering 6 Monash University 7 Melbourne, Australia 8 Tel: (+61) 3 990 50236 9 Email: [email protected] 10 11 Taha H. Rashidi 12 School of Civil & Environmental Engineering 13 University of New South Wales 14 Sydney, Australia 15 Tel: (+61) 2 9385 5063 16 Email: [email protected] 17 18 Milad Ghasri 19 School of Civil & Environmental Engineering 20 University of New South Wales 21 Sydney, Australia 22 Tel: (+61) 2 9385 5063 23 Email: [email protected] 24 25 Kenneth Ewe 26 Department of Civil Engineering 27 Monash University 28 Melbourne, Australia 29 Tel: (+61) 3 990 50236 30 Email: [email protected] 31 32 33 34 35 36 37 Word count: 6000 words + 6 tables and figures = 7,500 words 38 39 40 41 Submitted for presentation only to the 96 th Annual Meeting of the Transportation Research 42 Board, National Research Council, Washington DC, January 2017 43 44 45 46 August 2016 47 48 49

Upload: hanguyet

Post on 24-Jun-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Travel Demand Model Evaluation: A Graph-Theoretic Approach 1   2   3  

Meead Saberi (corresponding author) 4  Institute of Transport Studies 5  Department of Civil Engineering 6  Monash University 7  Melbourne, Australia 8  Tel: (+61) 3 990 50236 9  Email: [email protected] 10   11  Taha H. Rashidi 12  School of Civil & Environmental Engineering 13  University of New South Wales 14  Sydney, Australia 15  Tel: (+61) 2 9385 5063 16  Email: [email protected] 17   18  Milad Ghasri 19  School of Civil & Environmental Engineering 20  University of New South Wales 21  Sydney, Australia 22  Tel: (+61) 2 9385 5063 23  Email: [email protected] 24   25  Kenneth Ewe 26  Department of Civil Engineering 27  Monash University 28  Melbourne, Australia 29  Tel: (+61) 3 990 50236 30  Email: [email protected] 31   32   33   34   35   36   37  Word count: 6000 words + 6 tables and figures = 7,500 words 38   39   40   41  Submitted for presentation only to the 96th Annual Meeting of the Transportation Research 42  Board, National Research Council, Washington DC, January 2017 43  

44   45   46  

August 2016 47   48  

49  

Saberi et al. 2

ABSTRACT 1   2  Modeling demand for the transport system requires development of complicated mathematical structures 3  reflecting the response of users to the capacity and service provided in the transport network including the 4  infrastructure and modes of transport. Such complicated competitive environment in which travelers try to 5  achieve better service at lower cost can be evaluated from numerous angles when modeled. The classical 6  travel demands are typically evaluated based on the fit to the observed origin-destination table (mostly 7  unavailable), or the likelihood value of models used in the process of generating the origin-destination 8  table. These conventional evaluation techniques do not assess the goodness-of-fit of the model in a robust 9  way. This paper presents a graph-theoretic methodology to evaluate travel demand models. An innovative 10  travel demand model, developed using several advanced data mining techniques, is considered for the 11  evaluation exercise. Statistical and mathematical properties of the modeled networks are compared 12  against the observed networks over time. Networks are constructed based on the trip origin-destination 13  matrices. The proposed evaluation approach focuses on the network structure attributes reflecting network 14  properties from many angles as means for evaluating the goodness-of-fit of the models that are not 15  usually captured by traditional evaluation and validation methods. Results demonstrate the complexity 16  involved in development, evaluation, and validation of travel demand models stressing on the need for 17  using methods like the proposed approach in this study. 18  

Saberi et al. 3

INTRODUCTION 1   2  A system consisting of several non-identical elements connected by diverse interactions can be viewed as 3  a complex network where the nodes are the elements of the system, links represent the interactions 4  between the elements, and there are flow movements on the links. Network science is an interdisciplinary 5  research area inspired by numerous studies of computer, social, and biological networks (1-4). Numerous 6  studies in the past explored human mobility patterns from urban to global scale from a network 7  perspective (5-13). Several more recent studies used non-traditional data sources such as social media, 8  smart cards, and mobile phones to study urban transportation characteristics using network-driven 9  measures (14-18). Despite the growing number of studies using non-traditional data sources and network 10  theory concepts to characterize urban transportation patterns, there is still a lack of connection to classical 11  travel demand modeling approaches. The full potential application of network science in analyzing 12  interaction among places, people and between people and places in the context of travel demand modeling 13  still requires further exploration. In this paper, we propose a network-theoretic methodology to evaluate 14  and validate travel demand models and demonstrate its applicability in evaluating different disaggregate 15  travel demand models of Melbourne, Australia. 16   17  Travel demand modeling approaches, currently used in practice, include trip-based, tour-based and 18  activity-based models. The trip-based models encompass the classical four step models having the four of 19  trip generation, trip distribution, mode choice and route assignment steps. Trip-based models are typically 20  developed at the aggregate level of zones. They lack time dimension, and overestimate the mode shift. 21  Generally, the main criticism around aggregate travel demand modeling pertains to them being policy 22  insensitive and not being behavioral (19). To address these shortcomings of aggregate models, tour-based 23  and activity-based models were introduced in the last few decades (20-22). Disaggregate agent-based 24  travel demand estimation paradigms, mainly represented by tour-based and activity-based models, 25  propose a structure capturing spatial and temporal distribution of activities generated by transportation 26  network users based on the expected utility from activities happening at the end of trips. Activity-based 27  models are based on the premise that travel demand derives from people's needs and desires to participate 28  in activities. The behavioral basis of such agent-based models and the fact that they can incorporate a 29  wide range of explanatory variables in their structures resulted in their deployment by transport agencies, 30  consultancies and software developers around the world. Although there is no standard structure for tour-31  based/activity-based models (23), most frameworks include the representation of the number, purposes 32  (activities), timing, location, party composition, and travel mode of traveler’s tours and stops. Surveys on 33  daily travel behavior, which includes information on travel durations and reasoning on traveling behavior, 34  allows for an extensive look at common trends and travel patterns that occur based on what activities each 35  individual prioritizes on a day-to-day basis. 36   37  Given the extensive data required for development of activity/tour-based models and computational cost 38  of developing and calibrating such highly complicated models, four-step models still dominate alternative 39  travel demand models with regard to their spatial distribution of operating platforms around the world. 40  Borrowing concepts from both aggregate and disaggregate modeling structures, transferability models 41  have been recently introduced which are less computationally cumbersome compared to activity-based 42  models, and with higher precision level and sounder realization of individual behavior compared to 43  aggregate models (24). Recent advances in the area of transferability models has yet to be examined in 44  terms of the model precision which is one of the major contributions of this paper by coupling one of the 45  most recent transferability models with an emerging network-theoretic approach. A recent study by Saberi 46  et al. (25) demonstrated that urban travel demand patterns can be characterized by a set of statistical 47  network measures and could potentially be used as a new methodology to calibrate and validate travel 48  demand models. In this paper, we propose a complex network-driven methodology for travel demand 49  model evaluation and validation. The proposed methodology is not a replacement for the traditional 50  model evaluation approaches; rather, it is a supplementary alternative that is capable of reflecting new 51  

Saberi et al. 4

aspects of goodness-of-fit of models from a network perspective. The new methodology views travel 1  demand as a complex network consisting of vertices and edges representing origins-destinations and trips, 2  respectively. It applies network-theoretic statistical properties to compare model outcomes and real world 3  observations. 4   5  The remainder of the paper is organized as follows. First, the travel demand modeling approach and data 6  used in this study will be explained. Next, existing approaches for model evaluation and validation will be 7  discussed. The main idea behind complex network theory and its application to evaluate and validate 8  travel demand models will then be explained. The probability distributions of various network parameters 9  from the observed and selected demand models will be discussed. Finally, selected goodness of fit 10  measures for the models will be explained, applied and discussed followed by concluding remarks and 11  directions for future research. 12  

13  DISAGGREGATE TRAVEL DEMAND MODELING 14   15  In this section, we present a travel demand modeling structure proposed by Ghasri et al. (24) in which trip 16  purpose, travel mode, time of day, commute distance and attributes of trip destination are jointly modeled 17  using (i) Decision Tree (DT), (ii) Modified Decision Tree (MDT), and (iii) Random Forest (RF). The 18  model estimation is performed using the Victorian Integrated Survey of Travel and Activity (VISTA) of 19  2007 as the training set, and its transferability is examined using data from 2009. 20   21  Modeling Structure 22   23  In the first step, the total number of daily trips for each person in the survey is modeled using a trip 24  generation model (TGM). Building on a tour-based structure paradigm, the framework then models 25  attributes of the first trip of the day using a module called First Trip Attributes Model (FTAM). This 26  module is immediately followed by another similar module, Next Trips Attributes Model (NTAM), which 27  explains attributes of the second and onward trips in the daily tour, if there is any. The NTAM module is 28  built on the assumption that trips on the tours are affected by the immediate predecessor trip in the tour. 29  The trip attributes considered among the dependent variables in NTAM from the previous trip including 30  trip purpose, trip time, trip mode, commute distance and attributes of trip destination. The three 31  aforementioned data mining techniques aim to examine relationship between the independent and the trip 32  attributes (dependent) variables in the TGM, FTAM and NTAM structures in a simultaneous modeling 33  development structure. For a comprehensive discussion on the model structure, refer to Ghasri et al. (24). 34   35  Decision Tree Model 36   37  The decision tree (DT) model is built by categorizing the observed data based on explanatory variables, to 38  achieve homogenous categories regarding the dependent variable (26). The model can be used to predict 39  the outcomes of the target (dependent) variables based on the input (independent) variables. Developing 40  decision tree models incorporates method of recursive partitioning, where at each node of the tree the best 41  classifier among all of the available variables is selected to split the data into two relatively purer subsets. 42  This process repeats until either no further classification is possible or possible classifications will not 43  improve purity anymore. There are a variety of algorithms to develop decision trees including the ID3, 44  C4.5 and CART algorithms (27). The classification and regression trees (CART) algorithm produces a 45  classification tree if the dependent variable is categorical or a regression tree if the dependent variable is 46  numerical. Classification trees predict outcomes based on the class of the data, while regression trees 47  predict outcomes based on the data values. In order to determine which attribute is the ‘best’ classifier, an 48  indicator called Gini impurity index is calculated. The Gini index is defined based on a Gini function 49  which is calculated using the portion of observations in a class as in equation 1 where f(pi) is the Gini 50  function for class i, and pi is the proportion of class i. Gini impurity index for a node is calculated using 51  

Saberi et al. 5

Gini functions for different classes of the dependent variable in that node as expressed in equation 2 1  where f(pi,A) is the Gini function for class i in node A, and pi,A is the proportion of class i at node A. 2   3  

𝑓 𝑝! =  𝑝!(1 −  𝑝!) (1) 4  

𝐼 𝐴 =   𝑓 𝑝!,! =   𝑝!,!(1 −  𝑝!,!) (2)

5  The Gini impurity index is used to calculate impurity reduction of each node, which is the metric to 6  decide about the best classifier at each node as in equation 3 where AL and AR are the immediate left and 7  right nodes of node A, p(A) is the portion of observation in node A, and I(A) is the Gini impurity index of 8  node A. The CART algorithm splits each node by selecting the independent variable that produces the 9  highest Gini impurity reduction. 10   11  

∆𝐼 = 𝑝 𝐴 𝐼 𝐴 −  𝑝 𝐴! 𝐼 𝐴! −  𝑝 𝐴! 𝐼 𝐴! (3) 12  Modified Decision Tree Model 13   14  The modified decision tree (MDT) model differs from DT by assigning a distribution at leaf nodes. In the 15  DT model, the most observed value of the dependent variable is returned as the value of each leaf. This 16  value is used in the simulation process as the prediction of model for that leaf. The deviation from the 17  observed value or the error rate (misclassification rate) is minimized if the model is applied to the training 18  data. Although this increases the individual level of accuracy, it overlooks the system level diversity. In 19  the MDT framework, incorporating the diversity of the leaf nodes, the model's performance at the 20  disaggregate level is improved compared to DT by looking at the goodness-of-fit indicators defined using 21  the dependent variables (24). 22  

Random Forest Model 23   24  The random forest (RF) model was first developed by Breiman and Cutler (28). The model endorses 25  Breiman's idea of bootstrap aggregation or 'bagging', which increases the accuracy of the model while 26  retaining the bias. This process involves sampling a set of data with replacement to create several new 27  sets of the data, and running a baseline predictor to get a trained model for each new set. On an unseen 28  sample of data, the predictions are usually averaged if the dependent variable is numerical, or voted if the 29  dependent variable is categorical. The bootstrap aggregation is combined with the random subspace 30  method to develop a set of decision trees, where subsets are chosen randomly and the outputs are 31  combined based on the majority vote. The error of the random forest model is measured by its out-of-bag 32  (OOB) error (29). OOB of each trained set of data is found by only using the decision trees that do not 33  contain a record of that set in their bootstrap sample. The OOB error creates an unbiased and accurate 34  error calculation as proven in Breiman (29). The RF model calculates the variable importance by 35  permuting the variable within the trained data set and measuring the deterioration in OOB error (30). The 36  variable importance can be normalized by the use of the standard deviation of the change in OOB error, 37  where a high normalized score means that the variable is of high importance. 38   39  Travel Data 40   41  VISTA includes daily travel diary of individual members of 17,100 households in 2007-08. The data 42  contains all personal travel outside the home location to different destinations. The data also includes 43  basic household and demographic information. On average residents of the surveyed areas make 3-4 trips 44  per weekday with a total of 11.6 million trips per day for the entire population of the Melbourne 45  metropolitan area. The median trip distance traveled in a weekday is 4.5 km. The dominant mode of travel 46  

Saberi et al. 6

is personal car (driver or passenger) with 75.4% share. Public transportation accounts for only 9.1% of the 1  trips. VISTA data is publicly available via http://economicdevelopment.vic.gov.au/transport/research-2  and-data/vista. 3   4  MODEL CALIBRATION AND VALIDATION: TRADITIONAL APPROACHES 5   6  The evaluation of traditional travel demand models involves calibration and validation of each of the 7  steps/sub-models and cross checking traffic counts at cordon lines. This section outlines the schematics of 8  this process, while later in the paper we propose a network-based approach for travel demand model 9  evaluation. 10   11  Validation is an essential stage in travel demand modeling. Travel demand models are first calibrated by 12  estimating parameters using the base year considering one or more goodness-of-fit measures. Once the 13  models are reasonably calibrated, the model is validated either using a test (out) sample or using auxiliary 14  data such as link flows or cordon count data. Out-sample validation examines the accuracy of the model 15  by looking at the precision of the predicted values of the dependent variable while the auxiliary data is not 16  directly modeled as part of the dependent variables but it is simulated to be cross checked against the 17  observed values. For instance, in the trip generation and distribution steps of a classical four-step model, 18  the base year zonal socioeconomic and a number of network attributes are used to develop trip 19  production/attraction and distribution models. Once the models are calibrated, a comparison between the 20  simulated, auxiliary data of observed trip length frequency distributions, and average trip lengths across 21  different trip purposes is conducted to evaluate the precision of the models. The mode choice step 22  calibration requires data from household travel surveys and is usually validated in an out-sample 23  validation process. Finally, calibration and validation of the traffic assignment step involves comparison 24  of observed traffic counts versus estimated volumes by the model. Different statistical measures are often 25  used in the calibration and validation process of travel demand models as discussed in the following (31, 26  32). 27   28  Percent Differences in Volumes 29   30  Difference percentage in link flows compared to model predictions for traffic volumes is a commonly 31  used indicator to evaluate and validate a calibrated traffic assignment model (33). An absolute difference 32  in volume can also be used; however, percentage differences are usually preferred in practice. The 33  differences in volumes should be within a specified target of accuracy, which typically lies around the 5% 34  range. 35   36  Correlation Coefficient 37   38  Pearson’s correlation coefficient, commonly shown with the symbol ‘r’, is a measure of how similar the 39  predicted measures are compared to the observed data (32). The value of this coefficient can range from 1 40  to -1, where results of closer to 1 are more acceptable. Alternatively, the correlation coefficient can be 41  determined by regressing the observed flows against the simulated ones where a slope of one is desirable. 42   43  Root Mean Square Error (RMSE) 44   45  The root mean square error (RMSE) is calculated as shown in equation 6, where the sum of the squared 46  differences between the predicted and observed values is divided by the total number of data points n-1 47  (34). An aggregate percentage for the RMSE is generally considered to be less than 30%. 48   49  

Saberi et al. 7

𝑅𝑀𝑆𝐸 =  𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑! −  𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑! !

!

𝑛 − 1 (4)

1  Peak Hour Validation Targets 2   3  The peak hour validation check is concerned only for peak flow during morning and evening peaks, with 4  the idea that a model should be accurate when it is at capacity. The predicted and observed peak hour 5  volumes are cross-compared to examine if they meet a certain degree of accuracy (32). 6   7  Vehicle Kilometers Traveled (VKT) or Vehicle Miles Traveled (VMT) 8   9  Travel demand model evaluation based on the VKT or VMT reflects the accuracy of the overall demand 10  model in an indirect fashion. 11   12  The existing evaluation and validation methods overlook the importance of interaction between places 13  and connectivity of the origin-destination travel demand networks. The underlying processes of urban 14  travel demand are derived by the interaction strength between places (35). As a result, reproducibility of 15  empirically observed interactions by the travel demand models is of critical importance to ensure the 16  modeled travel demand patterns produce accurate outcomes. 17   18  NETWORK-THEORETIC METHODOLOGY 19   20  A network is an arrangement of points, known as nodes or vertices, connected by lines, known as edges. 21  The analysis of networks can produce a significant amount of information on the relationship and 22  interaction among specific entities of the network at the individual or/and system level(s), allowing 23  knowledge-based decisions to be made to optimize the network performance. 24   25  Network theory has applications in many areas and, consequently, there are many different types of 26  networks including technological, social, informational and biological networks (36-38). Urban 27  transportation network can also be viewed as a complex densely connected network of individuals’ 28  activity spaces. Due to computational and practical complications involved in considering every trip 29  generated by individuals (some activity/tour -based are developed at the highly disaggregate level of 30  travelers), at the aggregate zone-based level, pairs of nodes i and j representing origin and destination 31  zones are commonly considered in the majority of travel demand models. ODs are connected by links 32  with non-negative weights wij > 0 if one or more trips are made between the nodes. If no trip is made 33  between a pair of nodes, wij = 0. In such networks, one could possibly find an indirect path between any 34  pair of nodes; wij quantifies the number of trips between pairs of nodes per unit of time. Figure 1 provides 35  a visual comparison of topology and structure of the observed and modeled travel demand networks. As it 36  was explained earlier, the 2007 HTS is used to develop three models using DT, RF and MDT methods. 37  These models are then used to simulated travel attributes for 2009 for which 2009 HTS is available for 38  validation purposes. 39   40   41  

Saberi et al. 8

1  FIGURE 1. Complex network structure of travel demand in Melbourne from real world observations (a 2  

and e), RF model (b and f), DT model (c and g), and MDT model (d and h) in 2007 and 2009. 3   4  Nomenclature 5   6  N = number of nodes in the network 7  L = number of edges in the network 8  T = total number of trips 9  δ = network connectivity (2L/N2) 10  aij = elements of the adjacency matrix 11  aw

ij = elements of the weighted adjacency matrix 12  Fi = flux of a given node i 13  ki = degree of a given node i 14  wij = weight of a given edge between node i and node j 15  ci = clustering coefficient of a given node i 16  bi = betweenness centrality of a given node i 17  bij = betweenness centrality of a given edge between node i and node j 18  𝐹 = mean node flux in the network 19  𝑘 = mean node degree in the network 20  𝑤 = mean edge weight in the network 21  𝑐 = mean clustering coefficient in the network 22  𝑤𝑐 = mean weighted clustering coefficient in the network 23  

CV(F) = coefficient of variation of node flux in the network 24  CV(k) = coefficient of variation of node degree in the network 25  CV(w) = coefficient of variation of edge weight in the network 26  C = network clustering coefficient 27  dT = average shortest path 28  wdT = average weighted shortest path 29  φ = network diameter 30  wφ = weighted network diameter 31  ξ = network dissimilarity 32   33  Table 1 summarizes and compares the statistical properties (36) of the observed and modeled travel 34  demand networks developed in this study for the city of Melbourne using VISTA data. Note that the 35  original networks are first reduced to their largest connected component. The largest connected 36  

Saberi et al. 9

component (LCC) is a sub graph in which any two nodes are connected by at least one path. The resulting 1  connected observed networks in 2007 and 2009 comprise a larger number of nodes N compared to the 2  modeled networks. The number of links L also varies largely between the observed and modeled 3  networks. This might be explained by the fact that models tend to find patterns in databases while 4  neglecting the outliers or standalone instances. This can result in more homogenous patterns where the 5  outcome represents the major stream of travelers. The DT model has the smallest network connectivity δ 6  = 0.596 and δ = 0.676 in 2007 and 2009, respectively compared to the connectivity of the observed and 7  other modeled networks, which is mainly because it significantly affected by the dominated outcomes on 8  leaves. The RF model in 2007 has the highest network connectivity δ = 0.739 while the observed network 9  of travel demand has the highest connectivity δ = 0.818 in 2009. 10   11  TABLE 1. Network-theoretic characteristics of the travel demand patterns from RF, DT, and MDT 12  models versus real-world observation in 2007 and 2009. 13  

Observation RF DT MDT 2007 2009 2007 2009 2007 2009 2007 2009

N 63 64 47 48 47 48 47 48 L 1,399 1676 816 804 658 779 736 856 δ 0.705 0.818 0.739 0.698 0.596 0.676 0.666 0.743 T 46,335 63,942 47,109 37,823 24,120 32,270 40,314 52,458 𝐹 735.5 999.1 993.6 778.1 499.9 662.1 849.0 1,086 𝑘 22.2 26.2 17.4 16.7 14 16.2 15.6 17.8 𝑤 33.1 38.1 57.2 46.4 35.7 40.8 54.2 60.9

CV(F) 1.336 1.061 1.031 0.748 0.957 0.740 0.968 0.729 CV(k) 0.548 0.512 0.376 0.403 0.396 0.418 0.383 0.398 CV(w) 6.767 5.862 5.113 3.710 4.055 3.476 4.613 3.916 𝑐 0.623 0.669 0.535 0.519 0.469 0.511 0.502 0.541 𝑤𝑐 3.824 5.140 7.136 7.735 6.258 8.449 6.886 10.145

C 1.283 1.454 1.283 0.854 1.283 0.854 0.749 0.854 dT 1.750 1.660 1.670 1.732 1.840 1.765 1.751 1.675 wdT 2.937 2.658 5.924 7.554 8.814 8.213 6.723 7.413 φ 4 4 3 4 4 4 4 3 wφ 24 10 19 88 88 48 20 67 14  The node degree k is the number of links connected to a node in a network. The average node degree <k> 15  in the observed networks is consistently larger than the modeled networks. This exhibits a greater 16  interaction between places (or destinations) across the observed network compared to the modeled 17  networks which is in line with the previous findings about number of nodes and links. Also, the 18  variability of k measured by the coefficient of variation CV(k) is greater in the observed network of travel 19  demand suggesting a larger heterogeneity in interaction between nodes that has not been reasonably 20  captured by any of the models. 21   22  The node flux F is the number of trips starting from or ending at a node. Node flux represents the 23  attractiveness of a place or destination and its interaction strength in connection to other nodes in the 24  network. The DT model has the lowest average node flux <F> compared to the observed and other 25  modeled networks. This suggests that interactions between places are significantly lower in the DT 26  model. However, the larger coefficient of variation of node flux CV(F) in the observed network suggests a 27  more heterogeneous distribution of interaction strengths across the network of trips in real world 28  

Saberi et al. 10

compared to the modeled networks. The average number of trips on each link is represented by link 1  weight <w> where wij represents the weight or the total number of trips between each pair of nodes. 2  Different from node flux and node degree, the average link weight in the observed network is consistently 3  lower in 2007 and 2009 compared to all the modeled networks, possibly because they are more 4  distributed among nodes and links in the observed networks. In the same vein, CV(w) representing the 5  variation of link weights across the network is significantly larger in the observed network. This suggests 6  a more heterogeneous distribution of trips between nodes in the observed network which has not been 7  completed in the developed models. 8   9  The clustering coefficient c is a measure of the degree in which the nodes in a network tend to locally 10  cluster together, and is measured by the fraction of paths of length two in the network that are closed. The 11  mean clustering coefficient <c> is consistently larger in the observed network in 2007 and 2009 12  compared to the modeled networks. This suggests that the observed network of trips is more locally 13  connected, meaning that trips are more distributed in the observed networks. The clustering coefficient 14  also reflects the formation of groups or communities in a network. Interestingly, the average weighted 15  clustering coefficient <wc>, in which the number of trips between neighboring nodes is also taken into 16  account, is considerably larger in the observed network as opposed to the average clustering coefficient 17  <c>. Alternatively, the network clustering coefficient C is a global measure of the extent in which nodes 18  in a network are clustered, and is measured by the ratio of the number of triangles over the number of 19  connected triples in a network. The MDT model is found to have the lowest C in 2007 while C remains 20  constant across the other networks. In 2009, the observed network has the largest C while C remains 21  constant across the modeled networks. 22   23  The term dT is defined as the average shortest path length between each pair of nodes in the network using 24  the adjacency matrix. Similarly, wdT is the average weighted shortest path length using the weight matrix. 25  Average shortest path is often used to measure network efficiency. While dT slightly varies across the 26  observed and modeled networks, wdT is significantly smaller in the observed network of travel demand 27  compared to the modeled network. The network diameter φ is defined as the longest shortest path 28  between each pair of nodes in the network. The weighted network diameter wφ is similarly the longest 29  weighted shortest path using the weight matrix. The diameter represents the linear size of a network. φ is 30  almost the same across all the networks while wφ varies significantly. 31   32  To further compare the network structural differences between the observations and model outcomes, 33  next, the complementary cumulative distribution functions (CCDF) of node degree k, node flux F, link 34  weight w, and node and link betweenness centrality normalized by the mean value of each variable are 35  plotted. CCDF is defined as Pr(X>x) = 1 – F(x) where F(x) is the CDF of the variable of interest. Figure 36  2 shows that k and F have very similar distributions in all networks while a slight deviation from the 37  observed distribution is observed in the modeled distributions of w. Results suggest that the modeled 38  networks are consistently reproducing similar statistical properties to the observed network. 39   40  Figure 3 shows the distribution of node betweenness centrality Pr(b) and weighted betweenness centrality 41  Pr(wb) of all the networks in 2007 and 2009. Betweenness centrality represents the centrality of a node or 42  link in the network. It is computed as the fraction of shortest paths in the entire network that pass through 43  a particular node or link. The RF model is slightly deviated from the observed, DT, and MDT 44  distributions for lower values of node wb suggesting that the RF model is not performing as good as DT 45  and MDT models. 46   47   48  

Saberi et al. 11

1  FIGURE 2. Cumulative distribution functions of (a) node degree k, (b) node flux F, and (c) link weight c 2  

normalized by the mean in each distribution k0, F0, and w0. Dashed lines represent power law fits. 3   4  

5  FIGURE 3. Cumulative distribution functions of (a) and (c) node betweenness centrality b, (b) and (d) 6  

node weighted betweenness centrality wb, normalized by the mean in each distribution b0 and wb0. 7   8  

k/k0

10 -1 10 0

Pr(k)

10 -2

10 -1

10 0

DTMDTObservationRF

2007

k/k0

10 -1 10 0

Pr(k)

10 -2

10 -1

10 0

DTMDTObservationRF

2009

F/F0

10 -1 10 0 10 1

Pr(F)

10 -2

10 -1

10 0

DTMDTObservationRF

2007

F/F0

10 -1 10 0 10 1

Pr(F)

10 -2

10 -1

10 0

DTMDTObservationRF

2009

w/w0

10 -2 10 -1 10 0 10 1 10 2

Pr(w

)

10 -3

10 -2

10 -1

10 0

DTMDTObservationRF

2007

w/w0

10 -2 10 -1 10 0 10 1 10 2

Pr(w

)

10 -3

10 -2

10 -1

10 0

DTMDTObservationRF

2009

(a) (b) (c)

(d) (e) (f)

node wb/wb0

10 0

Pr(w

b)

10 -2

10 -1

10 0

MDTDTObservationRF

node wb/wb0

10 0

Pr(w

b)

10 -2

10 -1

10 0

MDTDTObservationRF

node b/b0

10 -2 10 -1 10 0 10 1

Pr(b

)

10 -2

10 -1

10 0

MDTDTObservationRF

node b/b0

10 -2 10 -1 10 0 10 1

Pr(b

)

10 -2

10 -1

10 0

MDTDTObservationRF

(a) (b)

(c) (d)

Saberi et al. 12

NETWORK DISSIMILARITY 1   2  In this section, we apply two commonly used network dissimilarity measures to quantify the observed 3  differences in the distributions of network statistics as goodness-of-fit measures for model evaluation and 4  validation. 5   6  Kullback-Leibler Divergence 7   8  We first use the Kullback-Leibler divergence (39, 40) to measure the difference between probability 9  distributions of node degree k, node flux F, link weight c, node betweenness centrality b, node weighted 10  betweenness centrality wb, link betweenness centrality b, and link weighted betweenness centrality wb, as 11  presented in the previous section. Kullback-Leibler divergence is defined as: 12   13  

𝐷!" 𝑃 ∥ 𝑄 = 𝑃 𝑖  𝑙𝑛𝑃(𝑖)𝑄(𝑖)

!

14  where P and Q are probability distributions of variables of interest (node degree, node flux, etc.). The 15  Kullback-Leibler divergence provides a non-negative measure where 16   17  

𝐷!" 𝑃 ∥ 𝑄 = 0  𝑖𝑓  𝑎𝑛𝑑  𝑜𝑛𝑙𝑦  𝑖𝑓  𝑃 = 𝑄 18  Figure 4 shows the computed Kullback-Leibler divergence for the RF, DT, and MDT networks against 19  the observed networks in 2007 and 2009. For the 2007 data, the RF model performs better than the DT 20  and MDT models when looking at the distributions of k, F, and w. However, the Kullback-Leibler 21  divergence of the RF model is consistently larger for the node and link weighted betweenness centrality 22  measures. Results from 2009, however, suggest that the RF and MDT models perform better compared to 23  the DT model across the majority of the selected network statistics. This is in line with finding elsewhere 24  (Ghasri et al, 2015) that the RF and MDT approaches are more transferable while MDT provides better fit 25  to the data at the aggregate level. 26   27  

28   (a) (b) 29  FIGURE 4. Kullback-Leibler divergence of node degree k, node flux F, link weight c, node betweenness 30  centrality b, node weighted betweenness centrality wb, link betweenness centrality b, and link weighted 31  

betweenness centrality wb for RF, DT, and MDT models: (a) 2007 and (b) 2009. 32   33   34   35   36  

Saberi et al. 13

Structural Dissimilarity 1   2  We define an edge-based network dissimilarity measure ξ for a one-to-one comparison of the weight 3  matrix as the following: 4   5  

ξ =𝑤!"!"# − 𝑤!"!"#$%

𝑤!"!"#

6  where 𝑤!"!"# is the weight of a given edge between node i and node j in the observed network and 𝑤!"!"#$% 7  is the weight of the same edge in the modeled network. 8   9  Figure 5 illustrates the probability distribution of network dissimilarity measure for the RF, DT, and MDT 10  models in 2007 and 2009. All modeled networks exhibit similar pattern in which the majority of the edge 11  weights in the modeled networks fall within the 10%-20% dissimilarity range. Results demonstrate the 12  relative consistency of the three demand models in reproducing networks similar to the observed travel 13  demand network. 14   15  

16   (a) (b) 17  FIGURE 5. Probability distribution of network dissimilarity measure based on wij for RF, DT, and MDT 18  

models: (a) 2007 and (b) 2009. 19   20  CONCLUSION 21   22  Existing travel demand model evaluation and validation methods are often based on distribution of 23  distance traveled, journey duration, and cross checking a limited number of observed link/cordon traffic 24  counts. These methods overlook the importance of the complex interaction between places (or 25  destinations) in cities. Viewing urban travel demand as a complex network, we proposed a new method 26  for evaluating and validating travel demand models. The proposed method is not a replacement for 27  traditional methods; rather, it is a supplementary tool to ensure the modeled travel demand network has a 28  similar structure to the observed network from different perspectives each of which can reflect some 29  attributes of the network. The proposed method is applied on three disaggregate transferability models 30  developed by Ghasri et al. (24), namely Decision Tree (DT), Modified Decision Tree (MDT), and 31  Random Forest (RF). We used a series of network statistics including distributions of node degree, node 32  flux, link weight, and betweenness centrality to compare the performance of the travel demand models 33  against each other and observations. We also used a Kullback-Leibler divergence and an edge-based 34  measure of network dissimilarity to quantify the extent in which the modeled network statistics are in 35  agreement with the real world observations. Results suggest that the studied demand models are 36  consistently reproducing network properties close to the patterns in the observed travel demand network. 37  

Saberi et al. 14

Nonetheless, each of the modeling approaches outperform the other two for some of the examined 1  network level indicators. This is a crucial finding as it reveals the significance of using a diverse set of 2  performance measures when evaluating travel demand models. For example, had we only considered the 3  probability distribution of network dissimilarity measures, we would have concluded that the three 4  modeling approaches equally perform well. Alternatively, if one only considers the link measurements, 5  the decision tree approach is found to be the best approach for forecasting (for the 2009 data) while 6  almost for every other network statistics, we found the other two methods outperform the DT approach. 7  An important contribution here is, therefore, to highlight the complexity involved in development, 8  evaluation, and validation of travel demand models which calls for advanced evaluation techniques 9  reflecting a wide range of attributes of the observed and modeled data, travelers and the demand network. 10  We continue to explore the effectiveness of the evaluated modeling methods on other data sets as ongoing 11  work. Further, by examining the proposed method on other types of travel demand data and models 12  combined with social networks characteristics, we can develop a better understating of the behavioral 13  properties of the travel demand network attributes. 14   15  REFERENCES 16   17  1. Faloutsos, M., Faloutsos, P., Faloutsos, C., 1999. On Power-Law Relationships of the Internet 18  

Topology. In Conference of the ACM Special Interest Group on Data Communications (SIGCOMM). 19  ACM Press, New York, NY, 251–262. 20  

2. Yook, S.H., Jeong, H., Barabasi, A.L., 2002. Modeling the internet's large-scale topology. 21  Proceedings of the National Academy of Science of the United States (PNAS), 99(22), 13382-13386. 22  

3. Siganos, G., Faloutsos, M., Faloutsos, P., Faloutsos, C., 2003. Power laws and the AS-level internet 23  topology. IEEE/ACM Transactions on Networking, 11(4), 514-524. 24  

4. Watts, D.J., 2003. Six degrees. The science of a connected age. W. W. Norton & Co. Inc., New York. 25  5. Brockmann, D., Hufnagel, L., Geisel, T., 2006. The scaling laws of human travel. Nature 439, 462-26  

465. 27  6. González, M.C., Hidalgo, C.A., Barabási, A.L., 2008. Understanding individual human mobility 28  

patterns. Nature 453, 779-782. 29  7. Bazzani, A., Giorgini, B., Rambaldi, S., Gallotti, R. & Giovannini, L., 2010. Statistical laws in urban 30  

mobility from microscopic GPS data in the area of Florence. J. Stat. Mech. 2010, P05001. 31  8. Song, C., Koren, T., Wang, P. & Barabási, A., 2010. Modelling the scaling properties of human 32  

mobility. Nat. Phys. 6, 818–823. 33  9. Roth, C., Kang, S. M., Batty, M. & Barthélemy, M., 2011. Structure of urban movements: polycentric 34  

activity and entangled hierarchical flows. PLoS ONE 6, e15923. 35  10. Woolley-Meza, O., Thiemann, C., Grady, D., Lee, J.J., Seebens, H., Blasius, B., Brockmann, D., 36  

2011. Complexity in human transportation networks: A comparative analysis of worldwide air 37  transportation and global cargo ship movements. European Physical Journal B 84. 589-600. 38  

11. Noulas, A., Scellato, S., Lambiotte, R., Pontil, M., Mascolo, C., 2012. A Tale of Many Cities: 39  Universal Patterns in Human Urban Mobility. PLoS ONE 7(5): e37027. 40  

12. Schneider, C.M., Belik, V., Couronné, T., Smoreda, Z., González, M.C., 2013. Unravelling daily 41  human mobility motifs. Journal of The Royal Society Interface, volume 10, 1-8. 42  

13. Simini, F., González, M.C., Maritan, A., Barabási, A.L., 2012. A universal model for mobility and 43  migration patterns. Nature 484, 96–100. 44  

14. Colak, S., Schneider, C.M., Wang, P., González, M.C., 2013. On the role of spatial dynamics and 45  topology on network flows. New Journal of Physics 15: 113037. 46  

15. Wang, P., Hunter, T., Bayen, A., Schechtner, K., González, M.C., 2012. Understanding Road Usage 47  Patterns in Urban Areas. Scientific Reports, volume 2, doi:10.1038/srep01001. 48  

16. Hasan, S., Schneider, C.M., Ukkusuri, S.V., González, M.C., 2013. Spatio-temporal Patterns of 49  Urban Human Mobility. Journal of Statistical Physics, volume 151. 50  

Saberi et al. 15

17. Iqbal, M.S., Choudhury, C.F., Wang, P., González, M.C., 2014. Development of Origin-Destination 1  Matrices Using Mobile Phone Call Data. Transportation Research C (40), 63-74. 2  

18. Widhalm, P., Yang, Y., Ulm, M., Athavale, S., González, M.C., 2015. Discovering urban activity 3  patterns in cell phone data. Transportation 42(4), 597-623. 4  

19. Axhausen, K., Garling, T., 1992. Activity-based approaches to travel analysis: conceptual 5  frameworks, models, and research problems. Transport Reviews, 12(4), 323-341. 6  

20. Bowman, J.L., Ben-Akiva, M.E., 2001. Activity-based disaggregate travel demand model system 7  with activity schedules. Transportation Research Part A, 35, 1-28. 8  

21. Vovsha, P., Bradley, M., Bowman, J., 2004. Activity-based travel forecasting models in the United 9  States: Progress since 1995 and Prospects for the Future. In the EIRASS Conference on Progress in 10  Activity-Based Analysis, May 28-31, Vaeshartelt Castle, Maastricht, The Netherlands. 11  

22. Axhausen, K. W., 2000. Activity-based modelling: Research directions and possibilities, New Look 12  at Multi-Modal Modeling. Department of Environment, Transport and the Regions, London, 13  Cambridge and Oxford. 14  

23. Castiglione, J., M. Bradley, and J. Gliebe, 2015. Activity-Based Travel Demand Models: A Primer. 15  No. SHRP 2 Report S2-C46-RR-1. 16  

24. Ghasri M., Rashidi, T.H., Waller S.T., 2015. A Novel System of Disaggregate Models for Travel 17  Demand Modelling. Using Decision Tree and Random Forest Concepts. In the 94th Annual Meeting 18  of the Transportation Research Board, Washington DC, USA, 11-15 January, paper #15-3781. 19  

25. Saberi, M., Mahmassani, H., Brockmann, D., Hosseini, A., 2016. (2016) A Complex Network 20  Perspective for Characterizing Urban Travel Demand Patterns: Graph Theoretical Analysis of Large-21  Scale Origin-Destination Demand Networks. Transportation. doi:10.1007/s11116-016-9706-6. 22  

26. Breiman, L., Friedman, J.H., Olshen, R.A., and Stone, C.I., 1984. Classification and regression trees. 23  Wadsworth Statistics/Probability. Chapman and Hall/CRC. 24  

27. Kim, J. W., Lee, B. H., Shaw, M. J., Chang, H. L., and Nelson, M., 2001. Application of decision-tree 25  induction techniques to personalized advertisements on internet storefronts, International Journal of 26  Electronic Commerce, 5(3): 45-62. 27  

28. Breiman, L. and Cutler A., 2004. Random Forests. Department of Statistics, University of California, 28  Berkeley. 29  

29. Breiman, L, 1996. Bagging Predictors. Machine Learning, 24, 123-140. 30  30. Breiman, L., 2001. Random forests. Machine learning 45, 5-32. 31  31. Federal Highway Administration, 2010. Travel Model Validation and Reasonableness Checking 32  

Manual. Second Edition. Last accessed July via 33  2016.https://www.fhwa.dot.gov/planning/tmip/publications/other_reports/validation_and_reasonablen34  ess_2010/fhwahep10042.pdf 35  

32. Wegmann, F., Everett, J., 2008. Minimum Travel Demand Model Calibration and Validation 36  Guidelines for State of Tennessee. Center for Transportation Research, University of Tennessee, 37  Knoxville. 38  

33. Florian, M., Nguyen, S., 1976. An application and validation of equilibrium trip assignment methods. 39  Transportation Science, 10(4), 374-390. 40  

34. Wu, C. H., Ho, J. M., Lee, D. T., 2004. Travel-time prediction with support vector regression. IEEE 41  Transactions on Intelligent Transportation Systems, 5(4), 276-281. 42  

35. Betty, M., 2013. The New Science of Cities. MIT press, Cambridge. 43  36. Newman, M., 2001. The structure of scientific collaboration networks. Proc. Nat. Acad. Sci. USA 44  

98(2), 404–409. 45  37. Newman, M., 2010. Networks: an introduction. Oxford University Press, Oxford. 46  38. Newman, M., Park, J., 2003. Why social networks are different from other types of networks. Phys. 47  

Rev. E 68(3), 036122. 48  39. Kullback, S., Leibler, R.A., 1951. On information and sufficiency. Annals of Mathematical Statistics, 49  

22(1): 79–86. doi:10.1214/aoms/1177729694. 50  40. Kullback, S., 1959. Information Theory and Statistics. John Wiley & Sons. 51